3
matwrap -- Wrap C++ functions/classes for various matrix languages
7
matwrap is a script to generate wrapper functions for
8
matrix-oriented scripting languages so that C++ subroutines or member
9
functions can be called. It doesn't support non-matrix-oriented
10
scripting languages like perl and python and tcl because Dave Bezley's
11
program SWIG is such a good wrapper generator for those languages.
12
Someday I hope that all of the features in this wrapper generator are
13
incorporated into SWIG, but since I don't understand SWIG well enough to
14
do it myself, I'm releasing this separately. SWIG is available from
15
http://bifrost.lanl.gov/~dmb/SWIG/ or
16
http://www.cs.utah.edu/~beazley/SWIG/.
18
matwrap can handle the following constructs:
22
=item Ordinary functions
24
For example, suppose you have some functions defined in an C<.h> file,
27
float fiddle(double arg);
28
double tweedle(int x, char *name);
30
You can access these directly from MATLAB by using the following:
32
matwrap -language matlab -o myfuncs_wrap.c fiddle.h
33
cmex myfuncs.o myfuncs_wrap.c -o myfuncs_wrap
35
Then, in MATLAB, you can do the following:
37
y = tweedle(3, 'Hello, world');
38
A = fiddle([3, 4; 5, 6];
40
Note especially the last statement, where instead of passing a scalar as
41
the argument, we pass a matrix. The C function fiddle() is called
42
repeatedly on each element of the matrix and the result is returned as a
45
Floats, doubles, char *, integer, unsigned, and pointers to structures
46
may be used as arugments. Support for other data types (e.g., various
47
C++ classes) is possible and may be easily added since the modules have
48
been written for easy extensibility. Function pointers are not
49
currently supported in any form. C++ operator definitions are not
54
You can access public member functions and simple public data members of
59
ABC(int constructor_arg);
60
void do_something(float number, int idx);
64
From MATLAB or a similar language, you would access this structure like
67
ABC_ptr = ABC_new(3); % Call the constructor and return a pointer.
68
ABC_do_something(ABC_ptr, pi, 4); % Call the member function.
69
abc_x = ABC_get_x(ABC_ptr); % Get the value of a data member.
70
ABC_set_x(ABC_ptr, 3.4); % Set the data member.
71
ABC_delete(ABC_ptr); % Discard the structure.
73
Accessing data members is often extremely useful when you are attempting
74
to figure out why your code returns 27.3421 when it ought to return
77
The same thing will work for C structs--the only difference is that they
78
have only data members and no member functions.
80
Only public members are accessible from the scripting language.
81
Operator overloading and function overloading are not supported.
82
Function pointers are not supported.
86
You can also call functions that take arrays of data, provided that they
87
accept the arrays in a standard format. For example, suppose you want
88
to use the pgplot distribution to make graphs (e.g., if you're using a
89
scripting language that doesn't have good graphing capability). The
90
following function generates a histogram of data:
92
void cpgbin(int nbin, const float *x, const float *data, Logical center);
94
Here x[] are the abscissae values and data[] are the data values. If
95
you add to your .h file a simple statement indicating the dimensions of
96
the matrices, like this:
98
//%input x(nbin), data(nbin)
100
then from a MATLAB-like language, you can call this function like this:
104
where C<X> and C<Data> are vectors. The C<nbin> argument is determined
105
from the length of the C<X> and C<Data> vectors automatically (and the
106
wrapper generator makes sure they are of the same length!).
108
This will also work with multidimensional arrays, provided that the
109
function expects the array to be a single one-dimensional array which is
110
really the concatenation of the columns of the two-dimensional array.
111
(This is normal for Fortran programs.) The first array dimension varies
112
the fastest, the second the next fastest, etc. (This is column major
113
order, as in Fortran, not row-major order, as in C. Most matlab-like
114
languages use the Fortran convention. Tela is an exception.)
116
You may only use variable name or a constant for the array dimension.
117
You can also use expressions like C<2*nbin> or C<2*nbin+1>. If the
118
expression is sufficiently simple, the wrapper generator will determine
119
the values of any integer values (like C<nbin> in this example) from the
120
dimension of the input arrays, so they do not have to be specified as an
131
In theory, this could be made to work with an ANSI C compiler, but I
132
haven't tried to yet. Currently, you must have a full C++ compiler.
133
I've used primarily gcc and I tested very briefly with DEC's cxx.
137
If you are using matlab, then you can tell matwrap to use C<mxCalloc>
138
instead of C<alloca> by specifying C<-use_mxCalloc> somewhere on the
139
command line. Otherwise, you must have a compiler that supports
140
C<alloca()>. (gcc does.)
142
C<alloca()> is usually a little more efficient than C<mxCalloc()>. It
143
allocates space on the stack rather than the heap. Unfortunately, you
144
may have a limited stack size, and so C<alloca()> may fail for large
145
temporary arrays. In this case, you may need to issue a command like
147
unix('unlimit stacksize')
149
or else use the C<-use_mxCalloc> option.
151
=item A relatively recent version of perl
153
I've tested this only with perl 5.004 and 5.005. Check out
154
http://www.perl.com/ for how to get perl.
160
matwrap -language languagename [-options] infile1.h infile2.h
162
matwrap -language languagename [-options] \
163
-cpp cxx [-options_to_C_compiler] infile.cxx
167
Using the first form, without the C<-cpp> flag, files are parsed in the order
168
listed, so you should put any files with required typedefs and other
169
definitions first. These files are C<#include>d by the generated
170
wrapper code; in fact, they are the only files which are C<#include>d.
171
This form can be used 1) if you don't have any C<#if>s or macros that confuse
172
the parser in your code; 2) if you can easily list all of the include files
173
that define the relevant structures.
175
Alternatively, you can use the C<-cpp> flag to have matwrap
176
run the C preprocessor on your files. This means that all of the
177
relevent definitions of types will be found, however deeply they are
178
nested in the C<#include> hierarchy. It also means that wrapper
179
generation runs considerably slower. Matwrap will attempt
180
to guess which files need to be C<#include>d, but it may guess wrong.
182
Overloaded functions and definitions of operators are not supported. C++
183
classes are supported (this is the main reason for this script). Member
184
functions may be called, and member fields may be accessed.
192
Run the C preprocessor on your file before parsing it. This is
193
necessary if you are using any #ifdefs in your code. Following the -cpp
194
option should be a complete compiler command, e.g.,
196
matwrap -language octave -o myfile_wrap.cxx \
197
-cpp g++ -Iextra_includes -Dmy_thingy=3 myfile.cxx
199
All words after the -cpp option are ignored (and passed verbatim to the
200
compiler), so you must supply a C<-o> option before the C<-cpp>. Note that
201
C<-o> and similar compiler options relevant for actual compilation are
202
ignored when just running the preprocessor, so you can substitute your
203
actual compilation command without modification. If you do not supply
204
the C<-E> flag in the compiler command, it will be inserted for you
205
immediately after the name of the compiler. Also, the C<-C> option is
206
added along with the C<-E> option so that any comments can be processed
207
and put into the documentation strings. (As far as I know all compilers
208
support C<-C> and C<-E> but undoubtably this won't work well with some. It
209
works fine with gcc.)
211
When run in this way, C<matwrap> does not generate wrappers for
212
any functions or classes defined in files located in C</usr/include> or
213
C</usr/local/include> or in subdirectories of C<*/gcc-lib>. (Most
214
likely you don't want to wrap the entire C library!) You can specify
215
additional directories to ignore with the -cpp_ignore option. If you
216
really want to wrap functions in one of those C<.h> files, either copy
217
C<.h> file or just the relevant function definitions into a file in
218
another directory tree. You can also restrict the functions which are
219
wrapped using the -wrap_only option (see below).
221
=item -cpp_ignore filename_or_directory
223
Ignored unless used with the -cpp option. Causes functions defined in
224
the given file name or in include files in the given directory or
225
subdirectories of it not to be wrapped. By default, functions defined
226
in C</usr/include>, C</usr/local/include>, or C<*/gcc-lib> are not
231
Specify the name of the output file. If this is not specified, the name is
232
inferred from the input files. Some language modules (e.g., MATLAB)
233
will not infer a file name from your source files (this is for your
234
protection, so we don't accidentally wipe out a C<.c> file with the same
235
name). If you use the C<-cpp> option, you must also specify the C<-o>
236
option before the C<-cpp> option.
238
=item -language <language_name>
240
Specify the language. This option is mandatory.
242
=item -wraponly <list>
244
Specify a list of global functions or variables or classes to wrap. The
245
list extends to the end of the command line, so this must be the last
246
option. Definitions of all functions and classes not explictly listed
247
are ignored. This allows you to specify all the C<.h> files that you
248
need to define all the types, but only to wrap some of the functions.
250
Global functions and variables are specified simply by name. Classes
251
are specified by the word 'class' followed by the class name. For
254
matwrap -language matlab myfile.h \
255
-wraponly myglobalfunc class myclass
261
Input files are designed to be your ordinary .h files, so your wrapper
262
and your C++ sources are never out of date. In general, the wrapper
263
generator does the obvious thing with each different kind of type. For
264
example, consider the function declaration:
266
double abcize(float a, int b, char *c, SomeClass *d);
268
This will pass a single-precision floating point number as argument C<a>
269
(probably converting from double precision or integer, depending on what
270
the interpreted language stored the value as). An integer is passed as
271
argument C<b> (probably converted from a double precision value). A
272
null-terminated string is passed as argument C<c> (converted from
273
whatever weird format the language uses). The argument C<d> must be a
274
pointer value which was returned by another function.
276
Vectorization is automatically performed, so that if you pass a matrix
277
of C<m> by C<n> inputs as argument C<a> and arguments C<b> and C<c> as
278
either scalars or C<m> by C<n> matrices, then the function will be
279
called C<m*n> times and the result will be an C<m> by C<n> matrix.
280
By default, a function is vectorized if it has both inputs and outputs
281
(see under C<//%vectorize> below). Most matrix languages do not support
282
vectors of strings in a natural way, so C<char *> arguments are not
286
Passing arguments by reference is handled in the expected way. For
287
example, given the declaration
289
void fortran_sub(double *inarg1, float *inarg2);
291
pointers to double and single precision numbers will be passed to the
292
subroutine instead of the numbers themselves.
294
This creates an ambiguity for the type C<char *>. For example, consider
295
the following two functions:
298
void f2(unsigned char *b);
300
Matwrap assumes that the function C<f1> is passed a null terminated
301
string, despite the fact that the argument C<a> could be a pointer to a
302
buffer where C<f1> returns a character. Although this situation can be
303
disambiguated with proper use of the C<const> qualifier, matwrap treats
304
C<char *> and C<const char *> as identical since many programs don't use
305
C<const> properly. Matwrap assumes, however, that C<unsigned char *>
306
is not a null terminated string but an C<unsigned char> variable passed
307
by reference. You can also force it to interpret C<char *> as a signed
308
char passed by reference by specifying the qualifier C<//%input a(1)>
311
If you want to pass arguments as arrays, or if there are outputs other
312
than the return value of the function, you must declare these explicitly
313
using the C<//%input> or C<//%output> qualifiers. All qualifiers follow
314
the definition of the function (after the C<;> or the closing C<}> if it
315
is an inline function). Valid qualifiers are:
319
=item //%novectorize_type type1, type2, ...
321
Specifies that all arguments of the given types should not be vectorized
322
even if it is possible. This could be useful if you have a class which
323
there will be only one copy of, so it is pointless to vectorize.
324
(This qualifier may be present anywhere in the file.)
328
Following the definition of a global function or member function,
329
directs matwrap not to try to vectorize the function. For
330
some functions, vectorization simply doesn't make sense. By default,
331
matwrap won't vectorize a function if it has no output
332
arguments or no input arguments.
336
Following the definition of a global function or member function,
337
directs matwrap to vectorize the function. By default, matwrap won't
338
vectorize a function if it has no output arguments or no input
339
arguments. This is normally what you want, but but sometimes it makes
340
sense to vectorize a function with no output arguments.
344
Don't wrap this function. It will therefore not be callable directly
345
from your scripting language.
347
=item //%name new_name
349
Specify a different name for the function when it is invoked from the
352
=item //%input argname(dim1, dim2, ...), argname(dim)
354
Following the declaration of a global function or member function,
355
declares the dimensions of the input arguments with the given name.
356
This declaration must immediately follow the prototype of the function.
357
Dimension strings may contain any arbitrary C expression. If the
358
expression is sufficiently simple, e.g., "n" or "n+1" or "2*n", and if
359
the expression includes another argument to the function ("n" in this
360
case), then the other argument will be calculated from the dimensions of
361
the input variable and need not be specified as an argument in the
364
For example, if you have a function which is declared like this:
366
void myfunc(int n, double *x, double *y);
368
//%output y(n*(n+1)/2)
370
n would be calculated from the dimension of the variable x and then used
371
to compute the size of the output array. So you would call the function
376
On the other hand, if you had a specification like this:
378
void return_diag(int n, double *x, double *y);
379
//%input x(n*(n+1)/2)
382
then n will have to be explicitly specified because it is too difficult
387
=item //%modify argname(dim1, dim2, ...), argname(dim1)
389
=item //%output argname(dim1, dim2, ...), argname(dim1)
391
Same as C<//%input> except that this also tags the variables as modify
392
or output variables. If you don't specify a dimension expression (e.g.,
393
"//%output x") then the variable is tagged as a scalar output variable.
394
(This is the proper way to tell matwrap to make an argument an
399
=head2 Unsupported C++ constructs
403
=item Function overloading
405
=item Operator definition
407
=item Function and member function pointers
409
It would be really nice to support these, but I think it's also really
412
=item Two-dimensional arrays using a vector of pointers
414
You can use two-dimensional arrays as long as they are stored internally
415
as a single long vector, as in Fortran. In this case, the array
416
declaration would be C<float *x>, and the C<i,j>'th element is accessed
417
by C<x[j*n+i]>. You cannot use two dimensional arrays if they are
418
declared like C<float **x> and accessed like C<x[i][j]>. Unfortunately,
419
the Numerical Recipes library uses this format for all its
420
two-dimensional matrices, so at present you can only wrap Numerical
421
Recipes functions which take scalars or vectors. This restriction might
422
be lifted in the future.
424
=item Arrays with an offset
426
The Numerical Recipes code is written so that most of its indices begin
427
at 1 rather than at 0, I guess because its authors are Fortran junkies.
428
This causes a problem, because it means that the pointer you pass to the
429
subroutine is actually not the beginning of the array but before the
430
beginning. You can get around this restriction by passing an extra
431
blank element in your array. For example, suppose you want to wrap the
432
function to return the Savitzky-Golay filter coefficients:
434
void savgol(float c[], int np, int nl, int nr, int ld, int m);
436
where the index in the array C<c> is declared to run from 1 to np.
437
You'd have to declare the array like this:
441
and then ignore the first element. Thus from MATLAB you'd call it with
442
the following sequence:
444
savgol_coefs = savgol(np, nl, nr, ld, m);
445
savgol_coefs = savgol_coefs(2:length(savgol_coefs));
446
% Discard the unused first element.
448
=item Passing structures by value or C++ reference
450
In other words, if Abc is the name of a class, declarations like
458
won't work. However, you can pass a pointer to the class:
462
The wrapper generator will do the type checking and it even handles
463
inheritance properly.
469
For more examples, see the subdirectories of F<share/matwrap/Examples>
470
in the distribution. This includes a wrapper for the entire PGPLOT
471
library (directory F<pgplot>) and a sample C++ simulator for an neuron
472
governed by the Hodgkin-Huxley equations (directory F<single_axon>).
474
=head1 Support for different languages
478
Currently, you must compile the generated wrapper code using C++, even
479
if you are wrapping only C functions with no C++ classes. You can
480
compile your C functions using C as you please; you may have to put a
481
C<extern "C" { }> statement in the .h file. This restriction may be
482
lifted in the future.
484
The default maximum number of dimensions supported is four. You can
485
change this by modifying the $max_dimensions variable near the top of
486
the file share/matwrap/wrap_matlab.pl in the distribution.
488
Specify C<-langauge matlab> on the command line to use the matlab code
489
generator. You MUST also use C<-o> to specify the output file name.
490
(This is because matlab wrappers have an extension of C<.c> and if we
491
infer the file name from the name of include files, it's quite likely
492
that we'll wipe out something that shouldn't be wiped out.)
494
An annoying restriction of MATLAB is that only one C function can be
495
defined per mex file. To get around this problem, the wrapper generator
496
defines a C function which takes an extra parameter, which is a code for
497
the function you actually want to call. It also defines a series of
498
MATLAB stub functions to supply the extra parameter. Each of these must
499
be placed into its own separate file (because of another MATLAB design
500
inadequacy) so wrapper generation for MATLAB may actually create
501
hundreds of files if you have a lot of member functions.
503
You can specify where you want the C<.m> files to be placed using the
504
C<-outdir> option, like this:
506
matwrap -language matlab -outdir wrap_m \
507
myfuncs.h -o myfuncs_matlab.c
509
mex -f mex_gcc_cxx myfunc
511
This will create dozens of tiny C<.m> files which are placed into the
512
directory C<wrap_m>, and a single mexfile with the name F<myfuncs>. DO
513
NOT CHANGE THE NAME OF THE MEX FILE! The C<.m> files assume that the
514
name of the C subroutine is the name of the file, in this case,
515
F<myfuncs>. (You can move the mex file to a different directory, if you
516
want, so long as it is still in your matlabpath).
518
To wrap C++ functions in MATLAB, you'll probably need to specify the
519
C<-f> option to the mex command, as shown above. You'll need to create
520
the mex options file so that the appropriate libraries get linked in for
521
C++. For example, on the machine that I use, I created the file
522
F<mex_gcc_cxx> which contains the following instructions:
524
. mexopts.sh # Load the standard definitions.
527
CLIBS='-lg++ -lstdc++ -lgcc -lm -lc'
531
This works with other C++ compilers if you set C<CC> and C<CLIBS> to use the
532
appropriate compiler and libraries (e.g., C<CLIBS=-lcxx> and C<CC=cxx>
533
for cxx on Digital Unix).
535
By default, matwrap uses C<alloca()> to allocate temporary memory. If
536
for some reason you want to use C<mxCalloc()>, specify C<-use_mxCalloc>
537
somewhere on the command line.
539
The following features of matlab are not currently supported:
543
=item Vectors of strings
547
It would be nice to be able to return whole C++ structures as MATLAB
548
structures. Maybe this will happen in the future.
552
Do not try to pass a cell array instead of a numeric array to a C++
553
function. It won't work; the wrapper code does not support it.
557
One quirk of operation which can be annoying is that MATLAB likes to use
558
row vectors instead of column vectors. This can be a problem if you
559
write some C code that expects a vector input, like this:
561
void myfunc(double *d, int n_d); //%input d(n_d)
563
Suppose now you try to invoke it with the following matlab commands:
567
The range C<0:0.1:pi> is a row vector, not a column vector. As a
568
result, a dimension error will be returned if my_func is not vectorized
569
(which would be the default with these arguments), because the function
570
is expecting an n_d by 1 array instead of a 1 by n_d array. If you
571
allowed C<myfunc> to be vectorized, then C<myfunc()> will be called once
572
for each element of the range, with C<n_d = 1>. This is almost
573
certainly not what you wanted. I haven't yet figured out a good way to
574
handle this. Anyway, be careful, and always transpose ranges, like
577
>> myfunc((0:0.1:pi)')
581
Octave is much like matlab in that it only allows one callable function
582
to be put into a .oct file. The function in the .oct file therefore
583
takes an extra argument which indicates which C++ function you actually
584
wanted to call. Fortunately, unlike matlab, octave can define more than
585
one function per file so we don't have to have a separate .m file for
586
each function. Instead, the functions are all placed into a separate
587
file whose name you specify on the command line with the -stub option.
589
To compile an octave module, you would use the following command:
591
matwrap -language octave -stub myfuncs_stubs.m \
592
myfuncs.h -o myfuncs_octave.cc
593
mkoctfile myfuncts_octave
595
Note that you can't do this unless you have the F<mkoctfile> script
596
installed. F<mkoctfile> is not available in some binary distributions.
598
Then, in octave, you must first load the stub functions:
600
octave:1> myfuncs_subs
601
octave:2> # Now you may call the functions.
603
DO NOT CHANGE THE NAME OF THE .oct FILE! Its name is written into the
604
stub functions. You can move the file into a different directory,
605
however, so long as the directory is in your LOADPATH.
607
(The F<mkoctfile> script for octave versions below 2.0.8 has an annoying
608
restriction that prevents additional libraries from being linked into
609
your module if your linker is sensitive to the order of the libraries on
610
the command line. The F<mkoctfile> script for versions 2.0.8 and 2.0.9
611
in theory supports libraries on the command line but it doesn't work.
612
Patches to fix F<mkoctfile> for these versions of octave are provided in
613
F<share/matwrap/mkoctfile_2_0_8_or_9.patch> and
614
F<share/matwrap/mkoctfile_before_2_0_8.patch>.)
616
If you compile your source code to .o or .a files separately, on many
617
systems you need to force the compiler to make position-independent code
618
(C<-fPIC> option to gcc). Remember you are making a shared library, so
619
follow the rules for making shared libraries on your system. The
620
F<mkoctfile> script should do this for you automatically if you have it
621
compile your source files, but if you compile to .o files first and give
622
these to F<mkoctfile>, you may have to be careful to specify the
623
appropriate flags on the C<cc> or C<c++> command line.
625
Octave doesn't seem to provide a good way to support modify variables,
626
i.e., variables that are taken as input and modified and returned as
627
output. For example, suppose you have the function
629
void myfunc(float *a, int a_n); //%modify a(a_n)
631
which takes the array C<a> as input, does something to it, and returns
632
its output in the same place. In octave, this would be called as:
634
a_out = myfunc(a_in);
640
as it might be from other languages.
642
Octave has the same quirk as MATLAB in the usage of row vectors where
643
matwrap expects column variables. See the end of the section on MATLAB
648
Tela (Tensor Language) is a MATLAB clone which is reputed to be considerably
649
faster than MATLAB and has a number of other nice features biassed toward PDEs.
650
It can be found at http://www.geo.fmi.fi/prog/tela.html.
652
Specify C<-language tela> to invoke the Tela wrapper generator, like this:
654
matwrap -language tela myfuncs.h -o myfuncs.ct
655
telakka myfuncs.ct other_files.o -o tela
657
That's pretty much all there is to it. Tela doesn't support arrays of
658
strings so C<char *> parameters are not vectorized. Otherwise, just
659
about everything should work as you expect.
661
WARNING: Tela stores data internally using a row-major scheme instead of
662
the usual column-major ordering, so the indexes of Tela arrays are in
663
reverse order from the index specification order in the C<%input>,
664
C<%output>, and C<%modify> declarations. Sorry, it wasn't my idea.
666
The tela code generator does not currently support C<short> or
669
=head2 A note on debugging
671
Since both MATLAB and Octave use dynamically loadable libraries, it can
672
be tricky to debug your C++ code. MATLAB has a documented way of making
673
a standalone program, but I found this extremely inconvenient. If you
674
have gdb, it is sometimes easier to use the "attach" command if your
675
operating system supports it. (Linux and Digital Unix do; I do not know
676
about other operating systems.) Start up MATLAB or octave as you
677
normally would, and load the shared library by calling some function in
678
it that doesn't cause it to crash. (Or, put a "sleep(30)" in an
679
appropriate place in the code, so there is enough time for you to catch
680
it between when it loads the library and when it crashes.) Then while
681
MATLAB or Octave is at the prompt or waiting, attach to the
682
octave/MATLAB process using gdb, set your breakpoints, allow the program
683
to continue, type the command that fails, and debug away.
685
=head1 Writing new language support modules
687
Matlab 5, octave, and Tela are the only language modules that I've
688
written so far. It's not hard to write a language module--most of the
689
tricky stuff has been taken care of by the main wrapper generator
690
program. It's just a bit tedious.
692
The parsing in matwrap is entirely independent of the target language.
693
The back end is supplied by one of several language modules, as
694
specified by the C<-language> option.
696
The interface is designed to make it easy to generate automatically
697
vectorized functions. Vectorization is done automatically by the
698
matwrap code, independent of the language module. All subroutines
699
except those with no output arguments or no input arguments are
700
vectorized except as explicitly requested.
702
Typically, the function_start() function in the language module will
703
output the function header to the file and declare the arguments to the
704
function. After this, the wrapper generator writes C code to check the
705
dimensions of the arguments.
707
After checking the dimensions of all variables, the value of the
708
variable is obtained from the function get_c_arg_scalar/get_c_arg_ptr.
709
This returns a pointer to the variable, so if it is vectorized we can
710
easily step through the pointer array. Note that if the desired type is
711
"float" and the input is an array of "double", then the language module
712
will have to make a temporary array of doubles. Output variables are
713
then created by calling make_output_scalar/make_output_ptr.
715
Next, the C function is called as many times as required.
717
Next, any modify/output arguments need to have the new values put back
718
into the scripting language variables. This is accomplished by the
719
put_val_scalar/put_val_ptr function. Temporary arrays may be freed
720
here. Note that put_val is not called for input arguments so temporary
721
arrays of input arguments will have to be freed some other way.
723
Finally, the function function_end is called to do any final cleanup and
724
terminate the function definition.
726
The following functions and variables must be supplied by the language
727
module. They should be in a package whose name is the same as the
728
argument to the C<-language> option.
732
=item C<$max_dimensions>
734
A scalar value indicating the maximum number of dimensions this language can
735
handle (or, at least, the maximum number of dimensions that our scripts will
736
handle). This is 2 for languages like Matlab or Octave which can only have
737
2-dimensional matrices.
739
=item C<arg_pass(\%function_def, $argname)>
741
A C or C++ expression used to pass the argument to another function
742
which does not know anything about the type of the argument. For
743
example, in the MATLAB module this function returns an expression for
744
the mxArray type for a given argument.
746
=item C<arg_declare(">arg_name_in_arglistC<")>
748
This returns a C/C++ declaration appropriate for the argument passed
749
using arg_pass. For example, in the MATLAB module this function returns
750
"mxArray *arg_name_in_arglist".
752
=item C<declare_const(">constant nameC<", ">class nameC<", ">typeC<")>
754
Output routines to make a given constant value accessible from the interpreter.
755
If "class name" is blank, this is a global constant.
757
None of the language modules currently support definition of constants,
758
but this function is called.
760
=item C<error_dimension(\%function_def, $argname)>
762
A C statement (including the final semicolon, if not surrounded by braces)
763
which indicates that an error has occured because the dimension of argument
768
Called after all functions have been wrapped, to close the output file and do
769
whatever other cleanup is necessary.
771
=item C<function_start(\%function_def)>
773
This should prepare a documentation string entry for the function and it should
774
set up the definition of the function. It should return a string rather than
777
C<%function_def> is the array defining all the arguments and outputs for this
778
function. See below for its format.
780
=item C<function_end(\%function_def)>
782
Returns a string which finishes off the definition of a function wrapper.
784
=item C<get_outfile(\@files_processed)>
786
Get the name of an output file. This subroutine is only called if no output
787
file is specified on the command line. C<\@files_processed> is a list of the
788
C<.h> files which were parsed.
790
=item C<get_c_arg_scalar(\%function_def, $argname)>
792
Returns C statements to load the current value of the given argument
793
into the C variable C<$function_def{args}{$argname}{c_var_name}>. The
794
variable is guaranteed to be either a scalar or an array with dimensions
795
1,1,1... (depending on the scripting language, these may be identical).
797
=item C<get_c_arg_ptr(\%function_def, $argname)>
799
Returns C statements to set up a pointer which points to the first value
800
of a given argument. It is possible that the argument may be a scalar,
801
in which case we just want a pointer to that scalar value. (This
802
happens only for vectorizable arguments when the vectorization is not
803
used on this function call.) The dimensions are guaranteed to be
804
correct. The type of the argument should be checked. The pointer value
805
should be stored in the variable
806
C<$function_def{args}{$argname}{c_var_name}>.
808
The pointer should actually point to the array of all the values of the
809
variable. The array should have the same number of elements as the argument,
810
since to vectorize the function, the wrapper function will simply step through
811
this array. If we want a float type and the input vector is double or int,
812
then a temporary array must be made which is a copy of the double/int arrays.
814
=item C<get_size(\%function_def, $argname, $n)>
816
Returns a C expression which is the size of the C<$n>'th dimension of the given
817
argument. Dimension 0 is the least-significant dimension.
819
=item C<initialize($outfile, \@files_processed, \@cpp_command, $include_str)>
821
Write out header information.
823
$outfile The name of the output file. This file should
824
be opened, and the function should return the
825
name of a file handle (qualified with the
826
package name, e.g., "matlab::OUTFILE").
828
@files A list of files explicitly listed on the command
829
line. This will be a null array if no files
830
were explicitly listed.
832
@cpp_command The command string words passed to the C
833
preprocessor, if the C preprocessor was run.
834
Otherwise, this will be a null array.
836
$include_str A string of #include statements which represents
837
our best guess as to the proper files to include
838
to make this compilation work.
840
This function also should write out C++ code to define the following
843
int _n_dims(argument) Returns number of dimensions.
844
int _dim(argument, n) Returns the size in the n'th dimension,
845
where 0 is the first dimension.
847
=item C<make_output_scalar(\%function_def, $argname)>
849
Return C code to create the given output variable. The output variable
852
=item C<make_output_ptr(\%function_def, $argname, $n_dimensions, @dimensions)>
854
Return C code to set up a pointer to where to store the values of the output
855
variable. C<$n_dimensions> is a C expression, not necessarily a constant.
856
C<@dimensions> is a list of C expressions that are the sizes of each dimension.
857
There may be more values in @dimensions than are needed.
859
=item C<n_dimensions(\%function_def, $argname)>
861
Returns a C expression which is the number of dimensions of the argument whose
864
=item C<pointer_conversion_functions()>
866
Returns code to convert to and from pointer types to the languages
867
internal representation, if any special code is needed. If this
868
subroutine is not called, then there are no class types and pointers
869
will not need to be handled.
871
=item C<parse_argv(\@ARGV)>
873
Scan the argument list for language-specific options. This is called after the
874
C<-language> option has been parsed and removed from the C<@ARGV> array.
876
=item C<put_val_scalar(\%function_def, $argname)>
878
Returns C code to take the value from the C variable whose name is given
879
by C<$function_def{args}{$argname}{c_var_name}> and store it back in the
880
scripting language scalar variable.
882
=item C<put_val_ptr(\%function_def, $argname)>
884
Returns C code to take the value from the C array whose name is given by
885
C<$function_def{args}{$argname}{c_var_name}> and store it back in the
886
scripting language array at the specified index. The pointer
887
C<$function_def{args}{$argname}{c_var_name}> was set up by either
888
C<get_c_arg> or C<make_output>, depending on whether this is an
889
input/modify or an output variable.
893
=head2 The %function_def array
895
Many of these arguments require a reference to the %function_def associative
896
array. This array defines everything that is known about the function.
898
First, there are a few entries that describe the interface to the scripting
905
The name of the function.
909
The class of which this is a member function. This element will be blank
910
if it is a global function.
914
The name of the function in the scripting language. If this field is blank,
915
then the name of the function should be generated from the "class" and "name"
916
fields. This field is set by the C<%name> directive.
920
True if this is a static member function. Non-static member functions will
921
have the class pointer specified as the first argument in the argument list.
925
A list of the names of arguments to the scripting language function which are
926
only for input. Argument names are generated from the corresponding argument
927
names in the C function prototype.
931
A list of the names of arguments to the scripting language function which are
932
for both input and output. Argument names are generated from the corresponding
933
argument names in the C function prototype.
937
A list of the names of arguments to the scripting language function which are
938
for output. Argument names are generated from the corresponding argument names
939
in the C function prototype. "retval" is used as the name of the return value
940
of the function, if there is a return value.
944
An associative array indexed by the argument name which contains information
945
about each argument of the function. Note that there may be more arguments in
946
this associative array than in the inputs/modifies/outputs arrays because some
947
of the arguments to the function may be merely the dimension of arrays, which
948
are not arguments in the scripting language since they can be determined by
951
Note that there will also be an entry in the args array for "retval" if the
952
function has a return value, since the return value is treated as an output
955
The fields in this associative array are:
961
Whether this is an "input", "output", or "modify" variable, or whether
962
it can be calculated from the "dimension" of another variable. These
963
are the only legal values for this field.
967
The type of this argument, i.e., "float", "double", "int", "char *", or "<class
968
name> *" or various combinations involving "&", "*", and "const". All typedefs
969
have been translated to the basic types or class names, and "[]" is translated
970
to "*". Otherwise, no other modifications have been made.
974
Same as the "type" field, except that the "const" qualifiers have been
975
stripped, a trailing '&' has been deleted, and a trailing '*' has been
976
deleted if this is an array type or if it's a basic type like 'double',
977
'int', etc., which we recognize.
981
The dimensions of this array argument. This is a reference to a list of
982
dimensions. Each element of the list must be the name of an integer argument
983
to the C function or else a decimal integer. If this argument is not an array,
984
then this field will still be present but will contain no elements.
988
Whether this argument may be supplied as a vector. If so, the wrapper
989
generator will automatically "vectorize" the function in the sense that MATLAB
990
functions like "sin" or "cos" are vectorized.
994
The variable name which contains the argument which is passed to the C
995
function. The c_var_name is guaranteed not to be the same as the argument name
996
itself, to avoid conflict with the argument declaration of the function.
998
If the argument is to be vectorized, or if the argument is an array,
999
then c_var_name is the name of a pointer to an array of the argument.
1000
If the argument is not to be vectorized, then c_var_name is the name of
1001
a variable containing the argument.
1005
A C expression indicating how to calculate this particular variable from
1006
the dimension of other input/modify variables. This field will not be
1007
present if we don't see any way to calculate this variable from the
1014
The remaining elements in the associative array for each function describe the
1015
arguments to the C/C++ function and its return type:
1021
A scalar containing the return type of the function. This information is also
1022
contained in the "retval" entry in the "args" array.
1026
A list containing the name of each argument in order in the C function's
1027
argument list. If no name was specified in the prototype, a name is generated
1028
for it, since our entire scheme depends on each argument having a unique name.
1032
Whether a vectorized wrapper function should be generated at all, i.e., a
1033
version which calls the C function once for each element of scalar arguments
1034
which are passed in a vector. Note that vectors may be supplied for some
1035
arguments but not others, depending on the "vectorize" field in the args array
1038
=item pass_by_pointer_reference
1040
True if we are supposed to pass a pointer to the argument, not the argument
1041
itself. This is used for pass-by-reference when the type is "double *".
1042
This is always 0 for arrays, which are handled separately.
1044
=item Additional fields
1046
The language module may add additional fields as necessary. Only those listed
1047
above are set up or used by the main wrapper generator code.
1051
For example, if the function prototype is
1053
double atan2(double y, double x)
1057
$global_functions{'atan2'} = {
1061
inputs => ["y", "x"],
1063
outputs => ["retval"],
1064
args => { x => { source => "input",
1066
basic_type => "double",
1068
c_var_name => "_arg_x",
1070
pass_by_pointer_reference = 0 },
1071
y => { source => "input",
1073
basic_type => "double",
1075
c_var_name => "_arg_y",
1077
pass_by_pointer_reference = 0 },
1078
retval => { source => "output",
1080
basic_type => "double",
1082
c_var_name => "_arg_retval",
1084
pass_by_pointer_reference = 0 } },
1085
returns => "double",
1086
argnames => ["x", "y"],
1090
This function is sufficiently simple that all of the relevant
1091
information can be filled out automatically, without any help from the
1092
user. For a more complicated function, it may not be possible to do so.
1093
For example, consider the following function (from the pgplot
1096
void cpgbin(int nbin, const float *x, const float *data, Logical center);
1098
This function plots a histogram of the given data, where C<x[]> are the
1099
abscissae values and C<data[]> are the data values. C<Logical> has been
1100
defined by a typedef statement earlier in the .h file to be C<int>.
1102
By default, the wrapper generator will interpret the C<float *> as a
1103
declaration to pass a scalar argument by reference. In this case, this
1104
is not what is wanted, so the definition file must contain additional
1107
void cpgbin(int nbin, const float *x, const float *data, Logical center);
1111
This tells us that the x and data arrays are the same size, which is given by
1112
nbin. With this information, then, the following will be produced:
1114
$global_functions{'cpgbin'} = {
1116
inputs => ["x", "data", "center" ],
1119
args => { "nbin" => { source = "dimension",
1124
pass_by_pointer_reference = 0 },
1125
"x" => { source = "input",
1127
basic_type = "float",
1128
dimension = ["nbin"],
1130
pass_by_pointer_reference = 0 },
1131
"data" => { source = "input",
1133
basic_type = "float",
1134
dimension = ["nbin"],
1136
pass_by_pointer_reference = 0 },
1137
"center" => { source = "input",
1142
pass_by_pointer_reference = 0 } },
1144
argnames => ["nbin", "x", "data", "center" ],
1148
Note that since this function has no output arguments, we do not attempt
1149
to provide a vectorized version of it.
1153
Gary Holt (holt@LNC.usc.edu).
1155
The latest version of matwrap should be available from
1156
http://LNC.usc.edu/~holt/matwrap/.