4
PDL::Impatient - PDL for the impatient (quick overview)
8
Executive summary of what PDL is about.
14
Perl is an extremely good and versatile scripting language, well suited to
15
beginners and allows rapid prototyping. However until recently it did not
16
support data structures which allowed it to do fast number crunching.
18
However with the development of Perl v5, Perl acquired 'Objects'. To put
19
it simply users can define their own special data types, and write
20
custom routines to manipulate them either in low level languages (C and
21
Fortran) or in Perl itself.
23
This has been fully exploited by the PerlDL developers. The 'PDL' module is a
24
complete Object-Oriented extension to Perl (although you don't have to know
25
what an object is to use it) which allows large N-dimensional data sets, such
26
as large images, spectra, time series, etc to be stored B<efficiently> and
27
manipulated B<en masse>. For example with the PDL module we can write the
28
perl code C<$a=$b+$c>, where C<$b> and C<$c> are large datasets (e.g. 2048x2048
29
images), and get the result in only a fraction of a second.
31
PDL variables (or 'piddles' as they have come to be known)
32
support a wide range of fundamental data types - arrays can be bytes,
33
short integers (signed or unsigned), long integers, floats or
34
double precision floats. And because of the Object-Oriented nature
35
of PDL new customised datatypes can be derived from them.
37
As well as the PDL modules, that can be used by normal perl programs, PerlDL
38
comes with a command line perl shell, called 'perldl', which supports command
39
line editing. In combination with the various PDL graphics modules this allows
40
data to be easily played with and visualised.
44
PDL contains extensive documentation, available both within the
45
I<perldl> shell and from the command line, using the C<pdldoc> program.
46
For further information try either of:
51
HTML copies of the documentation should also be available.
52
To find their location, try the following:
54
perldl> foreach ( map{"$_/PDL/HtmlDocs"}@INC ) { p "$_\n" if -d $_ }
56
=head2 Perl Datatypes and how PDL extends them
58
The fundamental perl data structures are scalar variables, e.g. C<$x>,
59
which can hold numbers or strings, lists or arrays of scalars, e.g. C<@x>,
60
and associative arrays/hashes of scalars, e.g. C<%x>.
62
perl v5 introduces to perl data structures and objects. A simple
63
scalar variable C<$x> now be a user-defined data type or full blown
64
object (it actually holds a reference (a smart "pointer") to this
65
but that is not relevant for ordinary use of perlDL)
67
The fundamental idea behind perlDL is to allow C<$x> to hold a whole 1D
68
spectrum, or a 2D image, a 3D data cube, and so on up to large
69
N-dimensional data sets. These can be manipulated all at once, e.g.
70
C<$a = $b + 2> does a vector operation on each value in the
73
You may well ask: "Why not just store a spectrum as a simple perl C<@x>
74
style list with each pixel being a list item?" The two key answers to
75
this are I<memory> and I<speed>. Because we know our spectrum consists of
76
pure numbers we can compactly store them in a single block of memory
77
corresponding to a C style numeric array. This takes up a LOT less
78
memory than the equivalent perl list. It is then easy to pass this
79
block of memory to a fast addition routine, or to any other C function
80
which deals with arrays. As a result perlDL is very fast --- for example
81
one can mulitiply a 2048*2048 image in exactly the same time as it
82
would take in C or FORTRAN (0.1 sec on my SPARC). A further advantage
83
of this is that for simple operations (e.g. C<$x += 2>) one can manipulate
84
the whole array without caring about its dimensionality.
86
I find when using perlDL it is most useful to think of standard perl
87
C<@x> variables as "lists" of generic "things" and PDL variables like
88
C<$x> as "arrays" which can be contained in lists or hashes. Quite
89
often in my perlDL scripts I have C<@x> contain a list of spectra, or a
90
list of images (or even a mix!). Or perhaps one could have a hash
91
(e.g. C<%x>) of images... the only limit is memory!
93
perlDL variables support a range of data types - arrays can be bytes,
94
short integers (signed or unsigned), long integers, floats or
95
double precision floats.
99
PerlDL is loaded into your perl script using this command:
101
use PDL; # in perl scripts: use the standard perlDL modules
103
There are also a lot of extension modules, e.g.
104
L<PDL::Graphics::TriD|PDL::Graphics::TriD>.
105
Most of these (but not all as sometimes it is not appropriate) follow
106
a standard convention. If you say:
108
use PDL::Graphics::TriD;
110
You import everything in a standard list from the module. Sometimes
111
you might want to import nothing (e.g. if you want to use OO syntax
112
all the time and save the import tax). For these you say:
114
use PDL::Graphics::TriD '';
116
And the blank quotes '' are regonised as meaning 'nothing'. You
117
can also specify a list of functions to import in the normal Perl
120
There is also an interactive shell, C<perldl>, see I<perldl>.
122
=head2 To create a new PDL variable
124
Here are some ways of creating a PDL variable:
126
$a = pdl [1..10]; # 1D array
127
$a = pdl (1,2,3,4); # Ditto
128
$b = pdl [[1,2,3],[4,5,6]]; # 2D 3x2 array
129
$b = pdl 42 # 0-dimensional scalar
130
$c = pdl $a; # Make a new copy
132
$d = byte [1..10]; # See "Type conversion"
133
$e = zeroes(3,2,4); # 3x2x4 zero-filled array
135
$c = rfits $file; # Read FITS file
137
@x = ( pdl(42), zeroes(3,2,4), rfits($file) ); # Is a LIST of PDL variables!
139
The L<pdl()|PDL::Core/pdl> function is used to initialise a PDL variable from a scalar,
140
list, list reference or another PDL variable.
142
In addition all PDL functions automatically convert normal perl scalars
143
to PDL variables on-the-fly.
145
(also see "Type Conversion" and "Input/Output" sections below)
147
=head2 Arithmetic (and boolean expressions)
149
$a = $b + 2; $a++; $a = $b / $c; # Etc.
151
$c=sqrt($a); $d = log10($b+100); # Etc
153
$e = $a>42; # Vector conditional
155
$e = 42*($a>42) + $a*($a<=42); # Cap top
157
$b = $a->log10 unless any ($a <= 0); # avoid floating point error
159
$a = $a / ( max($a) - min($a) );
161
$f = where($a, $a > 10); # where returns a piddle of elements for
162
# which the condition is true
164
print $a; # $a in string context prints it in a N-dimensional format
166
(and other perl operators/functions)
168
When using piddles in conditional expressions (i.e. C<if>, C<unless> and
169
C<while> constructs) only piddles with exactly one element are allowed, e.g.
172
print "is set" if $a->index(2);
174
Note that the boolean operators return in general multielement
175
piddles. Therefore, the following will raise an error
177
print "is ok" if $a > 3;
179
since C<$a E<gt> 3> is a piddle with 4 elements. Rather use
180
L<all|PDL::Primitive/all> or L<any|PDL::Primitive/any>
181
to test if all or any of the elements fulfill the condition:
183
print "some are > 3" if any $a>3;
184
print "can't take logarithm" unless all $a>0;
186
There are also many predefined functions, which are described on other
187
manpages. Check L<PDL::Index>.
189
=head2 Matrix functions
191
C<'x'> is hijacked as the matrix multiplication operator. e.g.
194
perlDL is row-major not column major so this is actually
195
C<c(i,j) = sum_k a(k,j) b(i,k)> - but when matrices are printed the
196
results will look right. Just remember the indices are reversed.
212
Note: L<transpose()|PDL::Basic::transpose>
213
does what it says and is a convenient way
214
to turn row vectors into column vectors. It is bound to
215
the unary operator C<'~'> for convenience.
217
=head2 How to write a simple function
225
If put in file dotproduct.pdl would be autoloaded if you
226
are using L<PDL::AutoLoader|PDL::AutoLoader> (see below).
228
Of course, this function is already available as the
229
L<inner|PDL::Primitive/inner>
230
function, see L<PDL::Primitive>.
232
=head2 Type Conversion
234
Default for pdl() is double. Conversions are:
237
$c = long($d); # "long" is generally a 4 byte int
240
Also double(), short(), ushort().
242
These routines also automatically convert perl lists to
243
allow the convenient shorthand:
245
$a = byte [[1..10],[1..10]]; # Create 2D byte array
246
$a = float [1..1000]; # Create 1D float array
250
=head2 Piddles and boolean expressions
254
Automatically expands array in N-dimensional format:
258
$b = "Answer is = $a ";
262
perlDL betrays its perl/C heritage in that arrays are zero-offset.
263
Thus a 100x100 image has indices C<0..99,0..99>.
265
Furthermore [Which modules!?!],
266
the convention is that the center of the pixel (0,0)
267
IS at coordinate (0.0,0.0). Thus the above image ranges from
268
C<-0.5..99.5, -0.5..99.5> in real space. All perlDL graphics functions
269
conform to this defintion and hide away the unit-offsetness
270
of, for example, the PGPLOT FORTRAN library.
272
Again following the usual convention coordinate (0,0) is displayed
273
at the bottom left when displaying an image. It appears at the
274
top left when using "C<print $a>" etc.
276
$b = $a->slice("$x1:$x2,$y1:$y1,$z1:$z2"); # Take subsection
278
# Set part of $bigimage to values from $smallimage
279
($tmp = $bigimage->slice("$xa:$xb,$ya:$yb")) .= $smallimage;
281
$newimage = ins($bigimage,$smallimage,$x,$y,$z...) # Insert at x,y,z
283
$c = nelem ($a); # Number of pixels
285
$val = at($object, $x,$y,$z...) # Pixel value at position
286
$val = $object->at($x,$y,$z...) # equivalent
287
set($myimage, $x, $y, ... $value) # Set value in image
289
$b = xvals($a); # Fill array with X-coord values (also yvals(), zvals(),
290
# axisvals($x,$axis) and rvals() for radial distance
295
The C<PDL::IO> modules implement several useful IO format functions.
296
It would be too much to give examples of each so you are referred to
297
the individual manpages for details.
303
Ascii, FITS and FIGARO/NDF IO routines.
305
=item PDL::IO::FastRaw
307
Using the raw data types of your machine, an unportable but blindingly
308
fast IO format. Also supports memory mapping to conserve memory as
309
well as get more speed.
311
=item PDL::IO::FlexRaw
313
General raw data formats.
315
=item PDL::IO::Browser
317
A Curses browser for arrays.
321
Portaple bitmap and pixmap support.
325
Using the previous module and netpbm, makes it possible to easily
326
write GIF, jpeg and whatever with simple commands.
332
The philosophy behind perlDL is to make it work with a variety of
333
existing graphics libraries since no single package will satisfy all
334
needs and all people and this allows one to work with packages one
335
already knows and likes. Obviously there will be some overlaps in
336
functionality and some lack of consistency and uniformity. However
337
this allows PDL to keep up with a rapidly developing field - the
338
latest PDL modules provide interfaces to OpenGL and VRML graphics!
342
=item PDL::Graphics::PGPLOT
344
PGPLOT provdes a simple library for line graphics and image display.
346
There is an easy interface to this in the internal module
347
L<PDL::Graphics::PGPLOT|PDL::Graphics::PGPLOT>, which
348
calls routines in the separately available
349
PGPLOT top-level module.
351
=item PDL::Graphics::IIS
353
Many astronomers like to use SAOimage and Ximtool (or there
354
derivations/clones). These are useful free widgets for inspection and
355
visualisation of images. (They are not provided with perlDL but can
356
easily be obtained from their official sites off the Net.)
358
The L<PDL::Graphics::IIS|PDL::Graphics::IIS>
359
package provides allows one to display images
360
in these ("IIS" is the name of an ancient item of image display
361
hardware whose protocols these tools conform to.)
365
The L<PDL::Graphics::Karma|PDL::Graphics::Karma>
366
module provides an interface to the Karma visualisation
367
suite. This is a set of GUI applications which are specially designed for
368
visualising noisy 2D and 3D data sets.
370
=item PDL::Graphics::TriD
372
See L<PDL::Graphics::TriD|PDL::Graphics::TriD> (the name sucks...).
373
this is a collection of 3D routines for
374
OpenGL and (soon) VRML and other 3D formats which allow 3D point, line,
375
and surface plots from PDL.
381
See L<PDL::AutoLoader>. This allows one to autoload functions
382
on demand, in a way perhaps familiar to users of MatLab.
384
One can also write PDL extensions as normal Perl modules.
388
The perl script C<perldl> provides a simple command line - if the latest
389
Readlines/ReadKey modules have beeen installed C<perldl> detects this
390
and enables command line recall and editing. See the manpage for details.
396
PDL comes with ABSOLUTELY NO WARRANTY. For details, see the file
397
'COPYING' in the PDL distribution. This is free software and you
398
are welcome to redistribute it under certain conditions, see
399
the same file for details.
401
Reading PDL/default.perldlrc...
402
Found docs database /home/kgb/soft/dev/lib/perl5/site_perl/PDL/pdldoc.db
403
Type 'help' for online help
404
Type 'demo' for online demos
406
perldl> $x = rfits 'm51.fits'
407
BITPIX = 16 size = 65536 pixels
409
BSCALE = 1.0000000000E0 && BZERO = 0.0000000000E0
413
Displaying 256 x 256 image from 24 to 500 ...
415
You can also run it from the perl debugger (C<perl -MPDL -d -e 1>)
418
Miscellaneous shell features:
424
The shell aliases C<p> to be a convenient short form of C<print>, e.g.
435
The files C<~/.perldlrc> and C<local.perldlrc> (in the current
436
directory) are sourced if found. This allows the user to have global
437
and local PDL code for startup.
441
Type 'help'! One can search the PDL documentation, and look up documentation
446
Any line starting with the C<#> character is treated as a shell
447
escape. This character is configurable by setting the perl variable
448
C<$PERLDL_ESCAPE>. This could, for example, be set in C<~/.perldlrc>.
452
=head2 Overload operators
454
The following builtin perl operators and functions have been overloaded
455
to work on PDL variables:
457
+ - * / > < >= <= << >> & | ^ == != <=> ** % ! ~
458
sin log abs atan2 sqrt cos exp
460
[All the unary functions (sin etc.) may be used with inplace() - see
463
=head2 Object-Orientation and perlDL
465
PDL operations are available as functions and methods.
466
Thus one can derive new types of object, to represent
469
By using overloading one can make mathematical operators
470
do whatever you please, and PDL has some built-in tricks
471
which allow existing PDL functions to work unchanged, even
472
if the underlying data representation is vastly changed!
475
=head2 Memory usage and references
477
Messing around with really huge data arrays may require some care.
478
perlDL provides many facilities to let you perform operations on big
479
arrays without generating extra copies though this does require a bit
480
more thought and care from the programmer.
482
NOTE: On some most systems it is better to configure perl (during the
483
build options) to use the system C<malloc()> function rather than perl's
484
built-in one. This is because perl's one is optimised for speed rather
485
than consumption of virtual memory - this can result in a factor of
486
two improvement in the amount of memory storage you can use.
487
The Perl malloc in 5.004 and later does have a number of compile-time
488
options you can use to tune the behaviour.
492
=item Simple arithmetic
494
If $a is a big image (e.g. occupying 10MB) then the command
498
eats up another 10MB of memory. This is because
499
the expression C<$a+1> creates a temporary copy of C<$a> to hold the
500
result, then C<$a> is assigned a reference to that.
501
After this, the original C<$a> is destroyed so there is no I<permanent>
502
memory waste. But on a small machine, the growth in the memory footprint
505
this way so C<$c=$a+1> works as expected.
509
$b = $a; # $b and $a now point to same data
512
Then C<$b> and C<$a> end up being different, as one naively expects,
513
because a new reference is created and C<$a> is assigned to it.
515
However if C<$a> was a huge memory hog (e.g. a 3D volume) creating a copy
516
of it may not be a good thing. One can avoid this memory overhead in
517
the above example by saying:
521
The operations C<++,+=,--,-=>, etc. all call a special "in-place"
522
version of the arithmetic subroutine. This means no more memory is
523
needed - the downside of this is that if C<$b=$a> then C<$b> is also
524
incremented. To force a copy explicitly:
526
$b = pdl $a; # Real copy
528
or, alternatively, perhaps better style:
534
Most functions, e.g. C<log()>, return a result which is a transformation
535
of their argument. This makes for good programming practice. However many
536
operations can be done "in-place" and this may be required when large
537
arrays are in use and memory is at a premium. For these circumstances
538
the operator L<inplace()|PDL::Core/inplace>
539
is provided which prevents the extra copy and
540
allows the argument to be modified. e.g.:
542
$x = log($array); # $array unaffected
543
log( inplace($bigarray) ); # $bigarray changed in situ
551
The usual caveats about duplicate references apply.
555
Obviously when used with some functions which can not be applied
556
in situ (e.g. C<convolve()>) unexpected effects may occur! We try to
557
indicate C<inplace()>-safe functions in the documentation.
561
Type conversions, such asC<float()>, may cause hidden copying.
567
=head2 Ensuring piddleness
569
If you have written a simple function and
570
you don't want it to blow up in your face if you pass it a simple
571
number rather than a PDL variable. Simply call the function
572
L<topdl()|PDL::Core/topdl> first to make it safe. e.g.:
574
sub myfiddle { my $pdl = topdl(shift); $pdl->fiddle_foo(...); ... }
576
C<topdl()> does NOT perform a copy if a pdl variable is passed - it
577
just falls through - which is obviously the desired behaviour. The
578
routine is not of course necessary in normal user defined functions
579
which do not care about internals.
583
Copyright (C) Karl Glazebrook (kgb@aaoepp.aao.gov.au), Tuomas J. Lukka,
584
(lukka@husc.harvard.edu) and Christian Soeller (c.soeller@auckland.ac.nz) 1997.
585
Commercial reproduction of this documentation in a different format is forbidden.