3
README: library "PCRE-OCaml"
4
****************************
5
Copyright (C) 2008 Markus Mottl (1)
6
==========================================
7
Vienna, November 29, 2008
8
=========================
16
------------------------------------------------------------------------------
17
| Changes | History of code changes |
18
------------------------------------------------------------------------------
19
| INSTALL | Short notes on compiling and |
20
| | installing the library |
21
------------------------------------------------------------------------------
22
| LICENSE | "GNU LESSER GENERAL PUBLIC LICENSE" |
23
------------------------------------------------------------------------------
24
| Makefile | Top Makefile |
25
------------------------------------------------------------------------------
26
| OCamlMakefile | Makefile for easy handling of |
27
| | compilation of not so easy |
28
| | OCaml-projects. It generates dependencies |
29
| | of OCaml-files automatically, |
30
| | is able to handle "ocamllex"-, |
31
| | "ocamlyacc"-, IDL- and C-files and |
32
| | generates native- or byte-code |
33
| | as executable or as library - |
34
| | with thread-support if you want! |
35
------------------------------------------------------------------------------
36
| README.txt | This file |
37
------------------------------------------------------------------------------
38
| README.win32 | Platform-specific information for Win32 |
39
------------------------------------------------------------------------------
40
| examples/subst/ | Example for fast and convenient |
41
| | substitution of patterns in files |
42
------------------------------------------------------------------------------
43
| examples/pcregrep/ | Basic "grep"-like command. |
44
------------------------------------------------------------------------------
45
| examples/cloc/ | Removes comments + empty lines in C-files |
46
------------------------------------------------------------------------------
47
| examples/count_hash/ | Counts equal words in texts |
48
------------------------------------------------------------------------------
49
| lib/ | OCaml-library for interfacing the PCRE-C-library|
50
| | Contains lots of higher level functions |
51
------------------------------------------------------------------------------
52
| pcre_make.win32/ | Additional files for Win32 |
53
------------------------------------------------------------------------------
58
2 What is the "PCRE-OCaml"-library?
59
*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
62
This OCaml-library interfaces the PCRE (Perl-compatibility regular
63
expressions) library which is written in C. it can be used for matching
64
regular expressions which are written in "PERL"-style.
66
Searching for, replacing or splitting text should become much easier with
70
3 Why would you need it?
71
*=*=*=*=*=*=*=*=*=*=*=*=*
74
Here is a list of features:
78
The PCRE-library by Philip Hazel has been under development for quite some
79
time now and is fairly advanced and stable. It implements just about all of
80
the convenient functionality of regular expressions as one can find them in
81
PERL. The higher-level functions written in OCaml (split, replace), too,
82
are compatible to the corresponding PERL-functions (to the extent that
83
OCaml allows). Most people find the syntax of PERL-style regular
84
expressions more straightforward than the Emacs-style one used in the
88
It is reentrant - and thus thread safe. This is not the case with the
89
"Str"-module of OCaml, which builds on the GNU "regex"-library. Using
90
reentrant libraries also means more convenience for programmers. They do
91
not have to reason about states in which the library might be in.
94
The high-level functions for replacement and substitution, they are all
95
implemented in OCaml, are much faster than the ones of the "Str"-module. In
96
fact, when compiled to native code, they even seem to be significantly
97
faster than those of PERL (PERL is written in C).
100
You can rely on the data returned being unique. In other terms: if the result
101
of a function is a string, you can safely use destructive updates on it
102
without having to fear side effects.
105
The interface to the library makes use of labels and default arguments to
106
give you a high degree of programming comfort.
110
4 How can you use it?
111
*=*=*=*=*=*=*=*=*=*=*=
114
Most functions allow additional parameters - they often have to be
115
translated to an internal format for the PCRE. Two ways of passing arguments
116
are possible in all such cases: the one is convenient, the other improves
117
speed. You can also often leave away such arguments - the intuitive default (=
118
no special behaviour) will be used instead then.
120
Convenient way of passing arguments - flags passed as list:
123
<< regexp ~flags:[`ANCHORED; `CASELESS] "some_pattern"
126
This makes it easy to pass flags on the fly. They will be translated to the
127
internal format automatically. However, if this happens to be in a loop, this
128
translation will occur on each iteration. If you really need to save as much
129
performance as possible, you should use the next approach.
131
Efficient way of passing flags - translate them before:
134
<< let iflags = cflags [`ANCHORED; `CASELESS] in
136
regexp ~iflags "some runtime-constructed pattern"
140
Factoring out the translation of flags for regular expressions may save some
141
cycles, but don't expect too much. You can save more CPU time when lifting the
142
creation of regular expressions out of loops. E.g. instead of:
145
<< for i = 1 to 1000 do
146
split ~pat:"[ \t]+" "foo bar"
153
<< let rex = regexp "[ \t]+" in
159
Take a look at the interface "pcre.mli" to see, which ways exists to pass
160
parameters and to learn about the defaults.
163
5 Contact information
164
*=*=*=*=*=*=*=*=*=*=*=
167
In the case of bugs, feature requests and similar, you can contact me here:
169
markus.mottl@gmail.com
171
Up-to-date information concerning this library should be available here:
173
http://www.ocaml.info/ocaml_sources
178
-----------------------------------------------------------------------------
180
This document was translated from LaTeX by HeVeA (2).
181
--------------------------------------
184
(1) http://www.ocaml.info/
186
(2) http://hevea.inria.fr/index.html