1
Copyright 2001 Free Software Foundation, Inc.
3
This file is part of the GNU MP Library.
5
The GNU MP Library is free software; you can redistribute it and/or modify
6
it under the terms of the GNU Lesser General Public License as published by
7
the Free Software Foundation; either version 2.1 of the License, or (at your
8
option) any later version.
10
The GNU MP Library is distributed in the hope that it will be useful, but
11
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
12
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
13
License for more details.
15
You should have received a copy of the GNU Lesser General Public License
16
along with the GNU MP Library; see the file COPYING.LIB. If not, write to
17
the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
25
GMP EXPRESSION EVALUATION
26
-------------------------
30
THIS CODE IS PRELIMINARY AND MAY BE SUBJECT TO INCOMPATIBLE CHANGES IN
31
FUTURE VERSIONS OF GMP.
35
The files in this directory implement a simple scheme of string based
36
expression parsing and evaluation, supporting mpz, mpq, mpf and mpfr.
38
This will be slower than direct GMP library calls, but may be convenient in
39
various circumstances, such as while prototyping, or for letting a user
40
enter values in symbolic form. "2**5723-7" for example is a lot easier to
41
enter or maintain than the equivalent written out in decimal.
47
Nothing in this directory is a normal part of libgmp, and nothing is built
48
or installed, but various Makefile rules are available to compile
51
All the functions are available through a little library (there's no shared
52
library since upward binary compatibility is not guaranteed).
56
In a program, prototypes are available using
60
run-expr.c is a sample program doing evaluations from the command line.
65
t-expr.c is self-test program, it prints nothing if successful.
70
The expr*.c sources don't depend on gmp-impl.h and can be compiled with just
71
a standard installed GMP. This isn't true of t-expr though, since it uses
72
some of the internal tests/libtests.la.
78
The mpfr support in libexpr.a is built if GMP is configured with
79
--enable-mpfr. Programs must include mpfr.h before expr.h to get the extra
85
If the expr sources are being compiled outside a GMP build tree, then change
86
HAVE_MPFR in expr-impl.h if necessary to indicate whether mpfr is available.
87
exprfr.c and exprfra.c are the C files providing mpfr support.
94
int mpz_expr (mpz_t res, int base, const char *e, ...);
95
int mpq_expr (mpq_t res, int base, const char *e, ...);
96
int mpf_expr (mpf_t res, int base, const char *e, ...);
97
int mpfr_expr (mpfr_t res, int base, const char *e, ...);
99
These functions evaluate simple arithmetic expressions. For example,
101
mpz_expr (result, 0, "123+456", NULL);
103
Numbers are parsed by mpz_expr and mpq_expr the same as mpz_set_str with the
104
given base. mpf_expr and mpfr_expr follow mpf_set_str and mpfr_set_str
105
respectively, but supporting an "0x" prefix for hex when base==0.
107
mpz_expr (result, 0, "0xAAAA * 0x5555", NULL);
109
White space, as indicated by <ctype.h> isspace(), is ignored except for the
110
purpose of separating tokens.
112
Variables can be included in expressions by putting them in the varargs list
113
after the string. "a", "b", "c" etc in the expression string designate
114
those values. For example,
118
mpq_expr (q, 10, "2/3 + 1/a + b/2", foo, bar, NULL);
120
Here "a" will be the value from foo and "b" from bar. Up to 26 variables
121
can be included this way. The NULL must be present to indicate the end of
124
Variables can also be written "$a", "$b" etc. This is necessary when using
125
bases greater than 10 since plain "a", "b" etc will otherwise be interpreted
126
as numbers. For example,
129
mpf_expr (f, 16, "F00F@-6 * $a", quux, NULL);
131
All the standard C operators are available, with the usual precedences, plus
132
"**" for exponentiation at the highest precedence (and right associative).
149
Currently only mpz_expr has the bitwise ~ % & ^ and | operators. The
150
precedence numbers are of interest in the advanced usage described below.
152
Various functions are available too. For example,
154
mpz_expr (res, 10, "gcd(123,456,789) * abs(a)", var, NULL);
156
The following is the full set of functions,
159
abs bin clrbit cmp cmpabs congruent_p divisible_p even_p fib fac
160
gcd hamdist invert jacobi kronecker lcm lucnum max min nextprime
161
odd_p perfect_power_p perfect_square_p popcount powm
162
probab_prime_p root scan0 scan1 setbit sgn sqrt
165
abs, cmp, den, max, min, num, sgn
168
abs, ceil, cmp, eq, floor, integer_p, max, min, reldiff, sgn,
172
abs, agm, ceil, cmp, cos, eq, exp, floor, inf_p, log, max, min,
173
nan_p, number_p, reldiff, sgn, sin, sqrt, trunc
175
In addition mpfr_expr has constants "log2" and "pi". For example,
177
mpfr_expr (res, 0, "2 * pi - log2", NULL);
179
All these are the same as the GMP library functions, except that min and max
180
don't exist in the library. Note also that min, max, gcd and lcm take any
181
number of arguments, not just two.
184
mpf_expr and mpfr_expr do all calculations to the precision of the
185
destination variable. mpfr_expr rounds using GMP_RNDZ (but this can be
186
changed at build time in expr-impl.h).
189
Expression parsing can succeed or fail. The return value indicates this,
190
and will be one of the following
193
MPEXPR_RESULT_BAD_VARIABLE
194
MPEXPR_RESULT_BAD_TABLE
195
MPEXPR_RESULT_PARSE_ERROR
198
BAD_VARIABLE is when a variable is referenced that hasn't been provided.
199
For example if "c" is used when only two parameters have been passed.
200
BAD_TABLE is applicable to the advanced usage described below.
202
PARSE_ERROR is a general syntax error, returned for any mal-formed input
205
NOT_UI is returned when an attempt is made to use an operand that's bigger
206
than an "unsigned long" with a function that's restricted to that range.
207
For example "fib" is mpz_fib_ui and only accepts an "unsigned long".
214
int mpz_expr_a (const struct mpexpr_operator_t *table,
215
mpz_ptr res, int base, const char *e, size_t elen,
217
int mpq_expr_a (const struct mpexpr_operator_t *table,
218
mpq_ptr res, int base, const char *e, size_t elen,
220
int mpf_expr_a (const struct mpexpr_operator_t *table,
221
mpf_ptr res, int base, unsigned long prec,
222
const char *e, size_t elen,
224
int mpfr_expr_a (const struct mpexpr_operator_t *table,
225
mpfr_ptr res, int base, unsigned long prec,
226
const char *e, size_t elen,
229
These functions are an advanced interface to expression parsing.
231
The string is taken as pointer and length. This makes it possible to parse
232
an expression in the middle of somewhere without copying and null
235
Variables are an array of 26 pointers to the appropriate operands, or NULL
236
for variables that are not available. Any combination of variables can be
237
given, for example just "x" and "y" (var[23] and var[24]) could be set.
239
Operators and functions are specified with a table. This makes it possible
240
to provide additional operators or functions, or to completely change the
241
syntax. The standard tables used by the simple functions above are
244
const struct mpexpr_operator_t * const mpz_expr_standard_table;
245
const struct mpexpr_operator_t * const mpq_expr_standard_table;
246
const struct mpexpr_operator_t * const mpf_expr_standard_table;
247
const struct mpexpr_operator_t * const mpfr_expr_standard_table;
249
struct mpexpr_operator_t is the following
251
struct mpexpr_operator_t {
258
typedef void (*mpexpr_fun_t) (void);
260
As an example, the standard mpz_expr table entry for multiplication is as
261
follows. See the source code for the full set of standard entries.
263
{ "*", (mpexpr_fun_t) mpz_mul, MPEXPR_TYPE_BINARY, 200 },
265
"name" is the string to parse, "fun" is the function to call for it, "type"
266
indicates what parameters the function takes (among other things), and
267
"precedence" sets its operator precedence.
269
A NULL for "name" indicates the end of the table, so for example an mpf
270
table with nothing but addition could be
272
struct mpexpr_operator_t table[] = {
273
{ "+", (mpexpr_fun_t) mpf_add, MPEXPR_TYPE_BINARY, 190 },
277
A special type MPEXPR_TYPE_NEW_TABLE makes it possible to chain from one
278
table to another. For example the following would add a "mod" operator to
279
the standard mpz table,
281
struct mpexpr_operator_t table[] = {
282
{ "mod", (mpexpr_fun_t) mpz_fdiv_r, MPEXPR_TYPE_BINARY, 125 },
283
{ (const char *) mpz_expr_standard_table, NULL, MPEXPR_TYPE_NEW_TABLE }
286
Notice the low precedence on "mod", so that for instance "45+26 mod 7"
287
parses as "(45+26)mod7".
290
Functions are designated by a precedence of 0. They always occur as
291
"foo(expr)" and so have no need for a precedence level. mpq_abs in the
292
standard mpq table is
294
{ "abs", (mpexpr_fun_t) mpq_abs, MPEXPR_TYPE_UNARY },
296
Functions expecting no arguments as in "foo()" can be given with
297
MPEXPR_TYPE_0ARY, or actual constants to be parsed as just "foo" are
298
MPEXPR_TYPE_CONSTANT. For example if a "void mpf_const_pi(mpf_t f)"
299
function existed (which it doesn't) it could be,
301
{ "pi", (mpexpr_fun_t) mpf_const_pi, MPEXPR_TYPE_CONSTANT },
304
Parsing of operator names is done by seeking the table entry with the
305
longest matching name. So for instance operators "<" and "<=" exist, and
306
when presented with "x <= y" the parser matches "<=" because it's longer.
308
Parsing of function names, on the other hand, is done by requiring a whole
309
alphanumeric word to match. For example presented with "fib2zz(5)" the
310
parser will attempt to find a function called "fib2zz". A function "fib"
311
wouldn't be used because it doesn't match the whole word.
313
The flag MPEXPR_TYPE_WHOLEWORD can be ORed into an operator type to override
314
the default parsing style. Similarly MPEXPR_TYPE_OPERATOR into a function.
317
Binary operators are left associative by default, meaning they're evaluated
318
from left to right, so for example "1+2+3" is treated as "(1+2)+3".
319
MPEXPR_TYPE_RIGHTASSOC can be ORed into the operator type to work from right
320
to left as in "1+(2+3)". This is generally what's wanted for
321
exponentiation, and for example the standard mpz table has
323
{ "**", (mpexpr_fun_t) mpz_pow_ui,
324
MPEXPR_TYPE_BINARY_UI | MPEXPR_TYPE_RIGHTASSOC, 220 }
326
Unary operators are postfix by default. For example a factorial to be used
329
{ "!", (mpexpr_fun_t) mpz_fac_ui, MPEXPR_TYPE_UNARY_UI, 215 }
331
MPEXPR_TYPE_PREFIX can be ORed into the type to get a prefix operator. For
332
instance negation (unary minus) in the standard mpf table is
334
{ "-", (mpexpr_fun_t) mpf_neg,
335
MPEXPR_TYPE_UNARY | MPEXPR_TYPE_PREFIX, 210 },
338
The same operator can exist as a prefix unary and a binary, or as a prefix
339
and postfix unary, simply by putting two entries in the table. While
340
parsing the context determines which style is sought. But note that the
341
same operator can't be both a postfix unary and a binary, since the parser
342
doesn't try to look ahead to decide which ought to be used.
344
When there's two entries for an operator, both prefix or both postfix (or
345
binary), then the first in the table will be used. This makes it possible
346
to override an entry in a standard table, for example to change the function
347
it calls, or perhaps its precedence level. The following would change mpz
348
division from tdiv to cdiv,
350
struct mpexpr_operator_t table[] = {
351
{ "/", (mpexpr_fun_t) mpz_cdiv_q, MPEXPR_TYPE_BINARY, 200 },
352
{ "%", (mpexpr_fun_t) mpz_cdiv_r, MPEXPR_TYPE_BINARY, 200 },
353
{ (char *) mpz_expr_standard_table, NULL, MPEXPR_TYPE_NEW_TABLE }
357
The type field indicates what parameters the given function expects. The
358
following styles of functions are supported. mpz_t is shown, but of course
359
this is mpq_t for mpq_expr_a, mpf_t for mpf_expr_a, etc.
361
MPEXPR_TYPE_CONSTANT void func (mpz_t result);
363
MPEXPR_TYPE_0ARY void func (mpz_t result);
364
MPEXPR_TYPE_I_0ARY int func (void);
366
MPEXPR_TYPE_UNARY void func (mpz_t result, mpz_t op);
367
MPEXPR_TYPE_UNARY_UI void func (mpz_t result, unsigned long op);
368
MPEXPR_TYPE_I_UNARY int func (mpz_t op);
369
MPEXPR_TYPE_I_UNARY_UI int func (unsigned long op);
371
MPEXPR_TYPE_BINARY void func (mpz_t result, mpz_t op1, mpz_t op2);
372
MPEXPR_TYPE_BINARY_UI void func (mpz_t result,
373
mpz_t op1, unsigned long op2);
374
MPEXPR_TYPE_I_BINARY int func (mpz_t op1, mpz_t op2);
375
MPEXPR_TYPE_I_BINARY_UI int func (mpz_t op1, unsigned long op2);
377
MPEXPR_TYPE_TERNARY void func (mpz_t result,
378
mpz_t op1, mpz_t op2, mpz_t op3);
379
MPEXPR_TYPE_TERNARY_UI void func (mpz_t result, mpz_t op1, mpz_t op2,
381
MPEXPR_TYPE_I_TERNARY int func (mpz_t op1, mpz_t op2, mpz_t op3);
382
MPEXPR_TYPE_I_TERNARY_UI int func (mpz_t op1, mpz_t op2,
385
Notice the pattern of "UI" for the last parameter as an unsigned long, or
386
"I" for the result as an "int" return value.
388
It's important that the declared type for an operator or function matches
389
the function pointer given. Any mismatch will have unpredictable results.
391
For binary functions, a further type attribute is MPEXPR_TYPE_PAIRWISE which
392
indicates that any number of arguments should be accepted, and evaluated by
393
applying the given binary function to them pairwise. This is used by gcd,
394
lcm, min and max. For example the standard mpz gcd is
396
{ "gcd", (mpexpr_fun_t) mpz_gcd,
397
MPEXPR_TYPE_BINARY | MPEXPR_TYPE_PAIRWISE },
399
Some special types exist for comparison operators (or functions).
400
MPEXPR_TYPE_CMP_LT through MPEXPR_TYPE_CMP_GE expect an MPEXPR_TYPE_I_BINARY
401
function, returning positive, negative or zero like mpz_cmp and similar.
402
For example the standard mpf "!=" operator is
404
{ "!=", (mpexpr_fun_t) mpf_cmp, MPEXPR_TYPE_CMP_NE, 160 },
406
But there's no obligation to use these types, for instance the standard mpq
407
table just uses a plain MPEXPR_TYPE_I_BINARY and mpq_equal for "==".
409
Further special types MPEXPR_TYPE_MIN and MPEXPR_TYPE_MAX exist to implement
410
the min and max functions, and they take a function like mpf_cmp similarly.
411
The standard mpf max function is
413
{ "max", (mpexpr_fun_t) mpf_cmp,
414
MPEXPR_TYPE_MAX | MPEXPR_TYPE_PAIRWISE },
416
These can be used as operators too, for instance the following would be the
417
>? operator which is a feature of GNU C++,
419
{ ">?", (mpexpr_fun_t) mpf_cmp, MPEXPR_TYPE_MAX, 175 },
421
Other special types are used to define "(" ")" parentheses, "," function
422
argument separator, "!" through "||" logical booleans, ternary "?" ":", and
423
the "$" which introduces variables. See the sources for how they should be
427
User definable operator tables will have various uses. For example,
429
- a subset of the C operators, to be rid of infrequently used things
430
- a more mathematical syntax like "." for multiply, "^" for powering,
431
and "!" for factorial
432
- a boolean evaluator with "^" for AND, "v" for OR
433
- variables introduced with "%" instead of "$"
434
- brackets as "[" and "]" instead of "(" and ")"
435
- new mpfr operators ^+^ or _+_ which round in a particular direction
437
The only fixed parts of the parsing are the treatment of numbers, whitespace
438
and the two styles of operator/function name recognition.
440
As a final example, the following would be a complete mpz table implementing
441
some operators with a more mathematical syntax. Notice there's no need to
442
preserve the standard precedence values, anything can be used so long as
443
they're in the desired relation to each other. There's also no need to have
444
entries in precedence order, but it's convenient to do so to show what comes
447
static const struct mpexpr_operator_t table[] = {
448
{ "^", (mpexpr_fun_t) mpz_pow_ui,
449
MPEXPR_TYPE_BINARY_UI | MPEXPR_TYPE_RIGHTASSOC, 9 },
451
{ "!", (mpexpr_fun_t) mpz_fac_ui, MPEXPR_TYPE_UNARY_UI, 8 },
452
{ "-", (mpexpr_fun_t) mpz_neg,
453
MPEXPR_TYPE_UNARY | MPEXPR_TYPE_PREFIX, 7 },
455
{ "*", (mpexpr_fun_t) mpz_mul, MPEXPR_TYPE_BINARY, 6 },
456
{ "/", (mpexpr_fun_t) mpz_fdiv_q, MPEXPR_TYPE_BINARY, 6 },
458
{ "+", (mpexpr_fun_t) mpz_add, MPEXPR_TYPE_BINARY, 5 },
459
{ "-", (mpexpr_fun_t) mpz_sub, MPEXPR_TYPE_BINARY, 5 },
461
{ "mod", (mpexpr_fun_t) mpz_mod, MPEXPR_TYPE_BINARY, 6 },
463
{ ")", NULL, MPEXPR_TYPE_CLOSEPAREN, 4 },
464
{ "(", NULL, MPEXPR_TYPE_OPENPAREN, 3 },
465
{ ",", NULL, MPEXPR_TYPE_ARGSEP, 2 },
467
{ "$", NULL, MPEXPR_TYPE_VARIABLE, 1 },
476
Operator precedence is implemented using a control and data stack, there's
477
no C recursion. When an expression like 1+2*3 is read the "+" is held on
478
the control stack and 1 on the data stack until "*" has been parsed and
479
applied to 2 and 3. This happens any time a higher precedence operator
480
follows a lower one, or when a right-associative operator like "**" is
483
Parentheses are handled by making "(" a special prefix unary with a low
484
precedence so a whole following expression is read. The special operator
485
")" knows to discard the pending "(". Function arguments are handled
486
similarly, with the function pretending to be a low precedence prefix unary
487
operator, and with "," allowed within functions. The same special ")"
488
operator recognises a pending function and will invoke it appropriately.
490
The ternary "? :" operator is also handled using precedences. ":" is one
491
level higher than "?", so when a valid a?b:c is parsed the ":" finds a "?"
492
on the control stack. It's a parse error for ":" to find anything else.
498
The ternary "?:" operator evaluates the "false" side of its pair, which is
499
wasteful, though it ought to be harmless. It'd be better if it could
500
evaluate only the "true" side. Similarly for the logical booleans "&&" and
501
"||" if they know their result already.
503
Functions like MPEXPR_TYPE_BINARY could return a status indicating operand
504
out of range or whatever, to get an error back through mpz_expr etc. That
505
would want to be just an option, since plain mpz_add etc have no such
508
Could have assignments like "a = b*c" modifying the input variables.
509
Assignment could be an operator attribute, making it expect an lvalue.
510
There would want to be a standard table without assignments available
511
though, so user input could be safely parsed.
513
The closing parethesis table entry could specify the type of open paren it
514
expects, so that "(" and ")" could match and "[" and "]" match but not a
515
mixture of the two. Currently "[" and "]" can be added, but there's no
516
error on writing a mixed expression like "2*(3+4]". Maybe also there could
517
be a way to say that functions can only be written with one or the other