1
\input texinfo @c -*-texinfo-*-
5
@settitle GNU MP @value{VERSION}
10
@comment %**end of header
13
This manual describes how to install and use the GNU multiple precision
14
arithmetic library, version @value{VERSION}.
16
Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
17
2003, 2004 Free Software Foundation, Inc.
19
Permission is granted to copy, distribute and/or modify this document under
20
the terms of the GNU Free Documentation License, Version 1.1 or any later
21
version published by the Free Software Foundation; with no Invariant Sections,
22
with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover
23
Texts being ``You have freedom to copy and modify this GNU Manual, like GNU
24
software''. A copy of the license is included in
25
@ref{GNU Free Documentation License}.
27
@c Note the @ref above must be on one line, a line break in an @ref within
28
@c @copying will bomb in recent texinfo.tex (eg. 2004-04-07.08 which comes
29
@c with texinfo 4.7), with messages about missing @endcsname.
32
@c Texinfo version 4.2 or up will be needed to process this into .info files.
34
@c The supplied texinfo.tex (or newer) should be used when processing into
37
@c The version number and edition number are taken from version.texi provided
38
@c by automake (note it's regenerated only if you configure with
39
@c --enable-maintainer-mode).
41
@c Discussions about this version in relation to previous ones (for instance
42
@c in the "Compatibility" section) obviously must be looked at manually
45
@c "cindex" entries have been made for function categories and programming
46
@c topics. Minutiae like particular systems and processors mentioned in
47
@c various places have been left out so as not to bury important topics under
48
@c a lot of junk. "mpn" functions aren't in the concept index because a
49
@c beginner looking for "GCD" or something is only going to be confused by
50
@c pointers to low level routines.
53
@dircategory GNU libraries
55
* gmp: (gmp). GNU Multiple Precision Arithmetic Library.
58
@c html <meta name=description content="...">
60
How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}.
61
@end documentdescription
68
@node Top, Copying, (dir), (dir)
75
@subtitle The GNU Multiple Precision Arithmetic Library
76
@subtitle Edition @value{EDITION}
77
@subtitle @value{UPDATED}
79
@author by Torbj@"orn Granlund, Swox AB
80
@email{tege@@swox.com}
82
@c Include the Distribution inside the titlepage so
83
@c that headings are turned off.
88
\global\baselineskip=13pt
92
@vskip 0pt plus 1filll
105
@c Don't bother with contents for html, the menus seem adequate.
111
* Copying:: GMP Copying Conditions (LGPL).
112
* Introduction to GMP:: Brief introduction to GNU MP.
113
* Installing GMP:: How to configure and compile the GMP library.
114
* GMP Basics:: What every GMP user should know.
115
* Reporting Bugs:: How to usefully report bugs.
116
* Integer Functions:: Functions for arithmetic on signed integers.
117
* Rational Number Functions:: Functions for arithmetic on rational numbers.
118
* Floating-point Functions:: Functions for arithmetic on floats.
119
* Low-level Functions:: Fast functions for natural numbers.
120
* Random Number Functions:: Functions for generating random numbers.
121
* Formatted Output:: @code{printf} style output.
122
* Formatted Input:: @code{scanf} style input.
123
* C++ Class Interface:: Class wrappers around GMP types.
124
* BSD Compatible Functions:: All functions found in BSD MP.
125
* Custom Allocation:: How to customize the internal allocation.
126
* Language Bindings:: Using GMP from other languages.
127
* Algorithms:: What happens behind the scenes.
128
* Internals:: How values are represented behind the scenes.
130
* Contributors:: Who brings your this library?
131
* References:: Some useful papers and books to read.
132
* GNU Free Documentation License::
138
@c @m{T,N} is $T$ in tex or @math{N} otherwise. This is an easy way to give
139
@c different forms for math in tex and info. Commas in N or T don't work,
140
@c but @C{} can be used instead. \, works in info but not in tex.
156
@c @ms{V,N} is $V_N$ in tex or just vn otherwise. This suits simple
157
@c subscripts like @ms{x,0}.
160
@tex$\V\_{\N\}$@end tex
169
@c @nicode{S} is plain S in info, or @code{S} elsewhere. This can be used
170
@c when the quotes that @code{} gives in info aren't wanted, but the
171
@c fontification in tex or html is wanted. Doesn't work as @nicode{'\\0'}
172
@c though (gives two backslashes in tex).
184
@c @nisamp{S} is plain S in info, or @samp{S} elsewhere. This can be used
185
@c when the quotes that @samp{} gives in info aren't wanted, but the
186
@c fontification in tex or html is wanted.
198
@c Usage: @GMPtimes{}
199
@c Give either \times or the word "times".
201
\gdef\GMPtimes{\times}
209
@c Usage: @GMPmultiply{}
210
@c Give * in info, or nothing in tex.
221
@c Give either |x| in tex, or abs(x) in info or html.
231
@c Usage: @GMPfloor{x}
232
@c Give either \lfloor x\rfloor in tex, or floor(x) in info or html.
234
\gdef\GMPfloor#1{\lfloor #1\rfloor}
242
@c Usage: @GMPceil{x}
243
@c Give either \lceil x\rceil in tex, or ceil(x) in info or html.
245
\gdef\GMPceil#1{\lceil #1 \rceil}
253
@c Math operators already available in tex, made available in info too.
254
@c For example @bmod{} can be used in both tex and info.
279
@c New math operators.
280
@c @abs{} can be used in both tex and info, or just \abs in tex.
282
\gdef\abs{\mathop{\rm abs}}
290
@c @cross{} is a \times symbol in tex, or an "x" in info. In tex it works
291
@c inside or outside $ $.
293
\gdef\cross{\ifmmode\times\else$\times$\fi}
301
@c @times{} made available as a "*" in info and html (already works in tex).
309
@c Like @w{} but working in math mode too.
311
\gdef\W#1{\ifmmode{#1}\else\w{#1}\fi}
319
@c Usage: \GMPdisplay{text}
320
@c Put the given text in an @display style indent, but without turning off
321
@c paragraph reflow etc.
325
\advance\leftskip by \lispnarrowing
330
@c A new \hat that will work in math mode, unlike the texinfo redefined
333
\gdef\GMPhat{\mathaccent"705E}
336
@c Usage: \GMPraise{text}
337
@c For use in a $ $ math expression as an alternative to "^". This is good
338
@c for @code{} in an exponent, since there seems to be no superscript font
341
\gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}}
344
@c Usage: @texlinebreak{}
345
@c A line break as per @*, but only in tex.
356
@c Usage: @maybepagebreak
357
@c Allow tex to insert a page break, if it feels the urge.
358
@c Normally blocks of @deftypefun/funx are kept together, which can lead to
359
@c some poor page break positioning if it's a big block, like the sets of
360
@c division functions etc.
362
\gdef\maybepagebreak{\penalty0}
365
@macro maybepagebreak
370
@node Copying, Introduction to GMP, Top, Top
371
@comment node-name, next, previous, up
372
@unnumbered GNU MP Copying Conditions
373
@cindex Copying conditions
374
@cindex Conditions for copying GNU MP
375
@cindex License conditions
377
This library is @dfn{free}; this means that everyone is free to use it and
378
free to redistribute it on a free basis. The library is not in the public
379
domain; it is copyrighted and there are restrictions on its distribution, but
380
these restrictions are designed to permit everything that a good cooperating
381
citizen would want to do. What is not allowed is to try to prevent others
382
from further sharing any version of this library that they might get from
385
Specifically, we want to make sure that you have the right to give away copies
386
of the library, that you receive source code or else can get it if you want
387
it, that you can change this library or use pieces of it in new free programs,
388
and that you know you can do these things.@refill
390
To make sure that everyone has such rights, we have to forbid you to deprive
391
anyone else of these rights. For example, if you distribute copies of the GNU
392
MP library, you must give the recipients all the rights that you have. You
393
must make sure that they, too, receive or can get the source code. And you
394
must tell them their rights.@refill
396
Also, for our own protection, we must make certain that everyone finds out
397
that there is no warranty for the GNU MP library. If it is modified by
398
someone else and passed on, we want their recipients to know that what they
399
have is not what we distributed, so that any problems introduced by others
400
will not reflect on our reputation.@refill
402
The precise conditions of the license for the GNU MP library are found in the
403
Lesser General Public License version 2.1 that accompanies the source code,
404
see @file{COPYING.LIB}. Certain demonstration programs are provided under the
405
terms of the plain General Public License version 2, see @file{COPYING}.
408
@node Introduction to GMP, Installing GMP, Copying, Top
409
@comment node-name, next, previous, up
410
@chapter Introduction to GNU MP
413
GNU MP is a portable library written in C for arbitrary precision arithmetic
414
on integers, rational numbers, and floating-point numbers. It aims to provide
415
the fastest possible arithmetic for all applications that need higher
416
precision than is directly supported by the basic C types.
418
Many applications use just a few hundred bits of precision; but some
419
applications may need thousands or even millions of bits. GMP is designed to
420
give good performance for both, by choosing algorithms based on the sizes of
421
the operands, and by carefully keeping the overhead at a minimum.
423
The speed of GMP is achieved by using fullwords as the basic arithmetic type,
424
by using sophisticated algorithms, by including carefully optimized assembly
425
code for the most common inner loops for many different CPUs, and by a general
426
emphasis on speed (as opposed to simplicity or elegance).
428
There is carefully optimized assembly code for these CPUs:
429
@cindex CPUs supported
431
DEC Alpha 21064, 21164, and 21264,
433
AMD K6, K6-2 and Athlon,
434
Hitachi SuperH and SH-2,
435
HPPA 1.0, 1.1 and 2.0,
436
Intel Pentium, Pentium Pro/II/III, Pentium 4, generic x86,
438
Motorola MC68000, MC68020, MC88100, and MC88110,
439
Motorola/IBM PowerPC 32 and 64,
443
SPARCv7, SuperSPARC, generic SPARCv8, UltraSPARC,
447
Some optimizations also for
454
@cindex Mailing lists
455
There are two public mailing lists of interest. One for general questions and
456
discussions about usage of the GMP library and one for discussions about
457
development of GMP. There's more information about the mailing lists at
458
@uref{http://swox.com/mailman/listinfo/}. These lists are @strong{not} for
461
The proper place for bug reports is @email{bug-gmp@@gnu.org}. See
462
@ref{Reporting Bugs} for info about reporting bugs.
467
For up-to-date information on GMP, please see the GMP web pages at
470
@uref{http://swox.com/gmp/}
473
@cindex Latest version of GMP
474
@cindex Anonymous FTP of latest version
475
@cindex FTP of latest version
477
The latest version of the library is available at
480
@uref{ftp://ftp.gnu.org/gnu/gmp}
483
Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror
484
near you, see @uref{http://www.gnu.org/order/ftp.html} for a full list.
486
@cindex Mailing lists
487
There are three public mailing lists of interest. One for release
488
announcements, one for general questions and discussions about usage of the
489
GMP library and one for discussions about development of GMP. These lists are
490
@strong{not} for bug reports. For more information, see
493
@uref{http://swox.com/mailman/listinfo/}.
496
The proper place for bug reports is @email{bug-gmp@@gnu.org}. See
497
@ref{Reporting Bugs} for information about reporting bugs.
500
@section How to use this Manual
501
@cindex About this manual
503
Everyone should read @ref{GMP Basics}. If you need to install the library
504
yourself, then read @ref{Installing GMP}. If you have a system with multiple
505
ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used
508
The rest of the manual can be used for later reference, although it is
509
probably a good idea to glance through it.
512
@node Installing GMP, GMP Basics, Introduction to GMP, Top
513
@comment node-name, next, previous, up
514
@chapter Installing GMP
515
@cindex Installing GMP
516
@cindex Configuring GMP
519
GMP has an autoconf/automake/libtool based configuration system. On a
520
Unix-like system a basic build can be done with
528
Some self-tests can be run with
535
And you can install (under @file{/usr/local} by default) with
541
If you experience problems, please report them to @email{bug-gmp@@gnu.org}.
542
See @ref{Reporting Bugs}, for information on what to include in useful bug
548
* Notes for Package Builds::
549
* Notes for Particular Systems::
550
* Known Build Problems::
554
@node Build Options, ABI and ISA, Installing GMP, Installing GMP
555
@section Build Options
556
@cindex Build options
558
All the usual autoconf configure options are available, run @samp{./configure
559
--help} for a summary. The file @file{INSTALL.autoconf} has some generic
560
installation information too.
563
@item Non-Unix Systems
565
@samp{configure} requires various Unix-like tools. On an MS-DOS system DJGPP
566
can be used, and on MS Windows Cygwin or MINGW can be used,
569
@uref{http://www.cygwin.com/}
570
@uref{http://www.delorie.com/djgpp}
571
@uref{http://www.mingw.org}
574
Microsoft also publishes an Interix ``Services for Unix'' which can be used to
575
build GMP on Windows (with a normal @samp{./configure}), but it's not free
578
The @file{macos} directory contains an unsupported port to MacOS 9 on Power
579
Macintosh, see @file{macos/README}. Note that MacOS X ``Darwin'' should use
580
the normal Unix-style @samp{./configure}.
582
It might be possible to build without the help of @samp{configure}, certainly
583
all the code is there, but unfortunately you'll be on your own.
585
@item Build Directory
587
To compile in a separate build directory, @command{cd} to that directory, and
588
prefix the configure command with the path to the GMP source directory. For
593
/my/sources/gmp-@value{VERSION}/configure
596
Not all @samp{make} programs have the necessary features (@code{VPATH}) to
597
support this. In particular, SunOS and Slowaris @command{make} have bugs that
598
make them unable to build in a separate directory. Use GNU @command{make}
601
@item @option{--prefix} and @option{--exec-prefix}
604
@cindex Install prefix
605
@cindex @code{--prefix}
606
@cindex @code{--exec-prefix}
607
The @option{--prefix} option can be used in the normal way to direct GMP to
608
install under a particular tree. The default is @samp{/usr/local}.
610
@option{--exec-prefix} can be used to direct architecture-dependent files like
611
@file{libgmp.a} to a different location. This can be used to share
612
architecture-independent parts like the documentation, but separate the
613
dependent parts. Note however that @file{gmp.h} and @file{mp.h} are
614
architecture-dependent since they encode certain aspects of @file{libgmp}, so
615
it will be necessary to ensure both @file{$prefix/include} and
616
@file{$exec_prefix/include} are available to the compiler.
618
@item @option{--disable-shared}, @option{--disable-static}
620
By default both shared and static libraries are built (where possible), but
621
one or other can be disabled. Shared libraries result in smaller executables
622
and permit code sharing between separate running processes, but on some CPUs
623
are slightly slower, having a small cost on each function call.
625
@item Native Compilation, @option{--build=CPU-VENDOR-OS}
627
For normal native compilation, the system can be specified with
628
@samp{--build}. By default @samp{./configure} uses the output from running
629
@samp{./config.guess}. On some systems @samp{./config.guess} can determine
630
the exact CPU type, on others it will be necessary to give it explicitly. For
634
./configure --build=ultrasparc-sun-solaris2.7
637
In all cases the @samp{OS} part is important, since it controls how libtool
638
generates shared libraries. Running @samp{./config.guess} is the simplest way
639
to see what it should be, if you don't know already.
641
@item Cross Compilation, @option{--host=CPU-VENDOR-OS}
643
When cross-compiling, the system used for compiling is given by @samp{--build}
644
and the system where the library will run is given by @samp{--host}. For
645
example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries,
648
./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu
651
Compiler tools are sought first with the host system type as a prefix. For
652
example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain
653
@command{ranlib}. This makes it possible for a set of cross-compiling tools
654
to co-exist with native tools. The prefix is the argument to @samp{--host},
655
and this can be an alias, such as @samp{m68k-linux}. But note that tools
656
don't have to be setup this way, it's enough to just have a @env{PATH} with a
657
suitable cross-compiling @command{cc} etc.
659
Compiling for a different CPU in the same family as the build system is a form
660
of cross-compilation, though very possibly this would merely be special
661
options on a native compiler. In any case @samp{./configure} avoids depending
662
on being able to run code on the build system, which is important when
663
creating binaries for a newer CPU since they very possibly won't run on the
666
In all cases the compiler must be able to produce an executable (of whatever
667
format) from a standard C @code{main}. Although only object files will go to
668
make up @file{libgmp}, @samp{./configure} uses linking tests for various
669
purposes, such as determining what functions are available on the host system.
671
Currently a warning is given unless an explicit @samp{--build} is used when
672
cross-compiling, because it may not be possible to correctly guess the build
673
system type if the @env{PATH} has only a cross-compiling @command{cc}.
675
Note that the @samp{--target} option is not appropriate for GMP. It's for use
676
when building compiler tools, with @samp{--host} being where they will run,
677
and @samp{--target} what they'll produce code for. Ordinary programs or
678
libraries like GMP are only interested in the @samp{--host} part, being where
679
they'll run. (Some past versions of GMP used @samp{--target} incorrectly.)
683
In general, if you want a library that runs as fast as possible, you should
684
configure GMP for the exact CPU type your system uses. However, this may mean
685
the binaries won't run on older members of the family, and might run slower on
686
other members, older or newer. The best idea is always to build GMP for the
687
exact machine type you intend to run it on.
689
The following CPUs have specific support. See @file{configure.in} for details
690
of what code and compiler options they select.
694
@c Keep this formatting, it's easy to read and it can be grepped to
695
@c automatically test that CPUs listed get through ./config.sub
765
@nisamp{powerpc603e},
767
@nisamp{powerpc604e},
771
@nisamp{powerpc7400},
772
@nisamp{powerpc7450},
787
@nisamp{ultrasparc2},
788
@nisamp{ultrasparc2i},
789
@nisamp{ultrasparc3},
825
CPUs not listed will use generic C code.
827
@item Generic C Build
829
If some of the assembly code causes problems, or if otherwise desired, the
830
generic C code can be selected with CPU @samp{none}. For example,
833
./configure --host=none-unknown-freebsd3.5
836
Note that this will run quite slowly, but it should be portable and should at
837
least make it possible to get something running if all else fails.
841
On some systems GMP supports multiple ABIs (application binary interfaces),
842
meaning data type sizes and calling conventions. By default GMP chooses the
843
best ABI available, but a particular ABI can be selected. For example
846
./configure --host=mips64-sgi-irix6 ABI=n32
849
See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what
850
applications need to do.
852
@item @option{CC}, @option{CFLAGS}
854
By default the C compiler used is chosen from among some likely candidates,
855
with @command{gcc} normally preferred if it's present. The usual
856
@samp{CC=whatever} can be passed to @samp{./configure} to choose something
859
For some systems, default compiler flags are set based on the CPU and
860
compiler. The usual @samp{CFLAGS="-whatever"} can be passed to
861
@samp{./configure} to use something different or to set good flags for systems
862
GMP doesn't otherwise know.
864
The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure},
865
and can be found in each generated @file{Makefile}. This is the easiest way
866
to check the defaults when considering changing or adding something.
868
Note that when @samp{CC} and @samp{CFLAGS} are specified on a system
869
supporting multiple ABIs it's important to give an explicit
870
@samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and
871
won't be able to select the correct assembler code.
873
If just @samp{CC} is selected then normal default @samp{CFLAGS} for that
874
compiler will be used (if GMP recognises it). For example @samp{CC=gcc} can
875
be used to force the use of GCC, with default flags (and default ABI).
877
@item @option{CPPFLAGS}
879
Any flags like @samp{-D} defines or @samp{-I} includes required by the
880
preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}.
881
Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but
882
preprocessing uses just @samp{CPPFLAGS}. This distinction is because most
883
preprocessors won't accept all the flags the compiler does. Preprocessing is
884
done separately in some configure tests, and in the @samp{ansi2knr} support
887
@item C++ Support, @option{--enable-cxx}
888
C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a
889
C++ compiler will be required. As a convenience @samp{--enable-cxx=detect}
890
can be used to enable C++ support only if a compiler can be found. The C++
891
support consists of a library @file{libgmpxx.la} and header file
894
A separate @file{libgmpxx.la} has been adopted rather than having C++ objects
895
within @file{libgmp.la} in order to ensure dynamic linked C programs aren't
896
bloated by a dependency on the C++ standard library, and to avoid any chance
897
that the C++ compiler could be required when linking plain C programs.
899
@file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can
900
only be expected to work with @file{libgmp.la} from the same GMP version.
901
Future changes to the relevant internals will be accompanied by renaming, so a
902
mismatch will cause unresolved symbols rather than perhaps mysterious
905
In general @file{libgmpxx.la} will be usable only with the C++ compiler that
906
built it, since name mangling and runtime support are usually incompatible
907
between different compilers.
909
@item @option{CXX}, @option{CXXFLAGS}
910
When C++ support is enabled, the C++ compiler and its flags can be set with
911
variables @samp{CXX} and @samp{CXXFLAGS} in the usual way. The default for
912
@samp{CXX} is the first compiler that works from a list of likely candidates,
913
with @command{g++} normally preferred when available. The default for
914
@samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then
915
for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers
916
@samp{-g} or nothing. Trying @samp{CFLAGS} this way is convenient when using
917
@samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will
918
usually suit @samp{g++}.
920
It's important that the C and C++ compilers match, meaning their startup and
921
runtime support routines are compatible and that they generate code in the
922
same ABI (if there's a choice of ABIs on the system). @samp{./configure}
923
isn't currently able to check these things very well itself, so for that
924
reason @samp{--disable-cxx} is the default, to avoid a build failure due to a
925
compiler mismatch. Perhaps this will change in the future.
927
Incidentally, it's normally not good enough to set @samp{CXX} to the same as
928
@samp{CC}. Although @command{gcc} for instance recognises @file{foo.cc} as
929
C++ code, only @command{g++} will invoke the linker the right way when
930
building an executable or shared library from object files.
932
@item Temporary Memory, @option{--enable-alloca=<choice>}
933
@cindex Stack overflow segfaults
934
@cindex @code{alloca}
936
GMP allocates temporary workspace using one of the following three methods,
937
which can be selected with for instance
938
@samp{--enable-alloca=malloc-reentrant}.
942
@samp{alloca} - C library or compiler builtin.
944
@samp{malloc-reentrant} - the heap, in a re-entrant fashion.
946
@samp{malloc-notreentrant} - the heap, with global variables.
949
For convenience, the following choices are also available.
950
@samp{--disable-alloca} is the same as @samp{--enable-alloca=no}.
954
@samp{yes} - a synonym for @samp{alloca}.
956
@samp{no} - a synonym for @samp{malloc-reentrant}.
958
@samp{reentrant} - @code{alloca} if available, otherwise
959
@samp{malloc-reentrant}. This is the default.
961
@samp{notreentrant} - @code{alloca} if available, otherwise
962
@samp{malloc-notreentrant}.
965
@code{alloca} is reentrant and fast, and is recommended, but when working with
966
large numbers it can overflow the available stack space, in which case one of
967
the two malloc methods will need to be used. Alternately it might be possible
968
to increase available stack with @command{limit}, @command{ulimit} or
969
@code{setrlimit}, or under DJGPP with @command{stubedit} or
970
@code{@w{_stklen}}. Note that depending on the system the only indication of
971
stack overflow might be a segmentation violation.
973
@samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe,
974
but @samp{malloc-notreentrant} is faster and should be used if reentrancy is
977
The two malloc methods in fact use the memory allocation functions selected by
978
@code{mp_set_memory_functions}, these being @code{malloc} and friends by
979
default. @xref{Custom Allocation}.
981
An additional choice @samp{--enable-alloca=debug} is available, to help when
982
debugging memory related problems (@pxref{Debugging}).
984
@item FFT Multiplication, @option{--disable-fft}
986
By default multiplications are done using Karatsuba, 3-way Toom-Cook, and
987
Fermat FFT. The FFT is only used on large to very large operands and can be
988
disabled to save code size if desired.
990
@item Berkeley MP, @option{--enable-mpbsd}
992
The Berkeley MP compatibility library (@file{libmp}) and header file
993
(@file{mp.h}) are built and installed only if @option{--enable-mpbsd} is used.
994
@xref{BSD Compatible Functions}.
996
@item MPFR, @option{--enable-mpfr}
999
The optional MPFR functions are built and installed only if
1000
@option{--enable-mpfr} is used. These are in a separate library
1001
@file{libmpfr.a} and are documented separately too (@pxref{Introduction to
1002
MPFR,, Introduction to MPFR, mpfr, MPFR}).
1004
@item Assertion Checking, @option{--enable-assert}
1006
This option enables some consistency checking within the library. This can be
1007
of use while debugging, @pxref{Debugging}.
1009
@item Execution Profiling, @option{--enable-profiling=prof/gprof}
1011
Profiling support can be enabled either for @command{prof} or @command{gprof}.
1012
This adds @samp{-p} or @samp{-pg} respectively to @samp{CFLAGS}, and for some
1013
systems adds corresponding @code{mcount} calls to the assembler code.
1016
@item @option{MPN_PATH}
1018
Various assembler versions of each mpn subroutines are provided. For a given
1019
CPU, a search is made though a path to choose a version of each. For example
1023
MPN_PATH="sparc32/v8 sparc32 generic"
1026
which means look first for v8 code, then plain sparc32 (which is v7), and
1027
finally fall back on generic C. Knowledgeable users with special requirements
1028
can specify a different path. Normally this is completely unnecessary.
1032
The document you're now reading is @file{gmp.texi}. The usual automake
1033
targets are available to make PostScript @file{gmp.ps} and/or DVI
1036
HTML can be produced with @samp{makeinfo --html}, see @ref{Generating HTML,,,
1037
texinfo, Texinfo}. Or alternately @samp{texi2html}, see @ref{Top,Texinfo to
1038
HTML,About,texi2html,Texinfo To HTML}.
1040
PDF can be produced with @samp{texi2dvi --pdf} (@pxref{PDF
1041
Output,PDF,,texinfo,Texinfo}) or with @samp{pdftex}.
1043
Some supplementary notes can be found in the @file{doc} subdirectory.
1049
@node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP
1050
@section ABI and ISA
1052
@cindex Application Binary Interface
1054
@cindex Instruction Set Architecture
1056
ABI (Application Binary Interface) refers to the calling conventions between
1057
functions, meaning what registers are used and what sizes the various C data
1058
types are. ISA (Instruction Set Architecture) refers to the instructions and
1059
registers a CPU has available.
1061
Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the
1062
latter for compatibility with older CPUs in the family. GMP supports some
1063
CPUs like this in both ABIs. In fact within GMP @samp{ABI} means a
1064
combination of chip ABI, plus how GMP chooses to use it. For example in some
1065
32-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit
1068
By default GMP chooses the best ABI available for a given system, and this
1069
generally gives significantly greater speed. But an ABI can be chosen
1070
explicitly to make GMP compatible with other libraries, or particular
1071
application requirements. For example,
1077
In all cases it's vital that all object code used in a given program is
1078
compiled for the same ABI.
1080
Usually a limb is implemented as a @code{long}. When a @code{long long} limb
1081
is used this is encoded in the generated @file{gmp.h}. This is convenient for
1082
applications, but it does mean that @file{gmp.h} will vary, and can't be just
1083
copied around. @file{gmp.h} remains compiler independent though, since all
1084
compilers for a particular ABI will be expected to use the same limb type.
1086
Currently no attempt is made to follow whatever conventions a system has for
1087
installing library or header files built for a particular ABI. This will
1088
probably only matter when installing multiple builds of GMP, and it might be
1089
as simple as configuring with a special @samp{libdir}, or it might require
1090
more than that. Note that builds for different ABIs need to done separately,
1091
with a fresh @command{./configure} and @command{make} each.
1096
@item HPPA 2.0 (@samp{hppa2.0*})
1099
@item @samp{ABI=2.0w}
1101
The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or up
1102
when using @command{cc}. @command{gcc} support for this is in progress.
1103
Applications must be compiled with
1109
@item @samp{ABI=2.0n}
1111
The 2.0n ABI means the 32-bit HPPA 1.0 ABI but with a 64-bit limb using
1112
@code{long long}. This is available on HP-UX 10 or up when using
1113
@command{cc}. No @command{gcc} support is planned for this. Applications
1114
must be compiled with
1120
@item @samp{ABI=1.0}
1122
HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI.
1123
No special compiler options are needed for applications.
1126
All three ABIs are available for CPUs @samp{hppa2.0w} and @samp{hppa2.0}, but
1127
for CPU @samp{hppa2.0n} only 2.0n or 1.0 are allowed.
1131
@item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]})
1133
IRIX 6 supports the n32 and 64 ABIs and always has a 64-bit MIPS 3 or better
1134
CPU. In both these ABIs GMP uses a 64-bit limb. A new enough @command{gcc}
1135
is required (2.95 for instance).
1138
@item @samp{ABI=n32}
1140
The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a
1141
@code{long long}. Applications must be compiled with
1150
The 64-bit ABI is 64-bit pointers and integers. Applications must be compiled
1159
Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary
1160
support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code.
1164
@item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630})
1167
@item @samp{ABI=aix64}
1169
The AIX 64 ABI uses 64-bit limbs and pointers and is available on systems
1170
@samp{*-*-aix*}. Applications must be compiled (and linked) with
1179
This is the basic 32-bit PowerPC ABI. No special compiler options are needed
1185
@item Sparc V9 (@samp{sparcv9} and @samp{ultrasparc*})
1190
The 64-bit V9 ABI is available on Solaris 2.7 and up and GNU/Linux. GCC 2.95
1191
or up, or Sun @command{cc} is required. Applications must be compiled with
1194
gcc -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9
1200
On Solaris 2.6 and earlier, and on Solaris 2.7 with the kernel in 32-bit mode,
1201
only the plain V8 32-bit ABI can be used, since the kernel doesn't save all
1202
registers. GMP still uses as much of the V9 ISA as it can in these
1203
circumstances. No special compiler options are required for applications,
1204
though using something like the following requesting V9 code within the V8 ABI
1212
@command{gcc} 2.8 and earlier only supports @samp{-mv8} though.
1215
Don't be confused by the names of these sparc @samp{-m} and @samp{-x} options,
1216
they're called @samp{arch} but they effectively control the ABI.
1218
On Solaris 2.7 with the kernel in 32-bit-mode, a normal native build will
1219
reject @samp{ABI=64} because the resulting executables won't run.
1220
@samp{ABI=64} can still be built if desired by making it look like a
1221
cross-compile, for example
1224
./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64
1230
@node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP
1231
@section Notes for Package Builds
1232
@cindex Build notes for binary packaging
1233
@cindex Packaged builds
1235
GMP should present no great difficulties for packaging in a binary
1238
@cindex Libtool versioning
1239
@cindex Shared library versioning
1240
Libtool is used to build the library and @samp{-version-info} is set
1241
appropriately, having started from @samp{3:0:0} in GMP 3.0. The GMP 4 series
1242
will be upwardly binary compatible in each release and will be upwardly binary
1243
compatible with all of the GMP 3 series. Additional function interfaces may
1244
be added in each release, so on systems where libtool versioning is not fully
1245
checked by the loader an auxiliary mechanism may be needed to express that a
1246
dynamic linked application depends on a new enough GMP.
1248
An auxiliary mechanism may also be needed to express that @file{libgmpxx.la}
1249
(from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la}
1250
from the same GMP version, since this is not done by the libtool versioning,
1251
nor otherwise. A mismatch will result in unresolved symbols from the linker,
1252
or perhaps the loader.
1254
Using @samp{DESTDIR} or a @samp{prefix} override with @samp{make install} and
1255
a shared @file{libgmpxx} may run into a libtool relinking problem, see
1256
@ref{Known Build Problems}.
1258
When building a package for a CPU family, care should be taken to use
1259
@samp{--host} (or @samp{--build}) to choose the least common denominator among
1260
the CPUs which might use the package. For example this might necessitate
1261
@samp{i386} for x86s, or plain @samp{sparc} (meaning V7) for SPARCs.
1263
Users who care about speed will want GMP built for their exact CPU type, to
1264
make use of the available optimizations. Providing a way to suitably rebuild
1265
a package may be useful. This could be as simple as making it possible for a
1266
user to omit @samp{--build} (and @samp{--host}) so @samp{./config.guess} will
1267
detect the CPU. But a way to manually specify a @samp{--build} will be wanted
1268
for systems where @samp{./config.guess} is inexact.
1270
Note that @file{gmp.h} is a generated file, and will be architecture and ABI
1275
@node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP
1276
@section Notes for Particular Systems
1277
@cindex Build notes for particular systems
1278
@cindex Particular systems
1282
@c This section is more or less meant for notes about performance or about
1283
@c build problems that have been worked around but might leave a user
1284
@c scratching their head. Fun with different ABIs on a system belongs in the
1289
On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since
1290
some versions of the native @command{ar} fail on the convenience libraries
1291
used. A shared build can be attempted with
1294
./configure --enable-shared --disable-static
1297
Note that the @samp{--disable-static} is necessary because in a shared build
1298
libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for
1299
the benefit of old versions of @command{ld} which only recognise @file{.a},
1300
but unfortunately this is done even if a fully functional @command{ld} is
1305
On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a
1306
bug in unsigned division, giving wrong results for some operands. GMP
1307
@samp{./configure} will demand GCC 2.95.4 or later.
1310
Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and
1311
an old pre-standard one (see @samp{man iostream_intro}). GMP can only use the
1312
standard one, which unfortunately is not the default but must be selected by
1313
defining @code{__USE_STD_IOSTREAM}. Configure with for instance
1316
./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM
1319
@item Floating Point Mode
1320
@cindex Floating point mode
1321
@cindex Hardware floating point mode
1322
@cindex Precision of hardware floating point
1324
On some systems, the hardware floating point has a control mode which can set
1325
all operations to be done in a particular precision, for instance single,
1326
double or extended on x86 systems (x87 floating point). The GMP functions
1327
involving a @code{double} cannot be expected to operate to their full
1328
precision when the hardware is in single precision mode. Of course this
1329
affects all code, including application code, not just GMP.
1331
@item Microsoft Windows
1332
On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by
1333
default GMP builds only a static library, but a DLL can be built instead using
1336
./configure --disable-static --enable-shared
1339
Static and DLL libraries can't both be built, since certain export directives
1340
in @file{gmp.h} must be different. @samp{--enable-cxx} cannot be used when
1341
building a DLL, since libtool doesn't currently support C++ DLLs. This might
1342
change in the future.
1345
A MINGW DLL build of GMP can be used with Microsoft C. Libtool doesn't
1346
install @file{.lib} and @file{.exp} files, but they can be created with the
1347
following commands, where @file{/my/inst/dir} is the install directory (with a
1348
@file{lib} subdirectory).
1351
lib /machine:IX86 /def:.libs/libgmp-3.dll-def
1352
cp libgmp-3.lib /my/inst/dir/lib
1353
cp .libs/libgmp-3.dll-exp /my/inst/dir/lib/libgmp-3.exp
1356
MINGW uses the C runtime library @samp{msvcrt.dll} for I/O, so applications
1357
wanting to use the GMP I/O routines must be compiled with @samp{cl /MD} to do
1358
the same. If one of the other C runtime library choices provided by MS C is
1359
desired then the suggestion is to use the GMP string functions and confine I/O
1362
@item Motorola 68k CPU Types
1364
@samp{m68k} is taken to mean 68000. @samp{m68020} or higher will give a
1365
performance boost on applicable CPUs. @samp{m68360} can be used for CPU32
1366
series chips. @samp{m68302} can be used for ``Dragonball'' series chips,
1367
though this is merely a synonym for @samp{m68000}.
1371
@command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it
1372
unsuitable for @file{.asm} file processing. @samp{./configure} will detect
1373
the problem and either abort or choose another m4 in the @env{PATH}. The bug
1374
is fixed in OpenBSD 2.7, so either upgrade or use GNU m4.
1376
@item Power CPU Types
1378
In GMP, CPU types @samp{power*} and @samp{powerpc*} will each use instructions
1379
not available on the other, so it's important to choose the right one for the
1380
CPU that will be used. Currently GMP has no assembler code support for using
1381
just the common instruction subset. To get executables that run on both, the
1382
current suggestion is to use the generic C code (CPU @samp{none}), possibly
1383
with appropriate compiler options (like @samp{-mcpu=common} for
1384
@command{gcc}). CPU @samp{rs6000} (which is not a CPU but a family of
1385
workstations) is accepted by @file{config.sub}, but is currently equivalent to
1388
@item Sparc CPU Types
1390
@samp{sparcv8} or @samp{supersparc} on relevant systems will give a
1391
significant performance increase over the V7 code.
1393
@item Sparc App Regs
1395
The GMP assembler code for both 32-bit and 64-bit Sparc clobbers the
1396
``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way
1397
that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,,, gcc,
1398
Using the GNU Compiler Collection (GCC)}).
1400
This makes that code unsuitable for use with the special V9
1401
@samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer),
1402
and for applications wanting to use those registers for special purposes. In
1403
these cases the only suggestion currently is to build GMP with CPU @samp{none}
1404
to avoid the assembler code.
1408
@command{/usr/bin/m4} lacks various features needed to process @file{.asm}
1409
files, and instead @samp{./configure} will automatically use
1410
@command{/usr/5bin/m4}, which we believe is always available (if not then use
1415
@samp{i386} selects generic code which will run reasonably well on all x86
1418
@samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for the intended
1419
P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II,
1420
P-III)@. @samp{i386} is a better choice when making binaries that must run on
1423
@samp{pentium4} and an SSE2 capable assembler are important for best results
1424
on Pentium 4. The specific code is for instance roughly a 2@cross{} to
1425
3@cross{} speedup over the generic @samp{i386} code.
1427
@item x86 MMX and SSE2 Code
1429
If the CPU selected has MMX code but the assembler doesn't support it, a
1430
warning is given and non-MMX code is used instead. This will be an inferior
1431
build, since the MMX code that's present is there because it's faster than the
1432
corresponding plain integer code. The same applies to SSE2.
1434
Old versions of @samp{gas} don't support MMX instructions, in particular
1435
version 1.92.3 that comes with FreeBSD 2.2.8 doesn't (and unfortunately
1436
there's no newer assembler for that system).
1438
Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register
1439
to register @code{movq} instructions, and so can't be used for MMX code.
1440
Install a recent @command{gas} if MMX code is wanted on these systems.
1445
@node Known Build Problems, , Notes for Particular Systems, Installing GMP
1446
@section Known Build Problems
1447
@cindex Build problems known
1449
@c This section is more or less meant for known build problems that are not
1450
@c otherwise worked around and require some sort of manual intervention.
1452
You might find more up-to-date information at @uref{http://swox.com/gmp/}.
1455
@item Compiler link options
1456
The version of libtool currently in use rather aggressively strips compiler
1457
options when linking a shared library. This will hopefully be relaxed in the
1458
future, but for now if this is a problem the suggestion is to create a little
1459
script to hide them, and for instance configure with
1462
./configure CC=gcc-with-my-options
1466
The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure}
1467
script, it exits silently, having died writing a preamble to
1468
@file{config.log}. Use @command{bash} 2.04 or higher.
1470
@samp{make all} was found to run out of memory during the final
1471
@file{libgmp.la} link on one system tested, despite having 64Mb available. A
1472
separate @samp{make libgmp.la} helped, perhaps recursing into the various
1473
subdirectories uses up memory.
1475
@item @samp{DESTDIR} and shared @file{libgmpxx}
1476
@cindex @samp{DESTDIR}
1477
@samp{make install DESTDIR=/my/staging/area}, or the same with a @samp{prefix}
1478
override, to install to a temporary directory is not fully supported by
1479
current versions of libtool when building a shared version of a library which
1480
depends on another being built at the same time, like @file{libgmpxx} and
1483
The problem is that @file{libgmpxx} is relinked at the install stage to ensure
1484
that if the system puts a hard-coded path to @file{libgmp} within
1485
@file{libgmpxx} then that path will be correct. Naturally the linker is
1486
directed to look only at the final location, not the staging area, so if
1487
@file{libgmp} is not already in that final location then the link will fail.
1489
A workaround for this on SVR4 style systems, such as GNU/Linux, where paths
1490
are not hard-coded, is to include the staging area in the linker's search
1491
using @code{LD_LIBRARY_PATH}. For example with @samp{--prefix=/usr} but
1492
installing under @samp{/my/staging/area},
1495
LD_LIBRARY_PATH=/my/staging/area/usr/lib \
1496
make install DESTDIR=/my/staging/area
1499
@item GNU binutils @command{strip} prior to 2.12
1500
@cindex Stripped libraries
1502
@command{strip} from GNU binutils 2.11 and earlier should not be used on the
1503
static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all
1504
but the last of multiple archive members with the same name, like the three
1505
versions of @file{init.o} in @file{libgmp.a}. Binutils 2.12 or higher can be
1508
The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by
1509
this and any version of @command{strip} can be used on them.
1511
@item @command{make} syntax error
1513
On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make}
1514
is unable to handle the long dependencies list for @file{libgmp.la}. The
1515
symptom is a ``syntax error'' on the following line of the top-level
1519
libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES)
1522
Either use GNU Make, or as a workaround remove
1523
@code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial
1524
build work, but if any recompiling is done @file{libgmp.la} might not be
1527
@item MacOS X and GCC
1528
Libtool currently only knows how to create shared libraries on MacOS X using
1529
the native @command{cc} (which is a modified GCC), not a plain GCC. A
1530
static-only build should work though (@samp{--disable-shared}).
1532
Also, libtool currently cannot build C++ shared libraries on MacOS X, so if
1533
@samp{--enable-cxx} is desired then @samp{--disable-shared} must be used.
1534
Hopefully this will be fixed in the future.
1536
@item Motorola 68k ABI
1539
The GMP assembler code has been written for the SVR4 standard ABI. GCC option
1540
@samp{-mshort} changes the calling conventions and is not currently supported.
1541
We believe the PalmOS calling conventions are similarly different and are
1542
likewise not currently supported.
1544
@c For reference, -mshort doesn't just change the size of an int but also
1545
@c changes the stack alignment to only 16-bits, where in svr4 it's 32-bits.
1546
@c This affects mpn_lshift and mpn_rshift in the gmp code, but perhaps
1547
@c nowhere else. Having those routines understand the variant stack frame
1548
@c wouldn't be hard, if anyone was keen. (PalmOS had problems building due
1549
@c to lack of stdio.h last time it was tried, so it's not yet really a
1552
@item NeXT prior to 3.3
1554
The system compiler on old versions of NeXT was a massacred and old GCC, even
1555
if it called itself @file{cc}. This compiler cannot be used to build GMP, you
1556
need to get a real GCC, and install that. (NeXT may have fixed this in
1557
release 3.3 of their system.)
1559
@item POWER and PowerPC
1561
Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or
1562
PowerPC. If you want to use GCC for these machines, get GCC 2.7.2.1 (or
1565
@item Sequent Symmetry
1567
Use the GNU assembler instead of the system assembler, since the latter has
1572
The system @command{sed} prints an error ``Output line too long'' when libtool
1573
builds @file{libgmp.la}. This doesn't seem to cause any obvious ill effects,
1574
but GNU @command{sed} is recommended, to avoid any doubt.
1576
@item Sparc Solaris 2.7 with gcc 2.95.2 in ABI=32
1578
A shared library build of GMP seems to fail in this combination, it builds but
1579
then fails the tests, apparently due to some incorrect data relocations within
1580
@code{gmp_randinit_lc_2exp_size}. The exact cause is unknown,
1581
@samp{--disable-shared} is recommended.
1583
@item Windows DLL test programs
1585
When creating a DLL version of @file{libgmp}, libtool creates wrapper scripts
1586
like @file{t-mul} for programs that would normally be @file{t-mul.exe}, in
1587
order to setup the right library paths etc. This works fine, but the absence
1588
of @file{t-mul.exe} etc causes @command{make} to think they need recompiling
1589
every time, which is an annoyance when re-running a @samp{make check}.
1593
@node GMP Basics, Reporting Bugs, Installing GMP, Top
1594
@comment node-name, next, previous, up
1598
@strong{Using functions, macros, data types, etc.@: not documented in this
1599
manual is strongly discouraged. If you do so your application is guaranteed
1600
to be incompatible with future versions of GMP.}
1603
* Headers and Libraries::
1604
* Nomenclature and Types::
1605
* Function Classes::
1606
* Variable Conventions::
1607
* Parameter Conventions::
1608
* Memory Management::
1610
* Useful Macros and Constants::
1611
* Compatibility with older versions::
1612
* Demonstration Programs::
1620
@node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics
1621
@section Headers and Libraries
1624
@cindex @file{gmp.h}
1625
All declarations needed to use GMP are collected in the include file
1626
@file{gmp.h}. It is designed to work with both C and C++ compilers.
1632
Note however that prototypes for GMP functions with @code{FILE *} parameters
1633
are only provided if @code{<stdio.h>} is included too.
1640
Likewise @code{<stdarg.h>} (or @code{<varargs.h>}) is required for prototypes
1641
with @code{va_list} parameters, such as @code{gmp_vprintf}. And
1642
@code{<obstack.h>} for prototypes with @code{struct obstack} parameters, such
1643
as @code{gmp_obstack_printf}, when available.
1647
All programs using GMP must link against the @file{libgmp} library. On a
1648
typical Unix-like system this can be done with @samp{-lgmp}, for example
1651
gcc myprogram.c -lgmp
1654
GMP C++ functions are in a separate @file{libgmpxx} library. This is built
1655
and installed if C++ support has been enabled (@pxref{Build Options}). For
1659
g++ mycxxprog.cc -lgmpxx -lgmp
1662
GMP is built using Libtool and an application can use that to link if desired,
1663
@pxref{Top,Shared library support for GNU,Introduction,libtool,GNU Libtool}
1665
If GMP has been installed to a non-standard location then it may be necessary
1666
to use @samp{-I} and @samp{-L} compiler options to point to the right
1667
directories, and some sort of run-time path for a shared library. Consult
1668
your compiler documentation, for instance @ref{Top,,Introduction,gcc,Using and
1669
Porting the GNU Compiler Collection}.
1672
@node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics
1673
@section Nomenclature and Types
1674
@cindex Nomenclature
1678
@tindex @code{mpz_t}
1680
In this manual, @dfn{integer} usually means a multiple precision integer, as
1681
defined by the GMP library. The C data type for such integers is @code{mpz_t}.
1682
Here are some examples of how to declare such integers:
1687
struct foo @{ mpz_t x, y; @};
1692
@cindex Rational number
1693
@tindex @code{mpq_t}
1695
@dfn{Rational number} means a multiple precision fraction. The C data type
1696
for these fractions is @code{mpq_t}. For example:
1702
@cindex Floating-point number
1703
@tindex @code{mpf_t}
1705
@dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision
1706
mantissa with a limited precision exponent. The C data type for such objects
1710
@tindex @code{mp_limb_t}
1712
A @dfn{limb} means the part of a multi-precision number that fits in a single
1713
machine word. (We chose this word because a limb of the human body is
1714
analogous to a digit, only larger, and containing several digits.) Normally a
1715
limb is 32 or 64 bits. The C data type for a limb is @code{mp_limb_t}.
1718
@node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics
1719
@section Function Classes
1720
@cindex Function classes
1722
There are six classes of functions in the GMP library:
1726
Functions for signed integer arithmetic, with names beginning with
1727
@code{mpz_}. The associated type is @code{mpz_t}. There are about 150
1728
functions in this class.
1731
Functions for rational number arithmetic, with names beginning with
1732
@code{mpq_}. The associated type is @code{mpq_t}. There are about 40
1733
functions in this class, but the integer functions can be used for arithmetic
1734
on the numerator and denominator separately.
1737
Functions for floating-point arithmetic, with names beginning with
1738
@code{mpf_}. The associated type is @code{mpf_t}. There are about 60
1739
functions is this class.
1742
Functions compatible with Berkeley MP, such as @code{itom}, @code{madd}, and
1743
@code{mult}. The associated type is @code{MINT}.
1746
Fast low-level functions that operate on natural numbers. These are used by
1747
the functions in the preceding groups, and you can also call them directly
1748
from very time-critical user programs. These functions' names begin with
1749
@code{mpn_}. The associated type is array of @code{mp_limb_t}. There are
1750
about 30 (hard-to-use) functions in this class.
1753
Miscellaneous functions. Functions for setting up custom allocation and
1754
functions for generating random numbers.
1758
@node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics
1759
@section Variable Conventions
1760
@cindex Variable conventions
1761
@cindex Conventions for variables
1763
GMP functions generally have output arguments before input arguments. This
1764
notation is by analogy with the assignment operator. The BSD MP compatibility
1765
functions are exceptions, having the output arguments last.
1767
GMP lets you use the same variable for both input and output in one call. For
1768
example, the main function for integer multiplication, @code{mpz_mul}, can be
1769
used to square @code{x} and put the result back in @code{x} with
1775
Before you can assign to a GMP variable, you need to initialize it by calling
1776
one of the special initialization functions. When you're done with a
1777
variable, you need to clear it out, using one of the functions for that
1778
purpose. Which function to use depends on the type of variable. See the
1779
chapters on integer functions, rational number functions, and floating-point
1780
functions for details.
1782
A variable should only be initialized once, or at least cleared between each
1783
initialization. After a variable has been initialized, it may be assigned to
1784
any number of times.
1786
For efficiency reasons, avoid excessive initializing and clearing. In
1787
general, initialize near the start of a function and clear near the end. For
1797
for (i = 1; i < 100; i++)
1799
mpz_mul (n, @dots{});
1800
mpz_fdiv_q (n, @dots{});
1808
@node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics
1809
@section Parameter Conventions
1810
@cindex Parameter conventions
1811
@cindex Conventions for parameters
1813
When a GMP variable is used as a function parameter, it's effectively a
1814
call-by-reference, meaning if the function stores a value there it will change
1815
the original in the caller. Parameters which are input-only can be designated
1816
@code{const} to provoke a compiler error or warning on attempting to modify
1819
When a function is going to return a GMP result, it should designate a
1820
parameter that it sets, like the library functions do. More than one value
1821
can be returned by having more than one output parameter, again like the
1822
library functions. A @code{return} of an @code{mpz_t} etc doesn't return the
1823
object, only a pointer, and this is almost certainly not what's wanted.
1825
Here's an example accepting an @code{mpz_t} parameter, doing a calculation,
1826
and storing the result to the indicated parameter.
1830
foo (mpz_t result, const mpz_t param, unsigned long n)
1833
mpz_mul_ui (result, param, n);
1834
for (i = 1; i < n; i++)
1835
mpz_add_ui (result, result, i*7);
1843
mpz_init_set_str (n, "123456", 0);
1845
gmp_printf ("%Zd\n", r);
1850
@code{foo} works even if the mainline passes the same variable for
1851
@code{param} and @code{result}, just like the library functions. But
1852
sometimes it's tricky to make that work, and an application might not want to
1853
bother supporting that sort of thing.
1855
For interest, the GMP types @code{mpz_t} etc are implemented as one-element
1856
arrays of certain structures. This is why declaring a variable creates an
1857
object with the fields GMP needs, but then using it as a parameter passes a
1858
pointer to the object. Note that the actual fields in each @code{mpz_t} etc
1859
are for internal use only and should not be accessed directly by code that
1860
expects to be compatible with future GMP releases.
1864
@node Memory Management, Reentrancy, Parameter Conventions, GMP Basics
1865
@section Memory Management
1866
@cindex Memory Management
1868
The GMP types like @code{mpz_t} are small, containing only a couple of sizes,
1869
and pointers to allocated data. Once a variable is initialized, GMP takes
1870
care of all space allocation. Additional space is allocated whenever a
1871
variable doesn't have enough.
1873
@code{mpz_t} and @code{mpq_t} variables never reduce their allocated space.
1874
Normally this is the best policy, since it avoids frequent reallocation.
1875
Applications that need to return memory to the heap at some particular point
1876
can use @code{mpz_realloc2}, or clear variables no longer needed.
1878
@code{mpf_t} variables, in the current implementation, use a fixed amount of
1879
space, determined by the chosen precision and allocated at initialization, so
1880
their size doesn't change.
1882
All memory is allocated using @code{malloc} and friends by default, but this
1883
can be changed, see @ref{Custom Allocation}. Temporary memory on the stack is
1884
also used (via @code{alloca}), but this can be changed at build-time if
1885
desired, see @ref{Build Options}.
1888
@node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics
1891
@cindex Thread safety
1892
@cindex Multi-threading
1894
GMP is reentrant and thread-safe, with some exceptions:
1898
If configured with @option{--enable-alloca=malloc-notreentrant} (or with
1899
@option{--enable-alloca=notreentrant} when @code{alloca} is not available),
1900
then naturally GMP is not reentrant.
1903
@code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the
1904
selected precision. @code{mpf_init2} can be used instead, and in the C++
1905
interface an explicit precision to the @code{mpf_class} constructor.
1908
@code{mpz_random} and the other old random number functions use a global
1909
random state and are hence not reentrant. The newer random number functions
1910
that accept a @code{gmp_randstate_t} parameter can be used instead.
1913
@code{gmp_randinit} (obsolete) returns an error indication through a global
1914
variable, which is not thread safe. Applications are advised to use
1915
@code{gmp_randinit_lc_2exp} instead.
1918
@code{mp_set_memory_functions} uses global variables to store the selected
1919
memory allocation functions.
1922
If the memory allocation functions set by a call to
1923
@code{mp_set_memory_functions} (or @code{malloc} and friends by default) are
1924
not reentrant, then GMP will not be reentrant either.
1927
If the standard I/O functions such as @code{fwrite} are not reentrant then the
1928
GMP I/O functions using them will not be reentrant either.
1931
It's safe for two threads to read from the same GMP variable simultaneously,
1932
but it's not safe for one to read while the another might be writing, nor for
1933
two threads to write simultaneously. It's not safe for two threads to
1934
generate a random number from the same @code{gmp_randstate_t} simultaneously,
1935
since this involves an update of that variable.
1940
@node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics
1941
@section Useful Macros and Constants
1942
@cindex Useful macros and constants
1945
@deftypevr {Global Constant} {const int} mp_bits_per_limb
1946
@findex mp_bits_per_limb
1947
@cindex Bits per limb
1949
The number of bits per limb.
1952
@defmac __GNU_MP_VERSION
1953
@defmacx __GNU_MP_VERSION_MINOR
1954
@defmacx __GNU_MP_VERSION_PATCHLEVEL
1955
@cindex Version number
1956
@cindex GMP version number
1957
The major and minor GMP version, and patch level, respectively, as integers.
1958
For GMP i.j, these numbers will be i, j, and 0, respectively.
1959
For GMP i.j.k, these numbers will be i, j, and k, respectively.
1962
@deftypevr {Global Constant} {const char * const} gmp_version
1964
The GMP version number, as a null-terminated string, in the form ``i.j'' or
1965
``i.j.k''. This release is @nicode{"@value{VERSION}"}.
1969
@node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics
1970
@section Compatibility with older versions
1971
@cindex Compatibility with older versions
1972
@cindex Upward compatibility
1974
This version of GMP is upwardly binary compatible with all 4.x and 3.x
1975
versions, and upwardly compatible at the source level with all 2.x versions,
1976
with the following exceptions.
1980
@code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency
1981
with other @code{mpn} functions.
1984
@code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and
1985
3.0.1, but in 3.1 reverted to the 2.x style.
1988
There are a number of compatibility issues between GMP 1 and GMP 2 that of
1989
course also apply when porting applications from GMP 1 to GMP 4. Please
1990
see the GMP 2 manual for details.
1992
The Berkeley MP compatibility library (@pxref{BSD Compatible Functions}) is
1993
source and binary compatible with the standard @file{libmp}.
1996
@c @item Integer division functions round the result differently. The obsolete
1997
@c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv},
1998
@c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the
2001
@c @minus{}infinity).
2008
@c There are a lot of functions for integer division, giving the user better
2009
@c control over the rounding.
2011
@c @item The function @code{mpz_mod} now compute the true @strong{mod} function.
2013
@c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use
2014
@c @strong{mod} for reduction.
2016
@c @item The assignment functions for rational numbers do no longer canonicalize
2017
@c their results. In the case a non-canonical result could arise from an
2018
@c assignment, the user need to insert an explicit call to
2019
@c @code{mpq_canonicalize}. This change was made for efficiency.
2021
@c @item Output generated by @code{mpz_out_raw} in this release cannot be read
2022
@c by @code{mpz_inp_raw} in previous releases. This change was made for making
2023
@c the file format truly portable between machines with different word sizes.
2025
@c @item Several @code{mpn} functions have changed. But they were intentionally
2026
@c undocumented in previous releases.
2028
@c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui}
2029
@c are now implemented as macros, and thereby sometimes evaluate their
2030
@c arguments multiple times.
2032
@c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1
2033
@c for 0^0. (In version 1, they yielded 0.)
2035
@c In version 1 of the library, @code{mpq_set_den} handled negative
2036
@c denominators by copying the sign to the numerator. That is no longer done.
2038
@c Pure assignment functions do not canonicalize the assigned variable. It is
2039
@c the responsibility of the user to canonicalize the assigned variable before
2040
@c any arithmetic operations are performed on that variable.
2041
@c Note that this is an incompatible change from version 1 of the library.
2047
@node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics
2048
@section Demonstration programs
2049
@cindex Demonstration programs
2050
@cindex Example programs
2051
@cindex Sample programs
2052
The @file{demos} subdirectory has some sample programs using GMP. These
2053
aren't built or installed, but there's a @file{Makefile} with rules for them.
2062
The following programs are provided
2066
@samp{pexpr} is an expression evaluator, the program used on the GMP web page.
2068
The @samp{calc} subdirectory has a similar but simpler evaluator using
2069
@command{lex} and @command{yacc}.
2071
The @samp{expr} subdirectory is yet another expression evaluator, a library
2072
designed for ease of use within a C program. See @file{demos/expr/README} for
2075
@samp{factorize} is a Pollard-Rho factorization program.
2077
@samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p}
2080
@samp{primes} counts or lists primes in an interval, using a sieve.
2082
@samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic
2086
The @samp{perl} subdirectory is a comprehensive perl interface to GMP. See
2087
@file{demos/perl/INSTALL} for more information. Documentation is in POD
2088
format in @file{demos/perl/GMP.pm}.
2093
@node Efficiency, Debugging, Demonstration Programs, GMP Basics
2098
@item Small operands
2099
On small operands, the time for function call overheads and memory allocation
2100
can be significant in comparison to actual calculation. This is unavoidable
2101
in a general purpose variable precision library, although GMP attempts to be
2102
as efficient as it can on both large and small operands.
2104
@item Static Linking
2105
On some CPUs, in particular the x86s, the static @file{libgmp.a} should be
2106
used for maximum speed, since the PIC code in the shared @file{libgmp.so} will
2107
have a small overhead on each function call and global data address. For many
2108
programs this will be insignificant, but for long calculations there's a gain
2111
@item Initializing and clearing
2112
Avoid excessive initializing and clearing of variables, since this can be
2113
quite time consuming, especially in comparison to otherwise fast operations
2116
A language interpreter might want to keep a free list or stack of
2117
initialized variables ready for use. It should be possible to integrate
2118
something like that with a garbage collector too.
2121
An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing
2122
values will have its memory repeatedly @code{realloc}ed, which could be quite
2123
slow or could fragment memory, depending on the C library. If an application
2124
can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can
2125
be called to allocate the necessary space from the beginning
2126
(@pxref{Initializing Integers}).
2128
It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2}
2129
is too small, since all functions will do a further reallocation if necessary.
2130
Badly overestimating memory required will waste space though.
2132
@item @code{2exp} functions
2133
It's up to an application to call functions like @code{mpz_mul_2exp} when
2134
appropriate. General purpose functions like @code{mpz_mul} make no attempt to
2135
identify powers of two or other special forms, because such inputs will
2136
usually be very rare and testing every time would be wasteful.
2138
@item @code{ui} and @code{si} functions
2139
The @code{ui} functions and the small number of @code{si} functions exist for
2140
convenience and should be used where applicable. But if for example an
2141
@code{mpz_t} contains a value that fits in an @code{unsigned long} there's no
2142
need extract it and call a @code{ui} function, just use the regular @code{mpz}
2145
@item In-Place Operations
2146
@code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg}
2147
and @code{mpf_neg} are fast when used for in-place operations like
2148
@code{mpz_abs(x,x)}, since in the current implementation only a single field
2149
of @code{x} needs changing. On suitable compilers (GCC for instance) this is
2152
@code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui}
2153
benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since
2154
usually only one or two limbs of @code{x} will need to be changed. The same
2155
applies to the full precision @code{mpz_add} etc if @code{y} is small. If
2156
@code{y} is big then cache locality may be helped, but that's all.
2158
@code{mpz_mul} is currently the opposite, a separate destination is slightly
2159
better. A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one
2160
limb, make a temporary copy of @code{x} before forming the result. Normally
2161
that copying will only be a tiny fraction of the time for the multiply, so
2162
this is not a particularly important consideration.
2164
@code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make
2165
no attempt to recognise a copy of something to itself, so a call like
2166
@code{mpz_set(x,x)} will be wasteful. Naturally that would never be written
2167
deliberately, but if it might arise from two pointers to the same object then
2168
a test to avoid it might be desirable.
2175
Note that it's never worth introducing extra @code{mpz_set} calls just to get
2176
in-place operations. If a result should go to a particular variable then just
2177
direct it there and let GMP take care of data movement.
2179
@item Divisibility Testing (Small Integers)
2181
@code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions
2182
for testing whether an @code{mpz_t} is divisible by an individual small
2183
integer. They use an algorithm which is faster than @code{mpz_tdiv_ui}, but
2184
which gives no useful information about the actual remainder, only whether
2185
it's zero (or a particular value).
2187
However when testing divisibility by several small integers, it's best to take
2188
a remainder modulo their product, to save multi-precision operations. For
2189
instance to test whether a number is divisible by any of 23, 29 or 31 take a
2190
remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that.
2192
The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well
2193
as a remainder are generally a little slower than the remainder-only functions
2194
like @code{mpz_tdiv_ui}. If the quotient is only rarely wanted then it's
2195
probably best to just take a remainder and then go back and calculate the
2196
quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the
2199
@item Rational Arithmetic
2200
The @code{mpq} functions operate on @code{mpq_t} values with no common factors
2201
in the numerator and denominator. Common factors are checked-for and cast out
2202
as necessary. In general, cancelling factors every time is the best approach
2203
since it minimizes the sizes for subsequent operations.
2205
However, applications that know something about the factorization of the
2206
values they're working with might be able to avoid some of the GCDs used for
2207
canonicalization, or swap them for divisions. For example when multiplying by
2208
a prime it's enough to check for factors of it in the denominator instead of
2209
doing a full GCD. Or when forming a big product it might be known that very
2210
little cancellation will be possible, and so canonicalization can be left to
2213
The @code{mpq_numref} and @code{mpq_denref} macros give access to the
2214
numerator and denominator to do things outside the scope of the supplied
2215
@code{mpq} functions. @xref{Applying Integer Functions}.
2217
The canonical form for rationals allows mixed-type @code{mpq_t} and integer
2218
additions or subtractions to be done directly with multiples of the
2219
denominator. This will be somewhat faster than @code{mpq_add}. For example,
2223
mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q));
2225
/* mpq += unsigned long */
2226
mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL);
2229
mpz_submul (mpq_numref(q), mpq_denref(q), z);
2232
@item Number Sequences
2233
Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui}
2234
are designed for calculating isolated values. If a range of values is wanted
2235
it's probably best to call to get a starting point and iterate from there.
2237
@item Text Input/Output
2238
Hexadecimal or octal are suggested for input or output in text form.
2239
Power-of-2 bases like these can be converted much more efficiently than other
2240
bases, like decimal. For big numbers there's usually nothing of particular
2241
interest to be seen in the digits, so the base doesn't matter much.
2243
Maybe we can hope octal will one day become the normal base for everyday use,
2244
as proposed by King Charles XII of Sweden and later reformers.
2245
@c Reference: Knuth volume 2 section 4.1, page 184 of second edition. :-)
2249
@node Debugging, Profiling, Efficiency, GMP Basics
2254
@item Stack Overflow
2255
Depending on the system, a segmentation violation or bus error might be the
2256
only indication of stack overflow. See @samp{--enable-alloca} choices in
2257
@ref{Build Options}, for how to address this.
2259
In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an
2260
overflow is recognised by the system before too much damage is done, or
2261
@samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to
2262
add checking if the system itself doesn't do any (@pxref{Code Gen Options,,
2263
Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}).
2264
These options must be added to the @samp{CFLAGS} used in the GMP build
2265
(@pxref{Build Options}), adding them just to an application will have no
2266
effect. Note also they're a slowdown, adding overhead to each function call
2267
and each stack allocation.
2270
The most likely cause of application problems with GMP is heap corruption.
2271
Failing to @code{init} GMP variables will have unpredictable effects, and
2272
corruption arising elsewhere in a program may well affect GMP. Initializing
2273
GMP variables more than once or failing to clear them will cause memory leaks.
2275
In all such cases a malloc debugger is recommended. On a GNU or BSD system
2276
the standard C library @code{malloc} has some diagnostic facilities, see
2277
@ref{Allocation Debugging,,,libc,The GNU C Library Reference Manual}, or
2278
@samp{man 3 malloc}. Other possibilities, in no particular order, include
2281
@uref{http://www.inf.ethz.ch/personal/biere/projects/ccmalloc}
2282
@uref{http://dmalloc.com}
2283
@uref{http://www.perens.com/FreeSoftware} @ (electric fence)
2284
@uref{http://packages.debian.org/fda}
2285
@uref{http://www.gnupdate.org/components/leakbug}
2286
@uref{http://people.redhat.com/~otaylor/memprof}
2287
@uref{http://www.cbmamiga.demon.co.uk/mpatrol}
2290
The GMP default allocation routines in @file{memory.c} also have a simple
2291
sentinel scheme which can be enabled with @code{#define DEBUG} in that file.
2292
This is mainly designed for detecting buffer overruns during GMP development,
2293
but might find other uses.
2295
@item Stack Backtraces
2296
On some systems the compiler options GMP uses by default can interfere with
2297
debugging. In particular on x86 and 68k systems @samp{-fomit-frame-pointer}
2298
is used and this generally inhibits stack backtracing. Recompiling without
2299
such options may help while debugging, though the usual caveats about it
2300
potentially moving a memory problem or hiding a compiler bug will apply.
2303
A sample @file{.gdbinit} is included in the distribution, showing how to call
2304
some undocumented dump functions to print GMP variables from within GDB. Note
2305
that these functions shouldn't be used in final application code since they're
2306
undocumented and may be subject to incompatible changes in future versions of
2309
@item Source File Paths
2310
GMP has multiple source files with the same name, in different directories.
2311
For example @file{mpz}, @file{mpq}, @file{mpf} and @file{mpfr} each have an
2312
@file{init.c}. If the debugger can't already determine the right one it may
2313
help to build with absolute paths on each C file. One way to do that is to
2314
use a separate object directory with an absolute path to the source directory.
2318
/my/source/dir/gmp-@value{VERSION}/configure
2321
This works via @code{VPATH}, and might require GNU @command{make}.
2322
Alternately it might be possible to change the @code{.c.lo} rules
2325
@item Assertion Checking
2326
The build option @option{--enable-assert} is available to add some consistency
2327
checks to the library (see @ref{Build Options}). These are likely to be of
2328
limited value to most applications. Assertion failures are just as likely to
2329
indicate memory corruption as a library or compiler bug.
2331
Applications using the low-level @code{mpn} functions, however, will benefit
2332
from @option{--enable-assert} since it adds checks on the parameters of most
2333
such functions, many of which have subtle restrictions on their usage. Note
2334
however that only the generic C code has checks, not the assembler code, so
2335
CPU @samp{none} should be used for maximum checking.
2337
@item Temporary Memory Checking
2338
The build option @option{--enable-alloca=debug} arranges that each block of
2339
temporary memory in GMP is allocated with a separate call to @code{malloc} (or
2340
the allocation function set with @code{mp_set_memory_functions}).
2342
This can help a malloc debugger detect accesses outside the intended bounds,
2343
or detect memory not released. In a normal build, on the other hand,
2344
temporary memory is allocated in blocks which GMP divides up for its own use,
2345
or may be allocated with a compiler builtin @code{alloca} which will go
2346
nowhere near any malloc debugger hooks.
2348
@item Maximum Debuggability
2349
To summarize the above, a GMP build for maximum debuggability would be
2352
./configure --disable-shared --enable-assert \
2353
--enable-alloca=debug --host=none CFLAGS=-g
2356
For C++, add @samp{--enable-cxx CXXFLAGS=-g}.
2359
The checker program (@uref{http://savannah.gnu.org/projects/checker}) can be
2360
used with GMP. It contains a stub library which means GMP applications
2361
compiled with checker can use a normal GMP build.
2363
A build of GMP with checking within GMP itself can be made. This will run
2364
very very slowly. Configure with
2367
./configure --host=none-pc-linux-gnu CC=checkergcc
2370
@samp{--host=none} must be used, since the GMP assembler code doesn't support
2371
the checking scheme. The GMP C++ features cannot be used, since current
2372
versions of checker (0.9.9.1) don't yet support the standard C++ library.
2376
The valgrind program (@uref{http://valgrind.kde.org/}) is a memory
2377
checker for x86s. It translates and emulates machine instructions to do
2378
strong checks for uninitialized data (at the level of individual bits), memory
2379
accesses through bad pointers, and memory leaks.
2381
Recent versions of Valgrind are getting support for MMX and SSE/SSE2
2382
instructions, for past versions GMP will need to be configured not to use
2383
those, ie.@: for an x86 without them (for instance plain @samp{i486}).
2385
@item Other Problems
2386
Any suspected bug in GMP itself should be isolated to make sure it's not an
2387
application problem, see @ref{Reporting Bugs}.
2391
@node Profiling, Autoconf, Debugging, GMP Basics
2395
Running a program under a profiler is a good way to find where it's spending
2396
most time and where improvements can be best sought.
2398
Depending on the system, it may be possible to get a flat profile, meaning
2399
simple timer sampling of the program counter, with no special GMP build
2400
options, just a @samp{-p} when compiling the mainline. This is a good way to
2401
ensure minimum interference with normal operation. The necessary symbol type
2402
and size information exists in most of the GMP assembler code.
2404
The @samp{--enable-profiling} build option can be used to add suitable
2405
compiler flags, either for @command{prof} (@samp{-p}) or @command{gprof}
2406
(@samp{-pg}), see @ref{Build Options}. Which of the two is available and what
2407
they do will depend on the system, and possibly on support available in
2408
@file{libc}. For some systems appropriate corresponding @code{mcount} calls
2409
are added to the assembler code too.
2411
On x86 systems @command{prof} gives call counting, so that average time spent
2412
in a function can be determined. @command{gprof}, where supported, adds call
2413
graph construction, so for instance calls to @code{mpn_add_n} from
2414
@code{mpz_add} and from @code{mpz_mul} can be differentiated.
2416
On x86 and 68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are
2417
incompatible, so the latter is not used when @command{gprof} profiling is
2418
selected, which may result in poorer code generation. If @command{prof}
2419
profiling is selected instead it should still be possible to use
2420
@command{gprof}, but only the @samp{gprof -p} flat profile and call counts can
2421
be expected to be valid, not the @samp{gprof -q} call graph.
2424
@node Autoconf, Emacs, Profiling, GMP Basics
2426
@cindex Autoconf detections
2428
Autoconf based applications can easily check whether GMP is installed. The
2429
only thing to be noted is that GMP library symbols from version 3 onwards have
2430
prefixes like @code{__gmpz}. The following therefore would be a simple test,
2433
AC_CHECK_LIB(gmp, __gmpz_init)
2436
This just uses the default @code{AC_CHECK_LIB} actions for found or not found,
2437
but an application that must have GMP would want to generate an error if not
2441
AC_CHECK_LIB(gmp, __gmpz_init, , [AC_MSG_ERROR(
2442
[GNU MP not found, see http://swox.com/gmp])])
2445
If functions added in some particular version of GMP are required, then one of
2446
those can be used when checking. For example @code{mpz_mul_si} was added in
2450
AC_CHECK_LIB(gmp, __gmpz_mul_si, , [AC_MSG_ERROR(
2451
[GNU MP not found, or not 3.1 or up, see http://swox.com/gmp])])
2454
An alternative would be to test the version number in @file{gmp.h} using say
2455
@code{AC_EGREP_CPP}. That would make it possible to test the exact version,
2456
if some particular sub-minor release is known to be necessary.
2458
An application that can use either GMP 2 or 3 will need to test for
2459
@code{__gmpz_init} (GMP 3 and up) or @code{mpz_init} (GMP 2), and it's also
2460
worth checking for @file{libgmp2} since Debian GNU/Linux systems used that
2461
name in the past. For example,
2464
AC_CHECK_LIB(gmp, __gmpz_init, ,
2465
[AC_CHECK_LIB(gmp, mpz_init, ,
2466
[AC_CHECK_LIB(gmp2, mpz_init)])])
2469
In general it's suggested that applications should simply demand a new enough
2470
GMP rather than trying to provide supplements for features not available in
2473
Occasionally an application will need or want to know the size of a type at
2474
configuration or preprocessing time, not just with @code{sizeof} in the code.
2475
This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or
2476
up is best for this, since prior versions needed certain @samp{-D} defines on
2477
systems using a @code{long long} limb. The following would suit Autoconf 2.50
2481
AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>])
2484
The optional @code{mpfr} functions are provided in a separate
2485
@file{libmpfr.a}, and this might be from GMP with @option{--enable-mpfr} or
2486
from MPFR installed separately. Either way @file{libmpfr} depends on
2487
@file{libgmp}, it doesn't stand alone. Currently only a static
2488
@file{libmpfr.a} will be available, not a shared library, since upward binary
2489
compatibility is not guaranteed.
2492
AC_CHECK_LIB(mpfr, mpfr_add, , [AC_MSG_ERROR(
2493
[Need MPFR either from GNU MP 4 or separate MPFR package.
2494
See http://www.mpfr.org or http://swox.com/gmp])
2498
@node Emacs, , Autoconf, GMP Basics
2502
@key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation
2503
on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup,
2504
emacs, The Emacs Editor}).
2506
The GMP manual can be included in such lookups by putting the following in
2509
@c This isn't pretty, but there doesn't seem to be a better way (in emacs
2510
@c 21.2 at least). info-lookup->mode-value could be used for the "assoc"s,
2511
@c but that function isn't documented, whereas info-lookup-alist is.
2514
(eval-after-load "info-look"
2515
'(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist))))
2516
(setcar (nthcdr 3 mode-value)
2517
(cons '("(gmp)Function Index" nil "^ -.* " "\\>")
2518
(nth 3 mode-value)))))
2521
The same can be done for MPFR, with @code{(mpfr)} in place of @code{(gmp)}.
2524
@node Reporting Bugs, Integer Functions, GMP Basics, Top
2525
@comment node-name, next, previous, up
2526
@chapter Reporting Bugs
2527
@cindex Reporting bugs
2528
@cindex Bug reporting
2530
If you think you have found a bug in the GMP library, please investigate it
2531
and report it. We have made this library available to you, and it is not too
2532
much to ask you to report the bugs you find.
2534
Before you report a bug, check it's not already addressed in @ref{Known Build
2535
Problems}, or perhaps @ref{Notes for Particular Systems}. You may also want
2536
to check @uref{http://swox.com/gmp/} for patches for this release.
2538
Please include the following in any report,
2542
The GMP version number, and if pre-packaged or patched then say so.
2545
A test program that makes it possible for us to reproduce the bug. Include
2546
instructions on how to run the program.
2549
A description of what is wrong. If the results are incorrect, in what way.
2550
If you get a crash, say so.
2553
If you get a crash, include a stack backtrace from the debugger if it's
2554
informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}).
2557
Please do not send core dumps, executables or @command{strace}s.
2560
The configuration options you used when building GMP, if any.
2563
The name of the compiler and its version. For @command{gcc}, get the version
2564
with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar.
2567
The output from running @samp{uname -a}.
2570
The output from running @samp{./config.guess}, and from running
2571
@samp{./configfsf.guess} (might be the same).
2574
If the bug is related to @samp{configure}, then the contents of
2578
If the bug is related to an @file{asm} file not assembling, then the contents
2579
of @file{config.m4} and the offending line or lines from the temporary
2580
@file{mpn/tmp-<file>.s}.
2583
Please make an effort to produce a self-contained report, with something
2584
definite that can be tested or debugged. Vague queries or piecemeal messages
2585
are difficult to act on and don't help the development effort.
2587
It is not uncommon that an observed problem is actually due to a bug in the
2588
compiler; the GMP code tends to explore interesting corners in compilers.
2590
If your bug report is good, we will do our best to help you get a corrected
2591
version of the library; if the bug report is poor, we won't do anything about
2592
it (except maybe ask you to send a better report).
2594
Send your report to: @email{bug-gmp@@gnu.org}.
2596
If you think something in this manual is unclear, or downright incorrect, or if
2597
the language needs to be improved, please send a note to the same address.
2600
@node Integer Functions, Rational Number Functions, Reporting Bugs, Top
2601
@comment node-name, next, previous, up
2602
@chapter Integer Functions
2603
@cindex Integer functions
2605
This chapter describes the GMP functions for performing integer arithmetic.
2606
These functions start with the prefix @code{mpz_}.
2608
GMP integers are stored in objects of type @code{mpz_t}.
2611
* Initializing Integers::
2612
* Assigning Integers::
2613
* Simultaneous Integer Init & Assign::
2614
* Converting Integers::
2615
* Integer Arithmetic::
2616
* Integer Division::
2617
* Integer Exponentiation::
2619
* Number Theoretic Functions::
2620
* Integer Comparisons::
2621
* Integer Logic and Bit Fiddling::
2623
* Integer Random Numbers::
2624
* Integer Import and Export::
2625
* Miscellaneous Integer Functions::
2628
@node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions
2629
@comment node-name, next, previous, up
2630
@section Initialization Functions
2631
@cindex Integer initialization functions
2632
@cindex Initialization functions
2634
The functions for integer arithmetic assume that all integer objects are
2635
initialized. You do that by calling the function @code{mpz_init}. For
2643
mpz_add (integ, @dots{});
2645
mpz_sub (integ, @dots{});
2647
/* Unless the program is about to exit, do ... */
2652
As you can see, you can store new values any number of times, once an
2653
object is initialized.
2655
@deftypefun void mpz_init (mpz_t @var{integer})
2656
Initialize @var{integer}, and set its value to 0.
2659
@deftypefun void mpz_init2 (mpz_t @var{integer}, unsigned long @var{n})
2660
Initialize @var{integer}, with space for @var{n} bits, and set its value to 0.
2662
@var{n} is only the initial space, @var{integer} will grow automatically in
2663
the normal way, if necessary, for subsequent values stored. @code{mpz_init2}
2664
makes it possible to avoid such reallocations if a maximum size is known in
2668
@deftypefun void mpz_clear (mpz_t @var{integer})
2669
Free the space occupied by @var{integer}. Call this function for all
2670
@code{mpz_t} variables when you are done with them.
2673
@deftypefun void mpz_realloc2 (mpz_t @var{integer}, unsigned long @var{n})
2674
Change the space allocated for @var{integer} to @var{n} bits. The value in
2675
@var{integer} is preserved if it fits, or is set to 0 if not.
2677
This function can be used to increase the space for a variable in order to
2678
avoid repeated automatic reallocations, or to decrease it to give memory back
2682
@deftypefun void mpz_array_init (mpz_t @var{integer_array}[], size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}})
2683
This is a special type of initialization. @strong{Fixed} space of
2684
@var{fixed_num_bits} bits is allocated to each of the @var{array_size}
2685
integers in @var{integer_array}.
2687
The space will not be automatically increased, unlike the normal
2688
@code{mpz_init}, but instead an application must ensure it's sufficient for
2689
any value stored. The following space requirements apply to various
2694
@code{mpz_abs}, @code{mpz_neg}, @code{mpz_set}, @code{mpz_set_si} and
2695
@code{mpz_set_ui} need room for the value they store.
2698
@code{mpz_add}, @code{mpz_add_ui}, @code{mpz_sub} and @code{mpz_sub_ui} need
2699
room for the larger of the two operands, plus an extra
2700
@code{mp_bits_per_limb}.
2703
@code{mpz_mul}, @code{mpz_mul_ui} and @code{mpz_mul_ui} need room for the sum
2704
of the number of bits in their operands, but each rounded up to a multiple of
2705
@code{mp_bits_per_limb}.
2708
@code{mpz_swap} can be used between two array variables, but not between an
2709
array and a normal variable.
2712
For other functions, or if in doubt, the suggestion is to calculate in a
2713
regular @code{mpz_init} variable and copy the result to an array variable with
2716
@code{mpz_array_init} can reduce memory usage in algorithms that need large
2717
arrays of integers, since it avoids allocating and reallocating lots of small
2718
memory blocks. There is no way to free the storage allocated by this
2719
function. Don't call @code{mpz_clear}!
2722
@deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc})
2723
Change the space for @var{integer} to @var{new_alloc} limbs. The value in
2724
@var{integer} is preserved if it fits, or is set to 0 if not. The return
2725
value is not useful to applications and should be ignored.
2727
@code{mpz_realloc2} is the preferred way to accomplish allocation changes like
2728
this. @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that
2729
@code{_mpz_realloc} takes the new size in limbs.
2733
@node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions
2734
@comment node-name, next, previous, up
2735
@section Assignment Functions
2736
@cindex Integer assignment functions
2737
@cindex Assignment functions
2739
These functions assign new values to already initialized integers
2740
(@pxref{Initializing Integers}).
2742
@deftypefun void mpz_set (mpz_t @var{rop}, mpz_t @var{op})
2743
@deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
2744
@deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op})
2745
@deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op})
2746
@deftypefunx void mpz_set_q (mpz_t @var{rop}, mpq_t @var{op})
2747
@deftypefunx void mpz_set_f (mpz_t @var{rop}, mpf_t @var{op})
2748
Set the value of @var{rop} from @var{op}.
2750
@code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to
2754
@deftypefun int mpz_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base})
2755
Set the value of @var{rop} from @var{str}, a null-terminated C string in base
2756
@var{base}. White space is allowed in the string, and is simply ignored. The
2757
base may vary from 2 to 36. If @var{base} is 0, the actual base is determined
2758
from the leading characters: if the first two characters are ``0x'' or ``0X'',
2759
hexadecimal is assumed, otherwise if the first character is ``0'', octal is
2760
assumed, otherwise decimal is assumed.
2762
This function returns 0 if the entire string is a valid number in base
2763
@var{base}. Otherwise it returns @minus{}1.
2765
@c It turns out that it is not entirely true that this function ignores
2766
@c white-space. It does ignore it between digits, but not after a minus sign
2767
@c or within or after ``0x''. Some thought was given to disallowing all
2768
@c whitespace, but that would be an incompatible change, whitespace has been
2769
@c documented as ignored ever since GMP 1.
2773
@deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2})
2774
Swap the values @var{rop1} and @var{rop2} efficiently.
2778
@node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions
2779
@comment node-name, next, previous, up
2780
@section Combined Initialization and Assignment Functions
2781
@cindex Initialization and assignment functions
2782
@cindex Integer init and assign
2784
For convenience, GMP provides a parallel series of initialize-and-set functions
2785
which initialize the output and then store the value there. These functions'
2786
names have the form @code{mpz_init_set@dots{}}
2788
Here is an example of using one:
2793
mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10);
2795
mpz_sub (pie, @dots{});
2802
Once the integer has been initialized by any of the @code{mpz_init_set@dots{}}
2803
functions, it can be used as the source or destination operand for the ordinary
2804
integer functions. Don't use an initialize-and-set function on a variable
2805
already initialized!
2807
@deftypefun void mpz_init_set (mpz_t @var{rop}, mpz_t @var{op})
2808
@deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
2809
@deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op})
2810
@deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op})
2811
Initialize @var{rop} with limb space and set the initial numeric value from
2815
@deftypefun int mpz_init_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base})
2816
Initialize @var{rop} and set its value like @code{mpz_set_str} (see its
2817
documentation above for details).
2819
If the string is a correct base @var{base} number, the function returns 0;
2820
if an error occurs it returns @minus{}1. @var{rop} is initialized even if
2821
an error occurs. (I.e., you have to call @code{mpz_clear} for it.)
2825
@node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions
2826
@comment node-name, next, previous, up
2827
@section Conversion Functions
2828
@cindex Integer conversion functions
2829
@cindex Conversion functions
2831
This section describes functions for converting GMP integers to standard C
2832
types. Functions for converting @emph{to} GMP integers are described in
2833
@ref{Assigning Integers} and @ref{I/O of Integers}.
2835
@deftypefun {unsigned long int} mpz_get_ui (mpz_t @var{op})
2836
Return the value of @var{op} as an @code{unsigned long}.
2838
If @var{op} is too big to fit an @code{unsigned long} then just the least
2839
significant bits that do fit are returned. The sign of @var{op} is ignored,
2840
only the absolute value is used.
2843
@deftypefun {signed long int} mpz_get_si (mpz_t @var{op})
2844
If @var{op} fits into a @code{signed long int} return the value of @var{op}.
2845
Otherwise return the least significant part of @var{op}, with the same sign
2848
If @var{op} is too big to fit in a @code{signed long int}, the returned
2849
result is probably not very useful. To find out if the value will fit, use
2850
the function @code{mpz_fits_slong_p}.
2853
@deftypefun double mpz_get_d (mpz_t @var{op})
2854
Convert @var{op} to a @code{double}.
2857
@deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, mpz_t @var{op})
2858
Find @var{d} and @var{exp} such that @m{@var{d}\times 2^{exp}, @var{d} times 2
2859
raised to @var{exp}}, with @math{0.5@le{}@GMPabs{@var{d}}<1}, is a good
2860
approximation to @var{op}.
2863
@deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, mpz_t @var{op})
2864
Convert @var{op} to a string of digits in base @var{base}. The base may vary
2867
If @var{str} is @code{NULL}, the result string is allocated using the current
2868
allocation function (@pxref{Custom Allocation}). The block will be
2869
@code{strlen(str)+1} bytes, that being exactly enough for the string and
2872
If @var{str} is not @code{NULL}, it should point to a block of storage large
2873
enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base})
2874
+ 2}. The two extra bytes are for a possible minus sign, and the
2877
A pointer to the result string is returned, being either the allocated block,
2878
or the given @var{str}.
2881
@deftypefun mp_limb_t mpz_getlimbn (mpz_t @var{op}, mp_size_t @var{n})
2882
Return limb number @var{n} from @var{op}. The sign of @var{op} is ignored,
2883
just the absolute value is used. The least significant limb is number 0.
2885
@code{mpz_size} can be used to find how many limbs make up @var{op}.
2886
@code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to
2887
@code{mpz_size(@var{op})-1}.
2892
@node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions
2893
@comment node-name, next, previous, up
2894
@section Arithmetic Functions
2895
@cindex Integer arithmetic functions
2896
@cindex Arithmetic functions
2898
@deftypefun void mpz_add (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2899
@deftypefunx void mpz_add_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2900
Set @var{rop} to @math{@var{op1} + @var{op2}}.
2903
@deftypefun void mpz_sub (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2904
@deftypefunx void mpz_sub_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2905
@deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, mpz_t @var{op2})
2906
Set @var{rop} to @var{op1} @minus{} @var{op2}.
2909
@deftypefun void mpz_mul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2910
@deftypefunx void mpz_mul_si (mpz_t @var{rop}, mpz_t @var{op1}, long int @var{op2})
2911
@deftypefunx void mpz_mul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2912
Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
2915
@deftypefun void mpz_addmul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2916
@deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2917
Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}.
2920
@deftypefun void mpz_submul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2921
@deftypefunx void mpz_submul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2922
Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}.
2925
@deftypefun void mpz_mul_2exp (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2926
@cindex Bit shift left
2927
Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
2928
@var{op2}}. This operation can also be defined as a left shift by @var{op2}
2932
@deftypefun void mpz_neg (mpz_t @var{rop}, mpz_t @var{op})
2933
Set @var{rop} to @minus{}@var{op}.
2936
@deftypefun void mpz_abs (mpz_t @var{rop}, mpz_t @var{op})
2937
Set @var{rop} to the absolute value of @var{op}.
2942
@node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions
2943
@section Division Functions
2944
@cindex Integer division functions
2945
@cindex Division functions
2947
Division is undefined if the divisor is zero. Passing a zero divisor to the
2948
division or modulo functions (including the modular powering functions
2949
@code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by
2950
zero. This lets a program handle arithmetic exceptions in these functions the
2951
same way as for normal C @code{int} arithmetic.
2953
@c Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line
2954
@c between each, and seem to let tex do a better job of page breaks than an
2955
@c @sp 1 in the middle of one big set.
2957
@deftypefun void mpz_cdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
2958
@deftypefunx void mpz_cdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2959
@deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2961
@deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2962
@deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2963
@deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
2964
@deftypefunx {unsigned long int} mpz_cdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
2966
@deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2967
@deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2970
@deftypefun void mpz_fdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
2971
@deftypefunx void mpz_fdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2972
@deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2974
@deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2975
@deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2976
@deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
2977
@deftypefunx {unsigned long int} mpz_fdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
2979
@deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2980
@deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2983
@deftypefun void mpz_tdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
2984
@deftypefunx void mpz_tdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2985
@deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2987
@deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2988
@deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2989
@deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
2990
@deftypefunx {unsigned long int} mpz_tdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
2992
@deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2993
@deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2994
@cindex Bit shift right
2997
Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder
2998
@var{r}. For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}.
2999
The rounding is in three styles, each suiting different applications.
3003
@code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will
3004
have the opposite sign to @var{d}. The @code{c} stands for ``ceil''.
3007
@code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and
3008
@var{r} will have the same sign as @var{d}. The @code{f} stands for
3012
@code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign
3013
as @var{n}. The @code{t} stands for ``truncate''.
3016
In all cases @var{q} and @var{r} will satisfy
3017
@m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and
3018
@var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}.
3020
The @code{q} functions calculate only the quotient, the @code{r} functions
3021
only the remainder, and the @code{qr} functions calculate both. Note that for
3022
@code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or
3023
results will be unpredictable.
3025
For the @code{ui} variants the return value is the remainder, and in fact
3026
returning the remainder is all the @code{div_ui} functions do. For
3027
@code{tdiv} and @code{cdiv} the remainder can be negative, so for those the
3028
return value is the absolute value of the remainder.
3030
The @code{2exp} functions are right shifts and bit masks, but of course
3031
rounding the same as the other functions. For positive @var{n} both
3032
@code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp} are simple bitwise right
3033
shifts. For negative @var{n}, @code{mpz_fdiv_q_2exp} is effectively an
3034
arithmetic right shift treating @var{n} as twos complement the same as the
3035
bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp} effectively
3036
treats @var{n} as sign and magnitude.
3039
@deftypefun void mpz_mod (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
3040
@deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
3041
Set @var{r} to @var{n} @code{mod} @var{d}. The sign of the divisor is
3042
ignored; the result is always non-negative.
3044
@code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the
3045
remainder as well as setting @var{r}. See @code{mpz_fdiv_ui} above if only
3046
the return value is wanted.
3049
@deftypefun void mpz_divexact (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
3050
@deftypefunx void mpz_divexact_ui (mpz_t @var{q}, mpz_t @var{n}, unsigned long @var{d})
3051
@cindex Exact division functions
3052
Set @var{q} to @var{n}/@var{d}. These functions produce correct results only
3053
when it is known in advance that @var{d} divides @var{n}.
3055
These routines are much faster than the other division functions, and are the
3056
best choice when exact division is known to occur, for example reducing a
3057
rational to lowest terms.
3060
@deftypefun int mpz_divisible_p (mpz_t @var{n}, mpz_t @var{d})
3061
@deftypefunx int mpz_divisible_ui_p (mpz_t @var{n}, unsigned long int @var{d})
3062
@deftypefunx int mpz_divisible_2exp_p (mpz_t @var{n}, unsigned long int @var{b})
3063
Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of
3064
@code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}.
3067
@deftypefun int mpz_congruent_p (mpz_t @var{n}, mpz_t @var{c}, mpz_t @var{d})
3068
@deftypefunx int mpz_congruent_ui_p (mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d})
3069
@deftypefunx int mpz_congruent_2exp_p (mpz_t @var{n}, mpz_t @var{c}, unsigned long int @var{b})
3070
Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the
3071
case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}.
3076
@node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions
3077
@section Exponentiation Functions
3078
@cindex Integer exponentiation functions
3079
@cindex Exponentiation functions
3080
@cindex Powering functions
3082
@deftypefun void mpz_powm (mpz_t @var{rop}, mpz_t @var{base}, mpz_t @var{exp}, mpz_t @var{mod})
3083
@deftypefunx void mpz_powm_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp}, mpz_t @var{mod})
3084
Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp})
3087
Negative @var{exp} is supported if an inverse @math{@var{base}^@W{-1} @bmod
3088
@var{mod}} exists (see @code{mpz_invert} in @ref{Number Theoretic Functions}).
3089
If an inverse doesn't exist then a divide by zero is raised.
3092
@deftypefun void mpz_pow_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp})
3093
@deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp})
3094
Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}. The case
3095
@math{0^0} yields 1.
3100
@node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions
3101
@section Root Extraction Functions
3102
@cindex Integer root functions
3103
@cindex Root extraction functions
3105
@deftypefun int mpz_root (mpz_t @var{rop}, mpz_t @var{op}, unsigned long int @var{n})
3106
Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer
3107
part of the @var{n}th root of @var{op}. Return non-zero if the computation
3108
was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power.
3111
@deftypefun void mpz_sqrt (mpz_t @var{rop}, mpz_t @var{op})
3112
Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated
3113
integer part of the square root of @var{op}.
3116
@deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, mpz_t @var{op})
3117
Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
3118
of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{rop2} to the
3119
remainder @m{(@var{op} - @var{rop1}^2),
3120
@var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a
3123
If @var{rop1} and @var{rop2} are the same variable, the results are
3127
@deftypefun int mpz_perfect_power_p (mpz_t @var{op})
3128
Return non-zero if @var{op} is a perfect power, i.e., if there exist integers
3129
@m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that
3130
@m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}.
3132
Under this definition both 0 and 1 are considered to be perfect powers.
3133
Negative values of @var{op} are accepted, but of course can only be odd
3137
@deftypefun int mpz_perfect_square_p (mpz_t @var{op})
3138
Return non-zero if @var{op} is a perfect square, i.e., if the square root of
3139
@var{op} is an integer. Under this definition both 0 and 1 are considered to
3145
@node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions
3146
@section Number Theoretic Functions
3147
@cindex Number theoretic functions
3149
@deftypefun int mpz_probab_prime_p (mpz_t @var{n}, int @var{reps})
3150
@cindex Prime testing functions
3151
Determine whether @var{n} is prime. Return 2 if @var{n} is definitely prime,
3152
return 1 if @var{n} is probably prime (without being certain), or return 0 if
3153
@var{n} is definitely composite.
3155
This function does some trial divisions, then some Miller-Rabin probabilistic
3156
primality tests. @var{reps} controls how many such tests are done, 5 to 10 is
3157
a reasonable number, more will reduce the chances of a composite being
3158
returned as ``probably prime''.
3160
Miller-Rabin and similar tests can be more properly called compositeness
3161
tests. Numbers which fail are known to be composite but those which pass
3162
might be prime or might be composite. Only a few composites pass, hence those
3163
which pass are considered probably prime.
3166
@deftypefun void mpz_nextprime (mpz_t @var{rop}, mpz_t @var{op})
3167
Set @var{rop} to the next prime greater than @var{op}.
3169
This function uses a probabilistic algorithm to identify primes. For
3170
practical purposes it's adequate, the chance of a composite passing will be
3174
@c mpz_prime_p not implemented as of gmp 3.0.
3176
@c @deftypefun int mpz_prime_p (mpz_t @var{n})
3177
@c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime.
3178
@c This function is far slower than @code{mpz_probab_prime_p}, but then it
3179
@c never returns non-zero for composite numbers.
3181
@c (For practical purposes, using @code{mpz_probab_prime_p} is adequate.
3182
@c The likelihood of a programming error or hardware malfunction is orders
3183
@c of magnitudes greater than the likelihood for a composite to pass as a
3184
@c prime, if the @var{reps} argument is in the suggested range.)
3187
@deftypefun void mpz_gcd (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3188
@cindex Greatest common divisor functions
3189
Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}.
3190
The result is always positive even if one or both input operands
3194
@deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
3195
Compute the greatest common divisor of @var{op1} and @var{op2}. If
3196
@var{rop} is not @code{NULL}, store the result there.
3198
If the result is small enough to fit in an @code{unsigned long int}, it is
3199
returned. If the result does not fit, 0 is returned, and the result is equal
3200
to the argument @var{op1}. Note that the result will always fit if @var{op2}
3204
@deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, mpz_t @var{a}, mpz_t @var{b})
3205
@cindex Extended GCD
3206
Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in
3207
addition set @var{s} and @var{t} to coefficients satisfying
3208
@math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}.
3209
@var{g} is always positive, even if one or both of @var{a} and @var{b} are
3212
If @var{t} is @code{NULL} then that value is not computed.
3215
@deftypefun void mpz_lcm (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3216
@deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long @var{op2})
3217
@cindex Least common multiple functions
3218
Set @var{rop} to the least common multiple of @var{op1} and @var{op2}.
3219
@var{rop} is always positive, irrespective of the signs of @var{op1} and
3220
@var{op2}. @var{rop} will be zero if either @var{op1} or @var{op2} is zero.
3223
@deftypefun int mpz_invert (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3224
@cindex Modular inverse functions
3225
Compute the inverse of @var{op1} modulo @var{op2} and put the result in
3226
@var{rop}. If the inverse exists, the return value is non-zero and @var{rop}
3227
will satisfy @math{0 @le{} @var{rop} < @var{op2}}. If an inverse doesn't exist
3228
the return value is zero and @var{rop} is undefined.
3231
@deftypefun int mpz_jacobi (mpz_t @var{a}, mpz_t @var{b})
3232
@cindex Jacobi symbol functions
3233
Calculate the Jacobi symbol @m{\left(a \over b\right),
3234
(@var{a}/@var{b})}. This is defined only for @var{b} odd.
3237
@deftypefun int mpz_legendre (mpz_t @var{a}, mpz_t @var{p})
3238
Calculate the Legendre symbol @m{\left(a \over p\right),
3239
(@var{a}/@var{p})}. This is defined only for @var{p} an odd positive
3240
prime, and for such @var{p} it's identical to the Jacobi symbol.
3243
@deftypefun int mpz_kronecker (mpz_t @var{a}, mpz_t @var{b})
3244
@deftypefunx int mpz_kronecker_si (mpz_t @var{a}, long @var{b})
3245
@deftypefunx int mpz_kronecker_ui (mpz_t @var{a}, unsigned long @var{b})
3246
@deftypefunx int mpz_si_kronecker (long @var{a}, mpz_t @var{b})
3247
@deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, mpz_t @var{b})
3248
@cindex Kronecker symbol functions
3249
Calculate the Jacobi symbol @m{\left(a \over b\right),
3250
(@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over
3251
2\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or
3252
@m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even.
3254
When @var{b} is odd the Jacobi symbol and Kronecker symbol are
3255
identical, so @code{mpz_kronecker_ui} etc can be used for mixed
3256
precision Jacobi symbols too.
3258
For more information see Henri Cohen section 1.4.2 (@pxref{References}),
3259
or any number theory textbook. See also the example program
3260
@file{demos/qcn.c} which uses @code{mpz_kronecker_ui}.
3263
@deftypefun {unsigned long int} mpz_remove (mpz_t @var{rop}, mpz_t @var{op}, mpz_t @var{f})
3264
Remove all occurrences of the factor @var{f} from @var{op} and store the
3265
result in @var{rop}. The return value is how many such occurrences were
3269
@deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{op})
3270
@cindex Factorial functions
3271
Set @var{rop} to @var{op}!, the factorial of @var{op}.
3274
@deftypefun void mpz_bin_ui (mpz_t @var{rop}, mpz_t @var{n}, unsigned long int @var{k})
3275
@deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}})
3276
@cindex Binomial coefficient functions
3277
Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over
3278
@var{k}} and store the result in @var{rop}. Negative values of @var{n} are
3279
supported by @code{mpz_bin_ui}, using the identity
3280
@m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right),
3281
bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6
3285
@deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n})
3286
@deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n})
3287
@cindex Fibonacci sequence functions
3288
@code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci
3289
number. @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to
3292
These functions are designed for calculating isolated Fibonacci numbers. When
3293
a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and
3294
iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or
3298
@deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n})
3299
@deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n})
3300
@cindex Lucas number functions
3301
@code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas
3302
number. @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1}
3303
to @m{L_{n-1},L[n-1]}.
3305
These functions are designed for calculating isolated Lucas numbers. When a
3306
sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and
3307
iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or
3310
The Fibonacci numbers and Lucas numbers are related sequences, so it's never
3311
necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}. The
3312
formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers
3313
Algorithm}, the reverse is straightforward too.
3317
@node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions
3318
@comment node-name, next, previous, up
3319
@section Comparison Functions
3320
@cindex Integer comparison functions
3321
@cindex Comparison functions
3323
@deftypefn Function int mpz_cmp (mpz_t @var{op1}, mpz_t @var{op2})
3324
@deftypefnx Function int mpz_cmp_d (mpz_t @var{op1}, double @var{op2})
3325
@deftypefnx Macro int mpz_cmp_si (mpz_t @var{op1}, signed long int @var{op2})
3326
@deftypefnx Macro int mpz_cmp_ui (mpz_t @var{op1}, unsigned long int @var{op2})
3327
Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} >
3328
@var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if
3329
@math{@var{op1} < @var{op2}}.
3331
Note that @code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate
3332
their arguments more than once.
3335
@deftypefn Function int mpz_cmpabs (mpz_t @var{op1}, mpz_t @var{op2})
3336
@deftypefnx Function int mpz_cmpabs_d (mpz_t @var{op1}, double @var{op2})
3337
@deftypefnx Function int mpz_cmpabs_ui (mpz_t @var{op1}, unsigned long int @var{op2})
3338
Compare the absolute values of @var{op1} and @var{op2}. Return a positive
3339
value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if
3340
@math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if
3341
@math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}.
3344
@deftypefn Macro int mpz_sgn (mpz_t @var{op})
3346
@cindex Integer sign tests
3347
Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
3348
@math{-1} if @math{@var{op} < 0}.
3350
This function is actually implemented as a macro. It evaluates its argument
3355
@node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions
3356
@comment node-name, next, previous, up
3357
@section Logical and Bit Manipulation Functions
3358
@cindex Logical functions
3359
@cindex Bit manipulation functions
3360
@cindex Integer bit manipulation functions
3362
These functions behave as if twos complement arithmetic were used (although
3363
sign-magnitude is the actual implementation). The least significant bit is
3366
@deftypefun void mpz_and (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3367
Set @var{rop} to @var{op1} bitwise-and @var{op2}.
3370
@deftypefun void mpz_ior (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3371
Set @var{rop} to @var{op1} bitwise inclusive-or @var{op2}.
3374
@deftypefun void mpz_xor (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3375
Set @var{rop} to @var{op1} bitwise exclusive-or @var{op2}.
3378
@deftypefun void mpz_com (mpz_t @var{rop}, mpz_t @var{op})
3379
Set @var{rop} to the one's complement of @var{op}.
3382
@deftypefun {unsigned long int} mpz_popcount (mpz_t @var{op})
3383
If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is
3384
the number of 1 bits in the binary representation. If @math{@var{op}<0}, the
3385
number of 1s is infinite, and the return value is @var{ULONG_MAX}, the largest
3386
possible @code{unsigned long}.
3389
@deftypefun {unsigned long int} mpz_hamdist (mpz_t @var{op1}, mpz_t @var{op2})
3390
If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return
3391
the hamming distance between the two operands, which is the number of bit
3392
positions where @var{op1} and @var{op2} have different bit values. If one
3393
operand is @math{@ge{}0} and the other @math{<0} then the number of bits
3394
different is infinite, and the return value is @var{ULONG_MAX}, the largest
3395
possible @code{unsigned long}.
3398
@deftypefun {unsigned long int} mpz_scan0 (mpz_t @var{op}, unsigned long int @var{starting_bit})
3399
@deftypefunx {unsigned long int} mpz_scan1 (mpz_t @var{op}, unsigned long int @var{starting_bit})
3400
Scan @var{op}, starting from bit @var{starting_bit}, towards more significant
3401
bits, until the first 0 or 1 bit (respectively) is found. Return the index of
3404
If the bit at @var{starting_bit} is already what's sought, then
3405
@var{starting_bit} is returned.
3407
If there's no bit found, then @var{ULONG_MAX} is returned. This will happen
3408
in @code{mpz_scan0} past the end of a positive number, or @code{mpz_scan1}
3409
past the end of a negative.
3412
@deftypefun void mpz_setbit (mpz_t @var{rop}, unsigned long int @var{bit_index})
3413
Set bit @var{bit_index} in @var{rop}.
3416
@deftypefun void mpz_clrbit (mpz_t @var{rop}, unsigned long int @var{bit_index})
3417
Clear bit @var{bit_index} in @var{rop}.
3420
@deftypefun int mpz_tstbit (mpz_t @var{op}, unsigned long int @var{bit_index})
3421
Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly.
3424
@node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions
3425
@comment node-name, next, previous, up
3426
@section Input and Output Functions
3427
@cindex Integer input and output functions
3428
@cindex Input functions
3429
@cindex Output functions
3430
@cindex I/O functions
3432
Functions that perform input from a stdio stream, and functions that output to
3433
a stdio stream. Passing a @code{NULL} pointer for a @var{stream} argument to any of
3434
these functions will make them read from @code{stdin} and write to
3435
@code{stdout}, respectively.
3437
When using any of these functions, it is a good idea to include @file{stdio.h}
3438
before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
3439
for these functions.
3441
@deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, mpz_t @var{op})
3442
Output @var{op} on stdio stream @var{stream}, as a string of digits in base
3443
@var{base}. The base may vary from 2 to 36.
3445
Return the number of bytes written, or if an error occurred, return 0.
3448
@deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base})
3449
Input a possibly white-space preceded string in base @var{base} from stdio
3450
stream @var{stream}, and put the read integer in @var{rop}. The base may vary
3451
from 2 to 36. If @var{base} is 0, the actual base is determined from the
3452
leading characters: if the first two characters are `0x' or `0X', hexadecimal
3453
is assumed, otherwise if the first character is `0', octal is assumed,
3454
otherwise decimal is assumed.
3456
Return the number of bytes read, or if an error occurred, return 0.
3459
@deftypefun size_t mpz_out_raw (FILE *@var{stream}, mpz_t @var{op})
3460
Output @var{op} on stdio stream @var{stream}, in raw binary format. The
3461
integer is written in a portable format, with 4 bytes of size information, and
3462
that many bytes of limbs. Both the size and the limbs are written in
3463
decreasing significance order (i.e., in big-endian).
3465
The output can be read with @code{mpz_inp_raw}.
3467
Return the number of bytes written, or if an error occurred, return 0.
3469
The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because
3470
of changes necessary for compatibility between 32-bit and 64-bit machines.
3473
@deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream})
3474
Input from stdio stream @var{stream} in the format written by
3475
@code{mpz_out_raw}, and put the result in @var{rop}. Return the number of
3476
bytes read, or if an error occurred, return 0.
3478
This routine can read the output from @code{mpz_out_raw} also from GMP 1, in
3479
spite of changes necessary for compatibility between 32-bit and 64-bit
3485
@node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions
3486
@comment node-name, next, previous, up
3487
@section Random Number Functions
3488
@cindex Integer random number functions
3489
@cindex Random number functions
3491
The random number functions of GMP come in two groups; older function
3492
that rely on a global state, and newer functions that accept a state
3493
parameter that is read and modified. Please see the @ref{Random Number
3494
Functions} for more information on how to use and not to use random
3497
@deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{n})
3498
Generate a uniformly distributed random integer in the range 0 to @m{2^n-1,
3499
2^@var{n}@minus{}1}, inclusive.
3501
The variable @var{state} must be initialized by calling one of the
3502
@code{gmp_randinit} functions (@ref{Random State Initialization}) before
3503
invoking this function.
3506
@deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, mpz_t @var{n})
3507
Generate a uniform random integer in the range 0 to @math{@var{n}-1},
3510
The variable @var{state} must be initialized by calling one of the
3511
@code{gmp_randinit} functions (@ref{Random State Initialization})
3512
before invoking this function.
3515
@deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{n})
3516
Generate a random integer with long strings of zeros and ones in the
3517
binary representation. Useful for testing functions and algorithms,
3518
since this kind of random numbers have proven to be more likely to
3519
trigger corner-case bugs. The random number will be in the range
3520
0 to @m{2^n-1, 2^@var{n}@minus{}1}, inclusive.
3522
The variable @var{state} must be initialized by calling one of the
3523
@code{gmp_randinit} functions (@ref{Random State Initialization})
3524
before invoking this function.
3527
@deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size})
3528
Generate a random integer of at most @var{max_size} limbs. The generated
3529
random number doesn't satisfy any particular requirements of randomness.
3530
Negative random numbers are generated when @var{max_size} is negative.
3532
This function is obsolete. Use @code{mpz_urandomb} or
3533
@code{mpz_urandomm} instead.
3536
@deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size})
3537
Generate a random integer of at most @var{max_size} limbs, with long strings
3538
of zeros and ones in the binary representation. Useful for testing functions
3539
and algorithms, since this kind of random numbers have proven to be more
3540
likely to trigger corner-case bugs. Negative random numbers are generated
3541
when @var{max_size} is negative.
3543
This function is obsolete. Use @code{mpz_rrandomb} instead.
3547
@node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions
3548
@section Integer Import and Export
3550
@code{mpz_t} variables can be converted to and from arbitrary words of binary
3551
data with the following functions.
3553
@deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, int @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op})
3554
@cindex Integer import
3556
Set @var{rop} from an array of word data at @var{op}.
3558
The parameters specify the format of the data. @var{count} many words are
3559
read, each @var{size} bytes. @var{order} can be 1 for most significant word
3560
first or -1 for least significant first. Within each word @var{endian} can be
3561
1 for most significant byte first, -1 for least significant first, or 0 for
3562
the native endianness of the host CPU. The most significant @var{nails} bits
3563
of each word are skipped, this can be 0 to use the full words.
3565
There is no sign taken from the data, @var{rop} will simply be a positive
3566
integer. An application can handle any sign itself, and apply it for instance
3567
with @code{mpz_neg}.
3569
There are no data alignment restrictions on @var{op}, any address is allowed.
3571
Here's an example converting an array of @code{unsigned long} data, most
3572
significant element first, and host byte order within each value.
3575
unsigned long a[20];
3577
mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a);
3580
This example assumes the full @code{sizeof} bytes are used for data in the
3581
given type, which is usually true, and certainly true for @code{unsigned long}
3582
everywhere we know of. However on Cray vector systems it may be noted that
3583
@code{short} and @code{int} are always stored in 8 bytes (and with
3584
@code{sizeof} indicating that) but use only 32 or 46 bits. The @var{nails}
3585
feature can account for this, by passing for instance
3586
@code{8*sizeof(int)-INT_BIT}.
3589
@deftypefun {void *} mpz_export (void *@var{rop}, size_t *@var{countp}, int @var{order}, int @var{size}, int @var{endian}, size_t @var{nails}, mpz_t @var{op})
3590
@cindex Integer export
3592
Fill @var{rop} with word data from @var{op}.
3594
The parameters specify the format of the data produced. Each word will be
3595
@var{size} bytes and @var{order} can be 1 for most significant word first or
3596
-1 for least significant first. Within each word @var{endian} can be 1 for
3597
most significant byte first, -1 for least significant first, or 0 for the
3598
native endianness of the host CPU. The most significant @var{nails} bits of
3599
each word are unused and set to zero, this can be 0 to produce full words.
3601
The number of words produced is written to @code{*@var{countp}}, or
3602
@var{countp} can be @code{NULL} to discard the count. @var{rop} must have
3603
enough space for the data, or if @var{rop} is @code{NULL} then a result array
3604
of the necessary size is allocated using the current GMP allocation function
3605
(@pxref{Custom Allocation}). In either case the return value is the
3606
destination used, either @var{rop} or the allocated block.
3608
If @var{op} is non-zero then the most significant word produced will be
3609
non-zero. If @var{op} is zero then the count returned will be zero and
3610
nothing written to @var{rop}. If @var{rop} is @code{NULL} in this case, no
3611
block is allocated, just @code{NULL} is returned.
3613
The sign of @var{op} is ignored, just the absolute value is exported. An
3614
application can use @code{mpz_sgn} to get the sign and handle it as desired.
3615
(@pxref{Integer Comparisons})
3617
There are no data alignment restrictions on @var{rop}, any address is allowed.
3619
When an application is allocating space itself the required size can be
3620
determined with a calculation like the following. Since @code{mpz_sizeinbase}
3621
always returns at least 1, @code{count} here will be at least one, which
3622
avoids any portability problems with @code{malloc(0)}, though if @code{z} is
3623
zero no space at all is actually needed (or written).
3626
numb = 8*size - nail;
3627
count = (mpz_sizeinbase (z, 2) + numb-1) / numb;
3628
p = malloc (count * size);
3634
@node Miscellaneous Integer Functions, , Integer Import and Export, Integer Functions
3635
@comment node-name, next, previous, up
3636
@section Miscellaneous Functions
3637
@cindex Miscellaneous integer functions
3638
@cindex Integer miscellaneous functions
3640
@deftypefun int mpz_fits_ulong_p (mpz_t @var{op})
3641
@deftypefunx int mpz_fits_slong_p (mpz_t @var{op})
3642
@deftypefunx int mpz_fits_uint_p (mpz_t @var{op})
3643
@deftypefunx int mpz_fits_sint_p (mpz_t @var{op})
3644
@deftypefunx int mpz_fits_ushort_p (mpz_t @var{op})
3645
@deftypefunx int mpz_fits_sshort_p (mpz_t @var{op})
3646
Return non-zero iff the value of @var{op} fits in an @code{unsigned long int},
3647
@code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned
3648
short int}, or @code{signed short int}, respectively. Otherwise, return zero.
3651
@deftypefn Macro int mpz_odd_p (mpz_t @var{op})
3652
@deftypefnx Macro int mpz_even_p (mpz_t @var{op})
3653
Determine whether @var{op} is odd or even, respectively. Return non-zero if
3654
yes, zero if no. These macros evaluate their argument more than once.
3657
@deftypefun size_t mpz_size (mpz_t @var{op})
3658
Return the size of @var{op} measured in number of limbs. If @var{op} is zero,
3659
the returned value will be zero.
3660
@c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.)
3663
@deftypefun size_t mpz_sizeinbase (mpz_t @var{op}, int @var{base})
3664
@cindex Size in digits
3665
@cindex Digits in an integer
3666
Return the size of @var{op} measured in number of digits in the given
3667
@var{base}. @var{base} can vary from 2 to 36. The sign of @var{op} is
3668
ignored, just the absolute value is used. The result will be either exact or
3669
1 too big. If @var{base} is a power of 2, the result is always exact. If
3670
@var{op} is zero the return value is always 1.
3672
This function can be used to determine the space required when converting
3673
@var{op} to a string. The right amount of allocation is normally two more
3674
than the value returned by @code{mpz_sizeinbase}, one extra for a minus sign
3675
and one for the null-terminator.
3677
@cindex Most significant bit
3678
It will be noted that @code{mpz_sizeinbase(@var{op},2)} can be used to locate
3679
the most significant 1 bit in @var{op}, counting from 1. (Unlike the bitwise
3680
functions which start from 0, @xref{Integer Logic and Bit Fiddling,, Logical
3681
and Bit Manipulation Functions}.)
3685
@node Rational Number Functions, Floating-point Functions, Integer Functions, Top
3686
@comment node-name, next, previous, up
3687
@chapter Rational Number Functions
3688
@cindex Rational number functions
3690
This chapter describes the GMP functions for performing arithmetic on rational
3691
numbers. These functions start with the prefix @code{mpq_}.
3693
Rational numbers are stored in objects of type @code{mpq_t}.
3695
All rational arithmetic functions assume operands have a canonical form, and
3696
canonicalize their result. The canonical from means that the denominator and
3697
the numerator have no common factors, and that the denominator is positive.
3698
Zero has the unique representation 0/1.
3700
Pure assignment functions do not canonicalize the assigned variable. It is
3701
the responsibility of the user to canonicalize the assigned variable before
3702
any arithmetic operations are performed on that variable.
3704
@deftypefun void mpq_canonicalize (mpq_t @var{op})
3705
Remove any factors that are common to the numerator and denominator of
3706
@var{op}, and make the denominator positive.
3710
* Initializing Rationals::
3711
* Rational Conversions::
3712
* Rational Arithmetic::
3713
* Comparing Rationals::
3714
* Applying Integer Functions::
3715
* I/O of Rationals::
3718
@node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions
3719
@comment node-name, next, previous, up
3720
@section Initialization and Assignment Functions
3721
@cindex Initialization and assignment functions
3722
@cindex Rational init and assign
3724
@deftypefun void mpq_init (mpq_t @var{dest_rational})
3725
Initialize @var{dest_rational} and set it to 0/1. Each variable should
3726
normally only be initialized once, or at least cleared out (using the function
3727
@code{mpq_clear}) between each initialization.
3730
@deftypefun void mpq_clear (mpq_t @var{rational_number})
3731
Free the space occupied by @var{rational_number}. Make sure to call this
3732
function for all @code{mpq_t} variables when you are done with them.
3735
@deftypefun void mpq_set (mpq_t @var{rop}, mpq_t @var{op})
3736
@deftypefunx void mpq_set_z (mpq_t @var{rop}, mpz_t @var{op})
3737
Assign @var{rop} from @var{op}.
3740
@deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2})
3741
@deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2})
3742
Set the value of @var{rop} to @var{op1}/@var{op2}. Note that if @var{op1} and
3743
@var{op2} have common factors, @var{rop} has to be passed to
3744
@code{mpq_canonicalize} before any operations are performed on @var{rop}.
3747
@deftypefun int mpq_set_str (mpq_t @var{rop}, char *@var{str}, int @var{base})
3748
Set @var{rop} from a null-terminated string @var{str} in the given @var{base}.
3750
The string can be an integer like ``41'' or a fraction like ``41/152''. The
3751
fraction must be in canonical form (@pxref{Rational Number Functions}), or if
3752
not then @code{mpq_canonicalize} must be called.
3754
The numerator and optional denominator are parsed the same as in
3755
@code{mpz_set_str} (@pxref{Assigning Integers}). White space is allowed in
3756
the string, and is simply ignored. The @var{base} can vary from 2 to 36, or
3757
if @var{base} is 0 then the leading characters are used: @code{0x} for hex,
3758
@code{0} for octal, or decimal otherwise. Note that this is done separately
3759
for the numerator and denominator, so for instance @code{0xEF/100} is 239/100,
3760
whereas @code{0xEF/0x100} is 239/256.
3762
The return value is 0 if the entire string is a valid number, or @minus{}1 if
3766
@deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2})
3767
Swap the values @var{rop1} and @var{rop2} efficiently.
3772
@node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions
3773
@comment node-name, next, previous, up
3774
@section Conversion Functions
3775
@cindex Rational conversion functions
3776
@cindex Conversion functions
3778
@deftypefun double mpq_get_d (mpq_t @var{op})
3779
Convert @var{op} to a @code{double}.
3782
@deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op})
3783
@deftypefunx void mpq_set_f (mpq_t @var{rop}, mpf_t @var{op})
3784
Set @var{rop} to the value of @var{op}, without rounding.
3787
@deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, mpq_t @var{op})
3788
Convert @var{op} to a string of digits in base @var{base}. The base may vary
3789
from 2 to 36. The string will be of the form @samp{num/den}, or if the
3790
denominator is 1 then just @samp{num}.
3792
If @var{str} is @code{NULL}, the result string is allocated using the current
3793
allocation function (@pxref{Custom Allocation}). The block will be
3794
@code{strlen(str)+1} bytes, that being exactly enough for the string and
3797
If @var{str} is not @code{NULL}, it should point to a block of storage large
3798
enough for the result, that being
3801
mpz_sizeinbase (mpq_numref(@var{op}), @var{base})
3802
+ mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3
3805
The three extra bytes are for a possible minus sign, possible slash, and the
3808
A pointer to the result string is returned, being either the allocated block,
3809
or the given @var{str}.
3813
@node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions
3814
@comment node-name, next, previous, up
3815
@section Arithmetic Functions
3816
@cindex Rational arithmetic functions
3817
@cindex Arithmetic functions
3819
@deftypefun void mpq_add (mpq_t @var{sum}, mpq_t @var{addend1}, mpq_t @var{addend2})
3820
Set @var{sum} to @var{addend1} + @var{addend2}.
3823
@deftypefun void mpq_sub (mpq_t @var{difference}, mpq_t @var{minuend}, mpq_t @var{subtrahend})
3824
Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}.
3827
@deftypefun void mpq_mul (mpq_t @var{product}, mpq_t @var{multiplier}, mpq_t @var{multiplicand})
3828
Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}.
3831
@deftypefun void mpq_mul_2exp (mpq_t @var{rop}, mpq_t @var{op1}, unsigned long int @var{op2})
3832
Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
3836
@deftypefun void mpq_div (mpq_t @var{quotient}, mpq_t @var{dividend}, mpq_t @var{divisor})
3837
@cindex Division functions
3838
Set @var{quotient} to @var{dividend}/@var{divisor}.
3841
@deftypefun void mpq_div_2exp (mpq_t @var{rop}, mpq_t @var{op1}, unsigned long int @var{op2})
3842
Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
3846
@deftypefun void mpq_neg (mpq_t @var{negated_operand}, mpq_t @var{operand})
3847
Set @var{negated_operand} to @minus{}@var{operand}.
3850
@deftypefun void mpq_abs (mpq_t @var{rop}, mpq_t @var{op})
3851
Set @var{rop} to the absolute value of @var{op}.
3854
@deftypefun void mpq_inv (mpq_t @var{inverted_number}, mpq_t @var{number})
3855
Set @var{inverted_number} to 1/@var{number}. If the new denominator is
3856
zero, this routine will divide by zero.
3859
@node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions
3860
@comment node-name, next, previous, up
3861
@section Comparison Functions
3862
@cindex Rational comparison functions
3863
@cindex Comparison functions
3865
@deftypefun int mpq_cmp (mpq_t @var{op1}, mpq_t @var{op2})
3866
Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} >
3867
@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
3868
@math{@var{op1} < @var{op2}}.
3870
To determine if two rationals are equal, @code{mpq_equal} is faster than
3874
@deftypefn Macro int mpq_cmp_ui (mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2})
3875
@deftypefnx Macro int mpq_cmp_si (mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2})
3876
Compare @var{op1} and @var{num2}/@var{den2}. Return a positive value if
3877
@math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} =
3878
@var{num2}/@var{den2}}, and a negative value if @math{@var{op1} <
3879
@var{num2}/@var{den2}}.
3881
@var{num2} and @var{den2} are allowed to have common factors.
3883
These functions are implemented as a macros and evaluate their arguments
3887
@deftypefn Macro int mpq_sgn (mpq_t @var{op})
3889
@cindex Rational sign tests
3890
Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
3891
@math{-1} if @math{@var{op} < 0}.
3893
This function is actually implemented as a macro. It evaluates its
3894
arguments multiple times.
3897
@deftypefun int mpq_equal (mpq_t @var{op1}, mpq_t @var{op2})
3898
Return non-zero if @var{op1} and @var{op2} are equal, zero if they are
3899
non-equal. Although @code{mpq_cmp} can be used for the same purpose, this
3900
function is much faster.
3903
@node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions
3904
@comment node-name, next, previous, up
3905
@section Applying Integer Functions to Rationals
3906
@cindex Rational numerator and denominator
3907
@cindex Numerator and denominator
3909
The set of @code{mpq} functions is quite small. In particular, there are few
3910
functions for either input or output. The following functions give direct
3911
access to the numerator and denominator of an @code{mpq_t}.
3913
Note that if an assignment to the numerator and/or denominator could take an
3914
@code{mpq_t} out of the canonical form described at the start of this chapter
3915
(@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be
3916
called before any other @code{mpq} functions are applied to that @code{mpq_t}.
3918
@deftypefn Macro mpz_t mpq_numref (mpq_t @var{op})
3919
@deftypefnx Macro mpz_t mpq_denref (mpq_t @var{op})
3920
Return a reference to the numerator and denominator of @var{op}, respectively.
3921
The @code{mpz} functions can be used on the result of these macros.
3924
@deftypefun void mpq_get_num (mpz_t @var{numerator}, mpq_t @var{rational})
3925
@deftypefunx void mpq_get_den (mpz_t @var{denominator}, mpq_t @var{rational})
3926
@deftypefunx void mpq_set_num (mpq_t @var{rational}, mpz_t @var{numerator})
3927
@deftypefunx void mpq_set_den (mpq_t @var{rational}, mpz_t @var{denominator})
3928
Get or set the numerator or denominator of a rational. These functions are
3929
equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or
3930
@code{mpq_denref}. Direct use of @code{mpq_numref} or @code{mpq_denref} is
3931
recommended instead of these functions.
3936
@node I/O of Rationals, , Applying Integer Functions, Rational Number Functions
3937
@comment node-name, next, previous, up
3938
@section Input and Output Functions
3939
@cindex Rational input and output functions
3940
@cindex Input functions
3941
@cindex Output functions
3942
@cindex I/O functions
3944
When using any of these functions, it's a good idea to include @file{stdio.h}
3945
before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
3946
for these functions.
3948
Passing a @code{NULL} pointer for a @var{stream} argument to any of these
3949
functions will make them read from @code{stdin} and write to @code{stdout},
3952
@deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, mpq_t @var{op})
3953
Output @var{op} on stdio stream @var{stream}, as a string of digits in base
3954
@var{base}. The base may vary from 2 to 36. Output is in the form
3955
@samp{num/den} or if the denominator is 1 then just @samp{num}.
3957
Return the number of bytes written, or if an error occurred, return 0.
3960
@deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base})
3961
Read a string of digits from @var{stream} and convert them to a rational in
3962
@var{rop}. Any initial white-space characters are read and discarded. Return
3963
the number of characters read (including white space), or 0 if a rational
3966
The input can be a fraction like @samp{17/63} or just an integer like
3967
@samp{123}. Reading stops at the first character not in this form, and white
3968
space is not permitted within the string. If the input might not be in
3969
canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational
3972
The @var{base} can be between 2 and 36, or can be 0 in which case the leading
3973
characters of the string determine the base, @samp{0x} or @samp{0X} for
3974
hexadecimal, @samp{0} for octal, or decimal otherwise. The leading characters
3975
are examined separately for the numerator and denominator of a fraction, so
3976
for instance @samp{0x10/11} is 16/11, whereas @samp{0x10/0x11} is 16/17.
3980
@node Floating-point Functions, Low-level Functions, Rational Number Functions, Top
3981
@comment node-name, next, previous, up
3982
@chapter Floating-point Functions
3983
@cindex Floating-point functions
3984
@cindex Float functions
3985
@cindex User-defined precision
3986
@cindex Precision of floats
3988
GMP floating point numbers are stored in objects of type @code{mpf_t} and
3989
functions operating on them have an @code{mpf_} prefix.
3991
The mantissa of each float has a user-selectable precision, limited only by
3992
available memory. Each variable has its own precision, and that can be
3993
increased or decreased at any time.
3995
The exponent of each float is a fixed precision, one machine word on most
3996
systems. In the current implementation the exponent is a count of limbs, so
3997
for example on a 32-bit system this means a range of roughly
3998
@math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system
3999
this will be greater. Note however @code{mpf_get_str} can only return an
4000
exponent which fits an @code{mp_exp_t} and currently @code{mpf_set_str}
4001
doesn't accept exponents bigger than a @code{long}.
4003
Each variable keeps a size for the mantissa data actually in use. This means
4004
that if a float is exactly represented in only a few bits then only those bits
4005
will be used in a calculation, even if the selected precision is high.
4007
All calculations are performed to the precision of the destination variable.
4008
Each function is defined to calculate with ``infinite precision'' followed by
4009
a truncation to the destination precision, but of course the work done is only
4010
what's needed to determine a result under that definition.
4012
The precision selected for a variable is a minimum value, GMP may increase it
4013
a little to facilitate efficient calculation. Currently this means rounding
4014
up to a whole limb, and then sometimes having a further partial limb,
4015
depending on the high limb of the mantissa. But applications shouldn't be
4016
concerned by such details.
4018
The mantissa in stored in binary, as might be imagined from the fact
4019
precisions are expressed in bits. One consequence of this is that decimal
4020
fractions like @math{0.1} cannot be represented exactly. The same is true of
4021
plain IEEE @code{double} floats. This makes both highly unsuitable for
4022
calculations involving money or other values that should be exact decimal
4023
fractions. (Suitably scaled integers, or perhaps rationals, are better
4026
@code{mpf} functions and variables have no special notion of infinity or
4027
not-a-number, and applications must take care not to overflow the exponent or
4028
results will be unpredictable. This might change in a future release.
4030
Note that the @code{mpf} functions are @emph{not} intended as a smooth
4031
extension to IEEE P754 arithmetic. In particular results obtained on one
4032
computer often differ from the results on a computer with a different word
4036
* Initializing Floats::
4037
* Assigning Floats::
4038
* Simultaneous Float Init & Assign::
4039
* Converting Floats::
4040
* Float Arithmetic::
4041
* Float Comparison::
4043
* Miscellaneous Float Functions::
4046
@node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions
4047
@comment node-name, next, previous, up
4048
@section Initialization Functions
4049
@cindex Float initialization functions
4050
@cindex Initialization functions
4052
@deftypefun void mpf_set_default_prec (unsigned long int @var{prec})
4053
Set the default precision to be @strong{at least} @var{prec} bits. All
4054
subsequent calls to @code{mpf_init} will use this precision, but previously
4055
initialized variables are unaffected.
4058
@deftypefun {unsigned long int} mpf_get_default_prec (void)
4059
Return the default default precision actually used.
4062
An @code{mpf_t} object must be initialized before storing the first value in
4063
it. The functions @code{mpf_init} and @code{mpf_init2} are used for that
4066
@deftypefun void mpf_init (mpf_t @var{x})
4067
Initialize @var{x} to 0. Normally, a variable should be initialized once only
4068
or at least be cleared, using @code{mpf_clear}, between initializations. The
4069
precision of @var{x} is undefined unless a default precision has already been
4070
established by a call to @code{mpf_set_default_prec}.
4073
@deftypefun void mpf_init2 (mpf_t @var{x}, unsigned long int @var{prec})
4074
Initialize @var{x} to 0 and set its precision to be @strong{at least}
4075
@var{prec} bits. Normally, a variable should be initialized once only or at
4076
least be cleared, using @code{mpf_clear}, between initializations.
4079
@deftypefun void mpf_clear (mpf_t @var{x})
4080
Free the space occupied by @var{x}. Make sure to call this function for all
4081
@code{mpf_t} variables when you are done with them.
4085
Here is an example on how to initialize floating-point variables:
4089
mpf_init (x); /* use default precision */
4090
mpf_init2 (y, 256); /* precision @emph{at least} 256 bits */
4092
/* Unless the program is about to exit, do ... */
4098
The following three functions are useful for changing the precision during a
4099
calculation. A typical use would be for adjusting the precision gradually in
4100
iterative algorithms like Newton-Raphson, making the computation precision
4101
closely match the actual accurate part of the numbers.
4103
@deftypefun {unsigned long int} mpf_get_prec (mpf_t @var{op})
4104
Return the current precision of @var{op}, in bits.
4107
@deftypefun void mpf_set_prec (mpf_t @var{rop}, unsigned long int @var{prec})
4108
Set the precision of @var{rop} to be @strong{at least} @var{prec} bits. The
4109
value in @var{rop} will be truncated to the new precision.
4111
This function requires a call to @code{realloc}, and so should not be used in
4115
@deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, unsigned long int @var{prec})
4116
Set the precision of @var{rop} to be @strong{at least} @var{prec} bits,
4117
without changing the memory allocated.
4119
@var{prec} must be no more than the allocated precision for @var{rop}, that
4120
being the precision when @var{rop} was initialized, or in the most recent
4121
@code{mpf_set_prec}.
4123
The value in @var{rop} is unchanged, and in particular if it had a higher
4124
precision than @var{prec} it will retain that higher precision. New values
4125
written to @var{rop} will use the new @var{prec}.
4127
Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another
4128
@code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original
4129
allocated precision. Failing to do so will have unpredictable results.
4131
@code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the
4132
original allocated precision. After @code{mpf_set_prec_raw} it reflects the
4133
@var{prec} value set.
4135
@code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at
4136
different precisions during a calculation, perhaps to gradually increase
4137
precision in an iteration, or just to use various different precisions for
4138
different purposes during a calculation.
4143
@node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions
4144
@comment node-name, next, previous, up
4145
@section Assignment Functions
4146
@cindex Float assignment functions
4147
@cindex Assignment functions
4149
These functions assign new values to already initialized floats
4150
(@pxref{Initializing Floats}).
4152
@deftypefun void mpf_set (mpf_t @var{rop}, mpf_t @var{op})
4153
@deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
4154
@deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op})
4155
@deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op})
4156
@deftypefunx void mpf_set_z (mpf_t @var{rop}, mpz_t @var{op})
4157
@deftypefunx void mpf_set_q (mpf_t @var{rop}, mpq_t @var{op})
4158
Set the value of @var{rop} from @var{op}.
4161
@deftypefun int mpf_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base})
4162
Set the value of @var{rop} from the string in @var{str}. The string is of the
4163
form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}.
4164
@samp{M} is the mantissa and @samp{N} is the exponent. The mantissa is always
4165
in the specified base. The exponent is either in the specified base or, if
4166
@var{base} is negative, in decimal. The decimal point expected is taken from
4167
the current locale, on systems providing @code{localeconv}.
4169
The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to
4170
@minus{}2. Negative values are used to specify that the exponent is in
4173
Unlike the corresponding @code{mpz} function, the base will not be determined
4174
from the leading characters of the string if @var{base} is 0. This is so that
4175
numbers like @samp{0.23} are not interpreted as octal.
4177
White space is allowed in the string, and is simply ignored. [This is not
4178
really true; white-space is ignored in the beginning of the string and within
4179
the mantissa, but not in other places, such as after a minus sign or in the
4180
exponent. We are considering changing the definition of this function, making
4181
it fail when there is any white-space in the input, since that makes a lot of
4182
sense. Please tell us your opinion about this change. Do you really want it
4183
to accept @nicode{"3 14"} as meaning 314 as it does now?]
4185
This function returns 0 if the entire string is a valid number in base
4186
@var{base}. Otherwise it returns @minus{}1.
4189
@deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2})
4190
Swap @var{rop1} and @var{rop2} efficiently. Both the values and the
4191
precisions of the two variables are swapped.
4195
@node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions
4196
@comment node-name, next, previous, up
4197
@section Combined Initialization and Assignment Functions
4198
@cindex Initialization and assignment functions
4199
@cindex Float init and assign functions
4201
For convenience, GMP provides a parallel series of initialize-and-set functions
4202
which initialize the output and then store the value there. These functions'
4203
names have the form @code{mpf_init_set@dots{}}
4205
Once the float has been initialized by any of the @code{mpf_init_set@dots{}}
4206
functions, it can be used as the source or destination operand for the ordinary
4207
float functions. Don't use an initialize-and-set function on a variable
4208
already initialized!
4210
@deftypefun void mpf_init_set (mpf_t @var{rop}, mpf_t @var{op})
4211
@deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
4212
@deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op})
4213
@deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op})
4214
Initialize @var{rop} and set its value from @var{op}.
4216
The precision of @var{rop} will be taken from the active default precision, as
4217
set by @code{mpf_set_default_prec}.
4220
@deftypefun int mpf_init_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base})
4221
Initialize @var{rop} and set its value from the string in @var{str}. See
4222
@code{mpf_set_str} above for details on the assignment operation.
4224
Note that @var{rop} is initialized even if an error occurs. (I.e., you have to
4225
call @code{mpf_clear} for it.)
4227
The precision of @var{rop} will be taken from the active default precision, as
4228
set by @code{mpf_set_default_prec}.
4232
@node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions
4233
@comment node-name, next, previous, up
4234
@section Conversion Functions
4235
@cindex Float conversion functions
4236
@cindex Conversion functions
4238
@deftypefun double mpf_get_d (mpf_t @var{op})
4239
Convert @var{op} to a @code{double}.
4242
@deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, mpf_t @var{op})
4243
Find @var{d} and @var{exp} such that @m{@var{d}\times 2^{exp}, @var{d} times 2
4244
raised to @var{exp}}, with @math{0.5@le{}@GMPabs{@var{d}}<1}, is a good
4245
approximation to @var{op}. This is similar to the standard C function
4249
@deftypefun long mpf_get_si (mpf_t @var{op})
4250
@deftypefunx {unsigned long} mpf_get_ui (mpf_t @var{op})
4251
Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any
4252
fraction part. If @var{op} is too big for the return type, the result is
4255
See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p}
4256
(@pxref{Miscellaneous Float Functions}).
4259
@deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op})
4260
Convert @var{op} to a string of digits in base @var{base}. @var{base} can be
4261
2 to 36. Up to @var{n_digits} digits will be generated. Trailing zeros are
4262
not returned. No more digits than can be accurately represented by @var{op}
4263
are ever generated. If @var{n_digits} is 0 then that accurate maximum number
4264
of digits are generated.
4266
If @var{str} is @code{NULL}, the result string is allocated using the current
4267
allocation function (@pxref{Custom Allocation}). The block will be
4268
@code{strlen(str)+1} bytes, that being exactly enough for the string and
4271
If @var{str} is not @code{NULL}, it should point to a block of
4272
@math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a
4273
possible minus sign, and a null-terminator. When @var{n_digits} is 0 to get
4274
all significant digits, an application won't be able to know the space
4275
required, and @var{str} should be @code{NULL} in that case.
4277
The generated string is a fraction, with an implicit radix point immediately
4278
to the left of the first digit. The applicable exponent is written through
4279
the @var{expptr} pointer. For example, the number 3.1416 would be returned as
4280
string @nicode{"31416"} and exponent 1.
4282
When @var{op} is zero, an empty string is produced and the exponent returned
4285
A pointer to the result string is returned, being either the allocated block
4286
or the given @var{str}.
4290
@node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions
4291
@comment node-name, next, previous, up
4292
@section Arithmetic Functions
4293
@cindex Float arithmetic functions
4294
@cindex Arithmetic functions
4296
@deftypefun void mpf_add (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4297
@deftypefunx void mpf_add_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4298
Set @var{rop} to @math{@var{op1} + @var{op2}}.
4301
@deftypefun void mpf_sub (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4302
@deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2})
4303
@deftypefunx void mpf_sub_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4304
Set @var{rop} to @var{op1} @minus{} @var{op2}.
4307
@deftypefun void mpf_mul (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4308
@deftypefunx void mpf_mul_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4309
Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
4312
Division is undefined if the divisor is zero, and passing a zero divisor to the
4313
divide functions will make these functions intentionally divide by zero. This
4314
lets the user handle arithmetic exceptions in these functions in the same
4315
manner as other arithmetic exceptions.
4317
@deftypefun void mpf_div (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4318
@deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2})
4319
@deftypefunx void mpf_div_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4320
@cindex Division functions
4321
Set @var{rop} to @var{op1}/@var{op2}.
4324
@deftypefun void mpf_sqrt (mpf_t @var{rop}, mpf_t @var{op})
4325
@deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op})
4326
@cindex Root extraction functions
4327
Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}.
4330
@deftypefun void mpf_pow_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4331
@cindex Exponentiation functions
4332
@cindex Powering functions
4333
Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}.
4336
@deftypefun void mpf_neg (mpf_t @var{rop}, mpf_t @var{op})
4337
Set @var{rop} to @minus{}@var{op}.
4340
@deftypefun void mpf_abs (mpf_t @var{rop}, mpf_t @var{op})
4341
Set @var{rop} to the absolute value of @var{op}.
4344
@deftypefun void mpf_mul_2exp (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4345
Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
4349
@deftypefun void mpf_div_2exp (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4350
Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
4354
@node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions
4355
@comment node-name, next, previous, up
4356
@section Comparison Functions
4357
@cindex Float comparison functions
4358
@cindex Comparison functions
4360
@deftypefun int mpf_cmp (mpf_t @var{op1}, mpf_t @var{op2})
4361
@deftypefunx int mpf_cmp_d (mpf_t @var{op1}, double @var{op2})
4362
@deftypefunx int mpf_cmp_ui (mpf_t @var{op1}, unsigned long int @var{op2})
4363
@deftypefunx int mpf_cmp_si (mpf_t @var{op1}, signed long int @var{op2})
4364
Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} >
4365
@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
4366
@math{@var{op1} < @var{op2}}.
4369
@deftypefun int mpf_eq (mpf_t @var{op1}, mpf_t @var{op2}, unsigned long int op3)
4370
Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are
4371
equal, zero otherwise. I.e., test of @var{op1} and @var{op2} are approximately
4374
Caution: Currently only whole limbs are compared, and only in an exact
4375
fashion. In the future values like 1000 and 0111 may be considered the same
4376
to 3 bits (on the basis that their difference is that small).
4379
@deftypefun void mpf_reldiff (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4380
Compute the relative difference between @var{op1} and @var{op2} and store the
4381
result in @var{rop}. This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}.
4384
@deftypefn Macro int mpf_sgn (mpf_t @var{op})
4386
@cindex Float sign tests
4387
Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
4388
@math{-1} if @math{@var{op} < 0}.
4390
This function is actually implemented as a macro. It evaluates its arguments
4394
@node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions
4395
@comment node-name, next, previous, up
4396
@section Input and Output Functions
4397
@cindex Float input and output functions
4398
@cindex Input functions
4399
@cindex Output functions
4400
@cindex I/O functions
4402
Functions that perform input from a stdio stream, and functions that output to
4403
a stdio stream. Passing a @code{NULL} pointer for a @var{stream} argument to
4404
any of these functions will make them read from @code{stdin} and write to
4405
@code{stdout}, respectively.
4407
When using any of these functions, it is a good idea to include @file{stdio.h}
4408
before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
4409
for these functions.
4411
@deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op})
4412
Print @var{op} to @var{stream}, as a string of digits. Return the number of
4413
bytes written, or if an error occurred, return 0.
4415
The mantissa is prefixed with an @samp{0.} and is in the given @var{base},
4416
which may vary from 2 to 36. An exponent then printed, separated by an
4417
@samp{e}, or if @var{base} is greater than 10 then by an @samp{@@}. The
4418
exponent is always in decimal. The decimal point follows the current locale,
4419
on systems providing @code{localeconv}.
4421
Up to @var{n_digits} will be printed from the mantissa, except that no more
4422
digits than are accurately representable by @var{op} will be printed.
4423
@var{n_digits} can be 0 to select that accurate maximum.
4426
@deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base})
4427
Read a string in base @var{base} from @var{stream}, and put the read float in
4428
@var{rop}. The string is of the form @samp{M@@N} or, if the base is 10 or
4429
less, alternatively @samp{MeN}. @samp{M} is the mantissa and @samp{N} is the
4430
exponent. The mantissa is always in the specified base. The exponent is
4431
either in the specified base or, if @var{base} is negative, in decimal. The
4432
decimal point expected is taken from the current locale, on systems providing
4435
The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to
4436
@minus{}2. Negative values are used to specify that the exponent is in
4439
Unlike the corresponding @code{mpz} function, the base will not be determined
4440
from the leading characters of the string if @var{base} is 0. This is so that
4441
numbers like @samp{0.23} are not interpreted as octal.
4443
Return the number of bytes read, or if an error occurred, return 0.
4446
@c @deftypefun void mpf_out_raw (FILE *@var{stream}, mpf_t @var{float})
4447
@c Output @var{float} on stdio stream @var{stream}, in raw binary
4448
@c format. The float is written in a portable format, with 4 bytes of
4449
@c size information, and that many bytes of limbs. Both the size and the
4450
@c limbs are written in decreasing significance order.
4453
@c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream})
4454
@c Input from stdio stream @var{stream} in the format written by
4455
@c @code{mpf_out_raw}, and put the result in @var{float}.
4459
@node Miscellaneous Float Functions, , I/O of Floats, Floating-point Functions
4460
@comment node-name, next, previous, up
4461
@section Miscellaneous Functions
4462
@cindex Miscellaneous float functions
4463
@cindex Float miscellaneous functions
4465
@deftypefun void mpf_ceil (mpf_t @var{rop}, mpf_t @var{op})
4466
@deftypefunx void mpf_floor (mpf_t @var{rop}, mpf_t @var{op})
4467
@deftypefunx void mpf_trunc (mpf_t @var{rop}, mpf_t @var{op})
4468
Set @var{rop} to @var{op} rounded to an integer. @code{mpf_ceil} rounds to the
4469
next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc}
4470
to the integer towards zero.
4473
@deftypefun int mpf_integer_p (mpf_t @var{op})
4474
Return non-zero if @var{op} is an integer.
4477
@deftypefun int mpf_fits_ulong_p (mpf_t @var{op})
4478
@deftypefunx int mpf_fits_slong_p (mpf_t @var{op})
4479
@deftypefunx int mpf_fits_uint_p (mpf_t @var{op})
4480
@deftypefunx int mpf_fits_sint_p (mpf_t @var{op})
4481
@deftypefunx int mpf_fits_ushort_p (mpf_t @var{op})
4482
@deftypefunx int mpf_fits_sshort_p (mpf_t @var{op})
4483
Return non-zero if @var{op} would fit in the respective C data type, when
4484
truncated to an integer.
4487
@deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{nbits})
4488
Generate a uniformly distributed random float in @var{rop}, such that @math{0
4489
@le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa.
4491
The variable @var{state} must be initialized by calling one of the
4492
@code{gmp_randinit} functions (@ref{Random State Initialization}) before
4493
invoking this function.
4496
@deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp})
4497
Generate a random float of at most @var{max_size} limbs, with long strings of
4498
zeros and ones in the binary representation. The exponent of the number is in
4499
the interval @minus{}@var{exp} to @var{exp} (in limbs). This function is
4500
useful for testing functions and algorithms, since these kind of random
4501
numbers have proven to be more likely to trigger corner-case bugs. Negative
4502
random numbers are generated when @var{max_size} is negative.
4505
@c @deftypefun size_t mpf_size (mpf_t @var{op})
4506
@c Return the size of @var{op} measured in number of limbs. If @var{op} is
4507
@c zero, the returned value will be zero. (@xref{Nomenclature}, for an
4508
@c explanation of the concept @dfn{limb}.)
4510
@c @strong{This function is obsolete. It will disappear from future GMP
4515
@node Low-level Functions, Random Number Functions, Floating-point Functions, Top
4516
@comment node-name, next, previous, up
4517
@chapter Low-level Functions
4518
@cindex Low-level functions
4520
This chapter describes low-level GMP functions, used to implement the
4521
high-level GMP functions, but also intended for time-critical user code.
4523
These functions start with the prefix @code{mpn_}.
4525
@c 1. Some of these function clobber input operands.
4528
The @code{mpn} functions are designed to be as fast as possible, @strong{not}
4529
to provide a coherent calling interface. The different functions have somewhat
4530
similar interfaces, but there are variations that make them hard to use. These
4531
functions do as little as possible apart from the real multiple precision
4532
computation, so that no time is spent on things that not all callers need.
4534
A source operand is specified by a pointer to the least significant limb and a
4535
limb count. A destination operand is specified by just a pointer. It is the
4536
responsibility of the caller to ensure that the destination has enough space
4537
for storing the result.
4539
With this way of specifying operands, it is possible to perform computations on
4540
subranges of an argument, and store the result into a subrange of a
4543
A common requirement for all functions is that each source area needs at least
4544
one limb. No size argument may be zero. Unless otherwise stated, in-place
4545
operations are allowed where source and destination are the same, but not where
4546
they only partly overlap.
4548
The @code{mpn} functions are the base for the implementation of the
4549
@code{mpz_}, @code{mpf_}, and @code{mpq_} functions.
4551
This example adds the number beginning at @var{s1p} and the number beginning at
4552
@var{s2p} and writes the sum at @var{destp}. All areas have @var{n} limbs.
4555
cy = mpn_add_n (destp, s1p, s2p, n)
4559
In the notation used here, a source operand is identified by the pointer to
4560
the least significant limb, and the limb count in braces. For example,
4561
@{@var{s1p}, @var{s1n}@}.
4563
@deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
4564
Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n}
4565
least significant limbs of the result to @var{rp}. Return carry, either 0 or
4568
This is the lowest-level function for addition. It is the preferred function
4569
for addition, since it is written in assembly for most CPUs. For addition of
4570
a variable to itself (i.e., @var{s1p} equals @var{s2p}, use @code{mpn_lshift}
4571
with a count of 1 for optimal speed.
4574
@deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
4575
Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least
4576
significant limbs of the result to @var{rp}. Return carry, either 0 or 1.
4579
@deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
4580
Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
4581
@var{s1n} least significant limbs of the result to @var{rp}. Return carry,
4584
This function requires that @var{s1n} is greater than or equal to @var{s2n}.
4587
@deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
4588
Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the
4589
@var{n} least significant limbs of the result to @var{rp}. Return borrow,
4592
This is the lowest-level function for subtraction. It is the preferred
4593
function for subtraction, since it is written in assembly for most CPUs.
4596
@deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
4597
Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least
4598
significant limbs of the result to @var{rp}. Return borrow, either 0 or 1.
4601
@deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
4602
Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the
4603
@var{s1n} least significant limbs of the result to @var{rp}. Return borrow,
4606
This function requires that @var{s1n} is greater than or equal to
4610
@deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
4611
Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the
4612
2*@var{n}-limb result to @var{rp}.
4614
The destination has to have space for 2*@var{n} limbs, even if the product's
4615
most significant limb is zero. No overlap is permitted between the
4616
destination and either source.
4619
@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
4620
Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least
4621
significant limbs of the product to @var{rp}. Return the most significant
4622
limb of the product. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are
4623
allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}.
4625
This is a low-level function that is a building block for general
4626
multiplication as well as other operations in GMP. It is written in assembly
4629
Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift}
4630
with a count equal to the logarithm of @var{s2limb} instead, for optimal speed.
4633
@deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
4634
Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least
4635
significant limbs of the product to @{@var{rp}, @var{n}@} and write the result
4636
to @var{rp}. Return the most significant limb of the product, plus carry-out
4639
This is a low-level function that is a building block for general
4640
multiplication as well as other operations in GMP. It is written in assembly
4644
@deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
4645
Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n}
4646
least significant limbs of the product from @{@var{rp}, @var{n}@} and write the
4647
result to @var{rp}. Return the most significant limb of the product, minus
4648
borrow-out from the subtraction.
4650
This is a low-level function that is a building block for general
4651
multiplication and division as well as other operations in GMP. It is written
4652
in assembly for most CPUs.
4655
@deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
4656
Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
4657
result to @var{rp}. Return the most significant limb of the result.
4659
The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
4660
result might be one limb smaller.
4662
This function requires that @var{s1n} is greater than or equal to
4663
@var{s2n}. The destination must be distinct from both input operands.
4666
@deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn})
4667
Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient
4668
at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp},
4669
@var{dn}@}. The quotient is rounded towards 0.
4671
No overlap is permitted between arguments. @var{nn} must be greater than or
4672
equal to @var{dn}. The most significant limb of @var{dp} must be non-zero.
4673
The @var{qxn} operand must be zero.
4674
@comment FIXME: Relax overlap requirements!
4677
@deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
4678
[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best
4681
Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the
4682
quotient at @var{r1p}, with the exception of the most significant limb, which
4683
is returned. The remainder replaces the dividend at @var{rs2p}; it will be
4684
@var{s3n} limbs long (i.e., as many limbs as the divisor).
4686
In addition to an integer quotient, @var{qxn} fraction limbs are developed, and
4687
stored after the integral limbs. For most usages, @var{qxn} will be zero.
4689
It is required that @var{rs2n} is greater than or equal to @var{s3n}. It is
4690
required that the most significant bit of the divisor is set.
4692
If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}. Aside
4693
from that special case, no overlap between arguments is permitted.
4695
Return the most significant limb of the quotient, either 0 or 1.
4697
The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn}
4701
@deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb})
4702
@deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}})
4703
Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at
4704
@var{r1p}. Return the remainder.
4706
The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in
4707
addition @var{qxn} fraction limbs are developed and written to @{@var{r1p},
4708
@var{qxn}@}. Either or both @var{s2n} and @var{qxn} can be zero. For most
4709
usages, @var{qxn} will be zero.
4711
@code{mpn_divmod_1} exists for upward source compatibility and is simply a
4712
macro calling @code{mpn_divrem_1} with a @var{qxn} of 0.
4714
The areas at @var{r1p} and @var{s2p} have to be identical or completely
4715
separate, not partially overlapping.
4718
@deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
4719
[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best
4723
@deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}})
4724
@deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry})
4725
Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing
4726
the result to @{@var{rp}, @var{n}@}. If 3 divides exactly, the return value is
4727
zero and the result is the quotient. If not, the return value is non-zero and
4728
the result won't be anything useful.
4730
@code{mpn_divexact_by3c} takes an initial carry parameter, which can be the
4731
return value from a previous call, so a large calculation can be done piece by
4732
piece from low to high. @code{mpn_divexact_by3} is simply a macro calling
4733
@code{mpn_divexact_by3c} with a 0 carry parameter.
4735
These routines use a multiply-by-inverse and will be faster than
4736
@code{mpn_divrem_1} on CPUs with fast multiplication but slow division.
4738
The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i},
4739
and return value @math{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where
4740
@m{b=2\GMPraise{@code{mp\_bits\_per\_limb}}, b=2^mp_bits_per_limb}. The
4741
return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also
4742
be 0, 1 or 2 (these are both borrows really). When @math{c=0} clearly
4743
@math{q=(a-i)/3}. When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{}
4744
3} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when
4745
@code{mp_bits_per_limb} is even, which is always so currently).
4748
@deftypefun mp_limb_t mpn_mod_1 (mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb})
4749
Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder.
4750
@var{s1n} can be zero.
4753
@deftypefun mp_limb_t mpn_bdivmod (mp_limb_t *@var{rp}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}, unsigned long int @var{d})
4754
This function puts the low
4755
@math{@GMPfloor{@var{d}/@nicode{mp\_bits\_per\_limb}}} limbs of @var{q} =
4756
@{@var{s1p}, @var{s1n}@}/@{@var{s2p}, @var{s2n}@} mod @m{2^d,2^@var{d}} at
4757
@var{rp}, and returns the high @var{d} mod @code{mp_bits_per_limb} bits of
4760
@{@var{s1p}, @var{s1n}@} - @var{q} * @{@var{s2p}, @var{s2n}@} mod @m{2
4761
\GMPraise{@var{s1n}*@code{mp\_bits\_per\_limb}},
4762
2^(@var{s1n}*@nicode{mp\_bits\_per\_limb})} is placed at @var{s1p}. Since the
4763
low @math{@GMPfloor{@var{d}/@nicode{mp\_bits\_per\_limb}}} limbs of this
4764
difference are zero, it is possible to overwrite the low limbs at @var{s1p}
4765
with this difference, provided @math{@var{rp} @le{} @var{s1p}}.
4767
This function requires that @math{@var{s1n} * @nicode{mp\_bits\_per\_limb}
4768
@ge{} @var{D}}, and that @{@var{s2p}, @var{s2n}@} is odd.
4770
@strong{This interface is preliminary. It might change incompatibly in future
4774
@deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
4775
Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to
4776
@{@var{rp}, @var{n}@}. The bits shifted out at the left are returned in the
4777
least significant @var{count} bits of the return value (the rest of the return
4780
@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The
4781
regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
4782
@math{@var{rp} @ge{} @var{sp}}.
4784
This function is written in assembly for most CPUs.
4787
@deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
4788
Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to
4789
@{@var{rp}, @var{n}@}. The bits shifted out at the right are returned in the
4790
most significant @var{count} bits of the return value (the rest of the return
4793
@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The
4794
regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
4795
@math{@var{rp} @le{} @var{sp}}.
4797
This function is written in assembly for most CPUs.
4800
@deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
4801
Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a
4802
positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a
4803
negative value if @math{@var{s1} < @var{s2}}.
4806
@deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
4807
Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{s1p},
4808
@var{s1n}@} and @{@var{s2p}, @var{s2n}@}. The result can be up to @var{s2n}
4809
limbs, the return value is the actual number produced. Both source operands
4812
@{@var{s1p}, @var{s1n}@} must have at least as many bits as @{@var{s2p},
4813
@var{s2n}@}. @{@var{s2p}, @var{s2n}@} must be odd. Both operands must have
4814
non-zero most significant limbs. No overlap is permitted between @{@var{s1p},
4815
@var{s1n}@} and @{@var{s2p}, @var{s2n}@}.
4818
@deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb})
4819
Return the greatest common divisor of @{@var{s1p}, @var{s1n}@} and
4820
@var{s2limb}. Both operands must be non-zero.
4823
@deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, mp_size_t *@var{r2n}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
4824
Calculate the greatest common divisor of @{@var{s1p}, @var{s1n}@} and
4825
@{@var{s2p}, @var{s2n}@}. Store the gcd at @{@var{r1p}, @var{retval}@} and
4826
the first cofactor at @{@var{r2p}, *@var{r2n}@}, with *@var{r2n} negative if
4827
the cofactor is negative. @var{r1p} and @var{r2p} should each have room for
4828
@math{@var{s1n}+1} limbs, but the return value and value stored through
4829
@var{r2n} indicate the actual number produced.
4831
@math{@{@var{s1p}, @var{s1n}@} @ge{} @{@var{s2p}, @var{s2n}@}} is required,
4832
and both must be non-zero. The regions @{@var{s1p}, @math{@var{s1n}+1}@} and
4833
@{@var{s2p}, @math{@var{s2n}+1}@} are destroyed (i.e. the operands plus an
4834
extra limb past the end of each).
4836
The cofactor @var{r1} will satisfy @m{r_2 s_1 + k s_2 = r_1, @var{r2}*@var{s1}
4837
+ @var{k}*@var{s2} = @var{r1}}. The second cofactor @var{k} is not calculated
4838
but can easily be obtained from @m{(r_1 - r_2 s_1) / s_2, (@var{r1} -
4839
@var{r2}*@var{s1}) / @var{s2}}.
4842
@deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
4843
Compute the square root of @{@var{sp}, @var{n}@} and put the result at
4844
@{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p},
4845
@var{retval}@}. @var{r2p} needs space for @var{n} limbs, but the return value
4846
indicates how many are produced.
4848
The most significant limb of @{@var{sp}, @var{n}@} must be non-zero. The
4849
areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must
4850
be completely separate. The areas @{@var{r2p}, @var{n}@} and @{@var{sp},
4851
@var{n}@} must be either identical or completely separate.
4853
If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this
4854
case the return value is zero or non-zero according to whether the remainder
4855
would have been zero or non-zero.
4857
A return value of zero indicates a perfect square. See also
4858
@code{mpz_perfect_square_p}.
4861
@deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n})
4862
Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in
4863
base @var{base}, and return the number of characters produced. There may be
4864
leading zeros in the string. The string is not in ASCII; to convert it to
4865
printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on
4866
the base and range. @var{base} can vary from 2 to 256.
4868
The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be
4869
non-zero. The input @{@var{s1p}, @var{s1n}@} is clobbered, except when
4870
@var{base} is a power of 2, in which case it's unchanged.
4872
The area at @var{str} has to have space for the largest possible number
4873
represented by a @var{s1n} long limb array, plus one extra character.
4876
@deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{rp}, const unsigned char *@var{str}, size_t @var{strsize}, int @var{base})
4877
Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at
4880
@math{@var{str}[0]} is the most significant byte and
4881
@math{@var{str}[@var{strsize}-1]} is the least significant. Each byte should
4882
be a value in the range 0 to @math{@var{base}-1}, not an ASCII character.
4883
@var{base} can vary from 2 to 256.
4885
The return value is the number of limbs written to @var{rp}. If the most
4886
significant input byte is non-zero then the high limb at @var{rp} will be
4887
non-zero, and only that exact number of limbs will be required there.
4889
If the most significant input byte is zero then there may be high zero limbs
4890
written to @var{rp} and included in the return value.
4892
@var{strsize} must be at least 1, and no overlap is permitted between
4893
@{@var{str},@var{strsize}@} and the result at @var{rp}.
4896
@deftypefun {unsigned long int} mpn_scan0 (const mp_limb_t *@var{s1p}, unsigned long int @var{bit})
4897
Scan @var{s1p} from bit position @var{bit} for the next clear bit.
4899
It is required that there be a clear bit within the area at @var{s1p} at or
4900
beyond bit position @var{bit}, so that the function has something to return.
4903
@deftypefun {unsigned long int} mpn_scan1 (const mp_limb_t *@var{s1p}, unsigned long int @var{bit})
4904
Scan @var{s1p} from bit position @var{bit} for the next set bit.
4906
It is required that there be a set bit within the area at @var{s1p} at or
4907
beyond bit position @var{bit}, so that the function has something to return.
4910
@deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
4911
@deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
4912
Generate a random number of length @var{r1n} and store it at @var{r1p}. The
4913
most significant limb is always non-zero. @code{mpn_random} generates
4914
uniformly distributed limb data, @code{mpn_random2} generates long strings of
4915
zeros and ones in the binary representation.
4917
@code{mpn_random2} is intended for testing the correctness of the @code{mpn}
4921
@deftypefun {unsigned long int} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
4922
Count the number of set bits in @{@var{s1p}, @var{n}@}.
4925
@deftypefun {unsigned long int} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
4926
Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p},
4927
@var{n}@}, which is the number of bit positions where the two operands have
4928
different bit values.
4931
@deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
4932
Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square.
4940
@strong{Everything in this section is highly experimental and may disappear or
4941
be subject to incompatible changes in a future version of GMP.}
4943
Nails are an experimental feature whereby a few bits are left unused at the
4944
top of each @code{mp_limb_t}. This can significantly improve carry handling
4947
All the @code{mpn} functions accepting limb data will expect the nail bits to
4948
be zero on entry, and will return data with the nails similarly all zero.
4949
This applies both to limb vectors and to single limb arguments.
4951
Nails can be enabled by configuring with @samp{--enable-nails}. By default
4952
the number of bits will be chosen according to what suits the host processor,
4953
but a particular number can be selected with @samp{--enable-nails=N}.
4955
At the mpn level, a nail build is neither source nor binary compatible with a
4956
non-nail build, strictly speaking. But programs acting on limbs only through
4957
the mpn functions are likely to work equally well with either build, and
4958
judicious use of the definitions below should make any program compatible with
4959
either build, at the source level.
4961
For the higher level routines, meaning @code{mpz} etc, a nail build should be
4962
fully source and binary compatible with a non-nail build.
4964
@defmac GMP_NAIL_BITS
4965
@defmacx GMP_NUMB_BITS
4966
@defmacx GMP_LIMB_BITS
4967
@code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in
4968
use. @code{GMP_NUMB_BITS} is the number of data bits in a limb.
4969
@code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}. In
4973
GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS
4977
@defmac GMP_NAIL_MASK
4978
@defmacx GMP_NUMB_MASK
4979
Bit masks for the nail and number parts of a limb. @code{GMP_NAIL_MASK} is 0
4980
when nails are not in use.
4982
@code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained
4983
with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which
4984
can help various RISC chips.
4987
@defmac GMP_NUMB_MAX
4988
The maximum value that can be stored in the number part of a limb. This is
4989
the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing
4990
comparisons rather than bit-wise operations.
4993
The term ``nails'' comes from finger or toe nails, which are at the ends of a
4994
limb (arm or leg). ``numb'' is short for number, but is also how the
4995
developers felt after trying for a long time to come up with sensible names
4998
In the future (the distant future most likely) a non-zero nail might be
4999
permitted, giving non-unique representations for numbers in a limb vector.
5000
This would help vector processors since carries would only ever need to
5001
propagate one or two limbs.
5004
@node Random Number Functions, Formatted Output, Low-level Functions, Top
5005
@chapter Random Number Functions
5006
@cindex Random number functions
5008
Sequences of pseudo-random numbers in GMP are generated using a variable of
5009
type @code{gmp_randstate_t}, which holds an algorithm selection and a current
5010
state. Such a variable must be initialized by a call to one of the
5011
@code{gmp_randinit} functions, and can be seeded with one of the
5012
@code{gmp_randseed} functions.
5014
The functions actually generating random numbers are described in @ref{Integer
5015
Random Numbers}, and @ref{Miscellaneous Float Functions}.
5017
The older style random number functions don't accept a @code{gmp_randstate_t}
5018
parameter but instead share a global variable of that type. They use a
5019
default algorithm and are currently not seeded (though perhaps that will
5020
change in the future). The new functions accepting a @code{gmp_randstate_t}
5021
are recommended for applications that care about randomness.
5024
* Random State Initialization::
5025
* Random State Seeding::
5028
@node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions
5029
@section Random State Initialization
5030
@cindex Random number state
5032
@deftypefun void gmp_randinit_default (gmp_randstate_t @var{state})
5033
Initialize @var{state} with a default algorithm. This will be a compromise
5034
between speed and randomness, and is recommended for applications with no
5035
special requirements.
5038
@deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, mpz_t @var{a}, @w{unsigned long @var{c}}, @w{unsigned long @var{m2exp}})
5039
Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X +
5040
@var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}.
5042
The low bits of @math{X} in this algorithm are not very random. The least
5043
significant bit will have a period no more than 2, and the second bit no more
5044
than 4, etc. For this reason only the high half of each @math{X} is actually
5047
When a random number of more than @math{@var{m2exp}/2} bits is to be
5048
generated, multiple iterations of the recurrence are used and the results
5052
@deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, unsigned long @var{size})
5053
Initialize @var{state} for a linear congruential algorithm as per
5054
@code{gmp_randinit_lc_2exp}. @var{a}, @var{c} and @var{m2exp} are selected
5055
from a table, chosen so that @var{size} bits (or more) of each @math{X} will
5056
be used, ie. @math{@var{m2exp}/2 @ge{} @var{size}}.
5058
If successful the return value is non-zero. If @var{size} is bigger than the
5059
table data provides then the return value is zero. The maximum @var{size}
5060
currently supported is 128.
5063
@c Although gmp_randinit, gmp_errno and related constants are obsolete, we
5064
@c still put @findex entries for them, since they're still documented and
5065
@c someone might be looking them up when perusing old application code.
5067
@deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, ...)
5068
@strong{This function is obsolete.}
5070
@findex GMP_RAND_ALG_LC
5071
@findex GMP_RAND_ALG_DEFAULT
5072
Initialize @var{state} with an algorithm selected by @var{alg}. The only
5073
choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size}
5074
described above. A third parameter of type @code{unsigned long} is required,
5075
this is the @var{size} for that function. @code{GMP_RAND_ALG_DEFAULT} or 0
5076
are the same as @code{GMP_RAND_ALG_LC}.
5078
@c For reference, this is the only place gmp_errno has been documented, and
5079
@c due to being non thread safe we won't be adding to it's uses.
5081
@findex GMP_ERROR_UNSUPPORTED_ARGUMENT
5082
@findex GMP_ERROR_INVALID_ARGUMENT
5083
@code{gmp_randinit} sets bits in the global variable @code{gmp_errno} to
5084
indicate an error. @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is
5085
unsupported, or @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter
5086
is too big. It may be noted this error reporting is not thread safe (a good
5087
reason to use @code{gmp_randinit_lc_2exp_size} instead).
5090
@c Not yet in the library.
5092
@deftypefun void gmp_randinit_lc (gmp_randstate_t @var{state}, mpz_t @var{a}, unsigned long int @var{c}, mpz_t @var{m})
5093
Initialize @var{state} for a linear congruential scheme @m{X = (@var{a}X +
5094
@var{c}) @bmod @var{m}, X = (@var{a}*X + @var{c}) mod 2^@var{m}}.
5098
@deftypefun void gmp_randclear (gmp_randstate_t @var{state})
5099
Free all memory occupied by @var{state}.
5103
@node Random State Seeding, , Random State Initialization, Random Number Functions
5104
@section Random State Seeding
5105
@cindex Random number seeding
5107
@deftypefun void gmp_randseed (gmp_randstate_t @var{state}, mpz_t @var{seed})
5108
@deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}})
5109
Set an initial seed value into @var{state}.
5111
The size of a seed determines how many different sequences of random numbers
5112
that it's possible to generate. The ``quality'' of the seed is the randomness
5113
of a given seed compared to the previous seed used, and this affects the
5114
randomness of separate number sequences. The method for choosing a seed is
5115
critical if the generated numbers are to be used for important applications,
5116
such as generating cryptographic keys.
5118
Traditionally the system time has been used to seed, but care needs to be
5119
taken with this. If an application seeds often and the resolution of the
5120
system clock is low, then the same sequence of numbers might be repeated.
5121
Also, the system time is quite easy to guess, so if unpredictability is
5122
required then it should definitely not be the only source for the seed value.
5123
On some systems there's a special device @file{/dev/random} which provides
5124
random data better suited for use as a seed.
5128
@node Formatted Output, Formatted Input, Random Number Functions, Top
5129
@chapter Formatted Output
5130
@cindex Formatted output
5131
@cindex @code{printf} formatted output
5134
* Formatted Output Strings::
5135
* Formatted Output Functions::
5136
* C++ Formatted Output::
5139
@node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output
5140
@section Format Strings
5142
@code{gmp_printf} and friends accept format strings similar to the standard C
5143
@code{printf} (@pxref{Formatted Output,,,libc,The GNU C Library Reference
5144
Manual}). A format specification is of the form
5147
% [flags] [width] [.[precision]] [type] conv
5150
GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
5151
and @code{mpf_t} respectively, and @samp{N} for an @code{mp_limb_t} array.
5152
@samp{Z}, @samp{Q} and @samp{N} behave like integers. @samp{Q} will print a
5153
@samp{/} and a denominator, if needed. @samp{F} behaves like a float. For
5158
gmp_printf ("%s is an mpz %Zd\n", "here", z);
5161
gmp_printf ("a hex rational: %#40Qx\n", q);
5165
gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n);
5167
const mp_limb_t *ptr;
5169
gmp_printf ("limb array %Nx\n", ptr, size);
5172
For @samp{N} the limbs are expected least significant first, as per the
5173
@code{mpn} functions (@pxref{Low-level Functions}). A negative size can be
5174
given to print the value as a negative.
5176
All the standard C @code{printf} types behave the same as the C library
5177
@code{printf}, and can be freely intermixed with the GMP extensions. In the
5178
current implementation the standard parts of the format string are simply
5179
handed to @code{printf} and only the GMP extensions handled directly.
5181
The flags accepted are as follows. GLIBC style @nisamp{'} is only for the
5182
standard C types (not the GMP types), and only if the C library supports it.
5185
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5186
@item @nicode{0} @tab pad with zeros (rather than spaces)
5187
@item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0}
5188
@item @nicode{+} @tab always show a sign
5189
@item (space) @tab show a space or a @samp{-} sign
5190
@item @nicode{'} @tab group digits, GLIBC style (not GMP types)
5194
The optional width and precision can be given as a number within the format
5195
string, or as a @samp{*} to take an extra parameter of type @code{int}, the
5196
same as the standard @code{printf}.
5198
The standard types accepted are as follows. @samp{h} and @samp{l} are
5199
portable, the rest will depend on the compiler (or include files) for the type
5200
and the C library for the output.
5203
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5204
@item @nicode{h} @tab @nicode{short}
5205
@item @nicode{hh} @tab @nicode{char}
5206
@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t}
5207
@item @nicode{l} @tab @nicode{long} or @nicode{wchar_t}
5208
@item @nicode{ll} @tab @nicode{long long}
5209
@item @nicode{L} @tab @nicode{long double}
5210
@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t}
5211
@item @nicode{t} @tab @nicode{ptrdiff_t}
5212
@item @nicode{z} @tab @nicode{size_t}
5220
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5221
@item @nicode{F} @tab @nicode{mpf_t}, float conversions
5222
@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions
5223
@item @nicode{N} @tab @nicode{mp_limb_t} array, integer conversions
5224
@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions
5228
The conversions accepted are as follows. @samp{a} and @samp{A} are always
5229
supported for @code{mpf_t} but depend on the C library for standard C float
5230
types. @samp{m} and @samp{p} depend on the C library.
5233
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5234
@item @nicode{a} @nicode{A} @tab hex floats, C99 style
5235
@item @nicode{c} @tab character
5236
@item @nicode{d} @tab decimal integer
5237
@item @nicode{e} @nicode{E} @tab scientific format float
5238
@item @nicode{f} @tab fixed point float
5239
@item @nicode{i} @tab same as @nicode{d}
5240
@item @nicode{g} @nicode{G} @tab fixed or scientific float
5241
@item @nicode{m} @tab @code{strerror} string, GLIBC style
5242
@item @nicode{n} @tab store characters written so far
5243
@item @nicode{o} @tab octal integer
5244
@item @nicode{p} @tab pointer
5245
@item @nicode{s} @tab string
5246
@item @nicode{u} @tab unsigned integer
5247
@item @nicode{x} @nicode{X} @tab hex integer
5251
@samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for
5252
types @samp{Z}, @samp{Q} and @samp{N} they are signed. @samp{u} is not
5253
meaningful for @samp{Z}, @samp{Q} and @samp{N}.
5255
@samp{n} can be used with any type, even the GMP types.
5257
Other types or conversions that might be accepted by the C library
5258
@code{printf} cannot be used through @code{gmp_printf}, this includes for
5259
instance extensions registered with GLIBC @code{register_printf_function}.
5260
Also currently there's no support for POSIX @samp{$} style numbered arguments
5261
(perhaps this will be added in the future).
5263
The precision field has it's usual meaning for integer @samp{Z} and float
5264
@samp{F} types, but is currently undefined for @samp{Q} and should not be used
5267
@code{mpf_t} conversions only ever generate as many digits as can be
5268
accurately represented by the operand, the same as @code{mpf_get_str} does.
5269
Zeros will be used if necessary to pad to the requested precision. This
5270
happens even for an @samp{f} conversion of an @code{mpf_t} which is an
5271
integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits
5272
precision will only produce about 40 digits, then pad with zeros to the
5273
decimal point. An empty precision field like @samp{%.Fe} or @samp{%.Ff} can
5274
be used to specifically request just the significant digits.
5276
The decimal point character (or string) is taken from the current locale
5277
settings on systems which provide @code{localeconv} (@pxref{Locales,,Locales
5278
and Internationalization,libc,The GNU C Library Reference Manual}). The C
5279
library will normally do the same for standard float output.
5281
The format string is only interpreted as plain @code{char}s, multibyte
5282
characters are not recognised. Perhaps this will change in the future.
5285
@node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output
5288
Each of the following functions is similar to the corresponding C library
5289
function. The basic @code{printf} forms take a variable argument list. The
5290
@code{vprintf} forms take an argument pointer, see @ref{Variadic
5291
Functions,,,libc,The GNU C Library Reference Manual}, or @samp{man 3
5294
It should be emphasised that if a format string is invalid, or the arguments
5295
don't match what the format specifies, then the behaviour of any of these
5296
functions will be unpredictable. GCC format string checking is not available,
5297
since it doesn't recognise the GMP extensions.
5299
The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return
5300
@math{-1} to indicate a write error. All the functions can return @math{-1}
5301
if the C library @code{printf} variant in use returns @math{-1}, but this
5302
shouldn't normally occur.
5304
@deftypefun int gmp_printf (const char *@var{fmt}, ...)
5305
@deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap})
5306
Print to the standard output @code{stdout}. Return the number of characters
5307
written, or @math{-1} if an error occurred.
5310
@deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, ...)
5311
@deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
5312
Print to the stream @var{fp}. Return the number of characters written, or
5313
@math{-1} if an error occurred.
5316
@deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, ...)
5317
@deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap})
5318
Form a null-terminated string in @var{buf}. Return the number of characters
5319
written, excluding the terminating null.
5321
No overlap is permitted between the space at @var{buf} and the string
5324
These functions are not recommended, since there's no protection against
5325
exceeding the space available at @var{buf}.
5328
@deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, ...)
5329
@deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap})
5330
Form a null-terminated string in @var{buf}. No more than @var{size} bytes
5331
will be written. To get the full output, @var{size} must be enough for the
5332
string and null-terminator.
5334
The return value is the total number of characters which ought to have been
5335
produced, excluding the terminating null. If @math{@var{retval} @ge{}
5336
@var{size}} then the actual output has been truncated to the first
5337
@math{@var{size}-1} characters, and a null appended.
5339
No overlap is permitted between the region @{@var{buf},@var{size}@} and the
5342
Notice the return value is in ISO C99 @code{snprintf} style. This is so even
5343
if the C library @code{vsnprintf} is the older GLIBC 2.0.x style.
5346
@deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, ...)
5347
@deftypefunx int gmp_vasprintf (char *@var{pp}, const char *@var{fmt}, va_list @var{ap})
5348
Form a null-terminated string in a block of memory obtained from the current
5349
memory allocation function (@pxref{Custom Allocation}). The block will be the
5350
size of the string and null-terminator. Put the address of the block in
5351
*@var{pp}. Return the number of characters produced, excluding the
5354
Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return
5355
@math{-1} if there's no more memory available, it lets the current allocation
5356
function handle that.
5359
@deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, ...)
5360
@deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap})
5361
Append to the current obstack object, in the same style as
5362
@code{obstack_printf}. Return the number of characters written. A
5363
null-terminator is not written.
5365
@var{fmt} cannot be within the current obstack object, since the object might
5368
These functions are available only when the C library provides the obstack
5369
feature, which probably means only on GNU systems, see
5370
@ref{Obstacks,,,libc,The GNU C Library Reference Manual}.
5374
@node C++ Formatted Output, , Formatted Output Functions, Formatted Output
5375
@section C++ Formatted Output
5376
@cindex C++ @code{ostream} output
5377
@cindex @code{ostream} output
5379
The following functions are provided in @file{libgmpxx}, which is built if C++
5380
support is enabled (@pxref{Build Options}). Prototypes are available from
5383
@deftypefun ostream& operator<< (ostream& @var{stream}, mpz_t @var{op})
5384
Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
5385
@code{ios::width} is reset to 0 after output, the same as the standard
5386
@code{ostream operator<<} routines do.
5388
In hex or octal, @var{op} is printed as a signed number, the same as for
5389
decimal. This is unlike the standard @code{operator<<} routines on @code{int}
5390
etc, which instead give twos complement.
5393
@deftypefun ostream& operator<< (ostream& @var{stream}, mpq_t @var{op})
5394
Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
5395
@code{ios::width} is reset to 0 after output, the same as the standard
5396
@code{ostream operator<<} routines do.
5398
Output will be a fraction like @samp{5/9}, or if the denominator is 1 then
5399
just a plain integer like @samp{123}.
5401
In hex or octal, @var{op} is printed as a signed value, the same as for
5402
decimal. If @code{ios::showbase} is set then a base indicator is shown on
5403
both the numerator and denominator (if the denominator is required).
5406
@deftypefun ostream& operator<< (ostream& @var{stream}, mpf_t @var{op})
5407
Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
5408
@code{ios::width} is reset to 0 after output, the same as the standard
5409
@code{ostream operator<<} routines do. The decimal point follows the current
5410
locale, on systems providing @code{localeconv}.
5412
Hex and octal are supported, unlike the standard @code{operator<<} on
5413
@code{double}. The mantissa will be in hex or octal, the exponent will be in
5414
decimal. For hex the exponent delimiter is an @samp{@@}. This is as per
5417
@code{ios::showbase} is supported, and will put a base on the mantissa, for
5418
example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}.
5419
This last form is slightly strange, but at least differentiates itself from
5423
These operators mean that GMP types can be printed in the usual C++ way, for
5430
cout << "iteration " << n << " value " << z << "\n";
5433
But note that @code{ostream} output (and @code{istream} input, @pxref{C++
5434
Formatted Input}) is the only overloading available for the GMP types and that
5435
for instance using @code{+} with an @code{mpz_t} will have unpredictable
5436
results. For classes with overloading, see @ref{C++ Class Interface}.
5439
@node Formatted Input, C++ Class Interface, Formatted Output, Top
5440
@chapter Formatted Input
5441
@cindex Formatted input
5442
@cindex @code{scanf} formatted input
5445
* Formatted Input Strings::
5446
* Formatted Input Functions::
5447
* C++ Formatted Input::
5451
@node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input
5452
@section Formatted Input Strings
5454
@code{gmp_scanf} and friends accept format strings similar to the standard C
5455
@code{scanf} (@pxref{Formatted Input,,,libc,The GNU C Library Reference
5456
Manual}). A format specification is of the form
5459
% [flags] [width] [type] conv
5462
GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
5463
and @code{mpf_t} respectively. @samp{Z} and @samp{Q} behave like integers.
5464
@samp{Q} will read a @samp{/} and a denominator, if present. @samp{F} behaves
5467
GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since
5468
they're already ``call-by-reference''. For example,
5471
/* to read say "a(5) = 1234" */
5474
gmp_scanf ("a(%d) = %Zd\n", &n, z);
5477
gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2);
5479
/* to read say "topleft (1.55,-2.66)" */
5482
gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y);
5485
All the standard C @code{scanf} types behave the same as in the C library
5486
@code{scanf}, and can be freely intermixed with the GMP extensions. In the
5487
current implementation the standard parts of the format string are simply
5488
handed to @code{scanf} and only the GMP extensions handled directly.
5490
The flags accepted are as follows. @samp{a} and @samp{'} will depend on
5491
support from the C library, and @samp{'} cannot be used with GMP types.
5494
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5495
@item @nicode{*} @tab read but don't store
5496
@item @nicode{a} @tab allocate a buffer (string conversions)
5497
@item @nicode{'} @tab group digits, GLIBC style (not GMP types)
5501
The standard types accepted are as follows. @samp{h} and @samp{l} are
5502
portable, the rest will depend on the compiler (or include files) for the type
5503
and the C library for the input.
5506
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5507
@item @nicode{h} @tab @nicode{short}
5508
@item @nicode{hh} @tab @nicode{char}
5509
@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t}
5510
@item @nicode{l} @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t}
5511
@item @nicode{ll} @tab @nicode{long long}
5512
@item @nicode{L} @tab @nicode{long double}
5513
@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t}
5514
@item @nicode{t} @tab @nicode{ptrdiff_t}
5515
@item @nicode{z} @tab @nicode{size_t}
5523
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5524
@item @nicode{F} @tab @nicode{mpf_t}, float conversions
5525
@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions
5526
@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions
5530
The conversions accepted are as follows. @samp{p} and @samp{[} will depend on
5531
support from the C library, the rest are standard.
5534
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5535
@item @nicode{c} @tab character or characters
5536
@item @nicode{d} @tab decimal integer
5537
@item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G}
5539
@item @nicode{i} @tab integer with base indicator
5540
@item @nicode{n} @tab characters read so far
5541
@item @nicode{o} @tab octal integer
5542
@item @nicode{p} @tab pointer
5543
@item @nicode{s} @tab string of non-whitespace characters
5544
@item @nicode{u} @tab decimal integer
5545
@item @nicode{x} @nicode{X} @tab hex integer
5546
@item @nicode{[} @tab string of characters in a set
5550
@samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all
5551
read either fixed point or scientific format, and either @samp{e} or @samp{E}
5552
for the exponent in scientific format.
5554
@samp{x} and @samp{X} are identical, both accept both upper and lower case
5557
@samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative
5558
values. For the standard C types these are described as ``unsigned''
5559
conversions, but that merely affects certain overflow handling, negatives are
5560
still allowed (per @code{strtoul}, @pxref{Parsing of Integers,, Parsing of
5561
Integers, libc, The GNU C Library Reference Manual}). For GMP types there are
5562
no overflows, so @samp{d} and @samp{u} are identical.
5564
@samp{Q} type reads the numerator and (optional) denominator as given. If the
5565
value might not be in canonical form then @code{mpq_canonicalize} must be
5566
called before using it in any calculations (@pxref{Rational Number
5569
@samp{Qi} will read a base specification separately for the numerator and
5570
denominator. For example @samp{0x10/11} would be 16/11, whereas
5571
@samp{0x10/0x11} would be 16/17.
5573
@samp{n} can be used with any of the types above, even the GMP types.
5574
@samp{*} to suppress assignment is allowed, though the field would then do
5577
Other conversions or types that might be accepted by the C library
5578
@code{scanf} cannot be used through @code{gmp_scanf}.
5580
Whitespace is read and discarded before a field, except for @samp{c} and
5581
@samp{[} conversions.
5583
For float conversions, the decimal point character (or string) expected is
5584
taken from the current locale settings on systems which provide
5585
@code{localeconv} (@pxref{Locales,,Locales and Internationalization,libc,The
5586
GNU C Library Reference Manual}). The C library will normally do the same for
5587
standard float input.
5589
The format string is only interpreted as plain @code{char}s, multibyte
5590
characters are not recognised. Perhaps this will change in the future.
5593
@node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input
5594
@section Formatted Input Functions
5596
Each of the following functions is similar to the corresponding C library
5597
function. The plain @code{scanf} forms take a variable argument list. The
5598
@code{vscanf} forms take an argument pointer, see @ref{Variadic
5599
Functions,,,libc,The GNU C Library Reference Manual}, or @samp{man 3
5602
It should be emphasised that if a format string is invalid, or the arguments
5603
don't match what the format specifies, then the behaviour of any of these
5604
functions will be unpredictable. GCC format string checking is not available,
5605
since it doesn't recognise the GMP extensions.
5607
No overlap is permitted between the @var{fmt} string and any of the results
5610
@deftypefun int gmp_scanf (const char *@var{fmt}, ...)
5611
@deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap})
5612
Read from the standard input @code{stdin}.
5615
@deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, ...)
5616
@deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
5617
Read from the stream @var{fp}.
5620
@deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, ...)
5621
@deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap})
5622
Read from a null-terminated string @var{s}.
5625
The return value from each of these functions is the same as the standard C99
5626
@code{scanf}, namely the number of fields successfully parsed and stored.
5627
@samp{%n} fields and fields read but suppressed by @samp{*} don't count
5628
towards the return value.
5630
If end of file or file error, or end of string, is reached when a match is
5631
required, and when no previous non-suppressed fields have matched, then the
5632
return value is EOF instead of 0. A match is required for a literal character
5633
in the format string or a field other than @samp{%n}. Whitespace in the
5634
format string is only an optional match and won't induce an EOF in this
5635
fashion. Leading whitespace read and discarded for a field doesn't count as a
5639
@node C++ Formatted Input, , Formatted Input Functions, Formatted Input
5640
@section C++ Formatted Input
5641
@cindex C++ @code{istream} input
5642
@cindex @code{istream} input
5644
The following functions are provided in @file{libgmpxx}, which is built only
5645
if C++ support is enabled (@pxref{Build Options}). Prototypes are available
5646
from @code{<gmp.h>}.
5648
@deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop})
5649
Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
5652
@deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop})
5653
Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
5655
An integer like @samp{123} will be read, or a fraction like @samp{5/9}. If
5656
the fraction is not in canonical form then @code{mpq_canonicalize} must be
5657
called (@pxref{Rational Number Functions}).
5660
@deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop})
5661
Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
5663
Hex or octal floats are not supported, but might be in the future.
5666
These operators mean that GMP types can be read in the usual C++ way, for
5675
But note that @code{istream} input (and @code{ostream} output, @pxref{C++
5676
Formatted Output}) is the only overloading available for the GMP types and
5677
that for instance using @code{+} with an @code{mpz_t} will have unpredictable
5678
results. For classes with overloading, see @ref{C++ Class Interface}.
5682
@node C++ Class Interface, BSD Compatible Functions, Formatted Input, Top
5683
@chapter C++ Class Interface
5684
@cindex C++ Interface
5686
This chapter describes the C++ class based interface to GMP.
5688
All GMP C language types and functions can be used in C++ programs, since
5689
@file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers
5690
overloaded functions and operators which may be more convenient.
5692
Due to the implementation of this interface, a reasonably recent C++ compiler
5693
is required, one supporting namespaces, partial specialization of templates
5694
and member templates. For GCC this means version 2.91 or later.
5696
@strong{Everything described in this chapter is to be considered preliminary
5697
and might be subject to incompatible changes if some unforeseen difficulty
5701
* C++ Interface General::
5702
* C++ Interface Integers::
5703
* C++ Interface Rationals::
5704
* C++ Interface Floats::
5705
* C++ Interface MPFR::
5706
* C++ Interface Random Numbers::
5707
* C++ Interface Limitations::
5711
@node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface
5712
@section C++ Interface General
5715
All the C++ classes and functions are available with
5722
Programs should be linked with the @file{libgmpxx} and @file{libgmp}
5723
libraries. For example,
5726
g++ mycxxprog.cc -lgmpxx -lgmp
5730
The classes defined are
5732
@deftp Class mpz_class
5733
@deftpx Class mpq_class
5734
@deftpx Class mpf_class
5737
The standard operators and various standard functions are overloaded to allow
5738
arithmetic with these classes. For example,
5749
cout << "sum is " << c << "\n";
5750
cout << "absolute value is " << abs(c) << "\n";
5756
An important feature of the implementation is that an expression like
5757
@code{a=b+c} results in a single call to the corresponding @code{mpz_add},
5758
without using a temporary for the @code{b+c} part. Expressions which by their
5759
nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries
5762
The classes can be freely intermixed in expressions, as can the classes and
5763
the standard types @code{long}, @code{unsigned long} and @code{double}.
5764
Smaller types like @code{int} or @code{float} can also be intermixed, since
5765
C++ will promote them.
5767
Note that @code{bool} is not accepted directly, but must be explicitly cast to
5768
an @code{int} first. This is because C++ will automatically convert any
5769
pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all
5770
sorts of invalid class and pointer combinations compile but almost certainly
5771
not do anything sensible.
5773
Conversions back from the classes to standard C++ types aren't done
5774
automatically, instead member functions like @code{get_si} are provided (see
5775
the following sections for details).
5777
Also there are no automatic conversions from the classes to the corresponding
5778
GMP C types, instead a reference to the underlying C object can be obtained
5779
with the following functions,
5781
@deftypefun mpz_t mpz_class::get_mpz_t ()
5782
@deftypefunx mpq_t mpq_class::get_mpq_t ()
5783
@deftypefunx mpf_t mpf_class::get_mpf_t ()
5786
These can be used to call a C function which doesn't have a C++ class
5787
interface. For example to set @code{a} to the GCD of @code{b} and @code{c},
5792
mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t());
5795
In the other direction, a class can be initialized from the corresponding GMP
5796
C type, or assigned to if an explicit constructor is used. In both cases this
5797
makes a copy of the value, it doesn't create any sort of association. For
5802
// ... init and calculate z ...
5808
There are no namespace setups in @file{gmpxx.h}, all types and functions are
5809
simply put into the global namespace. This is what @file{gmp.h} has done in
5810
the past, and continues to do for compatibility. The extras provided by
5811
@file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with
5815
@node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface
5816
@section C++ Interface Integers
5818
@deftypefun void mpz_class::mpz_class (type @var{n})
5819
Construct an @code{mpz_class}. All the standard C++ types may be used, except
5820
@code{long long} and @code{long double}, and all the GMP C++ classes can be
5821
used. Any necessary conversion follows the corresponding C function, for
5822
example @code{double} follows @code{mpz_set_d} (@pxref{Assigning Integers}).
5825
@deftypefun void mpz_class::mpz_class (mpz_t @var{z})
5826
Construct an @code{mpz_class} from an @code{mpz_t}. The value in @var{z} is
5827
copied into the new @code{mpz_class}, there won't be any permanent association
5828
between it and @var{z}.
5831
@deftypefun void mpz_class::mpz_class (const char *@var{s})
5832
@deftypefunx void mpz_class::mpz_class (const char *@var{s}, int base)
5833
@deftypefunx void mpz_class::mpz_class (const string& @var{s})
5834
@deftypefunx void mpz_class::mpz_class (const string& @var{s}, int base)
5835
Construct an @code{mpz_class} converted from a string using
5836
@code{mpz_set_str}, (@pxref{Assigning Integers}). If the @var{base} is not
5837
given then 0 is used.
5840
@deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d})
5841
@deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d})
5842
Divisions involving @code{mpz_class} round towards zero, as per the
5843
@code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}).
5844
This is the same as the C99 @code{/} and @code{%} operators.
5846
The @code{mpz_fdiv@dots{}} or @code{mpz_cdiv@dots{}} functions can always be called
5847
directly if desired. For example,
5852
mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t());
5856
@deftypefun mpz_class abs (mpz_class @var{op1})
5857
@deftypefunx int cmp (mpz_class @var{op1}, type @var{op2})
5858
@deftypefunx int cmp (type @var{op1}, mpz_class @var{op2})
5859
@deftypefunx double mpz_class::get_d (void)
5860
@deftypefunx long mpz_class::get_si (void)
5861
@deftypefunx {unsigned long} mpz_class::get_ui (void)
5863
@deftypefunx bool mpz_class::fits_sint_p (void)
5864
@deftypefunx bool mpz_class::fits_slong_p (void)
5865
@deftypefunx bool mpz_class::fits_sshort_p (void)
5867
@deftypefunx bool mpz_class::fits_uint_p (void)
5868
@deftypefunx bool mpz_class::fits_ulong_p (void)
5869
@deftypefunx bool mpz_class::fits_ushort_p (void)
5871
@deftypefunx int sgn (mpz_class @var{op})
5872
@deftypefunx mpz_class sqrt (mpz_class @var{op})
5873
These functions provide a C++ class interface to the corresponding GMP C
5876
@code{cmp} can be used with any of the classes or the standard C++ types,
5877
except @code{long long} and @code{long double}.
5881
Overloaded operators for combinations of @code{mpz_class} and @code{double}
5882
are provided for completeness, but it should be noted that if the given
5883
@code{double} is not an integer then the way any rounding is done is currently
5884
unspecified. The rounding might take place at the start, in the middle, or at
5885
the end of the operation, and it might change in the future.
5887
Conversions between @code{mpz_class} and @code{double}, however, are defined
5888
to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}.
5889
And comparisons are always made exactly, as per @code{mpz_cmp_d}.
5892
@node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface
5893
@section C++ Interface Rationals
5895
In all the following constructors, if a fraction is given then it should be in
5896
canonical form, or if not then @code{mpq_class::canonicalize} called.
5898
@deftypefun void mpq_class::mpq_class (type @var{op})
5899
@deftypefunx void mpq_class::mpq_class (integer @var{num}, integer @var{den})
5900
Construct an @code{mpq_class}. The initial value can be a single value of any
5901
type, or a pair of integers (@code{mpz_class} or standard C++ integer types)
5902
representing a fraction, except that @code{long long} and @code{long double}
5903
are not supported. For example,
5912
@deftypefun void mpq_class::mpq_class (mpq_t @var{q})
5913
Construct an @code{mpq_class} from an @code{mpq_t}. The value in @var{q} is
5914
copied into the new @code{mpq_class}, there won't be any permanent association
5915
between it and @var{q}.
5918
@deftypefun void mpq_class::mpq_class (const char *@var{s})
5919
@deftypefunx void mpq_class::mpq_class (const char *@var{s}, int base)
5920
@deftypefunx void mpq_class::mpq_class (const string& @var{s})
5921
@deftypefunx void mpq_class::mpq_class (const string& @var{s}, int base)
5922
Construct an @code{mpq_class} converted from a string using
5923
@code{mpq_set_str}, (@pxref{Initializing Rationals}). If the @var{base} is
5924
not given then 0 is used.
5927
@deftypefun void mpq_class::canonicalize ()
5928
Put an @code{mpq_class} into canonical form, as per @ref{Rational Number
5929
Functions}. All arithmetic operators require their operands in canonical
5930
form, and will return results in canonical form.
5933
@deftypefun mpq_class abs (mpq_class @var{op})
5934
@deftypefunx int cmp (mpq_class @var{op1}, type @var{op2})
5935
@deftypefunx int cmp (type @var{op1}, mpq_class @var{op2})
5937
@deftypefunx double mpq_class::get_d (void)
5938
@deftypefunx int sgn (mpq_class @var{op})
5939
These functions provide a C++ class interface to the corresponding GMP C
5942
@code{cmp} can be used with any of the classes or the standard C++ types,
5943
except @code{long long} and @code{long double}.
5946
@deftypefun {mpz_class&} mpq_class::get_num ()
5947
@deftypefunx {mpz_class&} mpq_class::get_den ()
5948
Get a reference to an @code{mpz_class} which is the numerator or denominator
5949
of an @code{mpq_class}. This can be used both for read and write access. If
5950
the object returned is modified, it modifies the original @code{mpq_class}.
5952
If direct manipulation might produce a non-canonical value, then
5953
@code{mpq_class::canonicalize} must be called before further operations.
5956
@deftypefun mpz_t mpq_class::get_num_mpz_t ()
5957
@deftypefunx mpz_t mpq_class::get_den_mpz_t ()
5958
Get a reference to the underlying @code{mpz_t} numerator or denominator of an
5959
@code{mpq_class}. This can be passed to C functions expecting an
5960
@code{mpz_t}. Any modifications made to the @code{mpz_t} will modify the
5961
original @code{mpq_class}.
5963
If direct manipulation might produce a non-canonical value, then
5964
@code{mpq_class::canonicalize} must be called before further operations.
5967
@deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop});
5968
Read @var{rop} from @var{stream}, using its @code{ios} formatting settings,
5969
the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}).
5971
If the @var{rop} read might not be in canonical form then
5972
@code{mpq_class::canonicalize} must be called.
5976
@node C++ Interface Floats, C++ Interface MPFR, C++ Interface Rationals, C++ Class Interface
5977
@section C++ Interface Floats
5979
When an expression requires the use of temporary intermediate @code{mpf_class}
5980
values, like @code{f=g*h+x*y}, those temporaries will have the same precision
5981
as the destination @code{f}. Explicit constructors can be used if this
5984
@deftypefun {} mpf_class::mpf_class (type @var{op})
5985
@deftypefunx {} mpf_class::mpf_class (type @var{op}, unsigned long @var{prec})
5986
Construct an @code{mpf_class}. Any standard C++ type can be used, except
5987
@code{long long} and @code{long double}, and any of the GMP C++ classes can be
5990
If @var{prec} is given, the initial precision is that value, in bits. If
5991
@var{prec} is not given, then the initial precision is determined by the type
5992
of @var{op} given. An @code{mpz_class}, @code{mpq_class}, string, or C++
5993
builtin type will give the default @code{mpf} precision (@pxref{Initializing
5994
Floats}). An @code{mpf_class} or expression will give the precision of that
5995
value. The precision of a binary expression is the higher of the two
5999
mpf_class f(1.5); // default precision
6000
mpf_class f(1.5, 500); // 500 bits (at least)
6001
mpf_class f(x); // precision of x
6002
mpf_class f(abs(x)); // precision of x
6003
mpf_class f(-g, 1000); // 1000 bits (at least)
6004
mpf_class f(x+y); // greater of precisions of x and y
6008
@deftypefun {mpf_class&} mpf_class::operator= (type @var{op})
6009
Convert and store the given @var{op} value to an @code{mpf_class} object. The
6010
same types are accepted as for the constructors above.
6012
Note that @code{operator=} only stores a new value, it doesn't copy or change
6013
the precision of the destination, instead the value is truncated if necessary.
6014
This is the same as @code{mpf_set} etc. Note in particular this means for
6015
@code{mpf_class} a copy constructor is not the same as a default constructor
6019
mpf_class x (y); // x created with precision of y
6021
mpf_class x; // x created with default precision
6022
x = y; // value truncated to that precision
6025
Applications using templated code may need to be careful about the assumptions
6026
the code makes in this area, when working with @code{mpf_class} values of
6027
various different or non-default precisions. For instance implementations of
6028
the standard @code{complex} template have been seen in both styles above,
6029
though of course @code{complex} is normally only actually specified for use
6030
with the builtin float types.
6033
@deftypefun mpf_class abs (mpf_class @var{op})
6034
@deftypefunx mpf_class ceil (mpf_class @var{op})
6035
@deftypefunx int cmp (mpf_class @var{op1}, type @var{op2})
6036
@deftypefunx int cmp (type @var{op1}, mpf_class @var{op2})
6038
@deftypefunx mpf_class floor (mpf_class @var{op})
6039
@deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2})
6040
@deftypefunx double mpf_class::get_d (void)
6041
@deftypefunx long mpf_class::get_si (void)
6042
@deftypefunx {unsigned long} mpf_class::get_ui (void)
6044
@deftypefunx bool mpf_class::fits_sint_p (void)
6045
@deftypefunx bool mpf_class::fits_slong_p (void)
6046
@deftypefunx bool mpf_class::fits_sshort_p (void)
6048
@deftypefunx bool mpf_class::fits_uint_p (void)
6049
@deftypefunx bool mpf_class::fits_ulong_p (void)
6050
@deftypefunx bool mpf_class::fits_ushort_p (void)
6052
@deftypefunx int sgn (mpf_class @var{op})
6053
@deftypefunx mpf_class sqrt (mpf_class @var{op})
6054
@deftypefunx mpf_class trunc (mpf_class @var{op})
6055
These functions provide a C++ class interface to the corresponding GMP C
6058
@code{cmp} can be used with any of the classes or the standard C++ types,
6059
except @code{long long} and @code{long double}.
6061
The accuracy provided by @code{hypot} is not currently guaranteed.
6064
@deftypefun {unsigned long int} mpf_class::get_prec ()
6065
@deftypefunx void mpf_class::set_prec (unsigned long @var{prec})
6066
@deftypefunx void mpf_class::set_prec_raw (unsigned long @var{prec})
6067
Get or set the current precision of an @code{mpf_class}.
6069
The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing
6070
Floats}) apply to @code{mpf_class::set_prec_raw}. Note in particular that the
6071
@code{mpf_class} must be restored to it's allocated precision before being
6072
destroyed. This must be done by application code, there's no automatic
6077
@node C++ Interface MPFR, C++ Interface Random Numbers, C++ Interface Floats, C++ Class Interface
6078
@section C++ Interface MPFR
6080
The C++ class interface to MPFR is provided if MPFR is enabled (@pxref{Build
6081
Options}). This interface must be regarded as preliminary and possibly
6082
subject to incompatible changes in the future, since MPFR itself is
6083
preliminary. All definitions can be obtained with
6093
@deftp Class mpfr_class
6097
which behaves similarly to @code{mpf_class} (@pxref{C++ Interface Floats}).
6100
@node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface MPFR, C++ Class Interface
6101
@section C++ Interface Random Numbers
6103
@deftp Class gmp_randclass
6104
The C++ class interface to the GMP random number functions uses
6105
@code{gmp_randclass} to hold an algorithm selection and current state, as per
6106
@code{gmp_randstate_t}.
6109
@deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, ...), ...)
6110
Construct a @code{gmp_randclass}, using a call to the given @var{randinit}
6111
function (@pxref{Random State Initialization}). The arguments expected are
6112
the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}.
6116
gmp_randclass r1 (gmp_randinit_default);
6117
gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32);
6118
gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp);
6121
@code{gmp_randinit_lc_2exp_size} can fail if the size requested is too big,
6122
the behaviour of @code{gmp_randclass::gmp_randclass} is undefined in this case
6123
(perhaps this will change in the future).
6126
@deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, ...)
6127
Construct a @code{gmp_randclass} using the same parameters as
6128
@code{gmp_randinit} (@pxref{Random State Initialization}). This function is
6129
obsolete and the above @var{randinit} style should be preferred.
6132
@deftypefun void gmp_randclass::seed (unsigned long int @var{s})
6133
@deftypefunx void gmp_randclass::seed (mpz_class @var{s})
6134
Seed a random number generator. See @pxref{Random Number Functions}, for how
6135
to choose a good seed.
6138
@deftypefun mpz_class gmp_randclass::get_z_bits (unsigned long @var{bits})
6139
@deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits})
6140
Generate a random integer with a specified number of bits.
6143
@deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n})
6144
Generate a random integer in the range 0 to @math{@var{n}-1} inclusive.
6147
@deftypefun mpf_class gmp_randclass::get_f ()
6148
@deftypefunx mpf_class gmp_randclass::get_f (unsigned long @var{prec})
6149
Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}. @var{f}
6150
will be to @var{prec} bits precision, or if @var{prec} is not given then to
6151
the precision of the destination. For example,
6156
mpf_class f (0, 512); // 512 bits precision
6157
f = r.get_f(); // random number, 512 bits
6163
@node C++ Interface Limitations, , C++ Interface Random Numbers, C++ Class Interface
6164
@section C++ Interface Limitations
6167
@item @code{mpq_class} and Templated Reading
6168
A generic piece of template code probably won't know that @code{mpq_class}
6169
requires a @code{canonicalize} call if inputs read with @code{operator>>}
6170
might be non-canonical. This can lead to incorrect results.
6172
@code{operator>>} behaves as it does for reasons of efficiency. A
6173
canonicalize can be quite time consuming on large operands, and is best
6174
avoided if it's not necessary.
6176
But this potential difficulty reduces the usefulness of @code{mpq_class}.
6177
Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in
6178
the future, maybe a preprocessor define, a global flag, or an @code{ios} flag
6179
pressed into service. Or maybe, at the risk of inconsistency, the
6180
@code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t}
6181
@code{operator>>} not doing so, for use on those occasions when that's
6182
acceptable. Send feedback or alternate ideas to @email{bug-gmp@@gnu.org}.
6185
Subclassing the GMP C++ classes works, but is not currently recommended.
6187
Expressions involving subclasses resolve correctly (or seem to), but in normal
6188
C++ fashion the subclass doesn't inherit constructors and assignments.
6189
There's many of those in the GMP classes, and a good way to reestablish them
6190
in a subclass is not yet provided.
6192
@item Templated Expressions
6194
A subtle difficulty exists when using expressions together with
6195
application-defined template functions. Consider the following, with @code{T}
6196
intended to be some numeric type,
6200
T fun (const T &, const T &);
6204
When used with, say, plain @code{mpz_class} variables, it works fine: @code{T}
6205
is resolved as @code{mpz_class}.
6208
mpz_class f(1), g(2);
6213
But when one of the arguments is an expression, it doesn't work.
6216
mpz_class f(1), g(2), h(3);
6217
fun (f, g+h); // Bad
6220
This is because @code{g+h} ends up being a certain expression template type
6221
internal to @code{gmpxx.h}, which the C++ template resolution rules are unable
6222
to automatically convert to @code{mpz_class}. The workaround is simply to add
6226
mpz_class f(1), g(2), h(3);
6227
fun (f, mpz_class(g+h)); // Good
6230
Similarly, within @code{fun} it may be necessary to cast an expression to type
6231
@code{T} when calling a templated @code{fun2}.
6237
fun2 (f, f+g); // Bad
6243
fun2 (f, T(f+g)); // Good
6249
@node BSD Compatible Functions, Custom Allocation, C++ Class Interface, Top
6250
@comment node-name, next, previous, up
6251
@chapter Berkeley MP Compatible Functions
6252
@cindex Berkeley MP compatible functions
6253
@cindex BSD MP compatible functions
6255
These functions are intended to be fully compatible with the Berkeley MP
6256
library which is available on many BSD derived U*ix systems. The
6257
@samp{--enable-mpbsd} option must be used when building GNU MP to make these
6258
available (@pxref{Installing GMP}).
6260
The original Berkeley MP library has a usage restriction: you cannot use the
6261
same variable as both source and destination in a single function call. The
6262
compatible functions in GNU MP do not share this restriction---inputs and
6263
outputs may overlap.
6265
It is not recommended that new programs are written using these functions.
6266
Apart from the incomplete set of functions, the interface for initializing
6267
@code{MINT} objects is more error prone, and the @code{pow} function collides
6268
with @code{pow} in @file{libm.a}.
6271
Include the header @file{mp.h} to get the definition of the necessary types and
6272
functions. If you are on a BSD derived system, make sure to include GNU
6273
@file{mp.h} if you are going to link the GNU @file{libmp.a} to your program.
6274
This means that you probably need to give the @samp{-I<dir>} option to the
6275
compiler, where @samp{<dir>} is the directory where you have GNU @file{mp.h}.
6277
@deftypefun {MINT *} itom (signed short int @var{initial_value})
6278
Allocate an integer consisting of a @code{MINT} object and dynamic limb space.
6279
Initialize the integer to @var{initial_value}. Return a pointer to the
6283
@deftypefun {MINT *} xtom (char *@var{initial_value})
6284
Allocate an integer consisting of a @code{MINT} object and dynamic limb space.
6285
Initialize the integer from @var{initial_value}, a hexadecimal,
6286
null-terminated C string. Return a pointer to the @code{MINT} object.
6289
@deftypefun void move (MINT *@var{src}, MINT *@var{dest})
6290
Set @var{dest} to @var{src} by copying. Both variables must be previously
6294
@deftypefun void madd (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
6295
Add @var{src_1} and @var{src_2} and put the sum in @var{destination}.
6298
@deftypefun void msub (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
6299
Subtract @var{src_2} from @var{src_1} and put the difference in
6303
@deftypefun void mult (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
6304
Multiply @var{src_1} and @var{src_2} and put the product in @var{destination}.
6307
@deftypefun void mdiv (MINT *@var{dividend}, MINT *@var{divisor}, MINT *@var{quotient}, MINT *@var{remainder})
6308
@deftypefunx void sdiv (MINT *@var{dividend}, signed short int @var{divisor}, MINT *@var{quotient}, signed short int *@var{remainder})
6309
Set @var{quotient} to @var{dividend}/@var{divisor}, and @var{remainder} to
6310
@var{dividend} mod @var{divisor}. The quotient is rounded towards zero; the
6311
remainder has the same sign as the dividend unless it is zero.
6313
Some implementations of these functions work differently---or not at all---for
6317
@deftypefun void msqrt (MINT *@var{op}, MINT *@var{root}, MINT *@var{remainder})
6318
Set @var{root} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
6319
of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{remainder} to
6320
@m{(@var{op} - @var{root}^2), @var{op}@minus{}@var{root}*@var{root}}, i.e.
6321
zero if @var{op} is a perfect square.
6323
If @var{root} and @var{remainder} are the same variable, the results are
6327
@deftypefun void pow (MINT *@var{base}, MINT *@var{exp}, MINT *@var{mod}, MINT *@var{dest})
6328
Set @var{dest} to (@var{base} raised to @var{exp}) modulo @var{mod}.
6330
Note that the name @code{pow} clashes with @code{pow} from the standard C math
6331
library (@pxref{Exponents and Logarithms,, Exponentiation and Logarithms,
6332
libc, The GNU C Library Reference Manual}). An application will only be able
6333
to use one or the other.
6336
@deftypefun void rpow (MINT *@var{base}, signed short int @var{exp}, MINT *@var{dest})
6337
Set @var{dest} to @var{base} raised to @var{exp}.
6340
@deftypefun void gcd (MINT *@var{op1}, MINT *@var{op2}, MINT *@var{res})
6341
Set @var{res} to the greatest common divisor of @var{op1} and @var{op2}.
6344
@deftypefun int mcmp (MINT *@var{op1}, MINT *@var{op2})
6345
Compare @var{op1} and @var{op2}. Return a positive value if @var{op1} >
6346
@var{op2}, zero if @var{op1} = @var{op2}, and a negative value if @var{op1} <
6350
@deftypefun void min (MINT *@var{dest})
6351
Input a decimal string from @code{stdin}, and put the read integer in
6352
@var{dest}. SPC and TAB are allowed in the number string, and are ignored.
6355
@deftypefun void mout (MINT *@var{src})
6356
Output @var{src} to @code{stdout}, as a decimal string. Also output a newline.
6359
@deftypefun {char *} mtox (MINT *@var{op})
6360
Convert @var{op} to a hexadecimal string, and return a pointer to the string.
6361
The returned string is allocated using the default memory allocation function,
6362
@code{malloc} by default. It will be @code{strlen(str)+1} bytes, that being
6363
exactly enough for the string and null-terminator.
6366
@deftypefun void mfree (MINT *@var{op})
6367
De-allocate, the space used by @var{op}. @strong{This function should only be
6368
passed a value returned by @code{itom} or @code{xtom}.}
6372
@node Custom Allocation, Language Bindings, BSD Compatible Functions, Top
6373
@comment node-name, next, previous, up
6374
@chapter Custom Allocation
6375
@cindex Custom allocation
6376
@cindex Memory allocation
6377
@cindex Allocation of memory
6379
By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory
6380
allocation, and if they fail GMP prints a message to the standard error output
6381
and terminates the program.
6383
Alternate functions can be specified to allocate memory in a different way or
6384
to have a different error action on running out of memory.
6386
This feature is available in the Berkeley compatibility library (@pxref{BSD
6387
Compatible Functions}) as well as the main GMP library.
6389
@deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t))
6390
Replace the current allocation functions from the arguments. If an argument
6391
is @code{NULL}, the corresponding default function is used.
6393
These functions will be used for all memory allocation done by GMP, apart from
6394
temporary space from @code{alloca} if that function is available and GMP is
6395
configured to use it (@pxref{Build Options}).
6397
@strong{Be sure to call @code{mp_set_memory_functions} only when there are no
6398
active GMP objects allocated using the previous memory functions! Usually
6399
that means calling it before any other GMP function.}
6402
The functions supplied should fit the following declarations:
6404
@deftypefun {void *} allocate_function (size_t @var{alloc_size})
6405
Return a pointer to newly allocated space with at least @var{alloc_size}
6409
@deftypefun {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size})
6410
Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be
6411
@var{new_size} bytes.
6413
The block may be moved if necessary or if desired, and in that case the
6414
smaller of @var{old_size} and @var{new_size} bytes must be copied to the new
6415
location. The return value is a pointer to the resized block, that being the
6416
new location if moved or just @var{ptr} if not.
6418
@var{ptr} is never @code{NULL}, it's always a previously allocated block.
6419
@var{new_size} may be bigger or smaller than @var{old_size}.
6422
@deftypefun void deallocate_function (void *@var{ptr}, size_t @var{size})
6423
De-allocate the space pointed to by @var{ptr}.
6425
@var{ptr} is never @code{NULL}, it's always a previously allocated block of
6429
A @dfn{byte} here means the unit used by the @code{sizeof} operator.
6431
The @var{old_size} parameters to @var{reallocate_function} and
6432
@var{deallocate_function} are passed for convenience, but of course can be
6433
ignored if not needed. The default functions using @code{malloc} and friends
6434
for instance don't use them.
6436
No error return is allowed from any of these functions, if they return then
6437
they must have performed the specified operation. In particular note that
6438
@var{allocate_function} or @var{reallocate_function} mustn't return
6441
Getting a different fatal error action is a good use for custom allocation
6442
functions, for example giving a graphical dialog rather than the default print
6443
to @code{stderr}. How much is possible when genuinely out of memory is
6444
another question though.
6446
There's currently no defined way for the allocation functions to recover from
6447
an error such as out of memory, they must terminate program execution. A
6448
@code{longjmp} or throwing a C++ exception will have undefined results. This
6449
may change in the future.
6451
GMP may use allocated blocks to hold pointers to other allocated blocks. This
6452
will limit the assumptions a conservative garbage collection scheme can make.
6454
Since the default GMP allocation uses @code{malloc} and friends, those
6455
functions will be linked in even if the first thing a program does is an
6456
@code{mp_set_memory_functions}. It's necessary to change the GMP sources if
6460
@node Language Bindings, Algorithms, Custom Allocation, Top
6461
@chapter Language Bindings
6463
The following packages and projects offer access to GMP from languages other
6464
than C, though perhaps with varying levels of functionality and efficiency.
6466
@c GNUstep Base Library @uref{http://www.gnustep.org} (version 0.9.1) is
6467
@c intending to use GMP for its NSDecimal class, which would be an Objective
6468
@c C binding for GMP. Has some configure stuff ready, but no code.
6470
@c @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces
6471
@c in tex, just to separate the URL from the preceding text a bit.
6473
@macro spaceuref {U}
6478
@macro spaceuref {U}
6488
GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward
6489
interface, expression templates to eliminate temporaries.
6491
ALP @spaceuref{http://www.inria.fr/saga/logiciels/ALP} @* Linear algebra and
6492
polynomials using templates.
6494
Arithmos @spaceuref{http://win-www.uia.ac.be/u/cant/arithmos} @* Rationals
6495
with infinities and square roots.
6497
CLN @spaceuref{http://www.ginac.de/CLN/} @* High level classes for arithmetic.
6499
LiDIA @spaceuref{http://www.informatik.tu-darmstadt.de/TI/LiDIA} @* A C++
6500
library for computational number theory.
6502
Linbox @spaceuref{http://www.linalg.org} @* Sparse vectors and matrices.
6504
NTL @spaceuref{http://www.shoup.net/ntl} @* A C++ number theory library.
6510
Omni F77 @spaceuref{http://phase.hpcc.jp/Omni/home.html} @* Arbitrary
6517
Glasgow Haskell Compiler @spaceuref{http://www.haskell.org/ghc}
6523
Kaffe @spaceuref{http://www.kaffe.org}
6525
Kissme @spaceuref{http://kissme.sourceforge.net}
6531
GNU Common Lisp @spaceuref{http://www.gnu.org/software/gcl/gcl.html} @* In the
6532
process of switching to GMP for bignums.
6534
Librep @spaceuref{http://librep.sourceforge.net}
6536
@c FIXME: When there's a stable release with gmp support, just refer to it
6537
@c rather than bothering to talk about betas.
6538
XEmacs (21.5.18 beta and up) @spaceuref{http://www.xemacs.org} @* Optional
6539
big integers, rationals and floats using GMP.
6545
GNU m4 betas @spaceuref{http://www.seindal.dk/rene/gnu} @* Optionally provides
6546
an arbitrary precision @code{mpeval}.
6552
MLton compiler @spaceuref{http://www.mlton.org}
6555
@item Objective Caml
6558
MLGMP @spaceuref{http://www.di.ens.fr/~monniaux/programmes.html.en}
6560
Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* Optionally using
6567
Mozart @spaceuref{http://www.mozart-oz.org}
6573
GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de} @* GMP unit.
6575
Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* For Free Pascal,
6576
optionally using GMP.
6582
GMP module, see @file{demos/perl} in the GMP sources.
6584
Math::GMP @spaceuref{http://www.cpan.org} @* Compatible with Math::BigInt, but
6585
not as many functions as the GMP module above.
6587
Math::BigInt::GMP @spaceuref{http://www.cpan.org} @* Plug Math::GMP into
6588
normal Math::BigInt operations.
6595
mpz module in the standard distribution, @uref{http://pike.ida.liu.se/}
6602
SWI Prolog @spaceuref{http://www.swi.psy.uva.nl/projects/SWI-Prolog} @*
6603
Arbitrary precision floats.
6609
mpz module in the standard distribution, @uref{http://www.python.org}
6611
GMPY @uref{http://gmpy.sourceforge.net}
6617
GNU Guile (upcoming 1.8)
6618
@spaceuref{http://www.gnu.org/software/guile/guile.html}
6620
RScheme @spaceuref{http://www.rscheme.org}
6622
STklos @spaceuref{http://kaolin.unice.fr/STklos}
6628
GNU Smalltalk @spaceuref{http://www.smalltalk.org/versions/GNUSmalltalk.html}
6634
Axiom @uref{http://savannah.nongnu.org/projects/axiom} @* Computer algebra
6637
DrGenius @spaceuref{http://drgenius.seul.org} @* Geometry system and
6638
mathematical programming language.
6640
GiNaC @spaceuref{http://www.ginac.de} @* C++ computer algebra using CLN.
6642
GOO @spaceuref{http://www.googoogaga.org/} @* Dynamic object oriented
6645
Maxima @uref{http://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma
6646
computer algebra using GCL.
6648
Q @spaceuref{http://www.musikwissenschaft.uni-mainz.de/~ag/q} @* Equational
6651
Regina @spaceuref{http://regina.sourceforge.net} @* Topological calculator.
6653
Yacas @spaceuref{http://www.xs4all.nl/~apinkus/yacas.html} @* Yet another
6654
computer algebra system.
6660
@node Algorithms, Internals, Language Bindings, Top
6664
This chapter is an introduction to some of the algorithms used for various GMP
6665
operations. The code is likely to be hard to understand without knowing
6666
something about the algorithms.
6668
Some GMP internals are mentioned, but applications that expect to be
6669
compatible with future GMP releases should take care to use only the
6670
documented functions.
6673
* Multiplication Algorithms::
6674
* Division Algorithms::
6675
* Greatest Common Divisor Algorithms::
6676
* Powering Algorithms::
6677
* Root Extraction Algorithms::
6678
* Radix Conversion Algorithms::
6679
* Other Algorithms::
6680
* Assembler Coding::
6684
@node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms
6685
@section Multiplication
6686
@cindex Multiplication algorithms
6688
N@cross{}N limb multiplications and squares are done using one of four
6689
algorithms, as the size N increases.
6692
@multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6693
@item Algorithm @tab Threshold
6694
@item Basecase @tab (none)
6695
@item Karatsuba @tab @code{MUL_KARATSUBA_THRESHOLD}
6696
@item Toom-3 @tab @code{MUL_TOOM3_THRESHOLD}
6697
@item FFT @tab @code{MUL_FFT_THRESHOLD}
6701
Similarly for squaring, with the @code{SQR} thresholds. Note though that the
6702
FFT is only used if GMP is configured with @samp{--enable-fft}, @pxref{Build
6705
N@cross{}M multiplications of operands with different sizes above
6706
@code{MUL_KARATSUBA_THRESHOLD} are currently done by splitting into M@cross{}M
6707
pieces. The Karatsuba and Toom-3 routines then operate only on equal size
6708
operands. This is not very efficient, and is slated for improvement in the
6712
* Basecase Multiplication::
6713
* Karatsuba Multiplication::
6714
* Toom-Cook 3-Way Multiplication::
6715
* FFT Multiplication::
6716
* Other Multiplication::
6720
@node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms
6721
@subsection Basecase Multiplication
6723
Basecase N@cross{}M multiplication is a straightforward rectangular set of
6724
cross-products, the same as long multiplication done by hand and for that
6725
reason sometimes known as the schoolbook or grammar school method. This is an
6726
@m{O(NM),O(N*M)} algorithm. See Knuth section 4.3.1 algorithm M
6727
(@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code.
6729
Assembler implementations of @code{mpn_mul_basecase} are essentially the same
6730
as the generic C code, but have all the usual assembler tricks and
6731
obscurities introduced for speed.
6733
A square can be done in roughly half the time of a multiply, by using the fact
6734
that the cross products above and below the diagonal are the same. A triangle
6735
of products below the diagonal is formed, doubled (left shift by one bit), and
6736
then the products on the diagonal added. This can be seen in
6737
@file{mpn/generic/sqr_basecase.c}. Again the assembler implementations take
6738
essentially the same approach.
6741
\def\GMPline#1#2#3#4#5#6{%
6743
\vrule height 2.5ex depth 1ex
6744
\hbox to 2em {\hfil{#2}\hfil}%
6745
\vrule \hbox to 2em {\hfil{#3}\hfil}%
6746
\vrule \hbox to 2em {\hfil{#4}\hfil}%
6747
\vrule \hbox to 2em {\hfil{#5}\hfil}%
6748
\vrule \hbox to 2em {\hfil{#6}\hfil}%
6753
\hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}%
6754
\hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}%
6755
\hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}%
6756
\hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}%
6757
\hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}%
6758
\hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}%
6762
\hbox to 2em {\hfil u0\hfil}%
6763
\hbox to 2em {\hfil u1\hfil}%
6764
\hbox to 2em {\hfil u2\hfil}%
6765
\hbox to 2em {\hfil u3\hfil}%
6766
\hbox to 2em {\hfil u4\hfil}}%
6769
\GMPline{u0}{d}{}{}{}{}%
6771
\GMPline{u1}{}{d}{}{}{}%
6773
\GMPline{u2}{}{}{d}{}{}%
6775
\GMPline{u3}{}{}{}{d}{}%
6777
\GMPline{u4}{}{}{}{}{d}%
6784
+---+---+---+---+---+
6786
+---+---+---+---+---+
6788
+---+---+---+---+---+
6790
+---+---+---+---+---+
6792
+---+---+---+---+---+
6794
+---+---+---+---+---+
6799
In practice squaring isn't a full 2@cross{} faster than multiplying, it's
6800
usually around 1.5@cross{}. Less than 1.5@cross{} probably indicates
6801
@code{mpn_sqr_basecase} wants improving on that CPU.
6803
On some CPUs @code{mpn_mul_basecase} can be faster than the generic C
6804
@code{mpn_sqr_basecase}. @code{SQR_BASECASE_THRESHOLD} is the size at which
6805
to use @code{mpn_sqr_basecase}, this will be zero if that routine should be
6809
@node Karatsuba Multiplication, Toom-Cook 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms
6810
@subsection Karatsuba Multiplication
6812
The Karatsuba multiplication algorithm is described in Knuth section 4.3.3
6813
part A, and various other textbooks. A brief description is given here.
6815
The inputs @math{x} and @math{y} are treated as each split into two parts of
6816
equal length (or the most significant part one limb shorter if N is odd).
6819
% GMPboxwidth used for all the multiplication pictures
6820
\global\newdimen\GMPboxwidth \global\GMPboxwidth=5em
6821
% GMPboxdepth and GMPboxheight are also used for the float pictures
6822
\global\newdimen\GMPboxdepth \global\GMPboxdepth=1ex
6823
\global\newdimen\GMPboxheight \global\GMPboxheight=2ex
6824
\gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth}
6828
\hbox to 2\GMPboxwidth{%
6829
\GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}%
6833
\hbox to 2\GMPboxwidth {high \hfil low}
6844
+----------+----------+
6846
+----------+----------+
6848
+----------+----------+
6850
+----------+----------+
6855
Let @math{b} be the power of 2 where the split occurs, ie.@: if @ms{x,0} is
6856
@math{k} limbs (@ms{y,0} the same) then
6857
@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}.
6858
With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the
6862
@m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0,
6863
x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0}
6866
This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs,
6867
whereas a basecase multiply of N@cross{}N limbs is equivalent to four
6868
multiplies of (N/2)@cross{}(N/2). The factors @math{(b^2+b)} etc represent
6869
the positions where the three products must be added.
6877
\hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}%
6879
\hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
6884
\raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}%
6889
\hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
6894
\hbox to 4\GMPboxwidth {high \hfil low}
6896
\GMPboxA{x_1y_1}{x_0y_0}
6898
\GMPboxB{$+$}{x_1y_1}
6900
\GMPboxB{$+$}{x_0y_0}
6902
\GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)}
6909
+--------+--------+ +--------+--------+
6911
+--------+--------+ +--------+--------+
6919
sub | (x1-x0)*(y1-y0) |
6925
The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an
6926
absolute value, and the sign used to choose to add or subtract. Notice the
6927
sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1),
6928
high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb
6929
additions, rather than @m{6k,6*k}, but in GMP extra function call overheads
6930
outweigh the saving.
6932
Squaring is similar to multiplying, but with @math{x=y} the formula reduces to
6933
an equivalent with three squares,
6936
@m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2,
6937
x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2}
6940
The final result is accumulated from those three squares the same way as for
6941
the three multiplies above. The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now
6944
A similar formula for both multiplying and squaring can be constructed with a
6945
middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}. But those sums can exceed
6946
@math{k} limbs, leading to more carry handling and additions than the form
6949
Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm,
6950
the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies
6951
each @math{1/2} the size of the inputs. This is a big improvement over the
6952
basecase multiply at @math{O(N^2)} and the advantage soon overcomes the extra
6953
additions Karatsuba performs. @code{MUL_KARATSUBA_THRESHOLD} can be as little
6954
as 10 limbs. The @code{SQR} threshold is usually about twice the @code{MUL}.
6956
The basecase algorithm will take a time of the form @m{M(N) = aN^2 + bN + c,
6957
M(N) = a*N^2 + b*N + c} and the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN +
6958
e, K(N) = 3*M(N/2) + d*N + e}, which expands to @m{K(N) = {3\over4} aN^2 +
6959
{3\over2} bN + 3c + dN + e, K(N) = 3/4*a*N^2 + 3/2*b*N + 3*c + d*N + e}. The
6960
factor @m{3\over4, 3/4} for @math{a} means per-crossproduct speedups in the
6961
basecase code will increase the threshold since they benefit @math{M(N)} more
6962
than @math{K(N)}. And conversely the @m{3\over2, 3/2} for @math{b} means
6963
linear style speedups of @math{b} will increase the threshold since they
6964
benefit @math{K(N)} more than @math{M(N)}. The latter can be seen for
6965
instance when adding an optimized @code{mpn_sqr_diagonal} to
6966
@code{mpn_sqr_basecase}. Of course all speedups reduce total time, and in
6967
that sense the algorithm thresholds are merely of academic interest.
6970
@node Toom-Cook 3-Way Multiplication, FFT Multiplication, Karatsuba Multiplication, Multiplication Algorithms
6971
@subsection Toom-Cook 3-Way Multiplication
6973
The Karatsuba formula is the simplest case of a general approach to splitting
6974
inputs that leads to both Toom-Cook and FFT algorithms. A description of
6975
Toom-Cook can be found in Knuth section 4.3.3, with an example 3-way
6976
calculation after Theorem A. The 3-way form used in GMP is described here.
6978
The operands are each considered split into 3 pieces of equal length (or the
6979
most significant part 1 or 2 limbs shorter than the others).
6985
\hbox to 3\GMPboxwidth {%
6997
\hbox to 3\GMPboxwidth {high \hfil low}
6999
\GMPbox{x_2}{x_1}{x_0}
7001
\GMPbox{y_2}{y_1}{y_0}
7009
+----------+----------+----------+
7011
+----------+----------+----------+
7013
+----------+----------+----------+
7015
+----------+----------+----------+
7021
These parts are treated as the coefficients of two polynomials
7025
@m{X(t) = x_2t^2 + x_1t + x_0,
7026
X(t) = x2*t^2 + x1*t + x0}
7027
@m{Y(t) = y_2t^2 + y_1t + y_0,
7028
Y(t) = y2*t^2 + y1*t + y0}
7032
Again let @math{b} equal the power of 2 which is the size of the @ms{x,0},
7033
@ms{x,1}, @ms{y,0} and @ms{y,1} pieces, ie.@: if they're @math{k} limbs each
7034
then @m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}},
7035
b=2^(k*mp_bits_per_limb)}. With this @math{x=X(b)} and @math{y=Y(b)}.
7037
Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients
7041
@m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0,
7042
W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0}
7046
The @m{w_i,w[i]} are going to be determined, and when they are they'll give
7047
the final result using @math{w=W(b)}, since
7048
@m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}. The coefficients will be roughly
7049
@math{b^2} each, and the final @math{W(b)} will be an addition like,
7053
\moveright #1\GMPboxwidth
7058
\hbox to 2\GMPboxwidth {\hfil$#2$\hfil}%
7064
\hbox to 6\GMPboxwidth {high \hfil low}%
7100
The @m{w_i,w[i]} coefficients could be formed by a simple set of cross
7101
products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2},
7102
@m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all
7103
nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely
7104
to a basecase multiply. Instead the following approach is used.
7106
@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving
7107
values of @math{W(t)} at those points. The points used can be chosen in
7108
various ways, but in GMP the following are used
7111
@multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
7112
@item Point @tab Value
7113
@item @math{t=0} @tab @m{x_0y_0,x0*y0}, which gives @ms{w,0} immediately
7114
@item @math{t=2} @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0)*(4*y2+2*y1+y0)}
7115
@item @math{t=1} @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0)*(y2+y1+y0)}
7116
@item @m{t={1\over2},t=1/2} @tab @m{(x_2+2x_1+4x_0)(y_2+2y_1+4y_0),(x2+2*x1+4*x0)*(y2+2*y1+4*y0)}
7117
@item @m{t=\infty,t=inf} @tab @m{x_2y_2,x2*y2}, which gives @ms{w,4} immediately
7121
At @m{t={1\over2},t=1/2} the value calculated is actually
7122
@m{16X({1\over2})Y({1\over2}), 16*X(1/2)*Y(1/2)}, giving a value for
7123
@m{16W({1\over2}),16*W(1/2)}, and this is always an integer. At
7124
@m{t=\infty,t=inf} the value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over
7125
t^4}, X(t)*Y(t)/t^4 in the limit as t approaches infinity}, but it's much
7126
easier to think of as simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately
7127
(much like @m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately).
7129
Now each of the points substituted into
7130
@m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination
7131
of the @m{w_i,w[i]} coefficients, and the value of those combinations has just
7137
W(0) & = & & & & & & & & & w_0 \cr
7138
16W({1\over2}) & = & w_4 & + & 2w_3 & + & 4w_2 & + & 8w_1 & + & 16w_0 \cr
7139
W(1) & = & w_4 & + & w_3 & + & w_2 & + & w_1 & + & w_0 \cr
7140
W(2) & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr
7141
W(\infty) & = & w_4 \cr
7148
16*W(1/2) = w4 + 2*w3 + 4*w2 + 8*w1 + 16*w0
7149
W(1) = w4 + w3 + w2 + w1 + w0
7150
W(2) = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0
7156
This is a set of five equations in five unknowns, and some elementary linear
7157
algebra quickly isolates each @m{w_i,w[i]}, by subtracting multiples of one
7158
equation from another.
7160
In the code the set of five values @math{W(0)},@dots{},@m{W(\infty),W(inf)}
7161
will represent those certain linear combinations. By adding or subtracting
7162
one from another as necessary, values which are each @m{w_i,w[i]} alone are
7163
arrived at. This involves only a few subtractions of small multiples (some of
7164
which are powers of 2), and so is fast. A couple of divisions remain by
7165
powers of 2 and one division by 3 (or by 6 rather), and that last uses the
7166
special @code{mpn_divexact_by3} (@pxref{Exact Division}).
7168
In the code the values @ms{w,4}, @ms{w,2} and @ms{w,0} are formed in the
7169
destination with pointers @code{E}, @code{C} and @code{A}, and @ms{w,3} and
7170
@ms{w,1} in temporary space @code{D} and @code{B} are added to them. There
7171
are extra limbs @code{tD}, @code{tC} and @code{tB} at the high end of
7172
@ms{w,3}, @ms{w,2} and @ms{w,1} which are handled separately. The final
7173
addition then is as follows.
7179
\hbox {\GMPvrule\hskip 0.4em #1\hskip 0.4em \vrule}%
7184
\hbox to 6\GMPboxwidth {high \hfil low}%
7190
\hbox to 2\GMPboxwidth {\hfil@code{E}\hfil}
7192
\hbox to 2\GMPboxwidth {\hfil@code{C}\hfil}
7194
\hbox to 2\GMPboxwidth {\hfil@code{A}\hfil}
7198
\moveright \GMPboxwidth \vbox{%
7200
\hbox to 4\GMPboxwidth {%
7201
\GMPvrule \hfil @code{D}\hfil
7202
\vrule \hfil @code{B}\hfil
7207
\hbox to \GMPboxwidth{\hfil \GMPboxT{\code{tD}}}%
7208
\hbox to \GMPboxwidth{\hfil \GMPboxT{\code{tC}}}%
7209
\hbox to \GMPboxwidth{\hfil \GMPboxT{\code{tB}}}}
7216
+-------+-------+-------+-------+-------+-------+
7218
+-------+-------+-------+-------+-------+-------+
7219
+------+-------++------+-------+
7221
+------+-------++------+-------+
7229
The conversion of @math{W(t)} values to the coefficients is interpolation. A
7230
polynomial of degree 4 like @math{W(t)} is uniquely determined by values known
7231
at 5 different points. The points can be chosen to make the linear equations
7232
come out with a convenient set of steps for isolating the @m{w_i,w[i]}.
7234
In @file{mpn/generic/mul_n.c} the @code{interpolate3} routine performs the
7235
interpolation. The open-coded one-pass version may be a bit hard to
7236
understand, the steps performed can be better seen in the @code{USE_MORE_MPN}
7239
Squaring follows the same procedure as multiplication, but there's only one
7240
@math{X(t)} and it's evaluated at 5 points, and those values squared to give
7241
values of @math{W(t)}. The interpolation is then identical, and in fact the
7242
same @code{interpolate3} subroutine is used for both squaring and multiplying.
7244
Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being
7245
@m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the
7246
original size. This is an improvement over Karatsuba at @math{O(N^@W{1.585})},
7247
though Toom-Cook does more work in the evaluation and interpolation and so it
7248
only realizes its advantage above a certain size.
7250
Near the crossover between Toom-3 and Karatsuba there's generally a range of
7251
sizes where the difference between the two is small.
7252
@code{MUL_TOOM3_THRESHOLD} is a somewhat arbitrary point in that range and
7253
successive runs of the tune program can give different values due to small
7254
variations in measuring. A graph of time versus size for the two shows the
7255
effect, see @file{tune/README}.
7257
At the fairly small sizes where the Toom-3 thresholds occur it's worth
7258
remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be
7259
expected to make accurate predictions, due of course to the big influence of
7260
all sorts of overheads, and the fact that only a few recursions of each are
7261
being performed. Even at large sizes there's a good chance machine dependent
7262
effects like cache architecture will mean actual performance deviates from
7263
what might be predicted.
7265
The formula given above for the Karatsuba algorithm has an equivalent for
7266
Toom-3 involving only five multiplies, but this would be complicated and
7269
An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using
7270
a vector to represent the @math{x} and @math{y} splits and a matrix
7271
multiplication for the evaluation and interpolation stages. The matrix
7272
inverses are not meant to be actually used, and they have elements with values
7273
much greater than in fact arise in the interpolation steps. The diagram shown
7274
for the 3-way is attractive, but again doesn't have to be implemented that way
7275
and for example with a bit of rearrangement just one division by 6 can be
7279
@node FFT Multiplication, Other Multiplication, Toom-Cook 3-Way Multiplication, Multiplication Algorithms
7280
@subsection FFT Multiplication
7282
At large to very large sizes a Fermat style FFT multiplication is used,
7283
following Sch@"onhage and Strassen (@pxref{References}). Descriptions of FFTs
7284
in various forms can be found in many textbooks, for instance Knuth section
7285
4.3.3 part C or Lipson chapter IX. A brief description of the form used in
7288
The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given
7289
@math{N}. A full product @m{xy,x*y} is obtained by choosing @m{N \ge
7290
\mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding
7291
@math{x} and @math{y} with high zero limbs. The modular product is the native
7292
form for the algorithm, so padding to get a full product is unavoidable.
7294
The algorithm follows a split, evaluate, pointwise multiply, interpolate and
7295
combine similar to that described above for Karatsuba and Toom-3. A @math{k}
7296
parameter controls the split, with an FFT-@math{k} splitting into @math{2^k}
7297
pieces of @math{M=N/2^k} bits each. @math{N} must be a multiple of
7298
@m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so
7299
the split falls on limb boundaries, avoiding bit shifts in the split and
7302
The evaluations, pointwise multiplications, and interpolation, are all done
7303
modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a
7304
multiple of @math{2^k} and of @code{mp_bits_per_limb}. The results of
7305
interpolation will be the following negacyclic convolution of the input
7306
pieces, and the choice of @math{N'} ensures these sums aren't truncated.
7308
$$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$
7315
w[n] = / (-1) * x[i] * y[j]
7322
The points used for the evaluation are @math{g^i} for @math{i=0} to
7323
@math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}. @math{g} is a
7324
@m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary
7325
cancellations at the interpolation stage, and it's also a power of 2 so the
7326
fast fourier transforms used for the evaluation and interpolation do only
7327
shifts, adds and negations.
7329
The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either
7330
recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or
7331
basecase), whichever is optimal at the size @math{N'}. The interpolation is
7332
an inverse fast fourier transform. The resulting set of sums of @m{x_iy_j,
7333
x[i]*y[j]} are added at appropriate offsets to give the final result.
7335
Squaring is the same, but @math{x} is the only input so it's one transform at
7336
the evaluate stage and the pointwise multiplies are squares. The
7337
interpolation is the same.
7339
For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}),
7340
O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed
7341
modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original.
7342
Each successive @math{k} is an asymptotic improvement, but overheads mean each
7343
is only faster at bigger and bigger sizes. In the code, @code{MUL_FFT_TABLE}
7344
and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used. Each
7345
new @math{k} effectively swaps some multiplying for some shifts, adds and
7348
A mod @math{2^N+1} product can be formed with a normal
7349
@math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT
7350
and Toom-3 etc can be compared directly. A @math{k=4} FFT at
7351
@math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at
7352
@math{O(N^@W{1.465})}. In practice this is what's found, with
7353
@code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between
7354
300 and 1000 limbs, depending on the CPU. So far it's been found that only
7355
very large FFTs recurse into pointwise multiplies above these sizes.
7357
When an FFT is to give a full product, the change of @math{N} to @math{2N}
7358
doesn't alter the theoretical complexity for a given @math{k}, but for the
7359
purposes of considering where an FFT might be first used it can be assumed
7360
that the FFT is recursing into a normal multiply and that on that basis it's
7361
doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of
7362
the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}. This would mean
7363
@math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3.
7364
In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been
7365
found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs.
7367
The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is
7368
rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that
7369
when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a
7370
multiple of @m{2^{2k-1},2^(2k-1)} bits. The @math{+k+3} means some values of
7371
@math{N} just under such a multiple will be rounded to the next. The
7372
complexity calculations above assume that a favourable size is used, meaning
7373
one which isn't padded through rounding, and it's also assumed that the extra
7374
@math{+k+3} bits are negligible at typical FFT sizes.
7376
The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a
7377
step-effect into measured speeds. For example @math{k=8} will round @math{N}
7378
up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb
7379
groups of sizes for which @code{mpn_mul_n} runs at the same speed. Or for
7380
@math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc. In
7381
practice it's been found each @math{k} is used at quite small multiples of its
7382
size constraint and so the step effect is quite noticeable in a time versus
7385
The threshold determinations currently measure at the mid-points of size
7386
steps, but this is sub-optimal since at the start of a new step it can happen
7387
that it's better to go back to the previous @math{k} for a while. Something
7388
more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be
7392
@node Other Multiplication, , FFT Multiplication, Multiplication Algorithms
7393
@subsection Other Multiplication
7395
The 3-way Toom-Cook algorithm described above (@pxref{Toom-Cook 3-Way
7396
Multiplication}) generalizes to split into an arbitrary number of pieces, as
7397
per Knuth section 4.3.3 algorithm C. This is not currently used, though it's
7398
possible a Toom-4 might fit in between Toom-3 and the FFTs. The notes here
7399
are merely for interest.
7401
In general a split into @math{r+1} pieces is made, and evaluations and
7402
pointwise multiplications done at @m{2r+1,2*r+1} points. A 4-way split does 7
7403
pointwise multiplies, 5-way does 9, etc. Asymptotically an @math{(r+1)}-way
7404
algorithm is @m{O(N^{log(2r+1)/log(r+1)}, O(N^(log(2*r+1)/log(r+1)))}. Only
7405
the pointwise multiplications count towards big-@math{O} complexity, but the
7406
time spent in the evaluate and interpolate stages grows with @math{r} and has
7407
a significant practical impact, with the asymptotic advantage of each @math{r}
7408
realized only at bigger and bigger sizes. The overheads grow as
7409
@m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log
7412
Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4
7413
uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small
7414
multiplies in the evaluate stage (or rather trades them for additions), and
7415
has a further saving of nearly half the interpolate steps. The idea is to
7416
separate odd and even final coefficients and then perform algorithm C steps C7
7417
and C8 on them separately. The divisors at step C7 become @math{j^2} and the
7418
multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}.
7420
Splitting odd and even parts through positive and negative points can be
7421
thought of as using @math{-1} as a square root of unity. If a 4th root of
7422
unity was available then a further split and speedup would be possible, but no
7423
such root exists for plain integers. Going to complex integers with
7424
@m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in cartesian
7425
form it takes three real multiplies to do a complex multiply. The existence
7426
of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast
7427
fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}.
7429
Floating point FFTs use complex numbers approximating Nth roots of unity.
7430
Some processors have special support for such FFTs. But these are not used in
7431
GMP since it's very difficult to guarantee an exact result (to some number of
7432
bits). An occasional difference of 1 in the last bit might not matter to a
7433
typical signal processing algorithm, but is of course of vital importance to
7437
@node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms
7438
@section Division Algorithms
7439
@cindex Division algorithms
7442
* Single Limb Division::
7443
* Basecase Division::
7444
* Divide and Conquer Division::
7447
* Small Quotient Division::
7451
@node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms
7452
@subsection Single Limb Division
7454
N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from
7455
high to low, either with a hardware divide instruction or a multiplication by
7456
inverse, whichever is best on a given CPU.
7458
The multiply by inverse follows section 8 of ``Division by Invariant Integers
7459
using Multiplication'' by Granlund and Montgomery (@pxref{References}) and is
7460
implemented as @code{udiv_qrnnd_preinv} in @file{gmp-impl.h}. The idea is to
7461
have a fixed-point approximation to @math{1/d} (see @code{invert_limb}) and
7462
then multiply by the high limb (plus one bit) of the dividend to get a
7463
quotient @math{q}. With @math{d} normalized (high bit set), @math{q} is no
7464
more than 1 too small. Subtracting @m{qd,q*d} from the dividend gives a
7465
remainder, and reveals whether @math{q} or @math{q-1} is correct.
7467
The result is a division done with two multiplications and four or five
7468
arithmetic operations. On CPUs with low latency multipliers this can be much
7469
faster than a hardware divide, though the cost of calculating the inverse at
7470
the start may mean it's only better on inputs bigger than say 4 or 5 limbs.
7472
When a divisor must be normalized, either for the generic C
7473
@code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is
7474
actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @math{a} is the dividend and
7475
@math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set.
7476
The bit shifts for the dividend are usually accomplished ``on the fly''
7477
meaning by extracting the appropriate bits at each step. Done this way the
7478
quotient limbs come out aligned ready to store. When only the remainder is
7479
wanted, an alternative is to take the dividend limbs unshifted and calculate
7480
@m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k
7481
\bmod d2^k, r*2^k mod d*2^k}. This can help on CPUs with poor bit shifts or
7484
The multiply by inverse can be done two limbs at a time. The calculation is
7485
basically the same, but the inverse is two limbs and the divisor treated as if
7486
padded with a low zero limb. This means more work, since the inverse will
7487
need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are
7488
independent and can therefore be done partly or wholly in parallel. Likewise
7489
for a 2@cross{}1 calculating @m{qd,q*d}. The net effect is to process two
7490
limbs with roughly the same two multiplies worth of latency that one limb at a
7491
time gives. This extends to 3 or 4 limbs at a time, though the extra work to
7492
apply the inverse will almost certainly soon reach the limits of multiplier
7495
A similar approach in reverse can be taken to process just half a limb at a
7496
time if the divisor is only a half limb. In this case the 1@cross{}1 multiply
7497
for the inverse effectively becomes two @m{{1\over2}\times1, (1/2)x1} for each
7498
limb, which can be a saving on CPUs with a fast half limb multiply, or in fact
7499
if the only multiply is a half limb, and especially if it's not pipelined.
7502
@node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms
7503
@subsection Basecase Division
7505
Basecase N@cross{}M division is like long division done by hand, but in base
7506
@m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}. See Knuth
7507
section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}.
7509
Briefly stated, while the dividend remains larger than the divisor, a high
7510
quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at
7511
the top end of the dividend. With a normalized divisor (most significant bit
7512
set), each quotient limb can be formed with a 2@cross{}1 division and a
7513
1@cross{}1 multiplication plus some subtractions. The 2@cross{}1 division is
7514
by the high limb of the divisor and is done either with a hardware divide or a
7515
multiply by inverse (the same as in @ref{Single Limb Division}) whichever is
7516
faster. Such a quotient is sometimes one too big, requiring an addback of the
7517
divisor, but that happens rarely.
7519
With Q=N@minus{}M being the number of quotient limbs, this is an
7520
@m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase
7521
Q@cross{}M multiplication, differing in fact only in the extra multiply and
7522
divide for each of the Q quotient limbs.
7525
@node Divide and Conquer Division, Exact Division, Basecase Division, Division Algorithms
7526
@subsection Divide and Conquer Division
7528
For divisors larger than @code{DIV_DC_THRESHOLD}, division is done by dividing.
7529
Or to be precise by a recursive divide and conquer algorithm based on work by
7530
Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}).
7532
The algorithm consists essentially of recognising that a 2N@cross{}N division
7533
can be done with the basecase division algorithm (@pxref{Basecase Division}),
7534
but using N/2 limbs as a base, not just a single limb. This way the
7535
multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of
7536
Karatsuba and higher multiplication algorithms (@pxref{Multiplication
7537
Algorithms}). The ``digits'' of the quotient are formed by recursive
7538
N@cross{}(N/2) divisions.
7540
If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication
7541
then the work is about the same as a basecase division, but with more function
7542
call overheads and with some subtractions separated from the multiplies.
7543
These overheads mean that it's only when N/2 is above
7544
@code{MUL_KARATSUBA_THRESHOLD} that divide and conquer is of use.
7546
@code{DIV_DC_THRESHOLD} is based on the divisor size N, so it will be somewhere
7547
above twice @code{MUL_KARATSUBA_THRESHOLD}, but how much above depends on the
7548
CPU. An optimized @code{mpn_mul_basecase} can lower @code{DIV_DC_THRESHOLD} a
7549
little by offering a ready-made advantage over repeated @code{mpn_submul_1}
7552
Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where
7553
@math{M(N)} is the time for an N@cross{}N multiplication done with FFTs. The
7554
actual time is a sum over multiplications of the recursed sizes, as can be
7555
seen near the end of section 2.2 of Burnikel and Ziegler. For example, within
7556
the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}. With higher
7557
algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log
7558
N, log(N)}. In practice, at moderate to large sizes, a 2N@cross{}N division
7559
is about 2 to 4 times slower than an N@cross{}N multiplication.
7561
Newton's method used for division is asymptotically @math{O(M(N))} and should
7562
therefore be superior to divide and conquer, but it's believed this would only
7563
be for large to very large N.
7566
@node Exact Division, Exact Remainder, Divide and Conquer Division, Division Algorithms
7567
@subsection Exact Division
7569
A so-called exact division is when the dividend is known to be an exact
7570
multiple of the divisor. Jebelean's exact division algorithm uses this
7571
knowledge to make some significant optimizations (@pxref{References}).
7573
The idea can be illustrated in decimal for example with 368154 divided by
7574
543. Because the low digit of the dividend is 4, the low digit of the
7575
quotient must be 8. This is arrived at from @m{4 \mathord{\times} 7 \bmod 10,
7576
4*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of
7577
the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7
7578
@equiv{} 1 mod 10}. So @m{8\mathord{\times}543 = 4344,8*543=4344} can be
7579
subtracted from the dividend leaving 363810. Notice the low digit has become
7582
The procedure is repeated at the second digit, with the next quotient digit 7
7583
(@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting
7584
@m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800. And finally at
7585
the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7
7586
mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0.
7587
So the quotient is 678.
7589
Notice however that the multiplies and subtractions don't need to extend past
7590
the low three digits of the dividend, since that's enough to determine the
7591
three quotient digits. For the last quotient digit no subtraction is needed
7592
at all. On a 2N@cross{}N division like this one, only about half the work of
7593
a normal basecase division is necessary.
7595
For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the
7596
saving over a normal basecase division is in two parts. Firstly, each of the
7597
Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and
7598
multiply. Secondly, the crossproducts are reduced when @math{Q>M} to
7599
@m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2,
7600
Q*(Q-1)/2}. Notice the savings are complementary. If Q is big then many
7601
divisions are saved, or if Q is small then the crossproducts reduce to a small
7604
The modular inverse used is calculated efficiently by @code{modlimb_invert} in
7605
@file{gmp-impl.h}. This does four multiplies for a 32-bit limb, or six for a
7606
64-bit limb. @file{tune/modlinv.c} has some alternate implementations that
7607
might suit processors better at bit twiddling than multiplying.
7609
The sub-quadratic exact division described by Jebelean in ``Exact Division
7610
with Karatsuba Complexity'' is not currently implemented. It uses a
7611
rearrangement similar to the divide and conquer for normal division
7612
(@pxref{Divide and Conquer Division}), but operating from low to high. A
7613
further possibility not currently implemented is ``Bidirectional Exact Integer
7614
Division'' by Krandick and Jebelean which forms quotient limbs from both the
7615
high and low ends of the dividend, and can halve once more the number of
7616
crossproducts needed in a 2N@cross{}N division.
7618
A special case exact division by 3 exists in @code{mpn_divexact_by3},
7619
supporting Toom-3 multiplication and @code{mpq} canonicalizations. It forms
7620
quotient digits with a multiply by the modular inverse of 3 (which is
7621
@code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next
7622
limb. The multiplications don't need to be on the dependent chain, as long as
7623
the effect of the borrows is applied. Only a few optimized assembler
7624
implementations currently exist.
7627
@node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms
7628
@subsection Exact Remainder
7630
If the exact division algorithm is done with a full subtraction at each stage
7631
and the dividend isn't a multiple of the divisor, then low zero limbs are
7632
produced but with a remainder in the high limbs. For dividend @math{a},
7633
divisor @math{d}, quotient @math{q}, and @m{b = 2
7634
\GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, then this
7635
remainder @math{r} is of the form
7637
$$ a = qd + r b^n $$
7646
@math{n} represents the number of zero limbs produced by the subtractions,
7647
that being the number of limbs produced for @math{q}. @math{r} will be in the
7648
range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by
7649
a factor of @math{b^n}.
7651
Carrying out full subtractions at each stage means the same number of cross
7652
products must be done as a normal division, but there's still some single limb
7653
divisions saved. When @math{d} is a single limb some simplifications arise,
7654
providing good speedups on a number of processors.
7656
@code{mpn_bdivmod}, @code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the
7657
@code{redc} function in @code{mpz_powm} differ subtly in how they return
7658
@math{r}, leading to some negations in the above formula, but all are
7659
essentially the same.
7661
Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this
7662
leads to divisibility or congruence tests which are potentially more efficient
7663
than a normal division.
7665
The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is
7666
odd, hence the use of @code{mpn_bdivmod} in @code{mpn_gcd}, and the use of
7667
@code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and @code{mpz_kronecker_ui} etc
7668
(@pxref{Greatest Common Divisor Algorithms}).
7670
Montgomery's REDC method for modular multiplications uses operands of the form
7671
of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n})
7672
(yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @math{b^n} in the exact
7673
remainder to reach a product in the same form @m{(xy)b^{-n}, (x*y)*b^-n}
7674
(@pxref{Modular Powering Algorithm}).
7676
Notice that @math{r} generally gives no useful information about the ordinary
7677
remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything. If
7678
however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the
7679
ordinary remainder. This occurs whenever @math{d} is a factor of
7680
@math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}. Other such
7681
factors include 5, 17 and 257, but no particular use has been found for this.
7684
@node Small Quotient Division, , Exact Remainder, Division Algorithms
7685
@subsection Small Quotient Division
7687
An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is
7688
small can be optimized somewhat.
7690
An ordinary basecase division normalizes the divisor by shifting it to make
7691
the high bit set, shifting the dividend accordingly, and shifting the
7692
remainder back down at the end of the calculation. This is wasteful if only a
7693
few quotient limbs are to be formed. Instead a division of just the top
7694
@m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be
7695
used to form a trial quotient. This requires only those limbs normalized, not
7696
the whole of the divisor and dividend.
7698
A multiply and subtract then applies the trial quotient to the M@minus{}Q
7699
unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q
7700
limbs remaining from the trial quotient division). The starting trial
7701
quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1
7702
too big are detected by first comparing the most significant limbs that will
7703
arise from the subtraction. An addback is done if the quotient still turns
7704
out to be 1 too big.
7706
This whole procedure is essentially the same as one step of the basecase
7707
algorithm done in a Q limb base, though with the trial quotient test done only
7708
with the high limbs, not an entire Q limb ``digit'' product. The correctness
7709
of this weaker test can be established by following the argument of Knuth
7710
section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r
7711
+ u_2, v2*q>b*r+u2} condition appropriately relaxed.
7715
@node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms
7716
@section Greatest Common Divisor
7717
@cindex Greatest common divisor algorithms
7727
@node Binary GCD, Accelerated GCD, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms
7728
@subsection Binary GCD
7730
At small sizes GMP uses an @math{O(N^2)} binary style GCD. This is described
7731
in many textbooks, for example Knuth section 4.5.2 algorithm B. It simply
7732
consists of successively reducing operands @math{a} and @math{b} using
7733
@math{@gcd{}(a,b) = @gcd{}(@min{}(a,b),@abs{}(a-b))}, and also that if
7734
@math{a} and @math{b} are first made odd then @math{@abs{}(a-b)} is even and
7735
factors of two can be discarded.
7737
Variants like letting @math{a-b} become negative and doing a different next
7738
step are of interest only as far as they suit particular CPUs, since on small
7739
operands it's machine dependent factors that determine performance.
7741
The Euclidean GCD algorithm, as per Knuth algorithms E and A, reduces using
7742
@math{a @bmod b} but this has so far been found to be slower everywhere. One
7743
reason the binary method does well is that the implied quotient at each step
7744
is usually small, so often only one or two subtractions are needed to get the
7745
same effect as a division. Quotients 1, 2 and 3 for example occur 67.7% of
7746
the time, see Knuth section 4.5.3 Theorem E.
7748
When the implied quotient is large, meaning @math{b} is much smaller than
7749
@math{a}, then a division is worthwhile. This is the basis for the initial
7750
@math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter
7751
for both N@cross{}1 and 1@cross{}1 cases). But after that initial reduction,
7752
big quotients occur too rarely to make it worth checking for them.
7755
@node Accelerated GCD, Extended GCD, Binary GCD, Greatest Common Divisor Algorithms
7756
@subsection Accelerated GCD
7758
For sizes above @code{GCD_ACCEL_THRESHOLD}, GMP uses the Accelerated GCD
7759
algorithm described independently by Weber and Jebelean (the latter as the
7760
``Generalized Binary'' algorithm), @pxref{References}. This algorithm is
7761
still @math{O(N^2)}, but is much faster than the binary algorithm since it
7762
does fewer multi-precision operations. It consists of alternating the
7763
@math{k}-ary reduction by Sorenson, and a ``dmod'' exact remainder reduction.
7765
For operands @math{u} and @math{v} the @math{k}-ary reduction replaces
7766
@math{u} with @m{nv-du,n*v-d*u} where @math{n} and @math{d} are single limb
7767
values chosen to give two trailing zero limbs on that value, which can be
7768
stripped. @math{n} and @math{d} are calculated using an algorithm similar to
7769
half of a two limb GCD (see @code{find_a} in @file{mpn/generic/gcd.c}).
7771
When @math{u} and @math{v} differ in size by more than a certain number of
7772
bits, a dmod is performed to zero out bits at the low end of the larger. It
7773
consists of an exact remainder style division applied to an appropriate number
7774
of bits (@pxref{Exact Division}, and @pxref{Exact Remainder}). This is faster
7775
than a @math{k}-ary reduction but useful only when the operands differ in
7776
size. There's a dmod after each @math{k}-ary reduction, and if the dmod
7777
leaves the operands still differing in size then it's repeated.
7779
The @math{k}-ary reduction step can introduce spurious factors into the GCD
7780
calculated, and these are eliminated at the end by taking GCDs with the
7781
original inputs @math{@gcd{}(u,@gcd{}(v,g))} using the binary algorithm.
7782
Since @math{g} is almost always small this takes very little time.
7784
At small sizes the algorithm needs a good implementation of @code{find_a}. At
7785
larger sizes it's dominated by @code{mpn_addmul_1} applying @math{n} and
7789
@node Extended GCD, Jacobi Symbol, Accelerated GCD, Greatest Common Divisor Algorithms
7790
@subsection Extended GCD
7792
The extended GCD calculates @math{@gcd{}(a,b)} and also cofactors @math{x} and
7793
@math{y} satisfying @m{ax+by=\gcd(a@C{}b), a*x+b*y=gcd(a@C{}b)}. Lehmer's
7794
multi-step improvement of the extended Euclidean algorithm is used. See Knuth
7795
section 4.5.2 algorithm L, and @file{mpn/generic/gcdext.c}. This is an
7796
@math{O(N^2)} algorithm.
7798
The multipliers at each step are found using single limb calculations for
7799
sizes up to @code{GCDEXT_THRESHOLD}, or double limb calculations above that.
7800
The single limb code is faster but doesn't produce full-limb multipliers,
7801
hence not making full use of the @code{mpn_addmul_1} calls.
7803
When a CPU has a data-dependent multiplier, meaning one which is faster on
7804
operands with fewer bits, the extra work in the double-limb calculation might
7805
only save some looping overheads, leading to a large @code{GCDEXT_THRESHOLD}.
7807
Currently the single limb calculation doesn't optimize for the small quotients
7808
that often occur, and this can lead to unusually low values of
7809
@code{GCDEXT_THRESHOLD}, depending on the CPU.
7811
An analysis of double-limb calculations can be found in ``A Double-Digit
7812
Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The code in GMP
7813
was developed independently.
7815
It should be noted that when a double limb calculation is used, it's used for
7816
the whole of that GCD, it doesn't fall back to single limb part way through.
7817
This is because as the algorithm proceeds, the inputs @math{a} and @math{b}
7818
are reduced, but the cofactors @math{x} and @math{y} grow, so the multipliers
7819
at each step are applied to a roughly constant total number of limbs.
7822
@node Jacobi Symbol, , Extended GCD, Greatest Common Divisor Algorithms
7823
@subsection Jacobi Symbol
7825
@code{mpz_jacobi} and @code{mpz_kronecker} are currently implemented with a
7826
simple binary algorithm similar to that described for the GCDs (@pxref{Binary
7827
GCD}). They're not very fast when both inputs are large. Lehmer's multi-step
7828
improvement or a binary based multi-step algorithm is likely to be better.
7830
When one operand fits a single limb, and that includes @code{mpz_kronecker_ui}
7831
and friends, an initial reduction is done with either @code{mpn_mod_1} or
7832
@code{mpn_modexact_1_odd}, followed by the binary algorithm on a single limb.
7833
The binary algorithm is well suited to a single limb, and the whole
7834
calculation in this case is quite efficient.
7836
In all the routines sign changes for the result are accumulated using some bit
7837
twiddling, avoiding table lookups or conditional jumps.
7841
@node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms
7842
@section Powering Algorithms
7843
@cindex Powering algorithms
7846
* Normal Powering Algorithm::
7847
* Modular Powering Algorithm::
7851
@node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms
7852
@subsection Normal Powering
7854
Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm,
7855
successively squaring and then multiplying by the base when a 1 bit is seen in
7856
the exponent, as per Knuth section 4.6.3. The ``left to right''
7857
variant described there is used rather than algorithm A, since it's just as
7858
easy and can be done with somewhat less temporary memory.
7861
@node Modular Powering Algorithm, , Normal Powering Algorithm, Powering Algorithms
7862
@subsection Modular Powering
7864
Modular powering is implemented using a @math{2^k}-ary sliding window
7865
algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85
7866
(@pxref{References}). @math{k} is chosen according to the size of the
7867
exponent. Larger exponents use larger values of @math{k}, the choice being
7868
made to minimize the average number of multiplications that must supplement
7871
The modular multiplies and squares use either a simple division or the REDC
7872
method by Montgomery (@pxref{References}). REDC is a little faster,
7873
essentially saving N single limb divisions in a fashion similar to an exact
7874
remainder (@pxref{Exact Remainder}). The current REDC has some limitations.
7875
It's only @math{O(N^2)} so above @code{POWM_THRESHOLD} division becomes faster
7876
and is used. It doesn't attempt to detect small bases, but rather always uses
7877
a REDC form, which is usually a full size operand. And lastly it's only
7878
applied to odd moduli.
7881
@node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms
7882
@section Root Extraction Algorithms
7883
@cindex Root extraction algorithms
7886
* Square Root Algorithm::
7887
* Nth Root Algorithm::
7888
* Perfect Square Algorithm::
7889
* Perfect Power Algorithm::
7893
@node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms
7894
@subsection Square Root
7896
Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul
7897
Zimmermann (@pxref{References}). This is expressed in a divide and conquer
7898
form, but as noted in the paper it can also be viewed as a discrete variant of
7901
In the Karatsuba multiplication range this is an @m{O({3\over2}
7902
M(N/2)),O(1.5*M(N/2))} algorithm, where @math{M(n)} is the time to multiply
7903
two numbers of @math{n} limbs. In the FFT multiplication range this grows to
7904
a bound of @m{O(6 M(N/2)),O(6*M(N/2))}. In practice a factor of about 1.5 to
7905
1.8 is found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT
7908
The algorithm does all its calculations in integers and the resulting
7909
@code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}.
7910
The extended precision given by @code{mpf_sqrt_ui} is obtained by
7911
padding with zero limbs.
7914
@node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms
7915
@subsection Nth Root
7917
Integer Nth roots are taken using Newton's method with the following
7918
iteration, where @math{A} is the input and @math{n} is the root to be taken.
7920
$$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$
7926
a[i+1] = - * ( --------- + (n-1)*a[i] )
7931
The initial approximation @m{a_1,a[1]} is generated bitwise by successively
7932
powering a trial root with or without new 1 bits, aiming to be just above the
7933
true root. The iteration converges quadratically when started from a good
7934
approximation. When @math{n} is large more initial bits are needed to get
7935
good convergence. The current implementation is not particularly well
7939
@node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms
7940
@subsection Perfect Square
7942
@code{mpz_perfect_square_p} is able to quickly exclude most non-squares by
7943
checking whether the input is a quadratic residue modulo some small integers.
7945
The first test is modulo 256 which means simply examining the least
7946
significant byte. Only 44 different values occur as the low byte of a square,
7947
so 82.8% of non-squares can be immediately excluded. Similar tests modulo
7948
primes from 3 to 29 exclude 99.5% of those remaining, or if a limb is 64 bits
7949
then primes up to 53 are used, excluding 99.99%. A single N@cross{}1
7950
remainder using @code{PP} from @file{gmp-impl.h} quickly gives all these
7953
A square root must still be taken for any value that passes the residue tests,
7954
to verify it's really a square and not one of the 0.086% (or 0.000156% for 64
7955
bits) non-squares that get through. @xref{Square Root Algorithm}.
7958
@node Perfect Power Algorithm, , Perfect Square Algorithm, Root Extraction Algorithms
7959
@subsection Perfect Power
7961
Detecting perfect powers is required by some factorization algorithms.
7962
Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root
7963
extractions, though naturally only prime roots need to be considered.
7964
(@xref{Nth Root Algorithm}.)
7966
If a prime divisor @math{p} with multiplicity @math{e} can be found, then only
7967
roots which are divisors of @math{e} need to be considered, much reducing the
7968
work necessary. To this end divisibility by a set of small primes is checked.
7971
@node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms
7972
@section Radix Conversion
7973
@cindex Radix conversion algorithms
7975
Radix conversions are less important than other algorithms. A program
7976
dominated by conversions should probably use a different data representation.
7984
@node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms
7985
@subsection Binary to Radix
7987
Conversions from binary to a power-of-2 radix use a simple and fast
7988
@math{O(N)} bit extraction algorithm.
7990
Conversions from binary to other radices use one of two algorithms. Sizes
7991
below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method.
7992
Repeated divisions by @math{b^n} are made, where @math{b} is the radix and
7993
@math{n} is the biggest power that fits in a limb. But instead of simply
7994
using the remainder @math{r} from such divisions, an extra divide step is done
7995
to give a fractional limb representing @math{r/b^n}. The digits of @math{r}
7996
can then be extracted using multiplications by @math{b} rather than divisions.
7997
Special case code is provided for decimal, allowing multiplications by 10 to
7998
optimize to shifts and adds.
8000
Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used.
8001
For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are
8002
calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is
8003
reached. @math{t} is then divided by that largest power, giving a quotient
8004
which is the digits above that power, and a remainder which is those below.
8005
These two parts are in turn divided by the second highest power, and so on
8006
recursively. When a piece has been divided down to less than
8007
@code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is
8010
The advantage of this algorithm is that big divisions can make use of the
8011
sub-quadratic divide and conquer division (@pxref{Divide and Conquer
8012
Division}), and big divisions tend to have less overheads than lots of
8013
separate single limb divisions anyway. But in any case the cost of
8014
calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome.
8016
@code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent
8017
the same basic thing, the point where it becomes worth doing a big division to
8018
cut the input in half. @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost
8019
of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD}
8020
assumes that's already available, which is the case when recursing.
8022
Since the base case produces digits from least to most significant but they
8023
want to be stored from most to least, it's necessary to calculate in advance
8024
how many digits there will be, or at least be sure not to underestimate that.
8025
For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly}
8026
from @code{mp_bases}, rounding up. The result is either correct or one too
8029
Examining some of the high bits of the input could increase the chance of
8030
getting the exact number of digits, but an exact result every time would not
8031
be practical, since in general the difference between numbers 100@dots{} and
8032
99@dots{} is only in the last few bits and the work to identify 99@dots{}
8033
might well be almost as much as a full conversion.
8035
@code{mpf_get_str} doesn't currently use the algorithm described here, it
8036
multiplies or divides by a power of @math{b} to move the radix point to the
8037
just above the highest non-zero digit (or at worst one above that location),
8038
then multiplies by @math{b^n} to bring out digits. This is @math{O(N^2)} and
8039
is certainly not optimal.
8041
The @math{r/b^n} scheme described above for using multiplications to bring out
8042
digits might be useful for more than a single limb. Some brief experiments
8043
with it on the base case when recursing didn't give a noticeable improvement,
8044
but perhaps that was only due to the implementation. Something similar would
8045
work for the sub-quadratic divisions too, though there would be the cost of
8046
calculating a bigger radix power.
8048
Another possible improvement for the sub-quadratic part would be to arrange
8049
for radix powers that balanced the sizes of quotient and remainder produced,
8050
ie. the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to
8051
@m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor. That ought to
8052
smooth out a graph of times against sizes, but may or may not be a net
8056
@node Radix to Binary, , Binary to Radix, Radix Conversion Algorithms
8057
@subsection Radix to Binary
8059
Conversions from a power-of-2 radix into binary use a simple and fast
8060
@math{O(N)} bitwise concatenation algorithm.
8062
Conversions from other radices use one of two algorithms. Sizes below
8063
@code{SET_STR_THRESHOLD} use a basic @math{O(N^2)} method. Groups of @math{n}
8064
digits are converted to limbs, where @math{n} is the biggest power of the base
8065
@math{b} which will fit in a limb, then those groups are accumulated into the
8066
result by multiplying by @math{b^n} and adding. This saves multi-precision
8067
operations, as per Knuth section 4.4 part E (@pxref{References}). Some
8068
special case code is provided for decimal, giving the compiler a chance to
8069
optimize multiplications by 10.
8071
Above @code{SET_STR_THRESHOLD} a sub-quadratic algorithm is used. First
8072
groups of @math{n} digits are converted into limbs. Then adjacent limbs are
8073
combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x} and @math{y}
8074
are the limbs. Adjacent limb pairs are combined into quads similarly with
8075
@m{xb^{2n}+y,x*b^(2n)+y}. This continues until a single block remains, that
8078
The advantage of this method is that the multiplications for each @math{x} are
8079
big blocks, allowing Karatsuba and higher algorithms to be used. But the cost
8080
of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome.
8081
@code{SET_STR_THRESHOLD} usually ends up quite big, around 5000 digits, and on
8082
some processors much bigger still.
8084
@code{SET_STR_THRESHOLD} is based on the input digits (and tuned for decimal),
8085
though it might be better based on a limb count, so as to be independent of
8086
the base. But that sort of count isn't used by the base case and so would
8087
need some sort of initial calculation or estimate.
8089
The main reason @code{SET_STR_THRESHOLD} is so much bigger than the
8090
corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is
8091
much faster than @code{mpn_divrem_1} (often by a factor of 10, or more).
8095
@node Other Algorithms, Assembler Coding, Radix Conversion Algorithms, Algorithms
8096
@section Other Algorithms
8099
* Factorial Algorithm::
8100
* Binomial Coefficients Algorithm::
8101
* Fibonacci Numbers Algorithm::
8102
* Lucas Numbers Algorithm::
8106
@node Factorial Algorithm, Binomial Coefficients Algorithm, Other Algorithms, Other Algorithms
8107
@subsection Factorial
8109
Factorials @math{n!} are calculated by a simple product from @math{1} to
8110
@math{n}, but arranged into certain sub-products.
8112
First as many factors as fit in a limb are accumulated, then two of those
8113
multiplied to give a 2-limb product. When two 2-limb products are ready
8114
they're multiplied to a 4-limb product, and when two 4-limbs are ready they're
8115
multiplied to an 8-limb product, etc. A stack of outstanding products is
8116
built up, with two of the same size multiplied together when ready.
8118
Arranging for multiplications to have operands the same (or nearly the same)
8119
size means the Karatsuba and higher multiplication algorithms can be used.
8120
And even on sizes below the Karatsuba threshold an N@cross{}N multiply will
8121
give a basecase multiply more to work on.
8123
An obvious improvement not currently implemented would be to strip factors of
8124
2 from the products and apply them at the end with a bit shift. Another
8125
possibility would be to determine the prime factorization of the result (which
8126
can be done easily), and use a powering method, at each stage squaring then
8127
multiplying in those primes with a 1 in their exponent at that point. The
8128
advantage would be some multiplies turned into squares.
8131
@node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms
8132
@subsection Binomial Coefficients
8134
Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated
8135
by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) =
8136
\left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then
8137
evaluating the following product simply from @math{i=2} to @math{i=k}.
8139
$$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$
8145
C(n,k) = (n-k+1) * prod -------
8150
It's easy to show that each denominator @math{i} will divide the product so
8151
far, so the exact division algorithm is used (@pxref{Exact Division}).
8153
The numerators @math{n-k+i} and denominators @math{i} are first accumulated
8154
into as many fit a limb, to save multi-precision operations, though for
8155
@code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an
8156
@code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all.
8158
An obvious improvement would be to strip factors of 2 from each multiplier and
8159
divisor and count them separately, to be applied with a bit shift at the end.
8160
Factors of 3 and perhaps 5 could even be handled similarly. Another
8161
possibility, if @math{n} is not too big, would be to determine the prime
8162
factorization of the result based on the factorials involved, and power up
8163
those primes appropriately. This would help most when @math{k} is near
8167
@node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms
8168
@subsection Fibonacci Numbers
8170
The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed
8171
for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]}
8174
For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is
8175
used. On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb
8176
up to @m{F_{93},F[93]}. For convenience the table starts at @m{F_{-1},F[-1]}.
8178
Beyond the table, values are generated with a binary powering algorithm,
8179
calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to
8180
low across the bits of @math{n}. The formulas used are
8183
F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr
8184
F_{2k-1} &= F_k^2 + F_{k-1}^2 \cr
8185
F_{2k} &= F_{2k+1} - F_{2k-1}
8191
F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k
8192
F[2k-1] = F[k]^2 + F[k-1]^2
8194
F[2k] = F[2k+1] - F[2k-1]
8198
At each step, @math{k} is the high @math{b} bits of @math{n}. If the next bit
8199
of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if
8200
it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process
8201
repeated until all bits of @math{n} are incorporated. Notice these formulas
8202
require just two squares per bit of @math{n}.
8204
It'd be possible to handle the first few @math{n} above the single limb table
8205
with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} =
8206
F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually
8207
turns out to be faster for only about 10 or 20 values of @math{n}, and
8208
including a block of code for just those doesn't seem worthwhile. If they
8209
really mattered it'd be better to extend the data table.
8211
Using a table avoids lots of calculations on small numbers, and makes small
8212
@math{n} go fast. A bigger table would make more small @math{n} go fast, it's
8213
just a question of balancing size against desired speed. For GMP the code is
8214
kept compact, with the emphasis primarily on a good powering algorithm.
8216
@code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but
8217
@code{mpz_fib_ui} is only interested in @m{F_n,F[n]}. In this case the last
8218
step of the algorithm can become one multiply instead of two squares. One of
8219
the following two formulas is used, according as @math{n} is odd or even.
8222
F_{2k} &= F_k (F_k + 2F_{k-1}) \cr
8223
F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k
8229
F[2k] = F[k]*(F[k]+2F[k-1])
8231
F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k
8235
@m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a
8236
multiply. For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above
8237
can be applied just to the low limb of the calculation, without a carry or
8238
borrow into further limbs, which saves some code size. See comments with
8239
@code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done.
8242
@node Lucas Numbers Algorithm, , Fibonacci Numbers Algorithm, Other Algorithms
8243
@subsection Lucas Numbers
8245
@code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci
8246
numbers with the following simple formulas.
8249
L_k &= F_k + 2F_{k-1} \cr
8250
L_{k-1} &= 2F_k - F_{k-1}
8256
L[k] = F[k] + 2*F[k-1]
8257
L[k-1] = 2*F[k] - F[k-1]
8261
@code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be
8262
saved. Trailing zero bits on @math{n} can be handled with a single square
8265
$$ L_{2k} = L_k^2 - 2(-1)^k $$
8270
L[2k] = L[k]^2 - 2*(-1)^k
8274
And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci
8275
numbers, similar to what @code{mpz_fib_ui} does.
8277
$$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$
8282
L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k
8288
@node Assembler Coding, , Other Algorithms, Algorithms
8289
@section Assembler Coding
8291
The assembler subroutines in GMP are the most significant source of speed at
8292
small to moderate sizes. At larger sizes algorithm selection becomes more
8293
important, but of course speedups in low level routines will still speed up
8294
everything proportionally.
8296
Carry handling and widening multiplies that are important for GMP can't be
8297
easily expressed in C. GCC @code{asm} blocks help a lot and are provided in
8298
@file{longlong.h}, but hand coding low level routines invariably offers a
8299
speedup over generic C by a factor of anything from 2 to 10.
8302
* Assembler Code Organisation::
8303
* Assembler Basics::
8304
* Assembler Carry Propagation::
8305
* Assembler Cache Handling::
8306
* Assembler Floating Point::
8307
* Assembler SIMD Instructions::
8308
* Assembler Software Pipelining::
8309
* Assembler Loop Unrolling::
8313
@node Assembler Code Organisation, Assembler Basics, Assembler Coding, Assembler Coding
8314
@subsection Code Organisation
8316
The various @file{mpn} subdirectories contain machine-dependent code, written
8317
in C or assembler. The @file{mpn/generic} subdirectory contains default code,
8318
used when there's no machine-specific version of a particular file.
8320
Each @file{mpn} subdirectory is for an ISA family. Generally 32-bit and
8321
64-bit variants in a family cannot share code and will have separate
8322
directories. Within a family further subdirectories may exist for CPU
8326
@node Assembler Basics, Assembler Carry Propagation, Assembler Code Organisation, Assembler Coding
8327
@subsection Assembler Basics
8329
@code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines
8330
for overall GMP performance. All multiplications and divisions come down to
8331
repeated calls to these. @code{mpn_add_n}, @code{mpn_sub_n},
8332
@code{mpn_lshift} and @code{mpn_rshift} are next most important.
8334
On some CPUs assembler versions of the internal functions
8335
@code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups,
8336
mainly through avoiding function call overheads. They can also potentially
8337
make better use of a wide superscalar processor.
8339
The restrictions on overlaps between sources and destinations
8340
(@pxref{Low-level Functions}) are designed to facilitate a variety of
8341
implementations. For example, knowing @code{mpn_add_n} won't have partly
8342
overlapping sources and destination means reading can be done far ahead of
8343
writing on superscalar processors, and loops can be vectorized on a vector
8344
processor, depending on the carry handling.
8347
@node Assembler Carry Propagation, Assembler Cache Handling, Assembler Basics, Assembler Coding
8348
@subsection Carry Propagation
8350
The problem that presents most challenges in GMP is propagating carries from
8351
one limb to the next. In functions like @code{mpn_addmul_1} and
8352
@code{mpn_add_n}, carries are the only dependencies between limb operations.
8354
On processors with carry flags, a straightforward CISC style @code{adc} is
8355
generally best. AMD K6 @code{mpn_addmul_1} however is an example of an
8356
unusual set of circumstances where a branch works out better.
8358
On RISC processors generally an add and compare for overflow is used. This
8359
sort of thing can be seen in @file{mpn/generic/aors_n.c}. Some carry
8360
propagation schemes require 4 instructions, meaning at least 4 cycles per
8361
limb, but other schemes may use just 1 or 2. On wide superscalar processors
8362
performance may be completely determined by the number of dependent
8363
instructions between carry-in and carry-out for each limb.
8365
On vector processors good use can be made of the fact that a carry bit only
8366
very rarely propagates more than one limb. When adding a single bit to a
8367
limb, there's only a carry out if that limb was @code{0xFF...FF} which on
8368
random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}},
8369
2^mp_bits_per_limb}. @file{mpn/cray/add_n.c} is an example of this, it adds
8370
all limbs in parallel, adds one set of carry bits in parallel and then only
8371
rarely needs to fall through to a loop propagating further carries.
8373
On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code
8374
for the RISC style idioms that are necessary to handle carry bits in
8375
C. Often conditional jumps are generated where @code{adc} or @code{sbb} forms
8376
would be better. And so unfortunately almost any loop involving carry bits
8377
needs to be coded in assembler for best results.
8380
@node Assembler Cache Handling, Assembler Floating Point, Assembler Carry Propagation, Assembler Coding
8381
@subsection Cache Handling
8383
GMP aims to perform well both on operands that fit entirely in L1 cache and
8386
Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on
8387
large operands, so L2 and main memory performance is important for them.
8388
@code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and
8389
square basecases, so L1 performance matters most for them, unless assembler
8390
versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in
8391
which case the remaining uses are mostly for larger operands.
8393
For L2 or main memory operands, memory access times will almost certainly be
8394
more than the calculation time. The aim therefore is to maximize memory
8395
throughput, by starting a load of the next cache line which processing the
8396
contents of the previous one. Clearly this is only possible if the chip has a
8397
lock-up free cache or some sort of prefetch instruction. Most current chips
8398
have both these features.
8400
Prefetching sources combines well with loop unrolling, since a prefetch can be
8401
initiated once per unrolled loop (or more than once if the loop covers more
8402
than one cache line).
8404
On CPUs without write-allocate caches, prefetching destinations will ensure
8405
individual stores don't go further down the cache hierarchy, limiting
8406
bandwidth. Of course for calculations which are slow anyway, like
8407
@code{mpn_divrem_1}, write-throughs might be fine.
8409
The distance ahead to prefetch will be determined by memory latency versus
8410
throughput. The aim of course is to have data arriving continuously, at peak
8411
throughput. Some CPUs have limits on the number of fetches or prefetches in
8414
If a special prefetch instruction doesn't exist then a plain load can be used,
8415
but in that case care must be taken not to attempt to read past the end of an
8416
operand, since that might produce a segmentation violation.
8418
Some CPUs or systems have hardware that detects sequential memory accesses and
8419
initiates suitable cache movements automatically, making life easy.
8422
@node Assembler Floating Point, Assembler SIMD Instructions, Assembler Cache Handling, Assembler Coding
8423
@subsection Floating Point
8425
Floating point arithmetic is used in GMP for multiplications on CPUs with poor
8426
integer multipliers. It's mostly useful for @code{mpn_mul_1},
8427
@code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and
8428
@code{mpn_mul_basecase} on both 32-bit and 64-bit machines.
8430
With IEEE 53-bit double precision floats, integer multiplications producing up
8431
to 53 bits will give exact results. Breaking a 64@cross{}64 multiplication
8432
into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient. With
8433
some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be
8434
used, if one of the lower two 21-bit pieces also uses the sign bit.
8436
For the @code{mpn_mul_1} family of functions on a 64-bit machine, the
8437
invariant single limb is split at the start, into 3 or 4 pieces. Inside the
8438
loop, the bignum operand is split into 32-bit pieces. Fast conversion of
8439
these unsigned 32-bit pieces to floating point is highly machine-dependent.
8440
In some cases, reading the data into the integer unit, zero-extending to
8441
64-bits, then transferring to the floating point unit back via memory is the
8444
Converting partial products back to 64-bit limbs is usually best done as a
8445
signed conversion. Since all values are smaller than @m{2^{53},2^53}, signed
8446
and unsigned are the same, but most processors lack unsigned conversions.
8450
Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or
8451
@code{mpn_addmul_1} with a 64-bit limb. The single limb operand V is split
8452
into four 16-bit parts. The multi-limb operand U is split in the loop into
8456
\global\newdimen\GMPbits \global\GMPbits=0.18em
8459
\hbox to 128\GMPbits{\hfil
8462
\hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
8465
\raise \GMPboxdepth \hbox{\hskip 2em #3}}}
8470
\hbox to 128\GMPbits {\hfil
8473
\hbox to 64\GMPbits{%
8474
\GMPvrule \hfil$v48$\hfil
8475
\vrule \hfil$v32$\hfil
8476
\vrule \hfil$v16$\hfil
8477
\vrule \hfil$v00$\hfil
8480
\raise \GMPboxdepth \hbox{\hskip 2em V Operand}}
8483
\hbox to 128\GMPbits {\hfil
8484
\raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}%
8487
\hbox to 64\GMPbits {%
8488
\GMPvrule \hfil$u32$\hfil
8489
\vrule \hfil$u00$\hfil
8492
\raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}%
8494
\hbox{\vbox to 2ex{\hrule width 128\GMPbits}}%
8495
\GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}%
8497
\GMPbox{16}{u00 \times v16}{$p16$}
8499
\GMPbox{32}{u00 \times v32}{$p32$}
8501
\GMPbox{48}{u00 \times v48}{$p48$}
8503
\GMPbox{32}{u32 \times v00}{$r32$}
8505
\GMPbox{48}{u32 \times v16}{$r48$}
8507
\GMPbox{64}{u32 \times v32}{$r64$}
8509
\GMPbox{80}{u32 \times v48}{$r80$}
8516
|v48|v32|v16|v00| V operand
8520
x | u32 | u00 | U operand (one limb)
8523
---------------------------------
8526
| u00 x v00 | p00 48-bit products
8553
@math{p32} and @math{r32} can be summed using floating-point addition, and
8554
likewise @math{p48} and @math{r48}. @math{p00} and @math{p16} can be summed
8555
with @math{r64} and @math{r80} from the previous iteration.
8557
For each loop then, four 49-bit quantities are transfered to the integer unit,
8561
% GMPbox here should be 49 bits wide, but use 51 to better show p16+r80'
8562
% crossing into the upper 64 bits.
8565
\hbox to 128\GMPbits {%
8569
\hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
8572
\raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}%
8574
\newbox\b \setbox\b\hbox{64 bits}%
8575
\newdimen\bw \bw=\wd\b \advance\bw by 2em
8576
\newdimen\x \x=128\GMPbits
8581
\hbox to 128\GMPbits {%
8583
\raise 0.5ex \vbox{\hrule \hbox to \x {}}%
8585
\raise 0.5ex \vbox{\hrule \hbox to \x {}}%
8587
\raise 0.5ex \vbox{\hrule \hbox to \x {}}%
8589
\raise 0.5ex \vbox{\hrule \hbox to \x {}}%
8592
\GMPbox{0}{p00+r64'}{i00}
8594
\GMPbox{16}{p16+r80'}{i16}
8596
\GMPbox{32}{p32+r32}{i32}
8598
\GMPbox{48}{p48+r48}{i48}
8604
|-----64bits----|-----64bits----|
8621
The challenge then is to sum these efficiently and add in a carry limb,
8622
generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48}
8623
extends 33 bits into the high half).
8626
@node Assembler SIMD Instructions, Assembler Software Pipelining, Assembler Floating Point, Assembler Coding
8627
@subsection SIMD Instructions
8629
The single-instruction multiple-data support in current microprocessors is
8630
aimed at signal processing algorithms where each data point can be treated
8631
more or less independently. There's generally not much support for
8632
propagating the sort of carries that arise in GMP.
8634
SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much
8635
work as one 32@cross{}32 from GMP's point of view, and need some shifts and
8636
adds besides. But of course if say the SIMD form is fully pipelined and uses
8637
less instruction decoding then it may still be worthwhile.
8639
On the 80x86 chips, MMX has so far found a use in @code{mpn_rshift} and
8640
@code{mpn_lshift} since it allows 64-bit operations, and is used in a special
8641
case for 16-bit multipliers in the P55 @code{mpn_mul_1}. 3DNow and SSE
8642
haven't found a use so far.
8645
@node Assembler Software Pipelining, Assembler Loop Unrolling, Assembler SIMD Instructions, Assembler Coding
8646
@subsection Software Pipelining
8648
Software pipelining consists of scheduling instructions around the branch
8649
point in a loop. For example a loop taking a checksum of an array of limbs
8650
might have a load and an add, but the load wouldn't be for that add, rather
8651
for the one next time around the loop. Each load then is effectively
8652
scheduled back in the previous iteration, allowing latency to be hidden.
8654
Naturally this is wanted only when doing things like loads or multiplies that
8655
take a few cycles to complete, and only where a CPU has multiple functional
8656
units so that other work can be done while waiting.
8658
A pipeline with several stages will have a data value in progress at each
8659
stage and each loop iteration moves them along one stage. This is like
8662
Within the loop some moves between registers may be necessary to have the
8663
right values in the right places for each iteration. Loop unrolling can help
8664
this, with each unrolled block able to use different registers for different
8665
values, even if some shuffling is still needed just before going back to the
8669
@node Assembler Loop Unrolling, , Assembler Software Pipelining, Assembler Coding
8670
@subsection Loop Unrolling
8672
Loop unrolling consists of replicating code so that several limbs are
8673
processed in each loop. At a minimum this reduces loop overheads by a
8674
corresponding factor, but it can also allow better register usage, for example
8675
alternately using one register combination and then another. Judicious use of
8676
@command{m4} macros can help avoid lots of duplication in the source code.
8678
Unrolling is commonly done to a power of 2 multiple so the number of unrolled
8679
loops and the number of remaining limbs can be calculated with a shift and
8680
mask. But other multiples can be used too, just by subtracting each @var{n}
8681
limbs processed from a counter and waiting for less than @var{n} remaining (or
8682
offsetting the counter by @var{n} so it goes negative when there's less than
8685
The limbs not a multiple of the unrolling can be handled in various ways, for
8690
A simple loop at the end (or the start) to process the excess. Care will be
8691
wanted that it isn't too much slower than the unrolled part.
8694
A set of binary tests, for example after an 8-limb unrolling, test for 4 more
8695
limbs to process, then a further 2 more or not, and finally 1 more or not.
8696
This will probably take more code space than a simple loop.
8699
A @code{switch} statement, providing separate code for each possible excess,
8700
for example an 8-limb unrolling would have separate code for 0 remaining, 1
8701
remaining, etc, up to 7 remaining. This might take a lot of code, but may be
8702
the best way to optimize all cases in combination with a deep pipelined loop.
8705
A computed jump into the middle of the loop, thus making the first iteration
8706
handle the excess. This should make times smoothly increase with size, which
8707
is attractive, but setups for the jump and adjustments for pointers can be
8708
tricky and could become quite difficult in combination with deep pipelining.
8711
One way to write the setups and finishups for a pipelined unrolled loop is
8712
simply to duplicate the loop at the start and the end, then delete
8713
instructions at the start which have no valid antecedents, and delete
8714
instructions at the end whose results are unwanted. Sizes not a multiple of
8715
the unrolling can then be handled as desired.
8718
@node Internals, Contributors, Algorithms, Top
8721
@strong{This chapter is provided only for informational purposes and the
8722
various internals described here may change in future GMP releases.
8723
Applications expecting to be compatible with future releases should use only
8724
the documented interfaces described in previous chapters.}
8727
* Integer Internals::
8728
* Rational Internals::
8730
* Raw Output Internals::
8731
* C++ Interface Internals::
8734
@node Integer Internals, Rational Internals, Internals, Internals
8735
@section Integer Internals
8737
@code{mpz_t} variables represent integers using sign and magnitude, in space
8738
dynamically allocated and reallocated. The fields are as follows.
8741
@item @code{_mp_size}
8742
The number of limbs, or the negative of that when representing a negative
8743
integer. Zero is represented by @code{_mp_size} set to zero, in which case
8744
the @code{_mp_d} data is unused.
8747
A pointer to an array of limbs which is the magnitude. These are stored
8748
``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the
8749
least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most
8750
significant. Whenever @code{_mp_size} is non-zero, the most significant limb
8753
Currently there's always at least one limb allocated, so for instance
8754
@code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch
8755
@code{_mp_d[0]} unconditionally (though its value is then only wanted if
8756
@code{_mp_size} is non-zero).
8758
@item @code{_mp_alloc}
8759
@code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d},
8760
and naturally @code{_mp_alloc >= ABS(_mp_size)}. When an @code{mpz} routine
8761
is about to (or might be about to) increase @code{_mp_size}, it checks
8762
@code{_mp_alloc} to see whether there's enough space, and reallocates if not.
8763
@code{MPZ_REALLOC} is generally used for this.
8766
The various bitwise logical functions like @code{mpz_and} behave as if
8767
negative values were twos complement. But sign and magnitude is always used
8768
internally, and necessary adjustments are made during the calculations.
8769
Sometimes this isn't pretty, but sign and magnitude are best for other
8772
Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these
8773
have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory
8774
allocation functions. Care is taken to ensure that these are big enough that
8775
no reallocation is necessary (since it would have unpredictable consequences).
8778
@node Rational Internals, Float Internals, Integer Internals, Internals
8779
@section Rational Internals
8781
@code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and
8782
denominator (@pxref{Integer Internals}).
8784
The canonical form adopted is denominator positive (and non-zero), no common
8785
factors between numerator and denominator, and zero uniquely represented as
8788
It's believed that casting out common factors at each stage of a calculation
8789
is best in general. A GCD is an @math{O(N^2)} operation so it's better to do
8790
a few small ones immediately than to delay and have to do a big one later.
8791
Knowing the numerator and denominator have no common factors can be used for
8792
example in @code{mpq_mul} to make only two cross GCDs necessary, not four.
8794
This general approach to common factors is badly sub-optimal in the presence
8795
of simple factorizations or little prospect for cancellation, but GMP has no
8796
way to know when this will occur. As per @ref{Efficiency}, that's left to
8797
applications. The @code{mpq_t} framework might still suit, with
8798
@code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and
8799
denominator, or of course @code{mpz_t} variables can be used directly.
8802
@node Float Internals, Raw Output Internals, Rational Internals, Internals
8803
@section Float Internals
8805
Efficient calculation is the primary aim of GMP floats and the use of whole
8806
limbs and simple rounding facilitates this.
8808
@code{mpf_t} floats have a variable precision mantissa and a single machine
8809
word signed exponent. The mantissa is represented using sign and magnitude.
8811
@c FIXME: The arrow heads don't join to the lines exactly.
8813
\global\newdimen\GMPboxwidth \GMPboxwidth=5em
8814
\global\newdimen\GMPboxheight \GMPboxheight=3ex
8815
\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
8818
\hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb}
8820
\def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
8822
\hbox to 3\GMPboxwidth {%
8823
\setbox 0 = \hbox{@code{\_mp\_exp}}%
8824
\dimen0=3\GMPboxwidth
8825
\advance\dimen0 by -\wd0
8827
\advance\dimen0 by -1em
8828
\setbox1 = \hbox{$\rightarrow$}%
8830
\advance\dimen1 by -\wd1
8831
\GMPcentreline{\dimen0}%
8835
\GMPcentreline{\dimen1{}}%
8837
\hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}}
8842
\vrule height 2ex depth 1ex
8843
\hbox to \GMPboxwidth {}%
8845
\hbox to \GMPboxwidth {}%
8847
\hbox to \GMPboxwidth {}%
8849
\hbox to \GMPboxwidth {}%
8851
\hbox to \GMPboxwidth {}%
8857
\hbox to 3\GMPboxwidth {%
8858
\hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}}
8859
\hbox to 5\GMPboxwidth{%
8860
\setbox 0 = \hbox{@code{\_mp\_size}}%
8861
\dimen0 = 5\GMPboxwidth
8862
\advance\dimen0 by -\wd0
8864
\advance\dimen0 by -1em
8866
\setbox1 = \hbox{$\leftarrow$}%
8867
\setbox2 = \hbox{$\rightarrow$}%
8868
\advance\dimen0 by -\wd1
8869
\advance\dimen1 by -\wd2
8872
\GMPcentreline{\dimen0}%
8876
\GMPcentreline{\dimen1}%
8883
significant significant
8887
|---- _mp_exp ---> |
8888
_____ _____ _____ _____ _____
8889
|_____|_____|_____|_____|_____|
8890
. <------------ radix point
8892
<-------- _mp_size --------->
8898
The fields are as follows.
8901
@item @code{_mp_size}
8902
The number of limbs currently in use, or the negative of that when
8903
representing a negative value. Zero is represented by @code{_mp_size} and
8904
@code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is
8905
unused. (In the future @code{_mp_exp} might be undefined when representing
8908
@item @code{_mp_prec}
8909
The precision of the mantissa, in limbs. In any calculation the aim is to
8910
produce @code{_mp_prec} limbs of result (the most significant being non-zero).
8913
A pointer to the array of limbs which is the absolute value of the mantissa.
8914
These are stored ``little endian'' as per the @code{mpn} functions, so
8915
@code{_mp_d[0]} is the least significant limb and
8916
@code{_mp_d[ABS(_mp_size)-1]} the most significant.
8918
The most significant limb is always non-zero, but there are no other
8919
restrictions on its value, in particular the highest 1 bit can be anywhere
8922
@code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being
8923
for convenience (see below). There are no reallocations during a calculation,
8924
only in a change of precision with @code{mpf_set_prec}.
8926
@item @code{_mp_exp}
8927
The exponent, in limbs, determining the location of the implied radix point.
8928
Zero means the radix point is just above the most significant limb. Positive
8929
values mean a radix point offset towards the lower limbs and hence a value
8930
@math{@ge{} 1}, as for example in the diagram above. Negative exponents mean
8931
a radix point further above the highest limb.
8933
Naturally the exponent can be any value, it doesn't have to fall within the
8934
limbs as the diagram shows, it can be a long way above or a long way below.
8935
Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data
8936
are treated as zero.
8941
The following various points should be noted.
8945
The least significant limbs @code{_mp_d[0]} etc can be zero, though such low
8946
zeros can always be ignored. Routines likely to produce low zeros check and
8947
avoid them to save time in subsequent calculations, but for most routines
8948
they're quite unlikely and aren't checked.
8950
@item Mantissa Size Range
8951
The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if
8952
the value can be represented in less. This means low precision values or
8953
small integers stored in a high precision @code{mpf_t} can still be operated
8956
@code{_mp_size} can also be greater than @code{_mp_prec}. Firstly a value is
8957
allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d},
8958
and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves
8959
@code{_mp_size} unchanged and so the size can be arbitrarily bigger than
8963
All rounding is done on limb boundaries. Calculating @code{_mp_prec} limbs
8964
with the high non-zero will ensure the application requested minimum precision
8967
The use of simple ``trunc'' rounding towards zero is efficient, since there's
8968
no need to examine extra limbs and increment or decrement.
8971
Since the exponent is in limbs, there are no bit shifts in basic operations
8972
like @code{mpf_add} and @code{mpf_mul}. When differing exponents are
8973
encountered all that's needed is to adjust pointers to line up the relevant
8976
Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts,
8977
but the choice is between an exponent in limbs which requires shifts there, or
8978
one in bits which requires them almost everywhere else.
8980
@item Use of @code{_mp_prec+1} Limbs
8981
The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just
8982
@code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its
8983
operation. @code{mpf_add} for instance will do an @code{mpn_add} of
8984
@code{_mp_prec} limbs. If there's no carry then that's the result, but if
8985
there is a carry then it's stored in the extra limb of space and
8986
@code{_mp_size} becomes @code{_mp_prec+1}.
8988
Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not
8989
needed for the intended precision, only the @code{_mp_prec} high limbs. But
8990
zeroing it out or moving the rest down is unnecessary. Subsequent routines
8991
reading the value will simply take the high limbs they need, and this will be
8992
@code{_mp_prec} if their target has that same precision. This is no more than
8993
a pointer adjustment, and must be checked anyway since the destination
8994
precision can be different from the sources.
8996
Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs
8997
if available. This ensures that a variable which has @code{_mp_size} equal to
8998
@code{_mp_prec+1} will get its full exact value copied. Strictly speaking
8999
this is unnecessary since only @code{_mp_prec} limbs are needed for the
9000
application's requested precision, but it's considered that an @code{mpf_set}
9001
from one variable into another of the same precision ought to produce an exact
9004
@item Application Precisions
9005
@code{__GMPF_BITS_TO_PREC} converts an application requested precision to an
9006
@code{_mp_prec}. The value in bits is rounded up to a whole limb then an
9007
extra limb is added since the most significant limb of @code{_mp_d} is only
9008
non-zero and therefore might contain only one bit.
9010
@code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra
9011
limb from @code{_mp_prec} before converting to bits. The net effect of
9012
reading back with @code{mpf_get_prec} is simply the precision rounded up to a
9013
multiple of @code{mp_bits_per_limb}.
9015
Note that the extra limb added here for the high only being non-zero is in
9016
addition to the extra limb allocated to @code{_mp_d}. For example with a
9017
32-bit limb, an application request for 250 bits will be rounded up to 8
9018
limbs, then an extra added for the high being only non-zero, giving an
9019
@code{_mp_prec} of 9. @code{_mp_d} then gets 10 limbs allocated. Reading
9020
back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and
9021
multiply by 32, giving 256 bits.
9023
Strictly speaking, the fact the high limb has at least one bit means that a
9024
float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but
9025
for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice
9026
multiple of the limb size.
9030
@node Raw Output Internals, C++ Interface Internals, Float Internals, Internals
9031
@section Raw Output Internals
9034
@code{mpz_out_raw} uses the following format.
9037
\global\newdimen\GMPboxwidth \GMPboxwidth=5em
9038
\global\newdimen\GMPboxheight \GMPboxheight=3ex
9039
\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
9042
\def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
9046
\vrule height 2.5ex depth 1.5ex
9047
\hbox to \GMPboxwidth {\hfil size\hfil}%
9049
\hbox to 3\GMPboxwidth {\hfil data bytes\hfil}%
9056
+------+------------------------+
9057
| size | data bytes |
9058
+------+------------------------+
9062
The size is 4 bytes written most significant byte first, being the number of
9063
subsequent data bytes, or the twos complement negative of that when a negative
9064
integer is represented. The data bytes are the absolute value of the integer,
9065
written most significant byte first.
9067
The most significant data byte is always non-zero, so the output is the same
9068
on all systems, irrespective of limb size.
9070
In GMP 1, leading zero bytes were written to pad the data bytes to a multiple
9071
of the limb size. @code{mpz_inp_raw} will still accept this, for
9074
The use of ``big endian'' for both the size and data fields is deliberate, it
9075
makes the data easy to read in a hex dump of a file. Unfortunately it also
9076
means that the limb data must be reversed when reading or writing, so neither
9077
a big endian nor little endian system can just read and write @code{_mp_d}.
9080
@node C++ Interface Internals, , Raw Output Internals, Internals
9081
@section C++ Interface Internals
9083
A system of expression templates is used to ensure something like @code{a=b+c}
9084
turns into a simple call to @code{mpz_add} etc. For @code{mpf_class} and
9085
@code{mpfr_class} the scheme also ensures the precision of the final
9086
destination is used for any temporaries within a statement like
9087
@code{f=w*x+y*z}. These are important features which a naive implementation
9090
A simplified description of the scheme follows. The true scheme is
9091
complicated by the fact that expressions have different return types. For
9092
detailed information, refer to the source code.
9094
To perform an operation, say, addition, we first define a ``function object''
9098
struct __gmp_binary_plus
9100
static void eval(mpf_t f, mpf_t g, mpf_t h) @{ mpf_add(f, g, h); @}
9105
And an ``additive expression'' object,
9108
__gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >
9109
operator+(const mpf_class &f, const mpf_class &g)
9112
<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g);
9116
The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<...>>} is used to
9117
encapsulate any possible kind of expression into a single template type. In
9118
fact even @code{mpf_class} etc are @code{typedef} specializations of
9121
Next we define assignment of @code{__gmp_expr} to @code{mpf_class}.
9125
mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr)
9127
expr.eval(this->get_mpf_t(), this->precision());
9132
void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval
9133
(mpf_t f, unsigned long int precision)
9135
Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t());
9139
where @code{expr.val1} and @code{expr.val2} are references to the expression's
9140
operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the
9143
This way, the expression is actually evaluated only at the time of assignment,
9144
when the required precision (that of @code{f}) is known. Furthermore the
9145
target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly
9146
with @code{f} as the output argument.
9148
Compound expressions are handled by defining operators taking subexpressions
9149
as their arguments, like this:
9152
template <class T, class U>
9154
<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
9155
operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2)
9158
<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
9163
And the corresponding specializations of @code{__gmp_expr::eval}:
9166
template <class T, class U, class Op>
9168
<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval
9169
(mpf_t f, unsigned long int precision)
9171
// declare two temporaries
9172
mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision);
9173
Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t());
9177
The expression is thus recursively evaluated to any level of complexity and
9178
all subexpressions are evaluated to the precision of @code{f}.
9181
@node Contributors, References, Internals, Top
9182
@comment node-name, next, previous, up
9183
@appendix Contributors
9184
@cindex Contributors
9186
Torbjorn Granlund wrote the original GMP library and is still developing and
9187
maintaining it. Several other individuals and organizations have contributed
9188
to GMP in various ways. Here is a list in chronological order:
9190
Gunnar Sjoedin and Hans Riesel helped with mathematical problems in early
9191
versions of the library.
9193
Richard Stallman contributed to the interface design and revised the first
9194
version of this manual.
9196
Brian Beuning and Doug Lea helped with testing of early versions of the
9197
library and made creative suggestions.
9199
John Amanatides of York University in Canada contributed the function
9200
@code{mpz_probab_prime_p}.
9202
Paul Zimmermann of Inria sparked the development of GMP 2, with his
9203
comparisons between bignum packages.
9205
Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul)
9206
contributed @code{mpz_gcd}, @code{mpz_divexact}, @code{mpn_gcd}, and
9207
@code{mpn_bdivmod}, partially supported by CNPq (Brazil) grant 301314194-2.
9209
Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure.
9210
He has also made valuable suggestions and tested numerous intermediary
9213
Joachim Hollman was involved in the design of the @code{mpf} interface, and in
9214
the @code{mpz} design revisions for version 2.
9216
Bennet Yee contributed the initial versions of @code{mpz_jacobi} and
9217
@code{mpz_legendre}.
9219
Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and
9220
@file{mpn/m68k/rshift.S} (now in @file{.asm} form).
9222
The development of floating point functions of GNU MP 2, were supported in part
9223
by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial
9226
GNU MP 2 was finished and released by SWOX AB, SWEDEN, in cooperation with the
9227
IDA Center for Computing Sciences, USA.
9229
Robert Harley of Inria, France and David Seal of ARM, England, suggested clever
9230
improvements for population count.
9232
Robert Harley also wrote highly optimized Karatsuba and 3-way Toom
9233
multiplication functions for GMP 3. He also contributed the ARM assembly
9236
Torsten Ekedahl of the Mathematical department of Stockholm University provided
9237
significant inspiration during several phases of the GMP development. His
9238
mathematical expertise helped improve several algorithms.
9240
Paul Zimmermann wrote the Divide and Conquer division code, the REDC code, the
9241
REDC-based mpz_powm code, the FFT multiply code, and the Karatsuba square
9242
root. The ECMNET project Paul is organizing was a driving force behind many
9243
of the optimizations in GMP 3.
9245
Linus Nordberg wrote the new configure system based on autoconf and
9246
implemented the new random functions.
9248
Kent Boortz made the Macintosh port.
9250
Kevin Ryde worked on a number of things: optimized x86 code, m4 asm macros,
9251
parameter tuning, speed measuring, the configure system, function inlining,
9252
divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas number
9253
functions, printf and scanf functions, perl interface, demo expression parser,
9254
the algorithms chapter in the manual, @file{gmpasm-mode.el}, and various
9255
miscellaneous improvements elsewhere.
9257
Steve Root helped write the optimized alpha 21264 assembly code.
9259
Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++
9260
@code{istream} input routines.
9262
GNU MP 4.0 was finished and released by Torbjorn Granlund and Kevin Ryde.
9263
Torbjorn's work was partially funded by the IDA Center for Computing Sciences,
9266
(This list is chronological, not ordered after significance. If you have
9267
contributed to GMP but are not listed above, please tell @email{tege@@swox.com}
9268
about the omission!)
9270
Thanks goes to Hans Thorsen for donating an SGI system for the GMP test system
9273
@node References, GNU Free Documentation License, Contributors, Top
9274
@comment node-name, next, previous, up
9275
@appendix References
9278
@c FIXME: In tex, the @uref's are unhyphenated, which is good for clarity,
9279
@c but being long words they upset paragraph formatting (the preceding line
9280
@c can get badly stretched). Would like an conditional @* style line break
9281
@c if the uref is too long to fit on the last line of the paragraph, but it's
9282
@c not clear how to do that. For now explicit @texlinebreak{}s are used on
9283
@c paragraphs that come out bad.
9289
Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in
9290
Analytic Number Theory and Computational Complexity'', Wiley, John & Sons,
9294
Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate
9295
Texts in Mathematics number 138, Springer-Verlag, 1993.
9296
@texlinebreak{} @uref{http://www.math.u-bordeaux.fr/~cohen}
9299
Donald E. Knuth, ``The Art of Computer Programming'', volume 2,
9300
``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998.
9301
@texlinebreak{} @uref{http://www-cs-faculty.stanford.edu/~knuth/taocp.html}
9304
John D. Lipson, ``Elements of Algebra and Algebraic Computing'',
9305
The Benjamin Cummings Publishing Company Inc, 1981.
9308
Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of
9309
Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/}
9312
Richard M. Stallman, ``Using and Porting GCC'', Free Software Foundation, 1999,
9313
available online @uref{http://www.gnu.org/software/gcc/onlinedocs/}, and in
9314
the GCC package @uref{ftp://ftp.gnu.org/gnu/gcc/}
9321
Yves Bertot, Nicolas Magaud and Paul Zimmermann, ``A Proof of GMP Square
9322
Root'', Journal of Automated Reasoning, volume 29, 2002, pp.@: 225-252. Also
9323
available online as INRIA Research Report 4475, June 2001,
9324
@uref{http://www.inria.fr/rrrt/rr-4475.html}
9327
Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'',
9328
Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022, @texlinebreak{}
9329
@uref{http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022}
9332
Torbjorn Granlund and Peter L. Montgomery, ``Division by Invariant Integers
9333
using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June
9334
1994. Also available @uref{ftp://ftp.cwi.nl/pub/pmontgom/divcnst.psa4.gz}
9338
Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in
9339
Mathematics of Computation, volume 44, number 170, April 1985.
9343
``An algorithm for exact division'',
9344
Journal of Symbolic Computation,
9345
volume 15, 1993, pp. 169-180.
9346
Research report version available @texlinebreak{}
9347
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz}
9350
Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended
9351
Abstract'', RISC-Linz technical report 96-31, @texlinebreak{}
9352
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz}
9355
Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'',
9356
ISSAC 97, pp. 339-341. Technical report available @texlinebreak{}
9357
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz}
9360
Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93,
9361
pp. 111-116. Technical report version available @texlinebreak{}
9362
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz}
9365
Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD
9366
of Long Integers'', Journal of Symbolic Computation, volume 19, 1995,
9367
pp. 145-157. Technical report version also available @texlinebreak{}
9368
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz}
9371
Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'',
9372
Journal of Symbolic Computation, volume 21, 1996, pp. 441-455. Early
9373
technical report version also available
9374
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz}
9377
R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'',
9378
Proceedings of the 13th Annual IEEE Symposium on Switching and Automata
9379
Theory, October 1972, pp. 90-96. Reprinted as ``Fast Modular Transforms'',
9380
Journal of Computer and System Sciences, volume 8, number 3, June 1974,
9384
Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser
9385
Zahlen'', Computing 7, 1971, pp. 281-292.
9388
Kenneth Weber, ``The accelerated integer GCD algorithm'',
9389
ACM Transactions on Mathematical Software,
9390
volume 21, number 1, March 1995, pp. 111-122.
9393
Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805,
9394
November 1999, @uref{http://www.inria.fr/RRRT/RR-3805.html}
9397
Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root
9398
Implementations'', @texlinebreak{}
9399
@uref{http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz}
9402
Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE
9403
Symposium on Computer Arithmetic, 1993, pp. 260 to 271. Reprinted as ``More
9404
on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers,
9405
volume 43, number 8, August 1994, pp. 899-908.
9409
@node GNU Free Documentation License, Concept Index, References, Top
9410
@appendix GNU Free Documentation License
9411
@cindex GNU Free Documentation License
9415
@node Concept Index, Function Index, GNU Free Documentation License, Top
9416
@comment node-name, next, previous, up
9417
@unnumbered Concept Index
9420
@node Function Index, , Concept Index, Top
9421
@comment node-name, next, previous, up
9422
@unnumbered Function and Type Index
9429
@c compile-command: "make gmp.info"