1
\input texinfo @c -*-texinfo-*-
5
@settitle GNU MP @value{VERSION}
10
@comment %**end of header
12
@c Texinfo version 4 or up will be needed to process this into .info files.
14
@c The edition number is in three places and the month/year in one, all taken
15
@c from version.texi. version.texi is created when you configure with
16
@c --enable-maintainer-mode, and is included in a distribution made with
19
@c "cindex" entries have been made for function categories and programming
20
@c topics. Minutiae like particular systems and processors mentioned in
21
@c various places have been left out so as not to bury important topics under
22
@c a lot of junk. "mpn" functions aren't in the concept index because a
23
@c beginner looking for "GCD" or something is only going to be confused by
24
@c pointers to low level routines.
26
@dircategory GNU libraries
28
* gmp: (gmp). GNU Multiple Precision Arithmetic Library.
36
@node Top, Copying, (dir), (dir)
38
This manual describes how to install and use the GNU multiple precision
39
arithmetic library, version @value{VERSION}.
44
@c use the new format for titles
46
@subtitle The GNU Multiple Precision Arithmetic Library
47
@subtitle Edition @value{EDITION}
48
@subtitle @value{UPDATED}
50
@author by Torbj@"orn Granlund, Swox AB
51
@email{tege@@swox.com}
53
@c Include the Distribution inside the titlepage so
54
@c that headings are turned off.
59
\global\baselineskip=13pt
63
@vskip 0pt plus 1filll
65
@c Ensure copyright stuff gets into info and html output.
68
Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002
69
Free Software Foundation, Inc.
71
Permission is granted to copy, distribute and/or modify this document under
72
the terms of the GNU Free Documentation License, Version 1.1 or any later
73
version published by the Free Software Foundation; with no Invariant Sections,
74
with the Front-Cover Texts being "A GNU Manual", and with the Back-Cover Texts
75
being "You have freedom to copy and modify this GNU Manual, like GNU
76
software". A copy of the license is included in @ref{GNU Free Documentation
83
@c Don't bother with contents for "makeinfo --html", the menus seem adequate.
88
* Copying:: GMP Copying Conditions (LGPL).
89
* Introduction to GMP:: Brief introduction to GNU MP.
90
* Installing GMP:: How to configure and compile the GMP library.
91
* GMP Basics:: What every GMP user should know.
92
* Reporting Bugs:: How to usefully report bugs.
93
* Integer Functions:: Functions for arithmetic on signed integers.
94
* Rational Number Functions:: Functions for arithmetic on rational numbers.
95
* Floating-point Functions:: Functions for arithmetic on floats.
96
* Low-level Functions:: Fast functions for natural numbers.
97
* Random Number Functions:: Functions for generating random numbers.
98
* Formatted Output:: @code{printf} style output.
99
* Formatted Input:: @code{scanf} style input.
100
* C++ Class Interface:: Class wrappers around GMP types.
101
* BSD Compatible Functions:: All functions found in BSD MP.
102
* Custom Allocation:: How to customize the internal allocation.
103
* Language Bindings:: Using GMP from other languages.
104
* Algorithms:: What happens behind the scenes.
105
* Internals:: How values are represented behind the scenes.
107
* Contributors:: Who brings your this library?
108
* References:: Some useful papers and books to read.
109
* GNU Free Documentation License::
115
@c @m{T,N} is $T$ in tex or @math{N} otherwise. This is an easy way to give
116
@c different forms for math in tex and info. Commas in N or T don't work,
117
@c but @C{} can be used instead. \, works in info but not in tex.
133
@c @ma{E} is $E$ for tex or @math{E} otherwise. This suits expressions which
134
@c want $$ rather than @math{} in tex, for example @ma{N^2}.
146
@c @ms{V,N} is $V_N$ in tex or just vn otherwise. This suits simple
147
@c subscripts like @ms{x,0}.
150
@tex$\V\_{\N\}$@end tex
159
@c @nicode{S} is plain S in info, or @code{S} elsewhere. This can be used
160
@c when the quotes that @code{} gives in info aren't wanted, but the
161
@c fontification in tex or html is wanted. Doesn't work as @nicode{'\\0'}
162
@c though (gives two backslashes in tex).
174
@c @nisamp{S} is plain S in info, or @samp{S} elsewhere. This can be used
175
@c when the quotes that @samp{} gives in info aren't wanted, but the
176
@c fontification in tex or html is wanted.
188
@c Usage: @GMPtimes{}
189
@c Give either \times or the word "times".
191
\gdef\GMPtimes{\times}
199
@c Usage: @GMPmultiply{}
200
@c Give * in info, or nothing in tex.
211
@c Give either |x| in tex, or abs(x) in info or html.
221
@c Usage: @GMPfloor{x}
222
@c Give either \lfloor x\rfloor in tex, or floor(x) in info or html.
224
\gdef\GMPfloor#1{\lfloor #1\rfloor}
232
@c Usage: @GMPceil{x}
233
@c Give either \lceil x\rceil in tex, or ceil(x) in info or html.
235
\gdef\GMPceil#1{\lceil #1 \rceil}
243
@c Math operators already available in tex, made available in info too.
244
@c For example @bmod{} can be used in both tex and info.
269
@c New math operators.
270
@c @abs{} can be used in both tex and info, or just \abs in tex.
272
\gdef\abs{\mathop{\rm abs}}
280
@c @cross{} is a \times symbol in tex, or an "x" in info. In tex it works
281
@c inside or outside $ $.
283
\gdef\cross{\ifmmode\times\else$\times$\fi}
291
@c @times{} made available as a "*" in info and html (already works in tex).
299
@c Like @w{} but working in math mode too.
301
\gdef\W#1{\ifmmode{#1}\else\w{#1}\fi}
309
@c Usage: \GMPdisplay{text}
310
@c Put the given text in an @display style indent, but without turning off
311
@c paragraph reflow etc.
315
\advance\leftskip by \lispnarrowing
320
@c A new \hat that will work in math mode, unlike the texinfo redefined
323
\gdef\GMPhat{\mathaccent"705E}
326
@c Usage: \GMPraise{text}
327
@c For use in a $ $ math expression as an alternative to "^". This is good
328
@c for @code{} in an exponent, since there seems to be no superscript font
331
\gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}}
334
@c Usage: @texlinebreak{}
335
@c A line break as per @*, but only in tex.
346
@c Usage: @maybepagebreak
347
@c Allow tex to insert a page break, if it feels the urge.
348
@c Normally blocks of @deftypefun/funx are kept together, which can lead to
349
@c some poor page break positioning if it's a big block, like the sets of
350
@c division functions etc.
352
\gdef\maybepagebreak{\penalty0}
355
@macro maybepagebreak
360
@node Copying, Introduction to GMP, Top, Top
361
@comment node-name, next, previous, up
362
@unnumbered GNU MP Copying Conditions
363
@cindex Copying conditions
364
@cindex Conditions for copying GNU MP
365
@cindex License conditions
367
This library is @dfn{free}; this means that everyone is free to use it and
368
free to redistribute it on a free basis. The library is not in the public
369
domain; it is copyrighted and there are restrictions on its distribution, but
370
these restrictions are designed to permit everything that a good cooperating
371
citizen would want to do. What is not allowed is to try to prevent others
372
from further sharing any version of this library that they might get from
375
Specifically, we want to make sure that you have the right to give away copies
376
of the library, that you receive source code or else can get it if you want
377
it, that you can change this library or use pieces of it in new free programs,
378
and that you know you can do these things.@refill
380
To make sure that everyone has such rights, we have to forbid you to deprive
381
anyone else of these rights. For example, if you distribute copies of the GNU
382
MP library, you must give the recipients all the rights that you have. You
383
must make sure that they, too, receive or can get the source code. And you
384
must tell them their rights.@refill
386
Also, for our own protection, we must make certain that everyone finds out
387
that there is no warranty for the GNU MP library. If it is modified by
388
someone else and passed on, we want their recipients to know that what they
389
have is not what we distributed, so that any problems introduced by others
390
will not reflect on our reputation.@refill
392
The precise conditions of the license for the GNU MP library are found in the
393
Lesser General Public License version 2.1 that accompanies the source code,
394
see @file{COPYING.LIB}. Certain demonstration programs are provided under the
395
terms of the plain General Public License version 2, see @file{COPYING}.
398
@node Introduction to GMP, Installing GMP, Copying, Top
399
@comment node-name, next, previous, up
400
@chapter Introduction to GNU MP
403
GNU MP is a portable library written in C for arbitrary precision arithmetic
404
on integers, rational numbers, and floating-point numbers. It aims to provide
405
the fastest possible arithmetic for all applications that need higher
406
precision than is directly supported by the basic C types.
408
Many applications use just a few hundred bits of precision; but some
409
applications may need thousands or even millions of bits. GMP is designed to
410
give good performance for both, by choosing algorithms based on the sizes of
411
the operands, and by carefully keeping the overhead at a minimum.
413
The speed of GMP is achieved by using fullwords as the basic arithmetic type,
414
by using sophisticated algorithms, by including carefully optimized assembly
415
code for the most common inner loops for many different CPUs, and by a general
416
emphasis on speed (as opposed to simplicity or elegance).
418
There is carefully optimized assembly code for these CPUs:
419
@cindex CPUs supported
421
DEC Alpha 21064, 21164, and 21264,
423
AMD K6, K6-2 and Athlon,
424
Hitachi SuperH and SH-2,
425
HPPA 1.0, 1.1 and 2.0,
426
Intel Pentium, Pentium Pro/II/III, Pentium 4, generic x86,
428
Motorola MC68000, MC68020, MC88100, and MC88110,
429
Motorola/IBM PowerPC 32 and 64,
433
SPARCv7, SuperSPARC, generic SPARCv8, UltraSPARC,
437
Some optimizations also for
445
There is a mailing list for GMP users. To join it, send a mail to
446
@email{gmp-request@@swox.com} with the word @samp{subscribe} in the message
447
@strong{body} (not in the subject line).
451
For up-to-date information on GMP, please see the GMP web pages at
454
@uref{http://swox.com/gmp/}
457
@cindex Latest version of GMP
458
@cindex Anonymous FTP of latest version
459
@cindex FTP of latest version
460
The latest version of the library is available at
463
@uref{ftp://ftp.gnu.org/gnu/gmp}
466
Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror
467
near you, see @uref{http://www.gnu.org/order/ftp.html} for a full list.
470
@section How to use this Manual
471
@cindex About this manual
473
Everyone should read @ref{GMP Basics}. If you need to install the library
474
yourself, then read @ref{Installing GMP}. If you have a system with multiple
475
ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used
478
The rest of the manual can be used for later reference, although it is
479
probably a good idea to glance through it.
482
@node Installing GMP, GMP Basics, Introduction to GMP, Top
483
@comment node-name, next, previous, up
484
@chapter Installing GMP
485
@cindex Installing GMP
486
@cindex Configuring GMP
489
GMP has an autoconf/automake/libtool based configuration system. On a
490
Unix-like system a basic build can be done with
498
Some self-tests can be run with
505
And you can install (under @file{/usr/local} by default) with
512
If you experience problems, please report them to @email{bug-gmp@@gnu.org}.
513
See @ref{Reporting Bugs}, for information on what to include in useful bug
519
* Notes for Package Builds::
520
* Notes for Particular Systems::
521
* Known Build Problems::
525
@node Build Options, ABI and ISA, Installing GMP, Installing GMP
526
@section Build Options
527
@cindex Build options
529
All the usual autoconf configure options are available, run @samp{./configure
530
--help} for a summary. The file @file{INSTALL.autoconf} has some generic
531
installation information too.
534
@item Non-Unix Systems
536
@samp{configure} requires various Unix-like tools. On an MS-DOS system
537
Cygwin, DJGPP or MINGW can be used. See
540
@uref{http://www.cygnus.com/cygwin}
541
@uref{http://www.delorie.com/djgpp}
542
@uref{http://www.mingw.org}
545
The @file{macos} directory contains an unsupported port to MacOS 9 on Power
546
Macintosh. Note that MacOS X ``Darwin'' can use the normal
549
It might be possible to build without the help of @samp{configure}, certainly
550
all the code is there, but unfortunately you'll be on your own.
552
@item Build Directory
554
To compile in a separate build directory, @command{cd} to that directory, and
555
prefix the configure command with the path to the GMP source directory. For
560
/my/sources/gmp-@value{VERSION}/configure
563
Not all @samp{make} programs have the necessary features (@code{VPATH}) to
564
support this. In particular, SunOS and Slowaris @command{make} have bugs that
565
make them unable to build in a separate directory. Use GNU @command{make}
568
@item @option{--disable-shared}, @option{--disable-static}
570
By default both shared and static libraries are built (where possible), but
571
one or other can be disabled. Shared libraries result in smaller executables
572
and permit code sharing between separate running processes, but on some CPUs
573
are slightly slower, having a small cost on each function call.
575
@item Native Compilation, @option{--build=CPU-VENDOR-OS}
577
For normal native compilation, the system can be specified with
578
@samp{--build}. By default @samp{./configure} uses the output from running
579
@samp{./config.guess}. On some systems @samp{./config.guess} can determine
580
the exact CPU type, on others it will be necessary to give it explicitly. For
584
./configure --build=ultrasparc-sun-solaris2.7
587
In all cases the @samp{OS} part is important, since it controls how libtool
588
generates shared libraries. Running @samp{./config.guess} is the simplest way
589
to see what it should be, if you don't know already.
591
@item Cross Compilation, @option{--host=CPU-VENDOR-OS}
593
When cross-compiling, the system used for compiling is given by @samp{--build}
594
and the system where the library will run is given by @samp{--host}. For
595
example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries,
598
./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu
601
Compiler tools are sought first with the host system type as a prefix. For
602
example @command{m68k-mac-linux-gnu-ranlib} is checked for, then plain
603
@command{ranlib}. This makes it possible for a set of cross-compiling tools
604
to co-exist with native tools. The prefix is the argument to @samp{--host},
605
and this can be an alias, such as @samp{m68k-linux}. But note that tools
606
don't have to be setup this way, it's enough to just have a @env{PATH} with a
607
suitable cross-compiling @command{cc} etc.
609
Compiling for a different CPU in the same family as the build system is a form
610
of cross-compilation, though very possibly this would merely be with special
611
options on a native compiler. In any case @samp{./configure} avoids depending
612
on being able to run code on the build system, which is important when
613
creating binaries for a newer CPU since they very possibly won't run on the
616
Currently a warning is given unless an explicit @samp{--build} is used when
617
cross-compiling, because it may not be possible to correctly guess the build
618
system type if the @env{PATH} has only a cross-compiling @command{cc}.
620
Note that the @samp{--target} option is not appropriate for GMP. It's for use
621
when building compiler tools, with @samp{--host} being where they will run,
622
and @samp{--target} what they'll produce code for. Ordinary programs or
623
libraries like GMP are only interested in the @samp{--host} part, being where
624
they'll run. (Some past versions of GMP used @samp{--target} incorrectly.)
628
In general, if you want a library that runs as fast as possible, you should
629
configure GMP for the exact CPU type your system uses. However, this may mean
630
the binaries won't run on older members of the family, and might run slower on
631
other members, older or newer. The best idea is always to build GMP for the
632
exact machine type you intend to run it on.
634
The following CPUs have specific support. See @file{configure.in} for details
635
of what code and compiler options they select.
639
@c Keep this formatting, it's easy to read and it can be grepped to
640
@c automatically test that CPUs listed get through ./config.sub
735
CPUs not listed will use generic C code.
737
@item Generic C Build
739
If some of the assembly code causes problems, or if otherwise desired, the
740
generic C code can be selected with CPU @samp{none}. For example,
743
./configure --build=none-unknown-freebsd3.5
746
Note that this will run quite slowly, but it should be portable and should at
747
least make it possible to get something running if all else fails.
751
On some systems GMP supports multiple ABIs (application binary interfaces),
752
meaning data type sizes and calling conventions. By default GMP chooses the
753
best ABI available, but a particular ABI can be selected. For example
756
./configure --build=mips64-sgi-irix6 ABI=n32
759
See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what
760
applications need to do.
762
@item @option{CC}, @option{CFLAGS}
764
By default the C compiler used is chosen from among some likely candidates,
765
with @command{gcc} normally preferred if it's present. The usual
766
@samp{CC=whatever} can be passed to @samp{./configure} to choose something
769
For some systems, default compiler flags are set based on the CPU and
770
compiler. The usual @samp{CFLAGS="-whatever"} can be passed to
771
@samp{./configure} to use something different or to set good flags for systems
772
GMP doesn't otherwise know.
774
The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure},
775
and can be found in each generated @file{Makefile}. This is the easiest way
776
to check the defaults when considering changing or adding something.
778
Note that when @samp{CC} and @samp{CFLAGS} are specified on a system
779
supporting multiple ABIs it's important to give an explicit
780
@samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and
781
won't be able to select the correct assembler code.
783
If just @samp{CC} is selected then normal default @samp{CFLAGS} for that
784
compiler will be used (if GMP recognises it). For example @samp{CC=gcc} can
785
be used to force the use of GCC, with default flags (and default ABI).
787
@item @option{CPPFLAGS}
789
Any flags like @samp{-D} defines or @samp{-I} includes required by the
790
preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}.
791
Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but
792
preprocessing uses just @samp{CPPFLAGS}. This distinction is because most
793
preprocessors won't accept all the flags the compiler does. Preprocessing is
794
done separately in some configure tests, and in the @samp{ansi2knr} support
797
@item C++ Support, @option{--enable-cxx}
798
C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a
799
C++ compiler will be required. As a convenience @samp{--enable-cxx=detect}
800
can be used to enable C++ support only if a compiler can be found. The C++
801
support consists of a library @file{libgmpxx.la} and header file
804
A separate @file{libgmpxx.la} has been adopted rather than having C++ objects
805
within @file{libgmp.la} in order to ensure dynamic linked C programs aren't
806
bloated by a dependency on the C++ standard library, and to avoid any chance
807
that the C++ compiler could be required when linking plain C programs.
809
@file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can
810
only be expected to work with @file{libgmp.la} from the same GMP version.
811
Future changes to the relevant internals will be accompanied by renaming, so a
812
mismatch will cause unresolved symbols rather than perhaps mysterious
815
In general @file{libgmpxx.la} will be usable only with the C++ compiler that
816
built it, since name mangling and runtime support are usually incompatible
817
between different compilers.
819
@item @option{CXX}, @option{CXXFLAGS}
820
When C++ support is enabled, the C++ compiler and its flags can be set with
821
variables @samp{CXX} and @samp{CXXFLAGS} in the usual way. The default for
822
@samp{CXX} is the first compiler that works from a list of likely candidates,
823
with @command{g++} normally preferred when available. The default for
824
@samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then
825
for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers
826
@samp{-g} or nothing. Trying @samp{CFLAGS} this way is convenient when using
827
@samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will
828
usually suit @samp{g++}.
830
It's important that the C and C++ compilers match, meaning their startup and
831
runtime support routines are compatible and that they generate code in the
832
same ABI (if there's a choice of ABIs on the system). @samp{./configure}
833
isn't currently able to check these things very well itself, so for that
834
reason @samp{--disable-cxx} is the default, to avoid a build failure due to a
835
compiler mismatch. Perhaps this will change in the future.
837
Incidentally, it's normally not good enough to set @samp{CXX} to the same as
838
@samp{CC}. Although @command{gcc} for instance recognises @file{foo.cc} as
839
C++ code, only @command{g++} will invoke the linker the right way when
840
building an executable or shared library from object files.
842
@item Temporary Memory, @option{--enable-alloca=<choice>}
843
@cindex Stack overflow segfaults
844
@cindex @code{alloca}
846
GMP allocates temporary workspace using one of the following three methods,
847
which can be selected with for instance
848
@samp{--enable-alloca=malloc-reentrant}.
852
@samp{alloca} - C library or compiler builtin.
854
@samp{malloc-reentrant} - the heap, in a re-entrant fashion.
856
@samp{malloc-notreentrant} - the heap, with global variables.
859
For convenience, the following choices are also available.
860
@samp{--disable-alloca} is the same as @samp{--enable-alloca=no}.
864
@samp{yes} - a synonym for @samp{alloca}.
866
@samp{no} - a synonym for @samp{malloc-reentrant}.
868
@samp{reentrant} - @code{alloca} if available, otherwise
869
@samp{malloc-reentrant}. This is the default.
871
@samp{notreentrant} - @code{alloca} if available, otherwise
872
@samp{malloc-notreentrant}.
875
@code{alloca} is reentrant and fast, and is recommended, but when working with
876
large numbers it can overflow the available stack space, in which case one of
877
the two malloc methods will need to be used. Alternately it might be possible
878
to increase available stack with @command{limit}, @command{ulimit} or
879
@code{setrlimit}, or under DJGPP with @command{stubedit} or
880
@code{@w{_stklen}}. Note that depending on the system the only indication of
881
stack overflow might be a segmentation violation.
883
@samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe,
884
but @samp{malloc-notreentrant} is faster and should be used if reentrancy is
887
The two malloc methods in fact use the memory allocation functions selected by
888
@code{mp_set_memory_functions}, these being @code{malloc} and friends by
889
default. @xref{Custom Allocation}.
891
An additional choice @samp{--enable-alloca=debug} is available, to help when
892
debugging memory related problems (@pxref{Debugging}).
894
@item FFT Multiplication, @option{--disable-fft}
896
By default multiplications are done using Karatsuba, 3-way Toom-Cook, and
897
Fermat FFT. The FFT is only used on large to very large operands and can be
898
disabled to save code size if desired.
900
@item Berkeley MP, @option{--enable-mpbsd}
902
The Berkeley MP compatibility library (@file{libmp}) and header file
903
(@file{mp.h}) are built and installed only if @option{--enable-mpbsd} is used.
904
@xref{BSD Compatible Functions}.
906
@item MPFR, @option{--enable-mpfr}
909
The optional MPFR functions are built and installed only if
910
@option{--enable-mpfr} is used. These are in a separate library
911
@file{libmpfr.a} and are documented separately too (@pxref{Introduction to
912
MPFR,, Introduction to MPFR, mpfr, MPFR}).
914
@item Assertion Checking, @option{--enable-assert}
916
This option enables some consistency checking within the library. This can be
917
of use while debugging, @pxref{Debugging}.
919
@item Execution Profiling, @option{--enable-profiling=prof/gprof}
921
Profiling support can be enabled either for @command{prof} or @command{gprof}.
922
This adds @samp{-p} or @samp{-pg} respectively to @samp{CFLAGS}, and for some
923
systems adds corresponding @code{mcount} calls to the assembler code.
926
@item @option{MPN_PATH}
928
Various assembler versions of mpn subroutines are provided, and, for a given
929
CPU, a search is made though a path to choose a version of each. For example
930
@samp{sparcv8} has path @samp{sparc32/v8 sparc32 generic}, which means it
931
looks first for v8 code, then plain sparc32, and finally falls back on generic
932
C. Knowledgeable users with special requirements can specify a path with
933
@samp{MPN_PATH="dir list"}. This will normally be unnecessary because all
934
sensible paths should be available under one or other CPU.
936
@item Demonstration Programs
937
@cindex Demonstration programs
938
@cindex Example programs
940
The @file{demos} subdirectory has some sample programs using GMP. These
941
aren't built or installed, but there's a @file{Makefile} with rules for them.
951
The document you're now reading is @file{gmp.texi}. The usual automake
952
targets are available to make PostScript @file{gmp.ps} and/or DVI
955
HTML can be produced with @samp{makeinfo --html}, see @ref{makeinfo
956
html,Generating HTML,Generating HTML,texinfo,Texinfo}. Or alternately
957
@samp{texi2html}, see @ref{Top,Texinfo to HTML,About,texi2html,Texinfo To
960
PDF can be produced with @samp{texi2dvi --pdf} (@pxref{PDF
961
Output,PDF,,texinfo,Texinfo}) or with @samp{pdftex}.
963
Some supplementary notes can be found in the @file{doc} subdirectory.
969
@node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP
972
@cindex Application Binary Interface
974
@cindex Instruction Set Architecture
976
ABI (Application Binary Interface) refers to the calling conventions between
977
functions, meaning what registers are used and what sizes the various C data
978
types are. ISA (Instruction Set Architecture) refers to the instructions and
979
registers a CPU has available.
981
Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the
982
latter for compatibility with older CPUs in the family. GMP supports some
983
CPUs like this in both ABIs. In fact within GMP @samp{ABI} means a
984
combination of chip ABI, plus how GMP chooses to use it. For example in some
985
32-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit
988
By default GMP chooses the best ABI available for a given system, and this
989
generally gives significantly greater speed. But an ABI can be chosen
990
explicitly to make GMP compatible with other libraries, or particular
991
application requirements. For example,
997
In all cases it's vital that all object code used in a given program is
998
compiled for the same ABI.
1000
Usually a limb is implemented as a @code{long}. When a @code{long long} limb
1001
is used this is encoded in the generated @file{gmp.h}. This is convenient for
1002
applications, but it does mean that @file{gmp.h} will vary, and can't be just
1003
copied around. @file{gmp.h} remains compiler independent though, since all
1004
compilers for a particular ABI will be expected to use the same limb type.
1006
Currently no attempt is made to follow whatever conventions a system has for
1007
installing library or header files built for a particular ABI. This will
1008
probably only matter when installing multiple builds of GMP, and it might be
1009
as simple as configuring with a special @samp{libdir}, or it might require
1010
more than that. Note that builds for different ABIs need to done separately,
1011
with a fresh @command{./configure} and @command{make} each.
1016
@item HPPA 2.0 (@samp{hppa2.0*})
1019
@item @samp{ABI=2.0w}
1021
The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or up
1022
when using @command{cc}. @command{gcc} support for this is in progress.
1023
Applications must be compiled with
1029
@item @samp{ABI=2.0n}
1031
The 2.0n ABI means the 32-bit HPPA 1.0 ABI but with a 64-bit limb using
1032
@code{long long}. This is available on HP-UX 10 or up when using
1033
@command{cc}. No @command{gcc} support is planned for this. Applications
1034
must be compiled with
1040
@item @samp{ABI=1.0}
1042
HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI.
1043
No special compiler options are needed for applications.
1046
All three ABIs are available for CPUs @samp{hppa2.0w} and @samp{hppa2.0}, but
1047
for CPU @samp{hppa2.0n} only 2.0n or 1.0 are allowed.
1051
@item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]})
1053
IRIX 6 supports the n32 and 64 ABIs and always has a 64-bit MIPS 3 or better
1054
CPU. In both these ABIs GMP uses a 64-bit limb. A new enough @command{gcc}
1055
is required (2.95 for instance).
1058
@item @samp{ABI=n32}
1060
The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a
1061
@code{long long}. Applications must be compiled with
1070
The 64-bit ABI is 64-bit pointers and integers. Applications must be compiled
1079
Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary
1080
support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code.
1084
@item PowerPC 64 (@samp{powerpc64*})
1087
@item @samp{ABI=aix64}
1089
The AIX 64 ABI uses 64-bit limbs and pointers and is available on systems
1090
@samp{powerpc64*-*-aix*}. Applications must be compiled (and linked) with
1097
@item @samp{ABI=32L}
1099
This uses the 32-bit ABI but a 64-bit limb using GCC @code{long long} in
1100
64-bit registers. Applications must be compiled with
1108
This is the basic 32-bit PowerPC ABI. No special compiler options are needed
1114
@item Sparc V9 (@samp{sparcv9} and @samp{ultrasparc*})
1119
The 64-bit V9 ABI is available on Solaris 2.7 and up and GNU/Linux. GCC 2.95
1120
or up, or Sun @command{cc} is required. Applications must be compiled with
1123
gcc -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9
1129
On Solaris 2.6 and earlier, and on Solaris 2.7 with the kernel in 32-bit mode,
1130
only the plain V8 32-bit ABI can be used, since the kernel doesn't save all
1131
registers. GMP still uses as much of the V9 ISA as it can in these
1132
circumstances. No special compiler options are required for applications,
1133
though using something like the following requesting V9 code within the V8 ABI
1141
@command{gcc} 2.8 and earlier only supports @samp{-mv8} though.
1144
Don't be confused by the names of these sparc @samp{-m} and @samp{-x} options,
1145
they're called @samp{arch} but they effectively control the ABI.
1147
On Solaris 2.7 with the kernel in 32-bit-mode, a normal native build will
1148
reject @samp{ABI=64} because the resulting executables won't run.
1149
@samp{ABI=64} can still be built if desired by making it look like a
1150
cross-compile, for example
1153
./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64
1159
@node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP
1160
@section Notes for Package Builds
1161
@cindex Build notes for binary packaging
1162
@cindex Packaged builds
1164
GMP should present no great difficulties for packaging in a binary
1167
@cindex Libtool versioning
1168
@cindex Shared library versioning
1169
Libtool is used to build the library and @samp{-version-info} is set
1170
appropriately, having started from @samp{3:0:0} in GMP 3.0. The GMP 4 series
1171
will be upwardly binary compatible in each release and will be upwardly binary
1172
compatible with all of the GMP 3 series. Additional function interfaces may
1173
be added in each release, so on systems where libtool versioning is not fully
1174
checked by the loader an auxiliary mechanism may be needed to express that a
1175
dynamic linked application depends on a new enough GMP.
1177
An auxiliary mechanism may also be needed to express that @file{libgmpxx.la}
1178
(from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la}
1179
from the same GMP version, since this is not done by the libtool versioning,
1180
nor otherwise. A mismatch will result in unresolved symbols from the linker,
1181
or perhaps the loader.
1183
When building a package for a CPU family, care should be taken to use
1184
@samp{--host} (or @samp{--build}) to choose the least common denominator among
1185
the CPUs which might use the package. For example this might necessitate
1186
@samp{i386} for x86s, or plain @samp{sparc} (meaning V7) for SPARCs.
1188
Users who care about speed will want GMP built for their exact CPU type, to
1189
make use of the available optimizations. Providing a way to suitably rebuild
1190
a package may be useful. This could be as simple as making it possible for a
1191
user to omit @samp{--build} (and @samp{--host}) so @samp{./config.guess} will
1192
detect the CPU. But a way to manually specify a @samp{--build} will be wanted
1193
for systems where @samp{./config.guess} is inexact.
1195
Note that @file{gmp.h} is a generated file, and will be architecture and ABI
1200
@node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP
1201
@section Notes for Particular Systems
1202
@cindex Build notes for particular systems
1205
@c This section is more or less meant for notes about performance or about
1206
@c build problems that have been worked around but might leave a user
1207
@c scratching their head. Fun with different ABIs on a system belongs in the
1212
On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since
1213
some versions of the native @command{ar} fail on the convenience libraries
1214
used. A shared build can be attempted with
1217
./configure --enable-shared --disable-static
1220
Note that the @samp{--disable-static} is necessary because in a shared build
1221
libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for
1222
the benefit of old versions of @command{ld} which only recognise @file{.a},
1223
but unfortunately this is done even if a fully functional @command{ld} is
1228
On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a
1229
bug in unsigned division, giving wrong results for some operands. GMP
1230
@samp{./configure} will demand GCC 2.95.4 or later.
1232
@item Microsoft Windows
1233
On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by
1234
default GMP builds only a static library, but a DLL can be built instead using
1237
./configure --disable-static --enable-shared
1240
Static and DLL libraries can't both be built, since certain export directives
1241
in @file{gmp.h} must be different. @samp{--enable-cxx} cannot be used when
1242
building a DLL, since libtool doesn't currently support C++ DLLs. This might
1243
change in the future.
1245
GCC is recommended for compiling GMP, but the resulting DLL can be used with
1246
any compiler. On mingw only the standard Windows libraries will be needed, on
1247
Cygwin the usual cygwin runtime will be required.
1249
@item Motorola 68k CPU Types
1251
@samp{m68k} is taken to mean 68000. @samp{m68020} or higher will give a
1252
performance boost on applicable CPUs. @samp{m68360} can be used for CPU32
1253
series chips. @samp{m68302} can be used for ``Dragonball'' series chips,
1254
though this is merely a synonym for @samp{m68000}.
1258
@command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it
1259
unsuitable for @file{.asm} file processing. @samp{./configure} will detect
1260
the problem and either abort or choose another m4 in the @env{PATH}. The bug
1261
is fixed in OpenBSD 2.7, so either upgrade or use GNU m4.
1263
@item Power CPU Types
1265
In GMP, CPU types @samp{power} and @samp{powerpc} will each use instructions
1266
not available on the other, so it's important to choose the right one for the
1267
CPU that will be used. Currently GMP has no assembler code support for using
1268
just the common instruction subset. To get executables that run on both, the
1269
current suggestion is to use the generic C code (CPU @samp{none}), possibly
1270
with appropriate compiler options (like @samp{-mcpu=common} for
1271
@command{gcc}). CPU @samp{rs6000} (which is not a CPU but a family of
1272
workstations) is accepted by @file{config.sub}, but is currently equivalent to
1275
@item Sparc CPU Types
1277
@samp{sparcv8} or @samp{supersparc} on relevant systems will give a
1278
significant performance increase over the V7 code.
1282
@command{/usr/bin/m4} lacks various features needed to process @file{.asm}
1283
files, and instead @samp{./configure} will automatically use
1284
@command{/usr/5bin/m4}, which we believe is always available (if not then use
1289
@samp{i386} selects generic code which will run reasonably well on all x86
1292
@samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for the intended
1293
P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II,
1294
P-III)@. @samp{i386} is a better choice when making binaries that must run on
1297
@samp{pentium4} and an SSE2 capable assembler are important for best results
1298
on Pentium 4. The specific code is for instance roughly a 2@cross{} to
1299
3@cross{} speedup over the generic @samp{i386} code.
1301
@item x86 MMX and SSE2 Code
1303
If the CPU selected has MMX code but the assembler doesn't support it, a
1304
warning is given and non-MMX code is used instead. This will be an inferior
1305
build, since the MMX code that's present is there because it's faster than the
1306
corresponding plain integer code. The same applies to SSE2.
1308
Old versions of @samp{gas} don't support MMX instructions, in particular
1309
version 1.92.3 that comes with FreeBSD 2.2.8 doesn't (and unfortunately
1310
there's no newer assembler for that system).
1312
Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register
1313
to register @code{movq} instructions, and so can't be used for MMX code.
1314
Install a recent @command{gas} if MMX code is wanted on these systems.
1316
@item x86 GCC @samp{-march=pentiumpro}
1318
GCC 2.95.2 and 2.95.3 miscompiled some versions of @file{mpz/powm.c} when
1319
@samp{-march=pentiumpro} was used, so for relevant CPUs that option is only in
1320
the default @env{CFLAGS} for GCC 2.95.4 and up.
1325
@node Known Build Problems, , Notes for Particular Systems, Installing GMP
1326
@section Known Build Problems
1327
@cindex Build problems known
1329
@c This section is more or less meant for known build problems that are not
1330
@c otherwise worked around and require some sort of manual intervention.
1332
You might find more up-to-date information at @uref{http://swox.com/gmp/}.
1337
The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure}
1338
script, it exits silently, having died writing a preamble to
1339
@file{config.log}. Use @command{bash} 2.04 or higher.
1341
@samp{make all} was found to run out of memory during the final
1342
@file{libgmp.la} link on one system tested, despite having 64Mb available. A
1343
separate @samp{make libgmp.la} helped, perhaps recursing into the various
1344
subdirectories uses up memory.
1346
@item GNU binutils @command{strip}
1347
@cindex Stripped libraries
1349
GNU binutils @command{strip} should not be used on the static libraries
1350
@file{libgmp.a} and @file{libmp.a}, neither directly nor via @samp{make
1351
install-strip}. It can be used on the shared libraries @file{libgmp.so} and
1352
@file{libmp.so} though.
1354
Currently (binutils 2.10.0), @command{strip} unpacks an archive then operates
1355
on the files, but GMP contains multiple object files of the same name
1356
(eg. three versions of @file{init.o}), and they overwrite each other, leaving
1357
only the one that happens to be last.
1359
If stripped static libraries are wanted, the suggested workaround is to build
1360
normally, strip the separate object files, and do another @samp{make all} to
1361
rebuild. Alternately @samp{CFLAGS} with @samp{-g} omitted can always be used
1362
if it's just debugging which is unwanted.
1364
@item NeXT prior to 3.3
1366
The system compiler on old versions of NeXT was a massacred and old GCC, even
1367
if it called itself @file{cc}. This compiler cannot be used to build GMP, you
1368
need to get a real GCC, and install that. (NeXT may have fixed this in
1369
release 3.3 of their system.)
1371
@item POWER and PowerPC
1373
Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or
1374
PowerPC. If you want to use GCC for these machines, get GCC 2.7.2.1 (or
1377
@item Sequent Symmetry
1379
Use the GNU assembler instead of the system assembler, since the latter has
1384
The system @command{sed} prints an error ``Output line too long'' when libtool
1385
builds @file{libgmp.la}. This doesn't seem cause any obvious ill effects, but
1386
GNU @command{sed} is recommended, to avoid any doubt.
1388
@item Sparc Solaris 2.7 with gcc 2.95.2 in ABI=32
1390
A shared library build of GMP seems to fail in this combination, it builds but
1391
then fails the tests, apparently due to some incorrect data relocations within
1392
@code{gmp_randinit_lc_2exp_size}. The exact cause is unknown,
1393
@samp{--disable-shared} is recommended.
1395
@item Windows DLL test programs
1397
When creating a DLL version of @file{libgmp}, libtool creates wrapper scripts
1398
like @file{t-mul} for programs that would normally be @file{t-mul.exe}, in
1399
order to setup the right library paths etc. This works fine, but the absence
1400
of @file{t-mul.exe} etc causes @command{make} to think they need recompiling
1401
every time, which is an annoyance when re-running a @samp{make check}.
1405
@node GMP Basics, Reporting Bugs, Installing GMP, Top
1406
@comment node-name, next, previous, up
1410
@cindex @file{gmp.h}
1411
All declarations needed to use GMP are collected in the include file
1412
@file{gmp.h}. It is designed to work with both C and C++ compilers.
1418
Note however that prototypes for GMP functions with @code{FILE *} parameters
1419
are only provided if @code{<stdio.h>} is included too.
1426
@strong{Using functions, macros, data types, etc.@: not documented in this
1427
manual is strongly discouraged. If you do so your application is guaranteed
1428
to be incompatible with future versions of GMP.}
1431
* Nomenclature and Types::
1432
* Function Classes::
1433
* Variable Conventions::
1434
* Parameter Conventions::
1435
* Memory Management::
1437
* Useful Macros and Constants::
1438
* Compatibility with older versions::
1445
@node Nomenclature and Types, Function Classes, GMP Basics, GMP Basics
1446
@section Nomenclature and Types
1447
@cindex Nomenclature
1451
@tindex @code{mpz_t}
1453
In this manual, @dfn{integer} usually means a multiple precision integer, as
1454
defined by the GMP library. The C data type for such integers is @code{mpz_t}.
1455
Here are some examples of how to declare such integers:
1460
struct foo @{ mpz_t x, y; @};
1465
@cindex Rational number
1466
@tindex @code{mpq_t}
1468
@dfn{Rational number} means a multiple precision fraction. The C data type
1469
for these fractions is @code{mpq_t}. For example:
1475
@cindex Floating-point number
1476
@tindex @code{mpf_t}
1478
@dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision
1479
mantissa with a limited precision exponent. The C data type for such objects
1483
@tindex @code{mp_limb_t}
1485
A @dfn{limb} means the part of a multi-precision number that fits in a single
1486
machine word. (We chose this word because a limb of the human body is
1487
analogous to a digit, only larger, and containing several digits.) Normally a
1488
limb is 32 or 64 bits. The C data type for a limb is @code{mp_limb_t}.
1491
@node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics
1492
@section Function Classes
1493
@cindex Function classes
1495
There are six classes of functions in the GMP library:
1499
Functions for signed integer arithmetic, with names beginning with
1500
@code{mpz_}. The associated type is @code{mpz_t}. There are about 150
1501
functions in this class.
1504
Functions for rational number arithmetic, with names beginning with
1505
@code{mpq_}. The associated type is @code{mpq_t}. There are about 40
1506
functions in this class, but the integer functions can be used for arithmetic
1507
on the numerator and denominator separately.
1510
Functions for floating-point arithmetic, with names beginning with
1511
@code{mpf_}. The associated type is @code{mpf_t}. There are about 60
1512
functions is this class.
1515
Functions compatible with Berkeley MP, such as @code{itom}, @code{madd}, and
1516
@code{mult}. The associated type is @code{MINT}.
1519
Fast low-level functions that operate on natural numbers. These are used by
1520
the functions in the preceding groups, and you can also call them directly
1521
from very time-critical user programs. These functions' names begin with
1522
@code{mpn_}. The associated type is array of @code{mp_limb_t}. There are
1523
about 30 (hard-to-use) functions in this class.
1526
Miscellaneous functions. Functions for setting up custom allocation and
1527
functions for generating random numbers.
1531
@node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics
1532
@section Variable Conventions
1533
@cindex Variable conventions
1534
@cindex Conventions for variables
1536
GMP functions generally have output arguments before input arguments. This
1537
notation is by analogy with the assignment operator. The BSD MP compatibility
1538
functions are exceptions, having the output arguments last.
1540
GMP lets you use the same variable for both input and output in one call. For
1541
example, the main function for integer multiplication, @code{mpz_mul}, can be
1542
used to square @code{x} and put the result back in @code{x} with
1548
Before you can assign to a GMP variable, you need to initialize it by calling
1549
one of the special initialization functions. When you're done with a
1550
variable, you need to clear it out, using one of the functions for that
1551
purpose. Which function to use depends on the type of variable. See the
1552
chapters on integer functions, rational number functions, and floating-point
1553
functions for details.
1555
A variable should only be initialized once, or at least cleared between each
1556
initialization. After a variable has been initialized, it may be assigned to
1557
any number of times.
1559
For efficiency reasons, avoid excessive initializing and clearing. In
1560
general, initialize near the start of a function and clear near the end. For
1570
for (i = 1; i < 100; i++)
1572
mpz_mul (n, @dots{});
1573
mpz_fdiv_q (n, @dots{});
1581
@node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics
1582
@section Parameter Conventions
1583
@cindex Parameter conventions
1584
@cindex Conventions for parameters
1586
When a GMP variable is used as a function parameter, it's effectively a
1587
call-by-reference, meaning if the function stores a value there it will change
1588
the original in the caller.
1590
When a function is going to return a GMP result, it should designate a
1591
parameter that it sets, like the library functions do. More than one value
1592
can be returned by having more than one output parameter, again like the
1593
library functions. A @code{return} of an @code{mpz_t} etc doesn't return the
1594
object, only a pointer, and this is almost certainly not what's wanted.
1596
Here's an example accepting an @code{mpz_t} parameter, doing a calculation,
1597
and storing the result to the indicated parameter.
1601
foo (mpz_t result, mpz_t param, unsigned long n)
1604
mpz_mul_ui (result, param, n);
1605
for (i = 1; i < n; i++)
1606
mpz_add_ui (result, result, i*7);
1614
mpz_init_set_str (n, "123456", 0);
1616
gmp_printf ("%Zd\n", r);
1621
@code{foo} works even if the mainline passes the same variable as both
1622
@code{param} and @code{result}, just like the library functions. But
1623
sometimes this is tricky to arrange, and an application might not want to
1624
bother supporting that sort of thing.
1626
For interest, the GMP types @code{mpz_t} etc are implemented as one-element
1627
arrays of certain structures. This is why declaring a variable creates an
1628
object with the fields GMP needs, but then using it as a parameter passes a
1629
pointer to the object. Note that the actual fields in each @code{mpz_t} etc
1630
are for internal use only and should not be accessed directly by code that
1631
expects to be compatible with future GMP releases.
1635
@node Memory Management, Reentrancy, Parameter Conventions, GMP Basics
1636
@section Memory Management
1637
@cindex Memory Management
1639
The GMP types like @code{mpz_t} are small, containing only a couple of sizes,
1640
and pointers to allocated data. Once a variable is initialized, GMP takes
1641
care of all space allocation. Additional space is allocated whenever a
1642
variable doesn't have enough.
1644
@code{mpz_t} and @code{mpq_t} variables never reduce their allocated space.
1645
Normally this is the best policy, since it avoids frequent reallocation.
1646
Applications that need to return memory to the heap at some particular point
1647
can use @code{mpz_realloc2}, or clear variables no longer needed.
1649
@code{mpf_t} variables, in the current implementation, use a fixed amount of
1650
space, determined by the chosen precision and allocated at initialization, so
1651
their size doesn't change.
1653
All memory is allocated using @code{malloc} and friends by default, but this
1654
can be changed, see @ref{Custom Allocation}. Temporary memory on the stack is
1655
also used (via @code{alloca}), but this can be changed at build-time if
1656
desired, see @ref{Build Options}.
1659
@node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics
1662
@cindex Thread safety
1663
@cindex Multi-threading
1665
GMP is reentrant and thread-safe, with some exceptions:
1669
If configured with @option{--enable-alloca=malloc-notreentrant} (or with
1670
@option{--enable-alloca=notreentrant} when @code{alloca} is not available),
1671
then naturally GMP is not reentrant.
1674
@code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the
1675
selected precision. @code{mpf_init2} can be used instead.
1678
@code{mp_set_memory_functions} uses global variables to store the selected
1679
memory allocation functions.
1682
@code{mpz_random} and the other old random number functions use a global
1683
random state and are hence not reentrant. The newer random number functions
1684
that accept a @code{gmp_randstate_t} parameter can be used instead.
1687
If the memory allocation functions set by a call to
1688
@code{mp_set_memory_functions} (or @code{malloc} and friends by default) are
1689
not reentrant, then GMP will not be reentrant either.
1692
If the standard I/O functions such as @code{fwrite} are not reentrant then the
1693
GMP I/O functions using them will not be reentrant either.
1696
It's safe for two threads to read from the same GMP variable simultaneously,
1697
but it's not safe for one to read while the another might be writing, nor for
1698
two threads to write simultaneously. It's not safe for two threads to
1699
generate a random number from the same @code{gmp_randstate_t} simultaneously,
1700
since this involves an update of that variable.
1703
On SCO systems the default @code{<ctype.h>} macros use per-file static
1704
variables and may not be reentrant, depending whether the compiler optimizes
1705
away fetches from them. The GMP text-based input functions are affected.
1710
@node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics
1711
@section Useful Macros and Constants
1712
@cindex Useful macros and constants
1715
@deftypevr {Global Constant} {const int} mp_bits_per_limb
1716
@findex mp_bits_per_limb
1717
@cindex Bits per limb
1719
The number of bits per limb.
1722
@defmac __GNU_MP_VERSION
1723
@defmacx __GNU_MP_VERSION_MINOR
1724
@defmacx __GNU_MP_VERSION_PATCHLEVEL
1725
@cindex Version number
1726
@cindex GMP version number
1727
The major and minor GMP version, and patch level, respectively, as integers.
1728
For GMP i.j, these numbers will be i, j, and 0, respectively.
1729
For GMP i.j.k, these numbers will be i, j, and k, respectively.
1732
@deftypevr {Global Constant} {const char * const} gmp_version
1734
The GMP version number, as a null-terminated string, in the form ``i.j'' or
1735
``i.j.k''. This release is @nicode{"@value{VERSION}"}.
1739
@node Compatibility with older versions, Efficiency, Useful Macros and Constants, GMP Basics
1740
@section Compatibility with older versions
1741
@cindex Compatibility with older versions
1742
@cindex Upward compatibility
1744
This version of GMP is upwardly binary compatible with all 3.x versions, and
1745
upwardly compatible at the source level with all 2.x versions, with the
1746
following exceptions.
1750
@code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency
1751
with other @code{mpn} functions.
1754
@code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and
1755
3.0.1, but in 3.1 reverted to the 2.x style.
1758
There are a number of compatibility issues between GMP 1 and GMP 2 that of
1759
course also apply when porting applications from GMP 1 to GMP 4. Please
1760
see the GMP 2 manual for details.
1762
The Berkeley MP compatibility library (@pxref{BSD Compatible Functions}) is
1763
source and binary compatible with the standard @file{libmp}.
1766
@c @item Integer division functions round the result differently. The obsolete
1767
@c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv},
1768
@c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the
1771
@c @minus{}infinity).
1778
@c There are a lot of functions for integer division, giving the user better
1779
@c control over the rounding.
1781
@c @item The function @code{mpz_mod} now compute the true @strong{mod} function.
1783
@c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use
1784
@c @strong{mod} for reduction.
1786
@c @item The assignment functions for rational numbers do no longer canonicalize
1787
@c their results. In the case a non-canonical result could arise from an
1788
@c assignment, the user need to insert an explicit call to
1789
@c @code{mpq_canonicalize}. This change was made for efficiency.
1791
@c @item Output generated by @code{mpz_out_raw} in this release cannot be read
1792
@c by @code{mpz_inp_raw} in previous releases. This change was made for making
1793
@c the file format truly portable between machines with different word sizes.
1795
@c @item Several @code{mpn} functions have changed. But they were intentionally
1796
@c undocumented in previous releases.
1798
@c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui}
1799
@c are now implemented as macros, and thereby sometimes evaluate their
1800
@c arguments multiple times.
1802
@c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1
1803
@c for 0^0. (In version 1, they yielded 0.)
1805
@c In version 1 of the library, @code{mpq_set_den} handled negative
1806
@c denominators by copying the sign to the numerator. That is no longer done.
1808
@c Pure assignment functions do not canonicalize the assigned variable. It is
1809
@c the responsibility of the user to canonicalize the assigned variable before
1810
@c any arithmetic operations are performed on that variable.
1811
@c Note that this is an incompatible change from version 1 of the library.
1817
@node Efficiency, Debugging, Compatibility with older versions, GMP Basics
1822
@item Small operands
1823
On small operands, the time for function call overheads and memory allocation
1824
can be significant in comparison to actual calculation. This is unavoidable
1825
in a general purpose variable precision library, although GMP attempts to be
1826
as efficient as it can on both large and small operands.
1828
@item Static Linking
1829
On some CPUs, in particular the x86s, the static @file{libgmp.a} should be
1830
used for maximum speed, since the PIC code in the shared @file{libgmp.so} will
1831
have a small overhead on each function call and global data address. For many
1832
programs this will be insignificant, but for long calculations there's a gain
1835
@item Initializing and clearing
1836
Avoid excessive initializing and clearing of variables, since this can be
1837
quite time consuming, especially in comparison to otherwise fast operations
1840
A language interpreter might want to keep a free list or stack of
1841
initialized variables ready for use. It should be possible to integrate
1842
something like that with a garbage collector too.
1845
An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing
1846
values will have its memory repeatedly @code{realloc}ed, which could be quite
1847
slow or could fragment memory, depending on the C library. If an application
1848
can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can
1849
be called to allocate the necessary space from the beginning
1850
(@pxref{Initializing Integers}).
1852
It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2}
1853
is too small, since all functions will do a further reallocation if necessary.
1854
Badly overestimating memory required will waste space though.
1856
@item @code{2exp} functions
1857
It's up to an application to call functions like @code{mpz_mul_2exp} when
1858
appropriate. General purpose functions like @code{mpz_mul} make no attempt to
1859
identify powers of two or other special forms, because such inputs will
1860
usually be very rare and testing every time would be wasteful.
1862
@item @code{ui} and @code{si} functions
1863
The @code{ui} functions and the small number of @code{si} functions exist for
1864
convenience and should be used where applicable. But if for example an
1865
@code{mpz_t} contains a value that fits in an @code{unsigned long} there's no
1866
need extract it and call a @code{ui} function, just use the regular @code{mpz}
1869
@item In-Place Operations
1870
@code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg}
1871
and @code{mpf_neg} are fast when used for in-place operations like
1872
@code{mpz_abs(x,x)}, since in the current implementation only a single field
1873
of @code{x} needs changing. On suitable compilers (GCC for instance) this is
1876
@code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui}
1877
benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since
1878
usually only one or two limbs of @code{x} will need to be changed. The same
1879
applies to the full precision @code{mpz_add} etc if @code{y} is small. If
1880
@code{y} is big then cache locality may be helped, but that's all.
1882
@code{mpz_mul} is currently the opposite, a separate destination is slightly
1883
better. A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one
1884
limb, make a temporary copy of @code{x} before forming the result. Normally
1885
that copying will only be a tiny fraction of the time for the multiply, so
1886
this is not a particularly important consideration.
1888
@code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make
1889
no attempt to recognise a copy of something to itself, so a call like
1890
@code{mpz_set(x,x)} will be wasteful. Naturally that would never be written
1891
deliberately, but if it might arise from two pointers to the same object then
1892
a test to avoid it might be desirable.
1899
Note that it's never worth introducing extra @code{mpz_set} calls just to get
1900
in-place operations. If a result should go to a particular variable then just
1901
direct it there and let GMP take care of data movement.
1903
@item Divisibility Testing (Small Integers)
1905
@code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions
1906
for testing whether an @code{mpz_t} is divisible by an individual small
1907
integer. They use an algorithm which is faster than @code{mpz_tdiv_ui}, but
1908
which gives no useful information about the actual remainder, only whether
1909
it's zero (or a particular value).
1911
However when testing divisibility by several small integers, it's best to take
1912
a remainder modulo their product, to save multi-precision operations. For
1913
instance to test whether a number is divisible by any of 23, 29 or 31 take a
1914
remainder modulo @ma{23@times{}29@times{}31 = 20677} and then test that.
1916
The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well
1917
as a remainder are generally a little slower than the remainder-only functions
1918
like @code{mpz_tdiv_ui}. If the quotient is only rarely wanted then it's
1919
probably best to just take a remainder and then go back and calculate the
1920
quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the
1923
@item Rational Arithmetic
1924
The @code{mpq} functions operate on @code{mpq_t} values with no common factors
1925
in the numerator and denominator. Common factors are checked-for and cast out
1926
as necessary. In general, cancelling factors every time is the best approach
1927
since it minimizes the sizes for subsequent operations.
1929
However, applications that know something about the factorization of the
1930
values they're working with might be able to avoid some of the GCDs used for
1931
canonicalization, or swap them for divisions. For example when multiplying by
1932
a prime it's enough to check for factors of it in the denominator instead of
1933
doing a full GCD. Or when forming a big product it might be known that very
1934
little cancellation will be possible, and so canonicalization can be left to
1937
The @code{mpq_numref} and @code{mpq_denref} macros give access to the
1938
numerator and denominator to do things outside the scope of the supplied
1939
@code{mpq} functions. @xref{Applying Integer Functions}.
1941
The canonical form for rationals allows mixed-type @code{mpq_t} and integer
1942
additions or subtractions to be done directly with multiples of the
1943
denominator. This will be somewhat faster than @code{mpq_add}. For example,
1947
mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q));
1949
/* mpq += unsigned long */
1950
mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL);
1953
mpz_submul (mpq_numref(q), mpq_denref(q), z);
1956
@item Number Sequences
1957
Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui}
1958
are designed for calculating isolated values. If a range of values is wanted
1959
it's probably best to call to get a starting point and iterate from there.
1963
@node Debugging, Profiling, Efficiency, GMP Basics
1968
@item Stack Overflow
1969
Depending on the system, a segmentation violation or bus error might be the
1970
only indication of stack overflow. See @samp{--enable-alloca} choices in
1971
@ref{Build Options}, for how to address this.
1974
The most likely cause of application problems with GMP is heap corruption.
1975
Failing to @code{init} GMP variables will have unpredictable effects, and
1976
corruption arising elsewhere in a program may well affect GMP. Initializing
1977
GMP variables more than once or failing to clear them will cause memory leaks.
1979
In all such cases a malloc debugger is recommended. On a GNU or BSD system
1980
the standard C library @code{malloc} has some diagnostic facilities, see
1981
@ref{Allocation Debugging,,,libc,The GNU C Library Reference Manual}, or
1982
@samp{man 3 malloc}. Other possibilities, in no particular order, include
1985
@uref{http://www.inf.ethz.ch/personal/biere/projects/ccmalloc}
1986
@uref{http://quorum.tamu.edu/jon/gnu} @ (debauch)
1987
@uref{http://dmalloc.com}
1988
@uref{http://www.perens.com/FreeSoftware} @ (electric fence)
1989
@uref{http://packages.debian.org/fda}
1990
@uref{http://www.gnupdate.org/components/leakbug}
1991
@uref{http://people.redhat.com/~otaylor/memprof}
1992
@uref{http://www.cbmamiga.demon.co.uk/mpatrol}
1995
@item Stack Backtraces
1996
On some systems the compiler options GMP uses by default can interfere with
1997
debugging. In particular on x86 and 68k systems @samp{-fomit-frame-pointer}
1998
is used and this generally inhibits stack backtracing. Recompiling without
1999
such options may help while debugging, though the usual caveats about it
2000
potentially moving a memory problem or hiding a compiler bug will apply.
2003
A sample @file{.gdbinit} is included in the distribution, showing how to call
2004
some undocumented dump functions to print GMP variables from within GDB. Note
2005
that these functions shouldn't be used in final application code since they're
2006
undocumented and may be subject to incompatible changes in future versions of
2009
@item Source File Paths
2010
GMP has multiple source files with the same name, in different directories.
2011
For example @file{mpz}, @file{mpq}, @file{mpf} and @file{mpfr} each have an
2012
@file{init.c}. If the debugger can't already determine the right one it may
2013
help to build with absolute paths on each C file. One way to do that is to
2014
use a separate object directory with an absolute path to the source directory.
2018
/my/source/dir/gmp-@value{VERSION}/configure
2021
This works via @code{VPATH}, and might require GNU @command{make}.
2022
Alternately it might be possible to change the @code{.c.lo} rules
2025
@item Assertion Checking
2026
The build option @option{--enable-assert} is available to add some consistency
2027
checks to the library (see @ref{Build Options}). These are likely to be of
2028
limited value to most applications. Assertion failures are just as likely to
2029
indicate memory corruption as a library or compiler bug.
2031
Applications using the low-level @code{mpn} functions, however, will benefit
2032
from @option{--enable-assert} since it adds checks on the parameters of most
2033
such functions, many of which have subtle restrictions on their usage. Note
2034
however that only the generic C code has checks, not the assembler code, so
2035
CPU @samp{none} should be used for maximum checking.
2037
@item Temporary Memory Checking
2038
The build option @option{--enable-alloca=debug} arranges that each block of
2039
temporary memory in GMP is allocated with a separate call to @code{malloc} (or
2040
the allocation function set with @code{mp_set_memory_functions}).
2042
This can help a malloc debugger detect accesses outside the intended bounds,
2043
or detect memory not released. In a normal build, on the other hand,
2044
temporary memory is allocated in blocks which GMP divides up for its own use,
2045
or may be allocated with a compiler builtin @code{alloca} which will go
2046
nowhere near any malloc debugger hooks.
2048
@item Other Problems
2049
Any suspected bug in GMP itself should be isolated to make sure it's not an
2050
application problem, see @ref{Reporting Bugs}.
2054
@node Profiling, Autoconf, Debugging, GMP Basics
2058
Running a program under a profiler is a good way to find where it's spending
2059
most time and where improvements can be best sought.
2061
Depending on the system, it may be possible to get a flat profile, meaning
2062
simple timer sampling of the program counter, with no special GMP build
2063
options, just a @samp{-p} when compiling the mainline. This is a good way to
2064
ensure minimum interference with normal operation. The necessary symbol type
2065
and size information exists in most of the GMP assembler code.
2067
The @samp{--enable-profiling} build option can be used to add suitable
2068
compiler flags, either for @command{prof} (@samp{-p}) or @command{gprof}
2069
(@samp{-pg}), see @ref{Build Options}. Which of the two is available and what
2070
they do will depend on the system, and possibly on support available in
2071
@file{libc}. For some systems appropriate corresponding @code{mcount} calls
2072
are added to the assembler code too.
2074
On x86 systems @command{prof} gives call counting, so that average time spent
2075
in a function can be determined. @command{gprof}, where supported, adds call
2076
graph construction, so for instance calls to @code{mpn_add_n} from
2077
@code{mpz_add} and from @code{mpz_mul} can be differentiated.
2079
On x86 and 68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are
2080
incompatible, so the latter is not used when @command{gprof} profiling is
2081
selected, which may result in poorer code generation. If @command{prof}
2082
profiling is selected instead it should still be possible to use
2083
@command{gprof}, but only the @samp{gprof -p} flat profile and call counts can
2084
be expected to be valid, not the @samp{gprof -q} call graph.
2087
@node Autoconf, , Profiling, GMP Basics
2089
@cindex Autoconf detections
2091
Autoconf based applications can easily check whether GMP is installed. The
2092
only thing to be noted is that GMP library symbols from version 3 onwards have
2093
prefixes like @code{__gmpz}. The following therefore would be a simple test,
2096
AC_CHECK_LIB(gmp, __gmpz_init)
2099
This just uses the default @code{AC_CHECK_LIB} actions for found or not found,
2100
but an application that must have GMP would want to generate an error if not
2104
AC_CHECK_LIB(gmp, __gmpz_init, , [AC_MSG_ERROR(
2105
[GNU MP not found, see http://swox.com/gmp])])
2108
If functions added in some particular version of GMP are required, then one of
2109
those can be used when checking. For example @code{mpz_mul_si} was added in
2113
AC_CHECK_LIB(gmp, __gmpz_mul_si, , [AC_MSG_ERROR(
2114
[GNU MP not found, or not 3.1 or up, see http://swox.com/gmp])])
2117
An alternative would be to test the version number in @file{gmp.h} using say
2118
@code{AC_EGREP_CPP}. That would make it possible to test the exact version,
2119
if some particular sub-minor release is known to be necessary.
2121
An application that can use either GMP 2 or 3 will need to test for
2122
@code{__gmpz_init} (GMP 3 and up) or @code{mpz_init} (GMP 2), and it's also
2123
worth checking for @file{libgmp2} since Debian GNU/Linux systems used that
2124
name in the past. For example,
2127
AC_CHECK_LIB(gmp, __gmpz_init, ,
2128
[AC_CHECK_LIB(gmp, mpz_init, ,
2129
[AC_CHECK_LIB(gmp2, mpz_init)])])
2132
In general it's suggested that applications should simply demand a new enough
2133
GMP rather than trying to provide supplements for features not available in
2136
Occasionally an application will need or want to know the size of a type at
2137
configuration or preprocessing time, not just with @code{sizeof} in the code.
2138
This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or
2139
up is best for this, since prior versions needed certain @samp{-D} defines on
2140
systems using a @code{long long} limb. The following would suit Autoconf 2.50
2144
AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>])
2147
The optional @code{mpfr} functions are provided in a separate
2148
@file{libmpfr.a}, and this might be from GMP with @option{--enable-mpfr} or
2149
from MPFR installed separately. Either way @file{libmpfr} depends on
2150
@file{libgmp}, it doesn't stand alone. Currently only a static
2151
@file{libmpfr.a} will be available, not a shared library, since upward binary
2152
compatibility is not guaranteed.
2155
AC_CHECK_LIB(mpfr, mpfr_add, , [AC_MSG_ERROR(
2156
[Need MPFR either from GNU MP 4 or separate MPFR package.
2157
See http://www.mpfr.org or http://swox.com/gmp])
2161
@node Reporting Bugs, Integer Functions, GMP Basics, Top
2162
@comment node-name, next, previous, up
2163
@chapter Reporting Bugs
2164
@cindex Reporting bugs
2165
@cindex Bug reporting
2167
If you think you have found a bug in the GMP library, please investigate it
2168
and report it. We have made this library available to you, and it is not too
2169
much to ask you to report the bugs you find.
2171
Before you report a bug, check it's not already addressed in @ref{Known Build
2172
Problems}, or perhaps @ref{Notes for Particular Systems}. You may also want
2173
to check @uref{http://swox.com/gmp/} for patches for this release.
2175
Please include the following in any report,
2179
The GMP version number, and if pre-packaged or patched then say so.
2182
A test program that makes it possible for us to reproduce the bug. Include
2183
instructions on how to run the program.
2186
A description of what is wrong. If the results are incorrect, in what way.
2187
If you get a crash, say so.
2190
If you get a crash, include a stack backtrace from the debugger if it's
2191
informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}).
2194
Please do not send core dumps, executables or @command{strace}s.
2197
The configuration options you used when building GMP, if any.
2200
The name of the compiler and its version. For @command{gcc}, get the version
2201
with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar.
2204
The output from running @samp{uname -a}.
2207
The output from running @samp{./config.guess}, and from running
2208
@samp{./configfsf.guess} (might be the same).
2211
If the bug is related to @samp{configure}, then the contents of
2215
If the bug is related to an @file{asm} file not assembling, then the contents
2216
of @file{config.m4} and the offending line or lines from the temporary
2217
@file{mpn/tmp-<file>.s}.
2220
Please make an effort to produce a self-contained report, with something
2221
definite that can be tested or debugged. Vague queries or piecemeal messages
2222
are difficult to act on and don't help the development effort.
2224
It is not uncommon that an observed problem is actually due to a bug in the
2225
compiler; the GMP code tends to explore interesting corners in compilers.
2227
If your bug report is good, we will do our best to help you get a corrected
2228
version of the library; if the bug report is poor, we won't do anything about
2229
it (except maybe ask you to send a better report).
2231
Send your report to: @email{bug-gmp@@gnu.org}.
2233
If you think something in this manual is unclear, or downright incorrect, or if
2234
the language needs to be improved, please send a note to the same address.
2237
@node Integer Functions, Rational Number Functions, Reporting Bugs, Top
2238
@comment node-name, next, previous, up
2239
@chapter Integer Functions
2240
@cindex Integer functions
2242
This chapter describes the GMP functions for performing integer arithmetic.
2243
These functions start with the prefix @code{mpz_}.
2245
GMP integers are stored in objects of type @code{mpz_t}.
2248
* Initializing Integers::
2249
* Assigning Integers::
2250
* Simultaneous Integer Init & Assign::
2251
* Converting Integers::
2252
* Integer Arithmetic::
2253
* Integer Division::
2254
* Integer Exponentiation::
2256
* Number Theoretic Functions::
2257
* Integer Comparisons::
2258
* Integer Logic and Bit Fiddling::
2260
* Integer Random Numbers::
2261
* Miscellaneous Integer Functions::
2264
@node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions
2265
@comment node-name, next, previous, up
2266
@section Initialization Functions
2267
@cindex Integer initialization functions
2268
@cindex Initialization functions
2270
The functions for integer arithmetic assume that all integer objects are
2271
initialized. You do that by calling the function @code{mpz_init}. For
2279
mpz_add (integ, @dots{});
2281
mpz_sub (integ, @dots{});
2283
/* Unless the program is about to exit, do ... */
2288
As you can see, you can store new values any number of times, once an
2289
object is initialized.
2291
@deftypefun void mpz_init (mpz_t @var{integer})
2292
Initialize @var{integer}, and set its value to 0.
2295
@deftypefun void mpz_init2 (mpz_t @var{integer}, unsigned long @var{n})
2296
Initialize @var{integer}, with space for @var{n} bits, and set its value to 0.
2298
@var{n} is only the initial space, @var{integer} will grow automatically in
2299
the normal way, if necessary, for subsequent values stored. @code{mpz_init2}
2300
makes it possible to avoid such reallocations if a maximum size is known in
2304
@deftypefun void mpz_clear (mpz_t @var{integer})
2305
Free the space occupied by @var{integer}. Call this function for all
2306
@code{mpz_t} variables when you are done with them.
2309
@deftypefun void mpz_realloc2 (mpz_t @var{integer}, unsigned long @var{n})
2310
Change the space allocated for @var{integer} to @var{n} bits. The value in
2311
@var{integer} is preserved if it fits, or is set to 0 if not.
2313
This function can be used to increase the space for a variable in order to
2314
avoid repeated automatic reallocations, or to decrease it to give memory back
2318
@deftypefun void mpz_array_init (mpz_t @var{integer_array}[], size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}})
2319
This is a special type of initialization. @strong{Fixed} space of
2320
@var{fixed_num_bits} bits is allocated to each of the @var{array_size}
2321
integers in @var{integer_array}.
2323
The space will not be automatically increased, unlike the normal
2324
@code{mpz_init}, but instead an application must ensure it's sufficient for
2325
any value stored. The following space requirements apply to various
2330
@code{mpz_abs}, @code{mpz_neg}, @code{mpz_set}, @code{mpz_set_si} and
2331
@code{mpz_set_ui} need room for the value they store.
2334
@code{mpz_add}, @code{mpz_add_ui}, @code{mpz_sub} and @code{mpz_sub_ui} need
2335
room for the larger of the two operands, plus an extra
2336
@code{mp_bits_per_limb}.
2339
@code{mpz_mul}, @code{mpz_mul_ui} and @code{mpz_mul_ui} need room for the sum
2340
of the number of bits in their operands, but each rounded up to a multiple of
2341
@code{mp_bits_per_limb}.
2344
@code{mpz_swap} can be used between two array variables, but not between an
2345
array and a normal variable.
2348
For other functions, or if in doubt, the suggestion is to calculate in a
2349
regular @code{mpz_init} variable and copy the result to an array variable with
2352
@code{mpz_array_init} can reduce memory usage in algorithms that need large
2353
arrays of integers, since it avoids allocating and reallocating lots of small
2354
memory blocks. There is no way to free the storage allocated by this
2355
function. Don't call @code{mpz_clear}!
2358
@deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc})
2359
Change the space for @var{integer} to @var{new_alloc} limbs. The value in
2360
@var{integer} is preserved if it fits, or is set to 0 if not. The return
2361
value is not useful to applications and should be ignored.
2363
@code{mpz_realloc2} is the preferred way to accomplish allocation changes like
2364
this. @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that
2365
@code{_mpz_realloc} takes the new size in limbs.
2369
@node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions
2370
@comment node-name, next, previous, up
2371
@section Assignment Functions
2372
@cindex Integer assignment functions
2373
@cindex Assignment functions
2375
These functions assign new values to already initialized integers
2376
(@pxref{Initializing Integers}).
2378
@deftypefun void mpz_set (mpz_t @var{rop}, mpz_t @var{op})
2379
@deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
2380
@deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op})
2381
@deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op})
2382
@deftypefunx void mpz_set_q (mpz_t @var{rop}, mpq_t @var{op})
2383
@deftypefunx void mpz_set_f (mpz_t @var{rop}, mpf_t @var{op})
2384
Set the value of @var{rop} from @var{op}.
2386
@code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to
2390
@deftypefun int mpz_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base})
2391
Set the value of @var{rop} from @var{str}, a null-terminated C string in base
2392
@var{base}. White space is allowed in the string, and is simply ignored. The
2393
base may vary from 2 to 36. If @var{base} is 0, the actual base is determined
2394
from the leading characters: if the first two characters are ``0x'' or ``0X'',
2395
hexadecimal is assumed, otherwise if the first character is ``0'', octal is
2396
assumed, otherwise decimal is assumed.
2398
This function returns 0 if the entire string is a valid number in base
2399
@var{base}. Otherwise it returns @minus{}1.
2401
[It turns out that it is not entirely true that this function ignores
2402
white-space. It does ignore it between digits, but not after a minus sign or
2403
within or after ``0x''. We are considering changing the definition of this
2404
function, making it fail when there is any white-space in the input, since
2405
that makes a lot of sense. Send your opinion of this change to
2406
@email{bug-gmp@@gnu.org}. Do you really want it to accept @nicode{"3 14"} as
2407
meaning 314 as it does now?]
2410
@deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2})
2411
Swap the values @var{rop1} and @var{rop2} efficiently.
2415
@node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions
2416
@comment node-name, next, previous, up
2417
@section Combined Initialization and Assignment Functions
2418
@cindex Initialization and assignment functions
2419
@cindex Integer init and assign
2421
For convenience, GMP provides a parallel series of initialize-and-set functions
2422
which initialize the output and then store the value there. These functions'
2423
names have the form @code{mpz_init_set@dots{}}
2425
Here is an example of using one:
2430
mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10);
2432
mpz_sub (pie, @dots{});
2439
Once the integer has been initialized by any of the @code{mpz_init_set@dots{}}
2440
functions, it can be used as the source or destination operand for the ordinary
2441
integer functions. Don't use an initialize-and-set function on a variable
2442
already initialized!
2444
@deftypefun void mpz_init_set (mpz_t @var{rop}, mpz_t @var{op})
2445
@deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
2446
@deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op})
2447
@deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op})
2448
Initialize @var{rop} with limb space and set the initial numeric value from
2452
@deftypefun int mpz_init_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base})
2453
Initialize @var{rop} and set its value like @code{mpz_set_str} (see its
2454
documentation above for details).
2456
If the string is a correct base @var{base} number, the function returns 0;
2457
if an error occurs it returns @minus{}1. @var{rop} is initialized even if
2458
an error occurs. (I.e., you have to call @code{mpz_clear} for it.)
2462
@node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions
2463
@comment node-name, next, previous, up
2464
@section Conversion Functions
2465
@cindex Integer conversion functions
2466
@cindex Conversion functions
2468
This section describes functions for converting GMP integers to standard C
2469
types. Functions for converting @emph{to} GMP integers are described in
2470
@ref{Assigning Integers} and @ref{I/O of Integers}.
2472
@deftypefun {unsigned long int} mpz_get_ui (mpz_t @var{op})
2473
Return the least significant part from @var{op}. This function combined with
2474
@* @code{mpz_tdiv_q_2exp(@dots{}, @var{op}, CHAR_BIT*sizeof(unsigned long
2475
int))} can be used to decompose an integer into unsigned longs.
2478
@deftypefun {signed long int} mpz_get_si (mpz_t @var{op})
2479
If @var{op} fits into a @code{signed long int} return the value of @var{op}.
2480
Otherwise return the least significant part of @var{op}, with the same sign
2483
If @var{op} is too large to fit in a @code{signed long int}, the returned
2484
result is probably not very useful. To find out if the value will fit, use
2485
the function @code{mpz_fits_slong_p}.
2488
@deftypefun double mpz_get_d (mpz_t @var{op})
2489
Convert @var{op} to a @code{double}.
2492
@deftypefun double mpz_get_d_2exp (signed long int @var{exp}, mpz_t @var{op})
2493
Find @var{d} and @var{exp} such that @m{@var{d}\times 2^{exp}, @var{d} times 2
2494
raised to @var{exp}}, with @ma{0.5@le{}@GMPabs{@var{d}}<1}, is a good
2495
approximation to @var{op}.
2498
@deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, mpz_t @var{op})
2499
Convert @var{op} to a string of digits in base @var{base}. The base may vary
2502
If @var{str} is @code{NULL}, the result string is allocated using the current
2503
allocation function (@pxref{Custom Allocation}). The block will be
2504
@code{strlen(str)+1} bytes, that being exactly enough for the string and
2507
If @var{str} is not @code{NULL}, it should point to a block of storage large
2508
enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base})
2509
+ 2}. The two extra bytes are for a possible minus sign, and the
2512
A pointer to the result string is returned, being either the allocated block,
2513
or the given @var{str}.
2516
@deftypefun mp_limb_t mpz_getlimbn (mpz_t @var{op}, mp_size_t @var{n})
2517
Return limb number @var{n} from @var{op}. The sign of @var{op} is ignored,
2518
just the absolute value is used. The least significant limb is number 0.
2520
@code{mpz_size} can be used to find how many limbs make up @var{op}.
2521
@code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to
2522
@code{mpz_size(@var{op})-1}.
2527
@node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions
2528
@comment node-name, next, previous, up
2529
@section Arithmetic Functions
2530
@cindex Integer arithmetic functions
2531
@cindex Arithmetic functions
2533
@deftypefun void mpz_add (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2534
@deftypefunx void mpz_add_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2535
Set @var{rop} to @ma{@var{op1} + @var{op2}}.
2538
@deftypefun void mpz_sub (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2539
@deftypefunx void mpz_sub_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2540
Set @var{rop} to @var{op1} @minus{} @var{op2}.
2543
@deftypefun void mpz_mul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2544
@deftypefunx void mpz_mul_si (mpz_t @var{rop}, mpz_t @var{op1}, long int @var{op2})
2545
@deftypefunx void mpz_mul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2546
Set @var{rop} to @ma{@var{op1} @GMPtimes{} @var{op2}}.
2549
@deftypefun void mpz_addmul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2550
@deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2551
Set @var{rop} to @ma{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}.
2554
@deftypefun void mpz_submul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2555
@deftypefunx void mpz_submul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2556
Set @var{rop} to @ma{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}.
2559
@deftypefun void mpz_mul_2exp (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2560
@cindex Bit shift left
2561
Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
2562
@var{op2}}. This operation can also be defined as a left shift by @var{op2}
2566
@deftypefun void mpz_neg (mpz_t @var{rop}, mpz_t @var{op})
2567
Set @var{rop} to @minus{}@var{op}.
2570
@deftypefun void mpz_abs (mpz_t @var{rop}, mpz_t @var{op})
2571
Set @var{rop} to the absolute value of @var{op}.
2576
@node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions
2577
@section Division Functions
2578
@cindex Integer division functions
2579
@cindex Division functions
2581
Division is undefined if the divisor is zero. Passing a zero divisor to the
2582
division or modulo functions (including the modular powering functions
2583
@code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by
2584
zero. This lets a program handle arithmetic exceptions in these functions the
2585
same way as for normal C @code{int} arithmetic.
2587
@c Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line
2588
@c between each, and seem to let tex do a better job of page breaks than an
2589
@c @sp 1 in the middle of one big set.
2591
@deftypefun void mpz_cdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
2592
@deftypefunx void mpz_cdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2593
@deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2595
@deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2596
@deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2597
@deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
2598
@deftypefunx {unsigned long int} mpz_cdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
2600
@deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2601
@deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2604
@deftypefun void mpz_fdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
2605
@deftypefunx void mpz_fdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2606
@deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2608
@deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2609
@deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2610
@deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
2611
@deftypefunx {unsigned long int} mpz_fdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
2613
@deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2614
@deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2617
@deftypefun void mpz_tdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
2618
@deftypefunx void mpz_tdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2619
@deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2621
@deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2622
@deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2623
@deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
2624
@deftypefunx {unsigned long int} mpz_tdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
2626
@deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2627
@deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}})
2628
@cindex Bit shift right
2631
Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder
2632
@var{r}. For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}.
2633
The rounding is in three styles, each suiting different applications.
2637
@code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will
2638
have the opposite sign to @var{d}. The @code{c} stands for ``ceil''.
2641
@code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and
2642
@var{r} will have the same sign as @var{d}. The @code{f} stands for
2646
@code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign
2647
as @var{n}. The @code{t} stands for ``truncate''.
2650
In all cases @var{q} and @var{r} will satisfy
2651
@m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and
2652
@var{r} will satisfy @ma{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}.
2654
The @code{q} functions calculate only the quotient, the @code{r} functions
2655
only the remainder, and the @code{qr} functions calculate both. Note that for
2656
@code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or
2657
results will be unpredictable.
2659
For the @code{ui} variants the return value is the remainder, and in fact
2660
returning the remainder is all the @code{div_ui} functions do. For
2661
@code{tdiv} and @code{cdiv} the remainder can be negative, so for those the
2662
return value is the absolute value of the remainder.
2664
The @code{2exp} functions are right shifts and bit masks, but of course
2665
rounding the same as the other functions. For positive @var{n} both
2666
@code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp} are simple bitwise right
2667
shifts. For negative @var{n}, @code{mpz_fdiv_q_2exp} is effectively an
2668
arithmetic right shift treating @var{n} as twos complement the same as the
2669
bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp} effectively
2670
treats @var{n} as sign and magnitude.
2673
@deftypefun void mpz_mod (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
2674
@deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
2675
Set @var{r} to @var{n} @code{mod} @var{d}. The sign of the divisor is
2676
ignored; the result is always non-negative.
2678
@code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the
2679
remainder as well as setting @var{r}. See @code{mpz_fdiv_ui} above if only
2680
the return value is wanted.
2683
@deftypefun void mpz_divexact (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
2684
@deftypefunx void mpz_divexact_ui (mpz_t @var{q}, mpz_t @var{n}, unsigned long @var{d})
2685
@cindex Exact division functions
2686
Set @var{q} to @var{n}/@var{d}. These functions produce correct results only
2687
when it is known in advance that @var{d} divides @var{n}.
2689
These routines are much faster than the other division functions, and are the
2690
best choice when exact division is known to occur, for example reducing a
2691
rational to lowest terms.
2694
@deftypefun int mpz_divisible_p (mpz_t @var{n}, mpz_t @var{d})
2695
@deftypefunx int mpz_divisible_ui_p (mpz_t @var{n}, unsigned long int @var{d})
2696
@deftypefunx int mpz_divisible_2exp_p (mpz_t @var{n}, unsigned long int @var{b})
2697
Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of
2698
@code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}.
2701
@deftypefun int mpz_congruent_p (mpz_t @var{n}, mpz_t @var{c}, mpz_t @var{d})
2702
@deftypefunx int mpz_congruent_ui_p (mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d})
2703
@deftypefunx int mpz_congruent_2exp_p (mpz_t @var{n}, mpz_t @var{c}, unsigned long int @var{b})
2704
Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the
2705
case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}.
2710
@node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions
2711
@section Exponentiation Functions
2712
@cindex Integer exponentiation functions
2713
@cindex Exponentiation functions
2714
@cindex Powering functions
2716
@deftypefun void mpz_powm (mpz_t @var{rop}, mpz_t @var{base}, mpz_t @var{exp}, mpz_t @var{mod})
2717
@deftypefunx void mpz_powm_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp}, mpz_t @var{mod})
2718
Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp})
2721
Negative @var{exp} is supported if an inverse @ma{@var{base}^@W{-1} @bmod
2722
@var{mod}} exists (see @code{mpz_invert} in @ref{Number Theoretic Functions}).
2723
If an inverse doesn't exist then a divide by zero is raised.
2726
@deftypefun void mpz_pow_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp})
2727
@deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp})
2728
Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}. The case
2734
@node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions
2735
@section Root Extraction Functions
2736
@cindex Integer root functions
2737
@cindex Root extraction functions
2739
@deftypefun int mpz_root (mpz_t @var{rop}, mpz_t @var{op}, unsigned long int @var{n})
2740
Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer
2741
part of the @var{n}th root of @var{op}. Return non-zero if the computation
2742
was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power.
2745
@deftypefun void mpz_sqrt (mpz_t @var{rop}, mpz_t @var{op})
2746
Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated
2747
integer part of the square root of @var{op}.
2750
@deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, mpz_t @var{op})
2751
Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
2752
of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{rop2} to the
2753
remainder @m{(@var{op} - @var{rop1}^2),
2754
@var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a
2757
If @var{rop1} and @var{rop2} are the same variable, the results are
2761
@deftypefun int mpz_perfect_power_p (mpz_t @var{op})
2762
Return non-zero if @var{op} is a perfect power, i.e., if there exist integers
2763
@m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that
2764
@m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}.
2766
Under this definition both 0 and 1 are considered to be perfect powers.
2767
Negative values of @var{op} are accepted, but of course can only be odd
2771
@deftypefun int mpz_perfect_square_p (mpz_t @var{op})
2772
Return non-zero if @var{op} is a perfect square, i.e., if the square root of
2773
@var{op} is an integer. Under this definition both 0 and 1 are considered to
2779
@node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions
2780
@section Number Theoretic Functions
2781
@cindex Number theoretic functions
2783
@deftypefun int mpz_probab_prime_p (mpz_t @var{n}, int @var{reps})
2784
@cindex Prime testing functions
2785
Determine whether @var{n} is prime. Return 2 if @var{n} is definitely prime,
2786
return 1 if @var{n} is probably prime (without being certain), or return 0 if
2787
@var{n} is definitely composite.
2789
This function does some trial divisions, then some Miller-Rabin probabilistic
2790
primality tests. @var{reps} controls how many such tests are done, 5 to 10 is
2791
a reasonable number, more will reduce the chances of a composite being
2792
returned as ``probably prime''.
2794
Miller-Rabin and similar tests can be more properly called compositeness
2795
tests. Numbers which fail are known to be composite but those which pass
2796
might be prime or might be composite. Only a few composites pass, hence those
2797
which pass are considered probably prime.
2800
@deftypefun void mpz_nextprime (mpz_t @var{rop}, mpz_t @var{op})
2801
Set @var{rop} to the next prime greater than @var{op}.
2803
This function uses a probabilistic algorithm to identify primes. For
2804
practical purposes it's adequate, the chance of a composite passing will be
2808
@c mpz_prime_p not implemented as of gmp 3.0.
2810
@c @deftypefun int mpz_prime_p (mpz_t @var{n})
2811
@c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime.
2812
@c This function is far slower than @code{mpz_probab_prime_p}, but then it
2813
@c never returns non-zero for composite numbers.
2815
@c (For practical purposes, using @code{mpz_probab_prime_p} is adequate.
2816
@c The likelihood of a programming error or hardware malfunction is orders
2817
@c of magnitudes greater than the likelihood for a composite to pass as a
2818
@c prime, if the @var{reps} argument is in the suggested range.)
2821
@deftypefun void mpz_gcd (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2822
@cindex Greatest common divisor functions
2823
Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}.
2824
The result is always positive even if one or both input operands
2828
@deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
2829
Compute the greatest common divisor of @var{op1} and @var{op2}. If
2830
@var{rop} is not @code{NULL}, store the result there.
2832
If the result is small enough to fit in an @code{unsigned long int}, it is
2833
returned. If the result does not fit, 0 is returned, and the result is equal
2834
to the argument @var{op1}. Note that the result will always fit if @var{op2}
2838
@deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, mpz_t @var{a}, mpz_t @var{b})
2839
@cindex Extended GCD
2840
Compute @var{g}, @var{s}, and @var{t}, such that
2841
@ma{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g} =
2842
@gcd{}(@var{a}, @var{b})}. If @var{t} is @code{NULL}, that argument is
2846
@deftypefun void mpz_lcm (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2847
@deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long @var{op2})
2848
@cindex Least common multiple functions
2849
Set @var{rop} to the least common multiple of @var{op1} and @var{op2}.
2850
@var{rop} is always positive, irrespective of the signs of @var{op1} and
2851
@var{op2}. @var{rop} will be zero if either @var{op1} or @var{op2} is zero.
2854
@deftypefun int mpz_invert (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2855
@cindex Modular inverse functions
2856
Compute the inverse of @var{op1} modulo @var{op2} and put the result in
2857
@var{rop}. If the inverse exists, the return value is non-zero and @var{rop}
2858
will satisfy @ma{0 @le{} @var{rop} < @var{op2}}. If an inverse doesn't exist
2859
the return value is zero and @var{rop} is undefined.
2862
@deftypefun int mpz_jacobi (mpz_t @var{a}, mpz_t @var{b})
2863
@cindex Jacobi symbol functions
2864
Calculate the Jacobi symbol @m{\left(a \over b\right),
2865
(@var{a}/@var{b})}. This is defined only for @var{b} odd.
2868
@deftypefun int mpz_legendre (mpz_t @var{a}, mpz_t @var{p})
2869
Calculate the Legendre symbol @m{\left(a \over p\right),
2870
(@var{a}/@var{p})}. This is defined only for @var{p} an odd positive
2871
prime, and for such @var{p} it's identical to the Jacobi symbol.
2874
@deftypefun int mpz_kronecker (mpz_t @var{a}, mpz_t @var{b})
2875
@deftypefunx int mpz_kronecker_si (mpz_t @var{a}, long @var{b})
2876
@deftypefunx int mpz_kronecker_ui (mpz_t @var{a}, unsigned long @var{b})
2877
@deftypefunx int mpz_si_kronecker (long @var{a}, mpz_t @var{b})
2878
@deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, mpz_t @var{b})
2879
@cindex Kronecker symbol functions
2880
Calculate the Jacobi symbol @m{\left(a \over b\right),
2881
(@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over
2882
2\right) = \left(2 \over a\right), (a/2)=(2/a)} when @ma{a} odd, or
2883
@m{\left(a \over 2\right) = 0, (a/2)=0} when @ma{a} even.
2885
When @var{b} is odd the Jacobi symbol and Kronecker symbol are
2886
identical, so @code{mpz_kronecker_ui} etc can be used for mixed
2887
precision Jacobi symbols too.
2889
For more information see Henri Cohen section 1.4.2 (@pxref{References}),
2890
or any number theory textbook. See also the example program
2891
@file{demos/qcn.c} which uses @code{mpz_kronecker_ui}.
2894
@deftypefun {unsigned long int} mpz_remove (mpz_t @var{rop}, mpz_t @var{op}, mpz_t @var{f})
2895
Remove all occurrences of the factor @var{f} from @var{op} and store the
2896
result in @var{rop}. Return the multiplicity of @var{f} in @var{op}.
2899
@deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{op})
2900
@cindex Factorial functions
2901
Set @var{rop} to @var{op}!, the factorial of @var{op}.
2904
@deftypefun void mpz_bin_ui (mpz_t @var{rop}, mpz_t @var{n}, unsigned long int @var{k})
2905
@deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}})
2906
@cindex Binomial coefficient functions
2907
Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over
2908
@var{k}} and store the result in @var{rop}. Negative values of @var{n} are
2909
supported by @code{mpz_bin_ui}, using the identity
2910
@m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right),
2911
bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6
2915
@deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n})
2916
@deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n})
2917
@cindex Fibonacci sequence functions
2918
@code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci
2919
number. @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to
2922
These functions are designed for calculating isolated Fibonacci numbers. When
2923
a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and
2924
iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or
2928
@deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n})
2929
@deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n})
2930
@cindex Lucas number functions
2931
@code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas
2932
number. @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1}
2933
to @m{L_{n-1},L[n-1]}.
2935
These functions are designed for calculating isolated Lucas numbers. When a
2936
sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and
2937
iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or
2940
The Fibonacci numbers and Lucas numbers are related sequences, so it's never
2941
necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}. The
2942
formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers
2943
Algorithm}, the reverse is straightforward too.
2947
@node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions
2948
@comment node-name, next, previous, up
2949
@section Comparison Functions
2950
@cindex Integer comparison functions
2951
@cindex Comparison functions
2953
@deftypefn Function int mpz_cmp (mpz_t @var{op1}, mpz_t @var{op2})
2954
@deftypefnx Function int mpz_cmp_d (mpz_t @var{op1}, double @var{op2})
2955
@deftypefnx Macro int mpz_cmp_si (mpz_t @var{op1}, signed long int @var{op2})
2956
@deftypefnx Macro int mpz_cmp_ui (mpz_t @var{op1}, unsigned long int @var{op2})
2957
Compare @var{op1} and @var{op2}. Return a positive value if @ma{@var{op1} >
2958
@var{op2}}, zero if @ma{@var{op1} = @var{op2}}, or a negative value if
2959
@ma{@var{op1} < @var{op2}}.
2961
Note that @code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate
2962
their arguments more than once.
2965
@deftypefn Function int mpz_cmpabs (mpz_t @var{op1}, mpz_t @var{op2})
2966
@deftypefnx Function int mpz_cmpabs_d (mpz_t @var{op1}, double @var{op2})
2967
@deftypefnx Function int mpz_cmpabs_ui (mpz_t @var{op1}, unsigned long int @var{op2})
2968
Compare the absolute values of @var{op1} and @var{op2}. Return a positive
2969
value if @ma{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if
2970
@ma{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if
2971
@ma{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}.
2973
Note that @code{mpz_cmpabs_si} is a macro and will evaluate its arguments more
2977
@deftypefn Macro int mpz_sgn (mpz_t @var{op})
2979
@cindex Integer sign tests
2980
Return @ma{+1} if @ma{@var{op} > 0}, 0 if @ma{@var{op} = 0}, and @ma{-1} if
2983
This function is actually implemented as a macro. It evaluates its argument
2988
@node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions
2989
@comment node-name, next, previous, up
2990
@section Logical and Bit Manipulation Functions
2991
@cindex Logical functions
2992
@cindex Bit manipulation functions
2993
@cindex Integer bit manipulation functions
2995
These functions behave as if twos complement arithmetic were used (although
2996
sign-magnitude is the actual implementation). The least significant bit is
2999
@deftypefun void mpz_and (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3000
Set @var{rop} to @var{op1} logical-and @var{op2}.
3003
@deftypefun void mpz_ior (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3004
Set @var{rop} to @var{op1} inclusive-or @var{op2}.
3007
@deftypefun void mpz_xor (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3008
Set @var{rop} to @var{op1} exclusive-or @var{op2}.
3011
@deftypefun void mpz_com (mpz_t @var{rop}, mpz_t @var{op})
3012
Set @var{rop} to the one's complement of @var{op}.
3015
@deftypefun {unsigned long int} mpz_popcount (mpz_t @var{op})
3016
If @ma{@var{op}@ge{}0}, return the population count of @var{op}, which is the
3017
number of 1 bits in the binary representation. If @ma{@var{op}<0}, the number
3018
of 1s is infinite, and the return value is @var{MAX_ULONG}, the largest
3019
possible @code{unsigned long}.
3022
@deftypefun {unsigned long int} mpz_hamdist (mpz_t @var{op1}, mpz_t @var{op2})
3023
If @var{op1} and @var{op2} are both @ma{@ge{}0} or both @ma{<0}, return the
3024
hamming distance between the two operands, which is the number of bit
3025
positions where @var{op1} and @var{op2} have different bit values. If one
3026
operand is @ma{@ge{}0} and the other @ma{<0} then the number of bits different
3027
is infinite, and the return value is @var{MAX_ULONG}, the largest possible
3028
@code{unsigned long}.
3031
@deftypefun {unsigned long int} mpz_scan0 (mpz_t @var{op}, unsigned long int @var{starting_bit})
3032
@deftypefunx {unsigned long int} mpz_scan1 (mpz_t @var{op}, unsigned long int @var{starting_bit})
3033
Scan @var{op}, starting from bit @var{starting_bit}, towards more significant
3034
bits, until the first 0 or 1 bit (respectively) is found. Return the index of
3037
If the bit at @var{starting_bit} is already what's sought, then
3038
@var{starting_bit} is returned.
3040
If there's no bit found, then @var{MAX_ULONG} is returned. This will happen
3041
in @code{mpz_scan0} past the end of a positive number, or @code{mpz_scan1}
3042
past the end of a negative.
3045
@deftypefun void mpz_setbit (mpz_t @var{rop}, unsigned long int @var{bit_index})
3046
Set bit @var{bit_index} in @var{rop}.
3049
@deftypefun void mpz_clrbit (mpz_t @var{rop}, unsigned long int @var{bit_index})
3050
Clear bit @var{bit_index} in @var{rop}.
3053
@deftypefun int mpz_tstbit (mpz_t @var{op}, unsigned long int @var{bit_index})
3054
Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly.
3057
@node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions
3058
@comment node-name, next, previous, up
3059
@section Input and Output Functions
3060
@cindex Integer input and output functions
3061
@cindex Input functions
3062
@cindex Output functions
3063
@cindex I/O functions
3065
Functions that perform input from a stdio stream, and functions that output to
3066
a stdio stream. Passing a @code{NULL} pointer for a @var{stream} argument to any of
3067
these functions will make them read from @code{stdin} and write to
3068
@code{stdout}, respectively.
3070
When using any of these functions, it is a good idea to include @file{stdio.h}
3071
before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
3072
for these functions.
3074
@deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, mpz_t @var{op})
3075
Output @var{op} on stdio stream @var{stream}, as a string of digits in base
3076
@var{base}. The base may vary from 2 to 36.
3078
Return the number of bytes written, or if an error occurred, return 0.
3081
@deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base})
3082
Input a possibly white-space preceded string in base @var{base} from stdio
3083
stream @var{stream}, and put the read integer in @var{rop}. The base may vary
3084
from 2 to 36. If @var{base} is 0, the actual base is determined from the
3085
leading characters: if the first two characters are `0x' or `0X', hexadecimal
3086
is assumed, otherwise if the first character is `0', octal is assumed,
3087
otherwise decimal is assumed.
3089
Return the number of bytes read, or if an error occurred, return 0.
3092
@deftypefun size_t mpz_out_raw (FILE *@var{stream}, mpz_t @var{op})
3093
Output @var{op} on stdio stream @var{stream}, in raw binary format. The
3094
integer is written in a portable format, with 4 bytes of size information, and
3095
that many bytes of limbs. Both the size and the limbs are written in
3096
decreasing significance order (i.e., in big-endian).
3098
The output can be read with @code{mpz_inp_raw}.
3100
Return the number of bytes written, or if an error occurred, return 0.
3102
The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because
3103
of changes necessary for compatibility between 32-bit and 64-bit machines.
3106
@deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream})
3107
Input from stdio stream @var{stream} in the format written by
3108
@code{mpz_out_raw}, and put the result in @var{rop}. Return the number of
3109
bytes read, or if an error occurred, return 0.
3111
This routine can read the output from @code{mpz_out_raw} also from GMP 1, in
3112
spite of changes necessary for compatibility between 32-bit and 64-bit
3118
@node Integer Random Numbers, Miscellaneous Integer Functions, I/O of Integers, Integer Functions
3119
@comment node-name, next, previous, up
3120
@section Random Number Functions
3121
@cindex Integer random number functions
3122
@cindex Random number functions
3124
The random number functions of GMP come in two groups; older function
3125
that rely on a global state, and newer functions that accept a state
3126
parameter that is read and modified. Please see the @ref{Random Number
3127
Functions} for more information on how to use and not to use random
3130
@deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{n})
3131
Generate a uniformly distributed random integer in the range 0 to @m{2^n-1,
3132
2^@var{n}@minus{}1}, inclusive.
3134
The variable @var{state} must be initialized by calling one of the
3135
@code{gmp_randinit} functions (@ref{Random State Initialization}) before
3136
invoking this function.
3139
@deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, mpz_t @var{n})
3140
Generate a uniform random integer in the range 0 to @ma{@var{n}-1}, inclusive.
3142
The variable @var{state} must be initialized by calling one of the
3143
@code{gmp_randinit} functions (@ref{Random State Initialization})
3144
before invoking this function.
3147
@deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{n})
3148
Generate a random integer with long strings of zeros and ones in the
3149
binary representation. Useful for testing functions and algorithms,
3150
since this kind of random numbers have proven to be more likely to
3151
trigger corner-case bugs. The random number will be in the range
3152
0 to @m{2^n-1, 2^@var{n}@minus{}1}, inclusive.
3154
The variable @var{state} must be initialized by calling one of the
3155
@code{gmp_randinit} functions (@ref{Random State Initialization})
3156
before invoking this function.
3159
@deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size})
3160
Generate a random integer of at most @var{max_size} limbs. The generated
3161
random number doesn't satisfy any particular requirements of randomness.
3162
Negative random numbers are generated when @var{max_size} is negative.
3164
This function is obsolete. Use @code{mpz_urandomb} or
3165
@code{mpz_urandomm} instead.
3168
@deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size})
3169
Generate a random integer of at most @var{max_size} limbs, with long strings
3170
of zeros and ones in the binary representation. Useful for testing functions
3171
and algorithms, since this kind of random numbers have proven to be more
3172
likely to trigger corner-case bugs. Negative random numbers are generated
3173
when @var{max_size} is negative.
3175
This function is obsolete. Use @code{mpz_rrandomb} instead.
3180
@node Miscellaneous Integer Functions, , Integer Random Numbers, Integer Functions
3181
@comment node-name, next, previous, up
3182
@section Miscellaneous Functions
3183
@cindex Miscellaneous integer functions
3184
@cindex Integer miscellaneous functions
3186
@deftypefun int mpz_fits_ulong_p (mpz_t @var{op})
3187
@deftypefunx int mpz_fits_slong_p (mpz_t @var{op})
3188
@deftypefunx int mpz_fits_uint_p (mpz_t @var{op})
3189
@deftypefunx int mpz_fits_sint_p (mpz_t @var{op})
3190
@deftypefunx int mpz_fits_ushort_p (mpz_t @var{op})
3191
@deftypefunx int mpz_fits_sshort_p (mpz_t @var{op})
3192
Return non-zero iff the value of @var{op} fits in an @code{unsigned long int},
3193
@code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned
3194
short int}, or @code{signed short int}, respectively. Otherwise, return zero.
3197
@deftypefn Macro int mpz_odd_p (mpz_t @var{op})
3198
@deftypefnx Macro int mpz_even_p (mpz_t @var{op})
3199
Determine whether @var{op} is odd or even, respectively. Return non-zero if
3200
yes, zero if no. These macros evaluate their argument more than once.
3203
@deftypefun size_t mpz_size (mpz_t @var{op})
3204
Return the size of @var{op} measured in number of limbs. If @var{op} is zero,
3205
the returned value will be zero.
3206
@c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.)
3209
@deftypefun size_t mpz_sizeinbase (mpz_t @var{op}, int @var{base})
3210
Return the size of @var{op} measured in number of digits in base @var{base}.
3211
The base may vary from 2 to 36. The sign of @var{op} is ignored, just the
3212
absolute value is used. The returned value will be exact or 1 too big. If
3213
@var{base} is a power of 2, the returned value will always be exact.
3215
This function is useful in order to allocate the right amount of space before
3216
converting @var{op} to a string. The right amount of allocation is normally
3217
two more than the value returned by @code{mpz_sizeinbase} (one extra for a
3218
minus sign and one for the null-terminator).
3222
@node Rational Number Functions, Floating-point Functions, Integer Functions, Top
3223
@comment node-name, next, previous, up
3224
@chapter Rational Number Functions
3225
@cindex Rational number functions
3227
This chapter describes the GMP functions for performing arithmetic on rational
3228
numbers. These functions start with the prefix @code{mpq_}.
3230
Rational numbers are stored in objects of type @code{mpq_t}.
3232
All rational arithmetic functions assume operands have a canonical form, and
3233
canonicalize their result. The canonical from means that the denominator and
3234
the numerator have no common factors, and that the denominator is positive.
3235
Zero has the unique representation 0/1.
3237
Pure assignment functions do not canonicalize the assigned variable. It is
3238
the responsibility of the user to canonicalize the assigned variable before
3239
any arithmetic operations are performed on that variable.
3241
@deftypefun void mpq_canonicalize (mpq_t @var{op})
3242
Remove any factors that are common to the numerator and denominator of
3243
@var{op}, and make the denominator positive.
3247
* Initializing Rationals::
3248
* Rational Conversions::
3249
* Rational Arithmetic::
3250
* Comparing Rationals::
3251
* Applying Integer Functions::
3252
* I/O of Rationals::
3255
@node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions
3256
@comment node-name, next, previous, up
3257
@section Initialization and Assignment Functions
3258
@cindex Initialization and assignment functions
3259
@cindex Rational init and assign
3261
@deftypefun void mpq_init (mpq_t @var{dest_rational})
3262
Initialize @var{dest_rational} and set it to 0/1. Each variable should
3263
normally only be initialized once, or at least cleared out (using the function
3264
@code{mpq_clear}) between each initialization.
3267
@deftypefun void mpq_clear (mpq_t @var{rational_number})
3268
Free the space occupied by @var{rational_number}. Make sure to call this
3269
function for all @code{mpq_t} variables when you are done with them.
3272
@deftypefun void mpq_set (mpq_t @var{rop}, mpq_t @var{op})
3273
@deftypefunx void mpq_set_z (mpq_t @var{rop}, mpz_t @var{op})
3274
Assign @var{rop} from @var{op}.
3277
@deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2})
3278
@deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2})
3279
Set the value of @var{rop} to @var{op1}/@var{op2}. Note that if @var{op1} and
3280
@var{op2} have common factors, @var{rop} has to be passed to
3281
@code{mpq_canonicalize} before any operations are performed on @var{rop}.
3284
@deftypefun int mpq_set_str (mpq_t @var{rop}, char *@var{str}, int @var{base})
3285
Set @var{rop} from a null-terminated string @var{str} in the given @var{base}.
3287
The string can be an integer like "41" or a fraction like "41/152". The
3288
fraction must be in canonical form (@pxref{Rational Number Functions}), or if
3289
not then @code{mpq_canonicalize} must be called.
3291
The numerator and optional denominator are parsed the same as in
3292
@code{mpz_set_str} (@pxref{Assigning Integers}). White space is allowed in
3293
the string, and is simply ignored. The @var{base} can vary from 2 to 36, or
3294
if @var{base} is 0 then the leading characters are used: @code{0x} for hex,
3295
@code{0} for octal, or decimal otherwise. Note that this is done separately
3296
for the numerator and denominator, so for instance @code{0xEF/100} is 239/100,
3297
whereas @code{0xEF/0x100} is 239/256.
3299
The return value is 0 if the entire string is a valid number, or @minus{}1 if
3303
@deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2})
3304
Swap the values @var{rop1} and @var{rop2} efficiently.
3309
@node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions
3310
@comment node-name, next, previous, up
3311
@section Conversion Functions
3312
@cindex Rational conversion functions
3313
@cindex Conversion functions
3315
@deftypefun double mpq_get_d (mpq_t @var{op})
3316
Convert @var{op} to a @code{double}.
3319
@deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op})
3320
@deftypefunx void mpq_set_f (mpq_t @var{rop}, mpf_t @var{op})
3321
Set @var{rop} to the value of @var{op}, without rounding.
3324
@deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, mpq_t @var{op})
3325
Convert @var{op} to a string of digits in base @var{base}. The base may vary
3326
from 2 to 36. The string will be of the form @samp{num/den}, or if the
3327
denominator is 1 then just @samp{num}.
3329
If @var{str} is @code{NULL}, the result string is allocated using the current
3330
allocation function (@pxref{Custom Allocation}). The block will be
3331
@code{strlen(str)+1} bytes, that being exactly enough for the string and
3334
If @var{str} is not @code{NULL}, it should point to a block of storage large
3335
enough for the result, that being
3338
mpz_sizeinbase (mpq_numref(@var{op}), @var{base})
3339
+ mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3
3342
The three extra bytes are for a possible minus sign, possible slash, and the
3345
A pointer to the result string is returned, being either the allocated block,
3346
or the given @var{str}.
3350
@node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions
3351
@comment node-name, next, previous, up
3352
@section Arithmetic Functions
3353
@cindex Rational arithmetic functions
3354
@cindex Arithmetic functions
3356
@deftypefun void mpq_add (mpq_t @var{sum}, mpq_t @var{addend1}, mpq_t @var{addend2})
3357
Set @var{sum} to @var{addend1} + @var{addend2}.
3360
@deftypefun void mpq_sub (mpq_t @var{difference}, mpq_t @var{minuend}, mpq_t @var{subtrahend})
3361
Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}.
3364
@deftypefun void mpq_mul (mpq_t @var{product}, mpq_t @var{multiplier}, mpq_t @var{multiplicand})
3365
Set @var{product} to @ma{@var{multiplier} @GMPtimes{} @var{multiplicand}}.
3368
@deftypefun void mpq_mul_2exp (mpq_t @var{rop}, mpq_t @var{op1}, unsigned long int @var{op2})
3369
Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
3373
@deftypefun void mpq_div (mpq_t @var{quotient}, mpq_t @var{dividend}, mpq_t @var{divisor})
3374
@cindex Division functions
3375
Set @var{quotient} to @var{dividend}/@var{divisor}.
3378
@deftypefun void mpq_div_2exp (mpq_t @var{rop}, mpq_t @var{op1}, unsigned long int @var{op2})
3379
Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
3383
@deftypefun void mpq_neg (mpq_t @var{negated_operand}, mpq_t @var{operand})
3384
Set @var{negated_operand} to @minus{}@var{operand}.
3387
@deftypefun void mpq_abs (mpq_t @var{rop}, mpq_t @var{op})
3388
Set @var{rop} to the absolute value of @var{op}.
3391
@deftypefun void mpq_inv (mpq_t @var{inverted_number}, mpq_t @var{number})
3392
Set @var{inverted_number} to 1/@var{number}. If the new denominator is
3393
zero, this routine will divide by zero.
3396
@node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions
3397
@comment node-name, next, previous, up
3398
@section Comparison Functions
3399
@cindex Rational comparison functions
3400
@cindex Comparison functions
3402
@deftypefun int mpq_cmp (mpq_t @var{op1}, mpq_t @var{op2})
3403
Compare @var{op1} and @var{op2}. Return a positive value if @ma{@var{op1} >
3404
@var{op2}}, zero if @ma{@var{op1} = @var{op2}}, and a negative value if
3405
@ma{@var{op1} < @var{op2}}.
3407
To determine if two rationals are equal, @code{mpq_equal} is faster than
3411
@deftypefn Macro int mpq_cmp_ui (mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2})
3412
@deftypefnx Macro int mpq_cmp_si (mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2})
3413
Compare @var{op1} and @var{num2}/@var{den2}. Return a positive value if
3414
@ma{@var{op1} > @var{num2}/@var{den2}}, zero if @ma{@var{op1} =
3415
@var{num2}/@var{den2}}, and a negative value if @ma{@var{op1} <
3416
@var{num2}/@var{den2}}.
3418
@var{num2} and @var{den2} are allowed to have common factors.
3420
These functions are implemented as a macros and evaluate their arguments
3424
@deftypefn Macro int mpq_sgn (mpq_t @var{op})
3426
@cindex Rational sign tests
3427
Return @ma{+1} if @ma{@var{op} > 0}, 0 if @ma{@var{op} = 0}, and @ma{-1} if
3430
This function is actually implemented as a macro. It evaluates its
3431
arguments multiple times.
3434
@deftypefun int mpq_equal (mpq_t @var{op1}, mpq_t @var{op2})
3435
Return non-zero if @var{op1} and @var{op2} are equal, zero if they are
3436
non-equal. Although @code{mpq_cmp} can be used for the same purpose, this
3437
function is much faster.
3440
@node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions
3441
@comment node-name, next, previous, up
3442
@section Applying Integer Functions to Rationals
3443
@cindex Rational numerator and denominator
3444
@cindex Numerator and denominator
3446
The set of @code{mpq} functions is quite small. In particular, there are few
3447
functions for either input or output. The following functions give direct
3448
access to the numerator and denominator of an @code{mpq_t}.
3450
Note that if an assignment to the numerator and/or denominator could take an
3451
@code{mpq_t} out of the canonical form described at the start of this chapter
3452
(@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be
3453
called before any other @code{mpq} functions are applied to that @code{mpq_t}.
3455
@deftypefn Macro mpz_t mpq_numref (mpq_t @var{op})
3456
@deftypefnx Macro mpz_t mpq_denref (mpq_t @var{op})
3457
Return a reference to the numerator and denominator of @var{op}, respectively.
3458
The @code{mpz} functions can be used on the result of these macros.
3461
@deftypefun void mpq_get_num (mpz_t @var{numerator}, mpq_t @var{rational})
3462
@deftypefunx void mpq_get_den (mpz_t @var{denominator}, mpq_t @var{rational})
3463
@deftypefunx void mpq_set_num (mpq_t @var{rational}, mpz_t @var{numerator})
3464
@deftypefunx void mpq_set_den (mpq_t @var{rational}, mpz_t @var{denominator})
3465
Get or set the numerator or denominator of a rational. These functions are
3466
equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or
3467
@code{mpq_denref}. Direct use of @code{mpq_numref} or @code{mpq_denref} is
3468
recommended instead of these functions.
3473
@node I/O of Rationals, , Applying Integer Functions, Rational Number Functions
3474
@comment node-name, next, previous, up
3475
@section Input and Output Functions
3476
@cindex Rational input and output functions
3477
@cindex Input functions
3478
@cindex Output functions
3479
@cindex I/O functions
3481
When using any of these functions, it's a good idea to include @file{stdio.h}
3482
before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
3483
for these functions.
3485
Passing a @code{NULL} pointer for a @var{stream} argument to any of these
3486
functions will make them read from @code{stdin} and write to @code{stdout},
3489
@deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, mpq_t @var{op})
3490
Output @var{op} on stdio stream @var{stream}, as a string of digits in base
3491
@var{base}. The base may vary from 2 to 36. Output is in the form
3492
@samp{num/den} or if the denominator is 1 then just @samp{num}.
3494
Return the number of bytes written, or if an error occurred, return 0.
3497
@deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base})
3498
Read a string of digits from @var{stream} and convert them to a rational in
3499
@var{rop}. Any initial white-space characters are read and discarded. Return
3500
the number of characters read (including white space), or 0 if a rational
3503
The input can be a fraction like @samp{17/63} or just an integer like
3504
@samp{123}. Reading stops at the first character not in this form, and white
3505
space is not permitted within the string. If the input might not be in
3506
canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational
3509
The @var{base} can be between 2 and 36, or can be 0 in which case the leading
3510
characters of the string determine the base, @samp{0x} or @samp{0X} for
3511
hexadecimal, @samp{0} for octal, or decimal otherwise. The leading characters
3512
are examined separately for the numerator and denominator of a fraction, so
3513
for instance @samp{0x10/11} is 16/11, whereas @samp{0x10/0x11} is 16/17.
3517
@node Floating-point Functions, Low-level Functions, Rational Number Functions, Top
3518
@comment node-name, next, previous, up
3519
@chapter Floating-point Functions
3520
@cindex Floating-point functions
3521
@cindex Float functions
3522
@cindex User-defined precision
3523
@cindex Precision of floats
3525
GMP floating point numbers are stored in objects of type @code{mpf_t} and
3526
functions operating on them have an @code{mpf_} prefix.
3528
The mantissa of each float has a user-selectable precision, limited only by
3529
available memory. Each variable has its own precision, and that can be
3530
increased or decreased at any time.
3532
The exponent of each float is a fixed precision, one machine word on most
3533
systems. In the current implementation the exponent is a count of limbs, so
3534
for example on a 32-bit system this means a range of roughly
3535
@ma{2^@W{-68719476768}} to @ma{2^@W{68719476736}}, or on a 64-bit system this
3536
will be greater. Note however @code{mpf_get_str} can only return an exponent
3537
which fits an @code{mp_exp_t} and currently @code{mpf_set_str} doesn't accept
3538
exponents bigger than a @code{long}.
3540
Each variable keeps a size for the mantissa data actually in use. This means
3541
that if a float is exactly represented in only a few bits then only those bits
3542
will be used in a calculation, even if the selected precision is high.
3544
All calculations are performed to the precision of the destination variable.
3545
Each function is defined to calculate with ``infinite precision'' followed by
3546
a truncation to the destination precision, but of course the work done is only
3547
what's needed to determine a result under that definition.
3549
The precision selected for a variable is a minimum value, GMP may increase it
3550
a little to facilitate efficient calculation. Currently this means rounding
3551
up to a whole limb, and then sometimes having a further partial limb,
3552
depending on the high limb of the mantissa. But applications shouldn't be
3553
concerned by such details.
3555
@code{mpf} functions and variables have no special notion of infinity or
3556
not-a-number, and applications must take care not to overflow the exponent or
3557
results will be unpredictable. This might change in a future release.
3559
Note that the @code{mpf} functions are @emph{not} intended as a smooth
3560
extension to IEEE P754 arithmetic. In particular results obtained on one
3561
computer often differ from the results on a computer with a different word
3565
* Initializing Floats::
3566
* Assigning Floats::
3567
* Simultaneous Float Init & Assign::
3568
* Converting Floats::
3569
* Float Arithmetic::
3570
* Float Comparison::
3572
* Miscellaneous Float Functions::
3575
@node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions
3576
@comment node-name, next, previous, up
3577
@section Initialization Functions
3578
@cindex Float initialization functions
3579
@cindex Initialization functions
3581
@deftypefun void mpf_set_default_prec (unsigned long int @var{prec})
3582
Set the default precision to be @strong{at least} @var{prec} bits. All
3583
subsequent calls to @code{mpf_init} will use this precision, but previously
3584
initialized variables are unaffected.
3587
@deftypefun {unsigned long int} mpf_get_default_prec (void)
3588
Return the default default precision actually used.
3591
An @code{mpf_t} object must be initialized before storing the first value in
3592
it. The functions @code{mpf_init} and @code{mpf_init2} are used for that
3595
@deftypefun void mpf_init (mpf_t @var{x})
3596
Initialize @var{x} to 0. Normally, a variable should be initialized once only
3597
or at least be cleared, using @code{mpf_clear}, between initializations. The
3598
precision of @var{x} is undefined unless a default precision has already been
3599
established by a call to @code{mpf_set_default_prec}.
3602
@deftypefun void mpf_init2 (mpf_t @var{x}, unsigned long int @var{prec})
3603
Initialize @var{x} to 0 and set its precision to be @strong{at least}
3604
@var{prec} bits. Normally, a variable should be initialized once only or at
3605
least be cleared, using @code{mpf_clear}, between initializations.
3608
@deftypefun void mpf_clear (mpf_t @var{x})
3609
Free the space occupied by @var{x}. Make sure to call this function for all
3610
@code{mpf_t} variables when you are done with them.
3614
Here is an example on how to initialize floating-point variables:
3618
mpf_init (x); /* use default precision */
3619
mpf_init2 (y, 256); /* precision @emph{at least} 256 bits */
3621
/* Unless the program is about to exit, do ... */
3627
The following three functions are useful for changing the precision during a
3628
calculation. A typical use would be for adjusting the precision gradually in
3629
iterative algorithms like Newton-Raphson, making the computation precision
3630
closely match the actual accurate part of the numbers.
3632
@deftypefun {unsigned long int} mpf_get_prec (mpf_t @var{op})
3633
Return the current precision of @var{op}, in bits.
3636
@deftypefun void mpf_set_prec (mpf_t @var{rop}, unsigned long int @var{prec})
3637
Set the precision of @var{rop} to be @strong{at least} @var{prec} bits. The
3638
value in @var{rop} will be truncated to the new precision.
3640
This function requires a call to @code{realloc}, and so should not be used in
3644
@deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, unsigned long int @var{prec})
3645
Set the precision of @var{rop} to be @strong{at least} @var{prec} bits,
3646
without changing the memory allocated.
3648
@var{prec} must be no more than the allocated precision for @var{rop}, that
3649
being the precision when @var{rop} was initialized, or in the most recent
3650
@code{mpf_set_prec}.
3652
The value in @var{rop} is unchanged, and in particular if it had a higher
3653
precision than @var{prec} it will retain that higher precision. New values
3654
written to @var{rop} will use the new @var{prec}.
3656
Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another
3657
@code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original
3658
allocated precision. Failing to do so will have unpredictable results.
3660
@code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the
3661
original allocated precision. After @code{mpf_set_prec_raw} it reflects the
3662
@var{prec} value set.
3664
@code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at
3665
different precisions during a calculation, perhaps to gradually increase
3666
precision in an iteration, or just to use various different precisions for
3667
different purposes during a calculation.
3672
@node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions
3673
@comment node-name, next, previous, up
3674
@section Assignment Functions
3675
@cindex Float assignment functions
3676
@cindex Assignment functions
3678
These functions assign new values to already initialized floats
3679
(@pxref{Initializing Floats}).
3681
@deftypefun void mpf_set (mpf_t @var{rop}, mpf_t @var{op})
3682
@deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
3683
@deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op})
3684
@deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op})
3685
@deftypefunx void mpf_set_z (mpf_t @var{rop}, mpz_t @var{op})
3686
@deftypefunx void mpf_set_q (mpf_t @var{rop}, mpq_t @var{op})
3687
Set the value of @var{rop} from @var{op}.
3690
@deftypefun int mpf_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base})
3691
Set the value of @var{rop} from the string in @var{str}. The string is of the
3692
form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}.
3693
@samp{M} is the mantissa and @samp{N} is the exponent. The mantissa is always
3694
in the specified base. The exponent is either in the specified base or, if
3695
@var{base} is negative, in decimal. The decimal point expected is taken from
3696
the current locale, on systems providing @code{localeconv}.
3698
The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to
3699
@minus{}2. Negative values are used to specify that the exponent is in
3702
Unlike the corresponding @code{mpz} function, the base will not be determined
3703
from the leading characters of the string if @var{base} is 0. This is so that
3704
numbers like @samp{0.23} are not interpreted as octal.
3706
White space is allowed in the string, and is simply ignored. [This is not
3707
really true; white-space is ignored in the beginning of the string and within
3708
the mantissa, but not in other places, such as after a minus sign or in the
3709
exponent. We are considering changing the definition of this function, making
3710
it fail when there is any white-space in the input, since that makes a lot of
3711
sense. Please tell us your opinion about this change. Do you really want it
3712
to accept @nicode{"3 14"} as meaning 314 as it does now?]
3714
This function returns 0 if the entire string is a valid number in base
3715
@var{base}. Otherwise it returns @minus{}1.
3718
@deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2})
3719
Swap @var{rop1} and @var{rop2} efficiently. Both the values and the
3720
precisions of the two variables are swapped.
3724
@node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions
3725
@comment node-name, next, previous, up
3726
@section Combined Initialization and Assignment Functions
3727
@cindex Initialization and assignment functions
3728
@cindex Float init and assign functions
3730
For convenience, GMP provides a parallel series of initialize-and-set functions
3731
which initialize the output and then store the value there. These functions'
3732
names have the form @code{mpf_init_set@dots{}}
3734
Once the float has been initialized by any of the @code{mpf_init_set@dots{}}
3735
functions, it can be used as the source or destination operand for the ordinary
3736
float functions. Don't use an initialize-and-set function on a variable
3737
already initialized!
3739
@deftypefun void mpf_init_set (mpf_t @var{rop}, mpf_t @var{op})
3740
@deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
3741
@deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op})
3742
@deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op})
3743
Initialize @var{rop} and set its value from @var{op}.
3745
The precision of @var{rop} will be taken from the active default precision, as
3746
set by @code{mpf_set_default_prec}.
3749
@deftypefun int mpf_init_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base})
3750
Initialize @var{rop} and set its value from the string in @var{str}. See
3751
@code{mpf_set_str} above for details on the assignment operation.
3753
Note that @var{rop} is initialized even if an error occurs. (I.e., you have to
3754
call @code{mpf_clear} for it.)
3756
The precision of @var{rop} will be taken from the active default precision, as
3757
set by @code{mpf_set_default_prec}.
3761
@node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions
3762
@comment node-name, next, previous, up
3763
@section Conversion Functions
3764
@cindex Float conversion functions
3765
@cindex Conversion functions
3767
@deftypefun double mpf_get_d (mpf_t @var{op})
3768
Convert @var{op} to a @code{double}.
3771
@deftypefun double mpf_get_d_2exp (signed long int @var{exp}, mpf_t @var{op})
3772
Find @var{d} and @var{exp} such that @m{@var{d}\times 2^{exp}, @var{d} times 2
3773
raised to @var{exp}}, with @ma{0.5@le{}@GMPabs{@var{d}}<1}, is a good
3774
approximation to @var{op}. This is similar to the standard C function
3778
@deftypefun long mpf_get_si (mpf_t @var{op})
3779
@deftypefunx {unsigned long} mpf_get_ui (mpf_t @var{op})
3780
Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any
3781
fraction part. If @var{op} is too big for the return type, the result is
3784
See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p}
3785
(@pxref{Miscellaneous Float Functions}).
3788
@deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op})
3789
Convert @var{op} to a string of digits in base @var{base}. @var{base} can be
3790
2 to 36. Up to @var{n_digits} digits will be generated. Trailing zeros are
3791
not returned. No more digits than can be accurately represented by @var{op}
3792
are ever generated. If @var{n_digits} is 0 then that accurate maximum number
3793
of digits are generated.
3795
If @var{str} is @code{NULL}, the result string is allocated using the current
3796
allocation function (@pxref{Custom Allocation}). The block will be
3797
@code{strlen(str)+1} bytes, that being exactly enough for the string and
3800
If @var{str} is not @code{NULL}, it should point to a block of
3801
@ma{@var{n\_digits} + 2} bytes, that being enough for the mantissa, a possible
3802
minus sign, and a null-terminator. When @var{n_digits} is 0 to get all
3803
significant digits, an application won't be able to know the space required,
3804
and @var{str} should be @code{NULL} in that case.
3806
The generated string is a fraction, with an implicit radix point immediately
3807
to the left of the first digit. The applicable exponent is written through
3808
the @var{expptr} pointer. For example, the number 3.1416 would be returned as
3809
string @nicode{"31416"} and exponent 1.
3811
When @var{op} is zero, an empty string is produced and the exponent returned
3814
A pointer to the result string is returned, being either the allocated block
3815
or the given @var{str}.
3819
@node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions
3820
@comment node-name, next, previous, up
3821
@section Arithmetic Functions
3822
@cindex Float arithmetic functions
3823
@cindex Arithmetic functions
3825
@deftypefun void mpf_add (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
3826
@deftypefunx void mpf_add_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
3827
Set @var{rop} to @ma{@var{op1} + @var{op2}}.
3830
@deftypefun void mpf_sub (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
3831
@deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2})
3832
@deftypefunx void mpf_sub_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
3833
Set @var{rop} to @var{op1} @minus{} @var{op2}.
3836
@deftypefun void mpf_mul (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
3837
@deftypefunx void mpf_mul_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
3838
Set @var{rop} to @ma{@var{op1} @GMPtimes{} @var{op2}}.
3841
Division is undefined if the divisor is zero, and passing a zero divisor to the
3842
divide functions will make these functions intentionally divide by zero. This
3843
lets the user handle arithmetic exceptions in these functions in the same
3844
manner as other arithmetic exceptions.
3846
@deftypefun void mpf_div (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
3847
@deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2})
3848
@deftypefunx void mpf_div_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
3849
@cindex Division functions
3850
Set @var{rop} to @var{op1}/@var{op2}.
3853
@deftypefun void mpf_sqrt (mpf_t @var{rop}, mpf_t @var{op})
3854
@deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op})
3855
@cindex Root extraction functions
3856
Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}.
3859
@deftypefun void mpf_pow_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
3860
@cindex Exponentiation functions
3861
@cindex Powering functions
3862
Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}.
3865
@deftypefun void mpf_neg (mpf_t @var{rop}, mpf_t @var{op})
3866
Set @var{rop} to @minus{}@var{op}.
3869
@deftypefun void mpf_abs (mpf_t @var{rop}, mpf_t @var{op})
3870
Set @var{rop} to the absolute value of @var{op}.
3873
@deftypefun void mpf_mul_2exp (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
3874
Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
3878
@deftypefun void mpf_div_2exp (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
3879
Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
3883
@node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions
3884
@comment node-name, next, previous, up
3885
@section Comparison Functions
3886
@cindex Float comparison functions
3887
@cindex Comparison functions
3889
@deftypefun int mpf_cmp (mpf_t @var{op1}, mpf_t @var{op2})
3890
@deftypefunx int mpf_cmp_d (mpf_t @var{op1}, double @var{op2})
3891
@deftypefunx int mpf_cmp_ui (mpf_t @var{op1}, unsigned long int @var{op2})
3892
@deftypefunx int mpf_cmp_si (mpf_t @var{op1}, signed long int @var{op2})
3893
Compare @var{op1} and @var{op2}. Return a positive value if @ma{@var{op1} >
3894
@var{op2}}, zero if @ma{@var{op1} = @var{op2}}, and a negative value if
3895
@ma{@var{op1} < @var{op2}}.
3898
@deftypefun int mpf_eq (mpf_t @var{op1}, mpf_t @var{op2}, unsigned long int op3)
3899
Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are
3900
equal, zero otherwise. I.e., test of @var{op1} and @var{op2} are approximately
3903
Caution: Currently only whole limbs are compared, and only in an exact
3904
fashion. In the future values like 1000 and 0111 may be considered the same
3905
to 3 bits (on the basis that their difference is that small).
3908
@deftypefun void mpf_reldiff (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
3909
Compute the relative difference between @var{op1} and @var{op2} and store the
3910
result in @var{rop}. This is @ma{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}.
3913
@deftypefn Macro int mpf_sgn (mpf_t @var{op})
3915
@cindex Float sign tests
3916
Return @ma{+1} if @ma{@var{op} > 0}, 0 if @ma{@var{op} = 0}, and @ma{-1} if
3919
This function is actually implemented as a macro. It evaluates its arguments
3923
@node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions
3924
@comment node-name, next, previous, up
3925
@section Input and Output Functions
3926
@cindex Float input and output functions
3927
@cindex Input functions
3928
@cindex Output functions
3929
@cindex I/O functions
3931
Functions that perform input from a stdio stream, and functions that output to
3932
a stdio stream. Passing a @code{NULL} pointer for a @var{stream} argument to
3933
any of these functions will make them read from @code{stdin} and write to
3934
@code{stdout}, respectively.
3936
When using any of these functions, it is a good idea to include @file{stdio.h}
3937
before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
3938
for these functions.
3940
@deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op})
3941
Print @var{op} to @var{stream}, as a string of digits. Return the number of
3942
bytes written, or if an error occurred, return 0.
3944
The mantissa is prefixed with an @samp{0.} and is in the given @var{base},
3945
which may vary from 2 to 36. An exponent then printed, separated by an
3946
@samp{e}, or if @var{base} is greater than 10 then by an @samp{@@}. The
3947
exponent is always in decimal. The decimal point follows the current locale,
3948
on systems providing @code{localeconv}.
3950
Up to @var{n_digits} will be printed from the mantissa, except that no more
3951
digits than are accurately representable by @var{op} will be printed.
3952
@var{n_digits} can be 0 to select that accurate maximum.
3955
@deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base})
3956
Read a string in base @var{base} from @var{stream}, and put the read float in
3957
@var{rop}. The string is of the form @samp{M@@N} or, if the base is 10 or
3958
less, alternatively @samp{MeN}. @samp{M} is the mantissa and @samp{N} is the
3959
exponent. The mantissa is always in the specified base. The exponent is
3960
either in the specified base or, if @var{base} is negative, in decimal. The
3961
decimal point expected is taken from the current locale, on systems providing
3964
The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to
3965
@minus{}2. Negative values are used to specify that the exponent is in
3968
Unlike the corresponding @code{mpz} function, the base will not be determined
3969
from the leading characters of the string if @var{base} is 0. This is so that
3970
numbers like @samp{0.23} are not interpreted as octal.
3972
Return the number of bytes read, or if an error occurred, return 0.
3975
@c @deftypefun void mpf_out_raw (FILE *@var{stream}, mpf_t @var{float})
3976
@c Output @var{float} on stdio stream @var{stream}, in raw binary
3977
@c format. The float is written in a portable format, with 4 bytes of
3978
@c size information, and that many bytes of limbs. Both the size and the
3979
@c limbs are written in decreasing significance order.
3982
@c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream})
3983
@c Input from stdio stream @var{stream} in the format written by
3984
@c @code{mpf_out_raw}, and put the result in @var{float}.
3988
@node Miscellaneous Float Functions, , I/O of Floats, Floating-point Functions
3989
@comment node-name, next, previous, up
3990
@section Miscellaneous Functions
3991
@cindex Miscellaneous float functions
3992
@cindex Float miscellaneous functions
3994
@deftypefun void mpf_ceil (mpf_t @var{rop}, mpf_t @var{op})
3995
@deftypefunx void mpf_floor (mpf_t @var{rop}, mpf_t @var{op})
3996
@deftypefunx void mpf_trunc (mpf_t @var{rop}, mpf_t @var{op})
3997
Set @var{rop} to @var{op} rounded to an integer. @code{mpf_ceil} rounds to the
3998
next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc}
3999
to the integer towards zero.
4002
@deftypefun int mpf_integer_p (mpf_t @var{op})
4003
Return non-zero if @var{op} is an integer.
4006
@deftypefun int mpf_fits_ulong_p (mpf_t @var{op})
4007
@deftypefunx int mpf_fits_slong_p (mpf_t @var{op})
4008
@deftypefunx int mpf_fits_uint_p (mpf_t @var{op})
4009
@deftypefunx int mpf_fits_sint_p (mpf_t @var{op})
4010
@deftypefunx int mpf_fits_ushort_p (mpf_t @var{op})
4011
@deftypefunx int mpf_fits_sshort_p (mpf_t @var{op})
4012
Return non-zero if @var{op} would fit in the respective C data type, when
4013
truncated to an integer.
4016
@deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{nbits})
4017
Generate a uniformly distributed random float in @var{rop}, such that @ma{0
4018
@le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa.
4020
The variable @var{state} must be initialized by calling one of the
4021
@code{gmp_randinit} functions (@ref{Random State Initialization}) before
4022
invoking this function.
4025
@deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp})
4026
Generate a random float of at most @var{max_size} limbs, with long strings of
4027
zeros and ones in the binary representation. The exponent of the number is in
4028
the interval @minus{}@var{exp} to @var{exp}. This function is useful for
4029
testing functions and algorithms, since this kind of random numbers have proven
4030
to be more likely to trigger corner-case bugs. Negative random numbers are
4031
generated when @var{max_size} is negative.
4034
@c @deftypefun size_t mpf_size (mpf_t @var{op})
4035
@c Return the size of @var{op} measured in number of limbs. If @var{op} is
4036
@c zero, the returned value will be zero. (@xref{Nomenclature}, for an
4037
@c explanation of the concept @dfn{limb}.)
4039
@c @strong{This function is obsolete. It will disappear from future GMP
4044
@node Low-level Functions, Random Number Functions, Floating-point Functions, Top
4045
@comment node-name, next, previous, up
4046
@chapter Low-level Functions
4047
@cindex Low-level functions
4049
This chapter describes low-level GMP functions, used to implement the
4050
high-level GMP functions, but also intended for time-critical user code.
4052
These functions start with the prefix @code{mpn_}.
4054
@c 1. Some of these function clobber input operands.
4057
The @code{mpn} functions are designed to be as fast as possible, @strong{not}
4058
to provide a coherent calling interface. The different functions have somewhat
4059
similar interfaces, but there are variations that make them hard to use. These
4060
functions do as little as possible apart from the real multiple precision
4061
computation, so that no time is spent on things that not all callers need.
4063
A source operand is specified by a pointer to the least significant limb and a
4064
limb count. A destination operand is specified by just a pointer. It is the
4065
responsibility of the caller to ensure that the destination has enough space
4066
for storing the result.
4068
With this way of specifying operands, it is possible to perform computations on
4069
subranges of an argument, and store the result into a subrange of a
4072
A common requirement for all functions is that each source area needs at least
4073
one limb. No size argument may be zero. Unless otherwise stated, in-place
4074
operations are allowed where source and destination are the same, but not where
4075
they only partly overlap.
4077
The @code{mpn} functions are the base for the implementation of the
4078
@code{mpz_}, @code{mpf_}, and @code{mpq_} functions.
4080
This example adds the number beginning at @var{s1p} and the number beginning at
4081
@var{s2p} and writes the sum at @var{destp}. All areas have @var{n} limbs.
4084
cy = mpn_add_n (destp, s1p, s2p, n)
4088
In the notation used here, a source operand is identified by the pointer to
4089
the least significant limb, and the limb count in braces. For example,
4090
@{@var{s1p}, @var{s1n}@}.
4092
@deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
4093
Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n}
4094
least significant limbs of the result to @var{rp}. Return carry, either 0 or
4097
This is the lowest-level function for addition. It is the preferred function
4098
for addition, since it is written in assembly for most CPUs. For addition of
4099
a variable to itself (i.e., @var{s1p} equals @var{s2p}, use @code{mpn_lshift}
4100
with a count of 1 for optimal speed.
4103
@deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
4104
Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least
4105
significant limbs of the result to @var{rp}. Return carry, either 0 or 1.
4108
@deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
4109
Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
4110
@var{s1n} least significant limbs of the result to @var{rp}. Return carry,
4113
This function requires that @var{s1n} is greater than or equal to @var{s2n}.
4116
@deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
4117
Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the
4118
@var{n} least significant limbs of the result to @var{rp}. Return borrow,
4121
This is the lowest-level function for subtraction. It is the preferred
4122
function for subtraction, since it is written in assembly for most CPUs.
4125
@deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
4126
Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least
4127
significant limbs of the result to @var{rp}. Return borrow, either 0 or 1.
4130
@deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
4131
Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the
4132
@var{s1n} least significant limbs of the result to @var{rp}. Return borrow,
4135
This function requires that @var{s1n} is greater than or equal to
4139
@deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
4140
Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the
4141
2*@var{n}-limb result to @var{rp}.
4143
The destination has to have space for 2*@var{n} limbs, even if the product's
4144
most significant limb is zero.
4147
@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
4148
Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least
4149
significant limbs of the product to @var{rp}. Return the most significant
4150
limb of the product. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are
4151
allowed to overlap provided @ma{@var{rp} @le{} @var{s1p}}.
4153
This is a low-level function that is a building block for general
4154
multiplication as well as other operations in GMP. It is written in assembly
4157
Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift}
4158
with a count equal to the logarithm of @var{s2limb} instead, for optimal speed.
4161
@deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
4162
Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least
4163
significant limbs of the product to @{@var{rp}, @var{n}@} and write the result
4164
to @var{rp}. Return the most significant limb of the product, plus carry-out
4167
This is a low-level function that is a building block for general
4168
multiplication as well as other operations in GMP. It is written in assembly
4172
@deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
4173
Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n}
4174
least significant limbs of the product from @{@var{rp}, @var{n}@} and write the
4175
result to @var{rp}. Return the most significant limb of the product, minus
4176
borrow-out from the subtraction.
4178
This is a low-level function that is a building block for general
4179
multiplication and division as well as other operations in GMP. It is written
4180
in assembly for most CPUs.
4183
@deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
4184
Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
4185
result to @var{rp}. Return the most significant limb of the result.
4187
The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
4188
result might be one limb smaller.
4190
This function requires that @var{s1n} is greater than or equal to
4191
@var{s2n}. The destination must be distinct from both input operands.
4194
@deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn})
4195
Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient
4196
at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp},
4197
@var{dn}@}. The quotient is rounded towards 0.
4199
No overlap is permitted between arguments. @var{nn} must be greater than or
4200
equal to @var{dn}. The most significant limb of @var{dp} must be non-zero.
4201
The @var{qxn} operand must be zero.
4202
@comment FIXME: Relax overlap requirements!
4205
@deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
4206
[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best
4209
Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the
4210
quotient at @var{r1p}, with the exception of the most significant limb, which
4211
is returned. The remainder replaces the dividend at @var{rs2p}; it will be
4212
@var{s3n} limbs long (i.e., as many limbs as the divisor).
4214
In addition to an integer quotient, @var{qxn} fraction limbs are developed, and
4215
stored after the integral limbs. For most usages, @var{qxn} will be zero.
4217
It is required that @var{rs2n} is greater than or equal to @var{s3n}. It is
4218
required that the most significant bit of the divisor is set.
4220
If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}. Aside
4221
from that special case, no overlap between arguments is permitted.
4223
Return the most significant limb of the quotient, either 0 or 1.
4225
The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn}
4229
@deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb})
4230
@deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}})
4231
Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at
4232
@var{r1p}. Return the remainder.
4234
The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in
4235
addition @var{qxn} fraction limbs are developed and written to @{@var{r1p},
4236
@var{qxn}@}. Either or both @var{s2n} and @var{qxn} can be zero. For most
4237
usages, @var{qxn} will be zero.
4239
@code{mpn_divmod_1} exists for upward source compatibility and is simply a
4240
macro calling @code{mpn_divrem_1} with a @var{qxn} of 0.
4242
The areas at @var{r1p} and @var{s2p} have to be identical or completely
4243
separate, not partially overlapping.
4246
@deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
4247
[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best
4251
@deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}})
4252
@deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry})
4253
Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing
4254
the result to @{@var{rp}, @var{n}@}. If 3 divides exactly, the return value is
4255
zero and the result is the quotient. If not, the return value is non-zero and
4256
the result won't be anything useful.
4258
@code{mpn_divexact_by3c} takes an initial carry parameter, which can be the
4259
return value from a previous call, so a large calculation can be done piece by
4260
piece from low to high. @code{mpn_divexact_by3} is simply a macro calling
4261
@code{mpn_divexact_by3c} with a 0 carry parameter.
4263
These routines use a multiply-by-inverse and will be faster than
4264
@code{mpn_divrem_1} on CPUs with fast multiplication but slow division.
4266
The source @ma{a}, result @ma{q}, size @ma{n}, initial carry @ma{i}, and
4267
return value @ma{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where
4268
@m{b=2\GMPraise{@code{mp\_bits\_per\_limb}}, b=2^mp_bits_per_limb}. The
4269
return @ma{c} is always 0, 1 or 2, and the initial carry @ma{i} must also be
4270
0, 1 or 2 (these are both borrows really). When @ma{c=0} clearly
4271
@ma{q=(a-i)/3}. When @m{c \neq 0, c!=0}, the remainder @ma{(a-i) @bmod{} 3}
4272
is given by @ma{3-c}, because @ma{b @equiv{} 1 @bmod{} 3} (when
4273
@code{mp_bits_per_limb} is even, which is always so currently).
4276
@deftypefun mp_limb_t mpn_mod_1 (mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb})
4277
Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder.
4278
@var{s1n} can be zero.
4281
@deftypefun mp_limb_t mpn_bdivmod (mp_limb_t *@var{rp}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}, unsigned long int @var{d})
4282
This function puts the low
4283
@ma{@GMPfloor{@var{d}/@nicode{mp\_bits\_per\_limb}}} limbs of @var{q} =
4284
@{@var{s1p}, @var{s1n}@}/@{@var{s2p}, @var{s2n}@} mod @m{2^d,2^@var{d}} at
4285
@var{rp}, and returns the high @var{d} mod @code{mp_bits_per_limb} bits of
4288
@{@var{s1p}, @var{s1n}@} - @var{q} * @{@var{s2p}, @var{s2n}@} mod @m{2
4289
\GMPraise{@var{s1n}*@code{mp\_bits\_per\_limb}},
4290
2^(@var{s1n}*@nicode{mp\_bits\_per\_limb})} is placed at @var{s1p}. Since the
4291
low @ma{@GMPfloor{@var{d}/@nicode{mp\_bits\_per\_limb}}} limbs of this
4292
difference are zero, it is possible to overwrite the low limbs at @var{s1p}
4293
with this difference, provided @ma{@var{rp} @le{} @var{s1p}}.
4295
This function requires that @ma{@var{s1n} * @nicode{mp\_bits\_per\_limb}
4296
@ge{} @var{D}}, and that @{@var{s2p}, @var{s2n}@} is odd.
4298
@strong{This interface is preliminary. It might change incompatibly in future
4302
@deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
4303
Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to
4304
@{@var{rp}, @var{n}@}. The bits shifted out at the left are returned in the
4305
least significant @var{count} bits of the return value (the rest of the return
4308
@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The
4309
regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
4310
@ma{@var{rp} @ge{} @var{sp}}.
4312
This function is written in assembly for most CPUs.
4315
@deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
4316
Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to
4317
@{@var{rp}, @var{n}@}. The bits shifted out at the right are returned in the
4318
most significant @var{count} bits of the return value (the rest of the return
4321
@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The
4322
regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
4323
@ma{@var{rp} @le{} @var{sp}}.
4325
This function is written in assembly for most CPUs.
4328
@deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
4329
Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a
4330
positive value if @ma{@var{s1} > @var{s2}}, 0 if they are equal, or a negative
4331
value if @ma{@var{s1} < @var{s2}}.
4334
@deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
4335
Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{s1p},
4336
@var{s1n}@} and @{@var{s2p}, @var{s2n}@}. The result can be up to @var{s2n}
4337
limbs, the return value is the actual number produced. Both source operands
4340
@{@var{s1p}, @var{s1n}@} must have at least as many bits as @{@var{s2p},
4341
@var{s2n}@}. @{@var{s2p}, @var{s2n}@} must be odd. Both operands must have
4342
non-zero most significant limbs.
4345
@deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb})
4346
Return the greatest common divisor of @{@var{s1p}, @var{s1n}@} and
4347
@var{s2limb}. Both operands must be non-zero.
4350
@deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, mp_size_t *@var{r2n}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
4351
Calculate the greatest common divisor of @{@var{s1p}, @var{s1n}@} and
4352
@{@var{s2p}, @var{s2n}@}. Store the gcd at @{@var{r1p}, @var{retval}@} and
4353
the first cofactor at @{@var{r2p}, *@var{r2n}@}, with *@var{r2n} negative if
4354
the cofactor is negative. @var{r1p} and @var{r2p} should each have room for
4355
@ma{@var{s1n}+1} limbs, but the return value and value stored through
4356
@var{r2n} indicate the actual number produced.
4358
@ma{@{@var{s1p}, @var{s1n}@} @ge{} @{@var{s2p}, @var{s2n}@}} is required, and
4359
both must be non-zero. The regions @{@var{s1p}, @ma{@var{s1n}+1}@} and
4360
@{@var{s2p}, @ma{@var{s2n}+1}@} are destroyed (i.e. the operands plus an extra
4361
limb past the end of each).
4363
The cofactor @var{r1} will satisfy @m{r_2 s_1 + k s_2 = r_1, @var{r2}*@var{s1}
4364
+ @var{k}*@var{s2} = @var{r1}}. The second cofactor @var{k} is not calculated
4365
but can easily be obtained from @m{(r_1 - r_2 s_1) / s_2, (@var{r1} -
4366
@var{r2}*@var{s1}) / @var{s2}}.
4369
@deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
4370
Compute the square root of @{@var{sp}, @var{n}@} and put the result at
4371
@{@var{r1p}, @ma{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p},
4372
@var{retval}@}. @var{r2p} needs space for @var{n} limbs, but the return value
4373
indicates how many are produced.
4375
The most significant limb of @{@var{sp}, @var{n}@} must be non-zero. The
4376
areas @{@var{r1p}, @ma{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must
4377
be completely separate. The areas @{@var{r2p}, @var{n}@} and @{@var{sp},
4378
@var{n}@} must be either identical or completely separate.
4380
If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this
4381
case the return value is zero or non-zero according to whether the remainder
4382
would have been zero or non-zero.
4384
A return value of zero indicates a perfect square. See also
4385
@code{mpz_perfect_square_p}.
4388
@deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n})
4389
Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in
4390
base @var{base}, and return the number of characters produced. There may be
4391
leading zeros in the string. The string is not in ASCII; to convert it to
4392
printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on
4395
The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be
4396
non-zero. The area @{@var{s1p}, @var{s1n}+1@} is clobbered.
4398
The area at @var{str} has to have space for the largest possible number
4399
represented by a @var{s1n} long limb array, plus one extra character.
4402
@deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{r1p}, const char *@var{str}, size_t @var{strsize}, int @var{base})
4403
Convert the raw unsigned char array at @var{str} of length @var{strsize} to a
4404
limb array. The base of @var{str} is @var{base}. @var{strsize} must be at
4407
Return the number of limbs stored in @var{r1p}.
4410
@deftypefun {unsigned long int} mpn_scan0 (const mp_limb_t *@var{s1p}, unsigned long int @var{bit})
4411
Scan @var{s1p} from bit position @var{bit} for the next clear bit.
4413
It is required that there be a clear bit within the area at @var{s1p} at or
4414
beyond bit position @var{bit}, so that the function has something to return.
4417
@deftypefun {unsigned long int} mpn_scan1 (const mp_limb_t *@var{s1p}, unsigned long int @var{bit})
4418
Scan @var{s1p} from bit position @var{bit} for the next set bit.
4420
It is required that there be a set bit within the area at @var{s1p} at or
4421
beyond bit position @var{bit}, so that the function has something to return.
4424
@deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
4425
@deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
4426
Generate a random number of length @var{r1n} and store it at @var{r1p}. The
4427
most significant limb is always non-zero. @code{mpn_random} generates
4428
uniformly distributed limb data, @code{mpn_random2} generates long strings of
4429
zeros and ones in the binary representation.
4431
@code{mpn_random2} is intended for testing the correctness of the @code{mpn}
4435
@deftypefun {unsigned long int} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
4436
Count the number of set bits in @{@var{s1p}, @var{n}@}.
4439
@deftypefun {unsigned long int} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
4440
Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p},
4444
@deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
4445
Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square.
4449
@node Random Number Functions, Formatted Output, Low-level Functions, Top
4450
@chapter Random Number Functions
4451
@cindex Random number functions
4453
Sequences of pseudo-random numbers in GMP are generated using a variable of
4454
type @code{gmp_randstate_t}, which holds an algorithm selection and a current
4455
state. Such a variable must be initialized by a call to one of the
4456
@code{gmp_randinit} functions, and can be seeded with one of the
4457
@code{gmp_randseed} functions.
4459
The functions actually generating random numbers are described in @ref{Integer
4460
Random Numbers}, and @ref{Miscellaneous Float Functions}.
4462
The older style random number functions don't accept a @code{gmp_randstate_t}
4463
parameter but instead share a global variable of that type. They use a
4464
default algorithm and are currently not seeded (though perhaps that will
4465
change in the future). The new functions accepting a @code{gmp_randstate_t}
4466
are recommended for applications that care about randomness.
4469
* Random State Initialization::
4470
* Random State Seeding::
4473
@node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions
4474
@section Random State Initialization
4475
@cindex Random number state
4477
@deftypefun void gmp_randinit_default (gmp_randstate_t @var{state})
4478
Initialize @var{state} with a default algorithm. This will be a compromise
4479
between speed and randomness, and is recommended for applications with no
4480
special requirements.
4483
@deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, mpz_t @var{a}, @w{unsigned long @var{c}}, @w{unsigned long @var{m2exp}})
4484
Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X +
4485
@var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}.
4487
The low bits of @ma{X} in this algorithm are not very random. The least
4488
significant bit will have a period no more than 2, and the second bit no more
4489
than 4, etc. For this reason only the high half of each @ma{X} is actually
4492
When a random number of more than @ma{@var{m2exp}/2} bits is to be generated,
4493
multiple iterations of the recurrence are used and the results concatenated.
4496
@deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, unsigned long @var{size})
4497
Initialize @var{state} for a linear congruential algorithm as per
4498
@code{gmp_randinit_lc_2exp}. @var{a}, @var{c} and @var{m2exp} are selected
4499
from a table, chosen so that @var{size} bits (or more) of each @ma{X} will be
4500
used, ie. @ma{@var{m2exp} @ge{} @var{size}/2}.
4502
If successful the return value is non-zero. If @var{size} is bigger than the
4503
table data provides then the return value is zero. The maximum @var{size}
4504
currently supported is 128.
4507
@deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, ...)
4508
@strong{This function is obsolete.}
4510
Initialize @var{state} with an algorithm selected by @var{alg}. The only
4511
choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size}.
4512
A third parameter of type @code{unsigned long} is required, this is the
4513
@var{size} for that function. @code{GMP_RAND_ALG_DEFAULT} or 0 are the same
4514
as @code{GMP_RAND_ALG_LC}.
4516
@code{gmp_randinit} sets bits in @code{gmp_errno} to indicate an error.
4517
@code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is unsupported, or
4518
@code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter is too big.
4521
@c Not yet in the library.
4523
@deftypefun void gmp_randinit_lc (gmp_randstate_t @var{state}, mpz_t @var{a}, unsigned long int @var{c}, mpz_t @var{m})
4524
Initialize @var{state} for a linear congruential scheme @m{X = (@var{a}X +
4525
@var{c}) @bmod @var{m}, X = (@var{a}*X + @var{c}) mod 2^@var{m}}.
4529
@deftypefun void gmp_randclear (gmp_randstate_t @var{state})
4530
Free all memory occupied by @var{state}.
4534
@node Random State Seeding, , Random State Initialization, Random Number Functions
4535
@section Random State Seeding
4536
@cindex Random number seeding
4538
@deftypefun void gmp_randseed (gmp_randstate_t @var{state}, mpz_t @var{seed})
4539
@deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}})
4540
Set an initial seed value into @var{state}.
4542
The size of a seed determines how many different sequences of random numbers
4543
that it's possible to generate. The ``quality'' of the seed is the randomness
4544
of a given seed compared to the previous seed used, and this affects the
4545
randomness of separate number sequences. The method for choosing a seed is
4546
critical if the generated numbers are to be used for important applications,
4547
such as generating cryptographic keys.
4549
Traditionally the system time has been used to seed, but care needs to be
4550
taken with this. If an application seeds often and the resolution of the
4551
system clock is low, then the same sequence of numbers might be repeated.
4552
Also, the system time is quite easy to guess, so if unpredictability is
4553
required then it should definitely not be the only source for the seed value.
4554
On some systems there's a special device @file{/dev/random} which provides
4555
random data better suited for use as a seed.
4559
@node Formatted Output, Formatted Input, Random Number Functions, Top
4560
@chapter Formatted Output
4561
@cindex Formatted output
4562
@cindex @code{printf} formatted output
4565
* Formatted Output Strings::
4566
* Formatted Output Functions::
4567
* C++ Formatted Output::
4570
@node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output
4571
@section Format Strings
4573
@code{gmp_printf} and friends accept format strings similar to the standard C
4574
@code{printf} (@pxref{Formatted Output,,,libc,The GNU C Library Reference
4575
Manual}). A format specification is of the form
4578
% [flags] [width] [.[precision]] [type] conv
4581
GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
4582
and @code{mpf_t} respectively. @samp{Z} and @samp{Q} behave like integers.
4583
@samp{Q} will print a @samp{/} and a denominator, if needed. @samp{F} behaves
4584
like a float. For example,
4588
gmp_printf ("%s is an mpz %Zd\n", "here", z);
4591
gmp_printf ("a hex rational: %#40Qx\n", q);
4595
gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n);
4598
All the standard C @code{printf} types behave the same as the C library
4599
@code{printf}, and can be freely intermixed with the GMP extensions. In the
4600
current implementation the standard parts of the format string are simply
4601
handed to @code{printf} and only the GMP extensions handled directly.
4603
The flags accepted are as follows. GLIBC style @nisamp{'}
4604
(@pxref{Locales,,Locales and Internationalization,libc,The GNU C Library
4605
Reference Manual}) is only for the standard C types (not the GMP types), and
4606
only if the C library supports it.
4609
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
4610
@item @nicode{0} @tab pad with zeros (rather than spaces)
4611
@item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0}
4612
@item @nicode{+} @tab always show a sign
4613
@item (space) @tab show a space or a @samp{-} sign
4614
@item @nicode{'} @tab group digits, GLIBC style (not GMP types)
4618
The standard types accepted are as follows. @samp{h} and @samp{l} are
4619
portable, the rest will depend on the compiler (or include files) for the type
4620
and the C library for the output.
4623
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
4624
@item @nicode{h} @tab @nicode{short}
4625
@item @nicode{hh} @tab @nicode{char}
4626
@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t}
4627
@item @nicode{l} @tab @nicode{long} or @nicode{wchar_t}
4628
@item @nicode{ll} @tab same as @nicode{L}
4629
@item @nicode{L} @tab @nicode{long long} or @nicode{long double}
4630
@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t}
4631
@item @nicode{t} @tab @nicode{ptrdiff_t}
4632
@item @nicode{z} @tab @nicode{size_t}
4640
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
4641
@item @nicode{F} @tab @nicode{mpf_t}, float conversions
4642
@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions
4643
@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions
4647
The conversions accepted are as follows. @samp{a} and @samp{A} are always
4648
supported for @code{mpf_t} but depend on the C library for standard C float
4649
types. @samp{m} and @samp{p} depend on the C library.
4652
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
4653
@item @nicode{a} @nicode{A} @tab hex floats, GLIBC style
4654
@item @nicode{c} @tab character
4655
@item @nicode{d} @tab decimal integer
4656
@item @nicode{e} @nicode{E} @tab scientific format float
4657
@item @nicode{f} @tab fixed point float
4658
@item @nicode{i} @tab same as @nicode{d}
4659
@item @nicode{g} @nicode{G} @tab fixed or scientific float
4660
@item @nicode{m} @tab @code{strerror} string, GLIBC style
4661
@item @nicode{n} @tab characters written so far
4662
@item @nicode{o} @tab octal integer
4663
@item @nicode{p} @tab pointer
4664
@item @nicode{s} @tab string
4665
@item @nicode{u} @tab unsigned integer
4666
@item @nicode{x} @nicode{X} @tab hex integer
4670
@samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for
4671
@samp{Z} and @samp{Q} a sign is included. @samp{u} is not meaningful for
4672
@code{Z} and @code{Q}.
4674
@samp{n} can be used with any of the types, even the GMP types.
4676
Other types or conversions that might be accepted by the C library
4677
@code{printf} cannot be used through @code{gmp_printf}, this includes for
4678
instance extensions registered with GLIBC @code{register_printf_function}.
4679
Also currently there's no support for POSIX @samp{$} style numbered arguments
4680
(perhaps this will be added in the future).
4682
The precision field has it's usual meaning for integer @samp{Z} and float
4683
@samp{F} types, but is currently undefined for @samp{Q} and should not be used
4686
@code{mpf_t} conversions only ever generate as many digits as can be
4687
accurately represented by the operand, the same as @code{mpf_get_str} does.
4688
Zeros will be used if necessary to pad to the requested precision. This
4689
happens even for an @samp{f} conversion of an @code{mpf_t} which is an
4690
integer, for instance @ma{2^@W{1024}} in an @code{mpf_t} of 128 bits precision
4691
will only produce about 20 digits, then pad with zeros to the decimal point.
4692
An empty precision field like @samp{%.Fe} or @samp{%.Ff} can be used to
4693
specifically request all significant digits.
4695
The decimal point character (or string) is taken from the current locale
4696
settings on systems which provide @code{localeconv} (@pxref{Locales,,Locales
4697
and Internationalization,libc,The GNU C Library Reference Manual}). The C
4698
library will normally do the same for standard float output.
4701
@node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output
4704
Each of the following functions is similar to the corresponding C library
4705
function. The basic @code{printf} forms take a variable argument list. The
4706
@code{vprintf} forms take an argument pointer, see @ref{Variadic
4707
Functions,,,libc,The GNU C Library Reference Manual}, or @samp{man 3
4710
It should be emphasised that if a format string is invalid, or the arguments
4711
don't match what the format specifies, then the behaviour of any of these
4712
functions will be unpredictable. GCC format string checking is not available,
4713
since it doesn't recognise the GMP extensions.
4715
The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return
4716
@ma{-1} to indicate a write error. All the functions can return @ma{-1} if
4717
the C library @code{printf} variant in use returns @ma{-1}, but this shouldn't
4720
@deftypefun int gmp_printf (const char *@var{fmt}, ...)
4721
@deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap})
4722
Print to the standard output @code{stdout}. Return the number of characters
4723
written, or @ma{-1} if an error occurred.
4726
@deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, ...)
4727
@deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
4728
Print to the stream @var{fp}. Return the number of characters written, or
4729
@ma{-1} if an error occurred.
4732
@deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, ...)
4733
@deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap})
4734
Form a null-terminated string in @var{buf}. Return the number of characters
4735
written, excluding the terminating null.
4737
No overlap is permitted between the space at @var{buf} and the string
4740
These functions are not recommended, since there's no protection against
4741
exceeding the space available at @var{buf}.
4744
@deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, ...)
4745
@deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap})
4746
Form a null-terminated string in @var{buf}. No more than @var{size} bytes
4747
will be written. To get the full output, @var{size} must be enough for the
4748
string and null-terminator.
4750
The return value is the total number of characters which ought to have been
4751
produced, excluding the terminating null. If @ma{@var{retval} >= @var{size}}
4752
then the actual output has been truncated to the first @ma{@var{size}-1}
4753
characters, and a null appended.
4755
No overlap is permitted between the region @{@var{buf},@var{size}@} and the
4758
Notice the return value is in ISO C99 @code{snprintf} style. This is so even
4759
if the C library @code{vsnprintf} is the older GLIBC 2.0.x style.
4762
@deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, ...)
4763
@deftypefunx int gmp_vasprintf (char *@var{pp}, const char *@var{fmt}, va_list @var{ap})
4764
Form a null-terminated string in a block of memory obtained from the current
4765
memory allocation function (@pxref{Custom Allocation}). The block will be the
4766
size of the string and null-terminator. Put the address of the block in
4767
*@var{pp}. Return the number of characters produced, excluding the
4770
Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return
4771
@ma{-1} if there's no more memory available, it lets the current allocation
4772
function handle that.
4775
@deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, ...)
4776
@deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap})
4777
Append to the current obstack object, in the same style as
4778
@code{obstack_printf}. Return the number of characters written. A
4779
null-terminator is not written.
4781
@var{fmt} cannot be within the current obstack object, since the object might
4784
These functions are available only when the C library provides the obstack
4785
feature, which probably means only on GNU systems, see
4786
@ref{Obstacks,,,libc,The GNU C Library Reference Manual}.
4790
@node C++ Formatted Output, , Formatted Output Functions, Formatted Output
4791
@section C++ Formatted Output
4792
@cindex C++ @code{ostream} output
4793
@cindex @code{ostream} output
4795
The following functions are provided in @file{libgmpxx}, which is built if C++
4796
support is enabled (@pxref{Build Options}). Prototypes are available from
4799
@deftypefun ostream& operator<< (ostream& @var{stream}, mpz_t @var{op})
4800
Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
4801
@code{ios::width} is reset to 0 after output, the same as the standard
4802
@code{ostream operator<<} routines do.
4804
In hex or octal, @var{op} is printed as a signed number, the same as for
4805
decimal. This is unlike the standard @code{operator<<} routines on @code{int}
4806
etc, which instead give twos complement.
4809
@deftypefun ostream& operator<< (ostream& @var{stream}, mpq_t @var{op})
4810
Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
4811
@code{ios::width} is reset to 0 after output, the same as the standard
4812
@code{ostream operator<<} routines do.
4814
Output will be a fraction like @samp{5/9}, or if the denominator is 1 then
4815
just a plain integer like @samp{123}.
4817
In hex or octal, @var{op} is printed as a signed value, the same as for
4818
decimal. If @code{ios::showbase} is set then a base indicator is shown on
4819
both the numerator and denominator (if the denominator is required).
4822
@deftypefun ostream& operator<< (ostream& @var{stream}, mpf_t @var{op})
4823
Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
4824
@code{ios::width} is reset to 0 after output, the same as the standard
4825
@code{ostream operator<<} routines do. The decimal point follows the current
4826
locale, on systems providing @code{localeconv}.
4828
Hex and octal are supported, unlike the standard @code{operator<<} routines on
4829
@code{double} etc. The mantissa will be in hex or octal, the exponent will be
4830
in decimal. For hex the exponent delimiter is an @samp{@@}. This is as per
4831
@code{mpf_out_str}. @code{ios::showbase} is supported, and will put a base on
4835
These operators mean that GMP types can be printed in the usual C++ way, for
4842
cout << "iteration " << n << " value " << z << "\n";
4845
But note that @code{ostream} output (and @code{istream} input, @pxref{C++
4846
Formatted Input}) is the only overloading available and using for instance
4847
@code{+} with an @code{mpz_t} will have unpredictable results.
4850
@node Formatted Input, C++ Class Interface, Formatted Output, Top
4851
@chapter Formatted Input
4852
@cindex Formatted input
4853
@cindex @code{scanf} formatted input
4856
* Formatted Input Strings::
4857
* Formatted Input Functions::
4858
* C++ Formatted Input::
4862
@node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input
4863
@section Formatted Input Strings
4865
@code{gmp_scanf} and friends accept format strings similar to the standard C
4866
@code{scanf} (@pxref{Formatted Input,,,libc,The GNU C Library Reference
4867
Manual}). A format specification is of the form
4870
% [flags] [width] [type] conv
4873
GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
4874
and @code{mpf_t} respectively. @samp{Z} and @samp{Q} behave like integers.
4875
@samp{Q} will read a @samp{/} and a denominator, if present. @samp{F} behaves
4878
GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since
4879
they're already ``call-by-reference''. For example,
4882
/* to read say "a(5) = 1234" */
4885
gmp_scanf ("a(%d) = %Zd\n", &n, z);
4888
gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2);
4890
/* to read say "topleft (1.55,-2.66)" */
4893
gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y);
4896
All the standard C @code{scanf} types behave the same as in the C library
4897
@code{scanf}, and can be freely intermixed with the GMP extensions. In the
4898
current implementation the standard parts of the format string are simply
4899
handed to @code{scanf} and only the GMP extensions handled directly.
4901
The flags accepted are as follows. @samp{a} and @samp{'} will depend on
4902
support from the C library, and @samp{'} cannot be used with GMP types.
4905
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
4906
@item @nicode{*} @tab read but don't store
4907
@item @nicode{a} @tab allocate a buffer (string conversions)
4908
@item @nicode{'} @tab group digits, GLIBC style (not GMP types)
4912
The standard types accepted are as follows. @samp{h} and @samp{l} are
4913
portable, the rest will depend on the compiler (or include files) for the type
4914
and the C library for the input.
4917
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
4918
@item @nicode{h} @tab @nicode{short}
4919
@item @nicode{hh} @tab @nicode{char}
4920
@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t}
4921
@item @nicode{l} @tab @nicode{long} or @nicode{wchar_t}
4922
@item @nicode{ll} @tab same as @nicode{L}
4923
@item @nicode{L} @tab @nicode{long long} or @nicode{long double}
4924
@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t}
4925
@item @nicode{t} @tab @nicode{ptrdiff_t}
4926
@item @nicode{z} @tab @nicode{size_t}
4934
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
4935
@item @nicode{F} @tab @nicode{mpf_t}, float conversions
4936
@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions
4937
@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions
4941
The conversions accepted are as follows. @samp{p} and @samp{[} will depend on
4942
support from the C library, the rest are standard.
4945
@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
4946
@item @nicode{c} @tab character or characters
4947
@item @nicode{d} @tab decimal integer
4948
@item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G}
4950
@item @nicode{i} @tab integer with base indicator
4951
@item @nicode{n} @tab characters written so far
4952
@item @nicode{o} @tab octal integer
4953
@item @nicode{p} @tab pointer
4954
@item @nicode{s} @tab string of non-whitespace characters
4955
@item @nicode{u} @tab decimal integer
4956
@item @nicode{x} @nicode{X} @tab hex integer
4957
@item @nicode{[} @tab string of characters in a set
4961
@samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all
4962
read either fixed point or scientific format, and either @samp{e} or @samp{E}
4963
for the exponent in scientific format.
4965
@samp{x} and @samp{X} are identical, both accept both upper and lower case
4968
@samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative
4969
values. For the standard C types these are described as ``unsigned''
4970
conversions, but that merely affects certain overflow handling, negatives are
4971
still allowed (see @code{strtoul}, @ref{Parsing of Integers,,,libc,The GNU C
4972
Library Reference Manual}). For GMP types there are no overflows, and
4973
@samp{d} and @samp{u} are identical.
4975
@samp{Q} type reads the numerator and (optional) denominator as given. If the
4976
value might not be in canonical form then @code{mpq_canonicalize} must be
4977
called before using it in any calculations (@pxref{Rational Number
4980
@samp{Qi} will read a base specification separately for the numerator and
4981
denominator. For example @samp{0x10/11} would be 16/11, whereas
4982
@samp{0x10/0x11} would be 16/17.
4984
@samp{n} can be used with any of the types above, even the GMP types.
4985
@samp{*} to suppress assignment is allowed, though the field would then do
4988
Other conversions or types that might be accepted by the C library
4989
@code{scanf} cannot be used through @code{gmp_scanf}.
4991
Whitespace is read and discarded before a field, except for @samp{c} and
4992
@samp{[} conversions.
4994
For float conversions, the decimal point character (or string) expected is
4995
taken from the current locale settings on systems which provide
4996
@code{localeconv} (@pxref{Locales,,Locales and Internationalization,libc,The
4997
GNU C Library Reference Manual}). The C library will normally do the same for
4998
standard float input.
5001
@node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input
5002
@section Formatted Input Functions
5004
Each of the following functions is similar to the corresponding C library
5005
function. The plain @code{scanf} forms take a variable argument list. The
5006
@code{vscanf} forms take an argument pointer, see @ref{Variadic
5007
Functions,,,libc,The GNU C Library Reference Manual}, or @samp{man 3
5010
It should be emphasised that if a format string is invalid, or the arguments
5011
don't match what the format specifies, then the behaviour of any of these
5012
functions will be unpredictable. GCC format string checking is not available,
5013
since it doesn't recognise the GMP extensions.
5015
No overlap is permitted between the @var{fmt} string and any of the results
5018
@deftypefun int gmp_scanf (const char *@var{fmt}, ...)
5019
@deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap})
5020
Read from the standard input @code{stdin}.
5023
@deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, ...)
5024
@deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
5025
Read from the stream @var{fp}.
5028
@deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, ...)
5029
@deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap})
5030
Read from a null-terminated string @var{s}.
5033
The return value from each of these functions is the same as the standard C99
5034
@code{scanf}, namely the number of fields successfully parsed and stored.
5035
@samp{%n} fields and fields read but suppressed by @samp{*} don't count
5036
towards the return value.
5038
If end of file or file error, or end of string, is reached when a match is
5039
required, and when no previous non-suppressed fields have matched, then the
5040
return value is EOF instead of 0. A match is required for a literal character
5041
in the format string or a field other than @samp{%n}. Whitespace in the
5042
format string is only an optional match and won't induce an EOF in this
5043
fashion. Leading whitespace read and discarded for a field doesn't count as a
5047
@node C++ Formatted Input, , Formatted Input Functions, Formatted Input
5048
@section C++ Formatted Input
5049
@cindex C++ @code{istream} input
5050
@cindex @code{istream} input
5052
The following functions are provided in @file{libgmpxx}, which is built only
5053
if C++ support is enabled (@pxref{Build Options}). Prototypes are available
5054
from @code{<gmp.h>}.
5056
@deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop})
5057
Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
5060
@deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop})
5061
Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
5063
An integer like @samp{123} will be read, or a fraction like @samp{5/9}. If
5064
the fraction is not in canonical form then @code{mpq_canonicalize} must be
5065
called (@pxref{Rational Number Functions}).
5068
@deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop})
5069
Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
5071
Hex or octal floats are not supported, but might be in the future.
5074
These operators mean that GMP types can be read in the usual C++ way, for
5083
But note that @code{istream} input (and @code{ostream} output, @pxref{C++
5084
Formatted Output}) is the only overloading available and using for instance
5085
@code{+} with an @code{mpz_t} will have unpredictable results.
5088
@node C++ Class Interface, BSD Compatible Functions, Formatted Input, Top
5089
@chapter C++ Class Interface
5090
@cindex C++ Interface
5092
This chapter describes the C++ class based interface to GMP.
5094
All GMP C language types and functions can be used in C++ programs, since
5095
@file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers
5096
overloaded functions and operators which may be more convenient.
5098
Due to the implementation of this interface, a reasonably recent C++ compiler
5099
is required, one supporting namespaces, partial specialization of templates
5100
and member templates. For GCC this means version 2.91 or later.
5102
@strong{Everything described in this chapter is to be considered preliminary
5103
and might be subject to incompatible changes if some unforeseen difficulty
5107
* C++ Interface General::
5108
* C++ Interface Integers::
5109
* C++ Interface Rationals::
5110
* C++ Interface Floats::
5111
* C++ Interface MPFR::
5112
* C++ Interface Random Numbers::
5113
* C++ Interface Limitations::
5117
@node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface
5118
@section C++ Interface General
5121
All the C++ classes and functions are available with
5128
The classes defined are
5130
@deftp Class mpz_class
5131
@deftpx Class mpq_class
5132
@deftpx Class mpf_class
5135
The standard operators and various standard functions are overloaded to allow
5136
arithmetic with these classes. For example,
5147
cout << "sum is " << c << "\n";
5148
cout << "absolute value is " << abs(c) << "\n";
5154
An important feature of the implementation is that an expression like
5155
@code{a=b+c} results in a single call to the corresponding @code{mpz_add},
5156
without using a temporary for the @code{b+c} part. Expressions which by their
5157
nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries
5160
The classes can be freely intermixed in expressions, as can the classes and
5161
the standard C++ types.
5163
Conversions back from the classes to standard C++ types aren't done
5164
automatically, instead member functions like @code{get_si} are provided (see
5165
the following sections for details).
5167
Also there are no automatic conversions from the classes to the corresponding
5168
GMP C types, instead a reference to the underlying C object can be obtained
5169
with the following functions,
5171
@deftypefun mpz_t mpz_class::get_mpz_t ()
5172
@deftypefunx mpq_t mpq_class::get_mpq_t ()
5173
@deftypefunx mpf_t mpf_class::get_mpf_t ()
5176
These can be used to call a C function which doesn't have a C++ class
5177
interface. For example to set @code{a} to the GCD of @code{b} and @code{c},
5182
mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t());
5185
In the other direction, a class can be initialized from the corresponding GMP
5186
C type, or assigned to if an explicit constructor is used. In both cases this
5187
makes a copy of the value, it doesn't create any sort of association. For
5192
// ... init and calculate z ...
5198
There are no namespace setups in @file{gmpxx.h}, all types and functions are
5199
simply put into the global namespace. This is what @file{gmp.h} has done in
5200
the past, and continues to do for compatibility. The extras provided by
5201
@file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with
5205
@node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface
5206
@section C++ Interface Integers
5208
@deftypefun void mpz_class::mpz_class (type @var{n})
5209
Construct an @code{mpz_class}. All the standard C++ types may be used, except
5210
@code{long long} and @code{long double}, and all the GMP C++ classes can be
5211
used. Any necessary conversion follows the corresponding C function, for
5212
example @code{double} follows @code{mpz_set_d} (@pxref{Assigning Integers}).
5215
@deftypefun void mpz_class::mpz_class (mpz_t @var{z})
5216
Construct an @code{mpz_class} from an @code{mpz_t}. The value in @var{z} is
5217
copied into the new @code{mpz_class}, there won't be any permanent association
5218
between it and @var{z}.
5221
@deftypefun void mpz_class::mpz_class (const char *@var{s})
5222
@deftypefunx void mpz_class::mpz_class (const char *@var{s}, int base)
5223
@deftypefunx void mpz_class::mpz_class (const string& @var{s})
5224
@deftypefunx void mpz_class::mpz_class (const string& @var{s}, int base)
5225
Construct an @code{mpz_class} converted from a string using
5226
@code{mpz_set_str}, (@pxref{Assigning Integers}). If the @var{base} is not
5227
given then 0 is used.
5230
@deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d})
5231
@deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d})
5232
Divisions involving @code{mpz_class} round towards zero, as per the
5233
@code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}).
5234
This corresponds to the rounding used for plain @code{int} calculations on
5237
The @code{mpz_fdiv...} or @code{mpz_cdiv...} functions can always be called
5238
directly if desired. For example,
5243
mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t());
5247
@deftypefun mpz_class abs (mpz_class @var{op1})
5248
@deftypefunx int cmp (mpz_class @var{op1}, type @var{op2})
5249
@deftypefunx int cmp (type @var{op1}, mpz_class @var{op2})
5250
@deftypefunx double mpz_class::get_d (void)
5251
@deftypefunx long mpz_class::get_si (void)
5252
@deftypefunx {unsigned long} mpz_class::get_ui (void)
5254
@deftypefunx bool mpz_class::fits_sint_p (void)
5255
@deftypefunx bool mpz_class::fits_slong_p (void)
5256
@deftypefunx bool mpz_class::fits_sshort_p (void)
5258
@deftypefunx bool mpz_class::fits_uint_p (void)
5259
@deftypefunx bool mpz_class::fits_ulong_p (void)
5260
@deftypefunx bool mpz_class::fits_ushort_p (void)
5262
@deftypefunx int sgn (mpz_class @var{op})
5263
@deftypefunx mpz_class sqrt (mpz_class @var{op})
5264
These functions provide a C++ class interface to the corresponding GMP C
5267
@code{cmp} can be used with any of the classes or the standard C++ types,
5268
except @code{long long} and @code{long double}.
5272
Overloaded operators for combinations of @code{mpz_class} and @code{double}
5273
are provided for completeness, but it should be noted that if the given
5274
@code{double} is not an integer then the way any rounding is done is currently
5275
unspecified. The rounding might take place at the start, in the middle, or at
5276
the end of the operation, and it might change in the future.
5278
Conversions between @code{mpz_class} and @code{double}, however, are defined
5279
to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}.
5280
And comparisons are always made exactly, as per @code{mpz_cmp_d}.
5283
@node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface
5284
@section C++ Interface Rationals
5286
In all the following constructors, if a fraction is given then it should be in
5287
canonical form, or if not then @code{mpq_class::canonicalize} called.
5289
@deftypefun void mpq_class::mpq_class (type @var{op})
5290
@deftypefunx void mpq_class::mpq_class (integer @var{num}, integer @var{den})
5291
Construct an @code{mpq_class}. The initial value can be a single value of any
5292
type, or a pair of integers (@code{mpz_class} or standard C++ integer types)
5293
representing a fraction, except that @code{long long} and @code{long double}
5294
are not supported. For example,
5303
@deftypefun void mpq_class::mpq_class (mpq_t @var{q})
5304
Construct an @code{mpq_class} from an @code{mpq_t}. The value in @var{q} is
5305
copied into the new @code{mpq_class}, there won't be any permanent association
5306
between it and @var{q}.
5309
@deftypefun void mpq_class::mpq_class (const char *@var{s})
5310
@deftypefunx void mpq_class::mpq_class (const char *@var{s}, int base)
5311
@deftypefunx void mpq_class::mpq_class (const string& @var{s})
5312
@deftypefunx void mpq_class::mpq_class (const string& @var{s}, int base)
5313
Construct an @code{mpq_class} converted from a string using
5314
@code{mpq_set_str}, (@pxref{Initializing Rationals}). If the @var{base} is
5315
not given then 0 is used.
5318
@deftypefun void mpq_class::canonicalize ()
5319
Put an @code{mpq_class} into canonical form, as per @ref{Rational Number
5320
Functions}. All arithmetic operators require their operands in canonical
5321
form, and will return results in canonical form.
5324
@deftypefun mpq_class abs (mpq_class @var{op})
5325
@deftypefunx int cmp (mpq_class @var{op1}, type @var{op2})
5326
@deftypefunx int cmp (type @var{op1}, mpq_class @var{op2})
5328
@deftypefunx double mpq_class::get_d (void)
5329
@deftypefunx int sgn (mpq_class @var{op})
5330
These functions provide a C++ class interface to the corresponding GMP C
5333
@code{cmp} can be used with any of the classes or the standard C++ types,
5334
except @code{long long} and @code{long double}.
5337
@deftypefun {mpz_class&} mpq_class::get_num ()
5338
@deftypefunx {mpz_class&} mpq_class::get_den ()
5339
Get a reference to an @code{mpz_class} which is the numerator or denominator
5340
of an @code{mpq_class}. This can be used both for read and write access. If
5341
the object returned is modified, it modifies the original @code{mpq_class}.
5343
If direct manipulation might produce a non-canonical value, then
5344
@code{mpq_class::canonicalize} must be called before further operations.
5347
@deftypefun mpz_t mpq_class::get_num_mpz_t ()
5348
@deftypefunx mpz_t mpq_class::get_den_mpz_t ()
5349
Get a reference to the underlying @code{mpz_t} numerator or denominator of an
5350
@code{mpq_class}. This can be passed to C functions expecting an
5351
@code{mpz_t}. Any modifications made to the @code{mpz_t} will modify the
5352
original @code{mpq_class}.
5354
If direct manipulation might produce a non-canonical value, then
5355
@code{mpq_class::canonicalize} must be called before further operations.
5358
@deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop});
5359
Read @var{rop} from @var{stream}, using its @code{ios} formatting settings,
5360
the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}).
5362
If the @var{rop} read might not be in canonical form then
5363
@code{mpq_class::canonicalize} must be called.
5367
@node C++ Interface Floats, C++ Interface MPFR, C++ Interface Rationals, C++ Class Interface
5368
@section C++ Interface Floats
5370
When an expression requires the use of temporary intermediate @code{mpf_class}
5371
values, like @code{f=g*h+x*y}, those temporaries will have the same precision
5372
as the destination @code{f}. Explicit constructors can be used if this
5375
@deftypefun {} mpf_class::mpf_class (type @var{op})
5376
@deftypefunx {} mpf_class::mpf_class (type @var{op}, unsigned long @var{prec})
5377
Construct an @code{mpf_class}. Any standard C++ type can be used, except
5378
@code{long long} and @code{long double}, and any of the GMP C++ classes can be
5381
If @var{prec} is given, the initial precision is that value, in bits. If
5382
@var{prec} is not given, then the initial precision is determined by the type
5383
of @var{op} given. An @code{mpz_class}, @code{mpq_class}, string, or C++
5384
builtin type will give the default @code{mpf} precision (@pxref{Initializing
5385
Floats}). An @code{mpf_class} or expression will give the precision of that
5386
value. The precision of a binary expression is the higher of the two
5390
mpf_class f(1.5); // default precision
5391
mpf_class f(1.5, 500); // 500 bits (at least)
5392
mpf_class f(x); // precision of x
5393
mpf_class f(abs(x)); // precision of x
5394
mpf_class f(-g, 1000); // 1000 bits (at least)
5395
mpf_class f(x+y); // greater of precisions of x and y
5399
@deftypefun mpf_class abs (mpf_class @var{op})
5400
@deftypefunx mpf_class ceil (mpf_class @var{op})
5401
@deftypefunx int cmp (mpf_class @var{op1}, type @var{op2})
5402
@deftypefunx int cmp (type @var{op1}, mpf_class @var{op2})
5404
@deftypefunx mpf_class floor (mpf_class @var{op})
5405
@deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2})
5406
@deftypefunx double mpf_class::get_d (void)
5407
@deftypefunx long mpf_class::get_si (void)
5408
@deftypefunx {unsigned long} mpf_class::get_ui (void)
5410
@deftypefunx bool mpf_class::fits_sint_p (void)
5411
@deftypefunx bool mpf_class::fits_slong_p (void)
5412
@deftypefunx bool mpf_class::fits_sshort_p (void)
5414
@deftypefunx bool mpf_class::fits_uint_p (void)
5415
@deftypefunx bool mpf_class::fits_ulong_p (void)
5416
@deftypefunx bool mpf_class::fits_ushort_p (void)
5418
@deftypefunx int sgn (mpf_class @var{op})
5419
@deftypefunx mpf_class sqrt (mpf_class @var{op})
5420
@deftypefunx mpf_class trunc (mpf_class @var{op})
5421
These functions provide a C++ class interface to the corresponding GMP C
5424
@code{cmp} can be used with any of the classes or the standard C++ types,
5425
except @code{long long} and @code{long double}.
5427
The accuracy provided by @code{hypot} is not currently guaranteed.
5430
@deftypefun {unsigned long int} mpf_class::get_prec ()
5431
@deftypefunx void mpf_class::set_prec (unsigned long @var{prec})
5432
@deftypefunx void mpf_class::set_prec_raw (unsigned long @var{prec})
5433
Get or set the current precision of an @code{mpf_class}.
5435
The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing
5436
Floats}) apply to @code{mpf_class::set_prec_raw}. Note in particular that the
5437
@code{mpf_class} must be restored to it's allocated precision before being
5438
destroyed. This must be done by application code, there's no automatic
5443
@node C++ Interface MPFR, C++ Interface Random Numbers, C++ Interface Floats, C++ Class Interface
5444
@section C++ Interface MPFR
5446
The C++ class interface to MPFR is provided if MPFR is enabled (@pxref{Build
5447
Options}). This interface must be regarded as preliminary and possibly
5448
subject to incompatible changes in the future, since MPFR itself is
5449
preliminary. All definitions can be obtained with
5458
@deftp Class mpfr_class
5462
which behaves similarly to @code{mpf_class} (@pxref{C++ Interface Floats}).
5465
@node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface MPFR, C++ Class Interface
5466
@section C++ Interface Random Numbers
5468
@deftp Class gmp_randclass
5469
The C++ class interface to the GMP random number functions uses
5470
@code{gmp_randclass} to hold an algorithm selection and current state, as per
5471
@code{gmp_randstate_t}.
5474
@deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, ...), ...)
5475
Construct a @code{gmp_randclass}, using a call to the given @var{randinit}
5476
function (@pxref{Random State Initialization}). The arguments expected are
5477
the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}.
5481
gmp_randclass r1 (gmp_randinit_default);
5482
gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32);
5483
gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp);
5486
@code{gmp_randinit_lc_2exp_size} can fail if the size requested is too big,
5487
the behaviour of @code{gmp_randclass::gmp_randclass} is undefined in this case
5488
(perhaps this will change in the future).
5491
@deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, ...)
5492
Construct a @code{gmp_randclass} using the same parameters as
5493
@code{gmp_randinit} (@pxref{Random State Initialization}). This function is
5494
obsolete and the above @var{randinit} style should be preferred.
5497
@deftypefun void gmp_randclass::seed (unsigned long int @var{s})
5498
@deftypefunx void gmp_randclass::seed (mpz_class @var{s})
5499
Seed a random number generator. See @pxref{Random Number Functions}, for how
5500
to choose a good seed.
5503
@deftypefun mpz_class gmp_randclass::get_z_bits (unsigned long @var{bits})
5504
@deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits})
5505
Generate a random integer with a specified number of bits.
5508
@deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n})
5509
Generate a random integer in the range 0 to @ma{@var{n}-1} inclusive.
5512
@deftypefun mpf_class gmp_randclass::get_f ()
5513
@deftypefunx mpf_class gmp_randclass::get_f (unsigned long @var{prec})
5514
Generate a random float @var{f} in the range @ma{0 <= @var{f} < 1}. @var{f}
5515
will be to @var{prec} bits precision, or if @var{prec} is not given then to
5516
the precision of the destination. For example,
5521
mpf_class f (0, 512); // 512 bits precision
5522
f = r.get_f(); // random number, 512 bits
5528
@node C++ Interface Limitations, , C++ Interface Random Numbers, C++ Class Interface
5529
@section C++ Interface Limitations
5532
@item @code{mpq_class} and Templated Reading
5533
A generic piece of template code probably won't know that @code{mpq_class}
5534
requires a @code{canonicalize} call if inputs read with @code{operator>>}
5535
might be non-canonical. This can lead to incorrect results.
5537
@code{operator>>} behaves as it does for reasons of efficiency. A
5538
canonicalize can be quite time consuming on large operands, and is best
5539
avoided if it's not necessary.
5541
But this potential difficulty reduces the usefulness of @code{mpq_class}.
5542
Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in
5543
the future, maybe a preprocessor define, a global flag, or an @code{ios} flag
5544
pressed into service. Or maybe, at the risk of inconsistency, the
5545
@code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t}
5546
@code{operator>>} not doing so, for use on those occasions when that's
5547
acceptable. Send feedback or alternate ideas to @email{bug-gmp@@gnu.org}.
5550
Subclassing the GMP C++ classes works, but is not currently recommended.
5552
Expressions involving subclasses resolve correctly (or seem to), but in normal
5553
C++ fashion the subclass doesn't inherit constructors and assignments.
5554
There's many of those in the GMP classes, and a good way to reestablish them
5555
in a subclass is not yet provided.
5557
@item Templated Expressions
5559
A subtle difficulty exists when using expressions together with
5560
application-defined template functions. Consider the following, with @code{T}
5561
intended to be some numeric type,
5565
T fun (const T &, const T &);
5569
When used with, say, plain @code{mpz_class} variables, it works fine: @code{T}
5570
is resolved as @code{mpz_class}.
5573
mpz_class f(1), g(2);
5578
But when one of the arguments is an expression, it doesn't work.
5581
mpz_class f(1), g(2), h(3);
5582
fun (f, g+h); // Bad
5585
This is because @code{g+h} ends up being a certain expression template type
5586
internal to @code{gmpxx.h}, which the C++ template resolution rules are unable
5587
to automatically convert to @code{mpz_class}. The workaround is simply to add
5591
mpz_class f(1), g(2), h(3);
5592
fun (f, mpz_class(g+h)); // Good
5595
Similarly, within @code{fun} it may be necessary to cast an expression to type
5596
@code{T} when calling a templated @code{fun2}.
5602
fun2 (f, f+g); // Bad
5608
fun2 (f, T(f+g)); // Good
5614
@node BSD Compatible Functions, Custom Allocation, C++ Class Interface, Top
5615
@comment node-name, next, previous, up
5616
@chapter Berkeley MP Compatible Functions
5617
@cindex Berkeley MP compatible functions
5618
@cindex BSD MP compatible functions
5620
These functions are intended to be fully compatible with the Berkeley MP
5621
library which is available on many BSD derived U*ix systems. The
5622
@samp{--enable-mpbsd} option must be used when building GNU MP to make these
5623
available (@pxref{Installing GMP}).
5625
The original Berkeley MP library has a usage restriction: you cannot use the
5626
same variable as both source and destination in a single function call. The
5627
compatible functions in GNU MP do not share this restriction---inputs and
5628
outputs may overlap.
5630
It is not recommended that new programs are written using these functions.
5631
Apart from the incomplete set of functions, the interface for initializing
5632
@code{MINT} objects is more error prone, and the @code{pow} function collides
5633
with @code{pow} in @file{libm.a}.
5636
Include the header @file{mp.h} to get the definition of the necessary types and
5637
functions. If you are on a BSD derived system, make sure to include GNU
5638
@file{mp.h} if you are going to link the GNU @file{libmp.a} to your program.
5639
This means that you probably need to give the @samp{-I<dir>} option to the
5640
compiler, where @samp{<dir>} is the directory where you have GNU @file{mp.h}.
5642
@deftypefun {MINT *} itom (signed short int @var{initial_value})
5643
Allocate an integer consisting of a @code{MINT} object and dynamic limb space.
5644
Initialize the integer to @var{initial_value}. Return a pointer to the
5648
@deftypefun {MINT *} xtom (char *@var{initial_value})
5649
Allocate an integer consisting of a @code{MINT} object and dynamic limb space.
5650
Initialize the integer from @var{initial_value}, a hexadecimal,
5651
null-terminated C string. Return a pointer to the @code{MINT} object.
5654
@deftypefun void move (MINT *@var{src}, MINT *@var{dest})
5655
Set @var{dest} to @var{src} by copying. Both variables must be previously
5659
@deftypefun void madd (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
5660
Add @var{src_1} and @var{src_2} and put the sum in @var{destination}.
5663
@deftypefun void msub (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
5664
Subtract @var{src_2} from @var{src_1} and put the difference in
5668
@deftypefun void mult (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
5669
Multiply @var{src_1} and @var{src_2} and put the product in @var{destination}.
5672
@deftypefun void mdiv (MINT *@var{dividend}, MINT *@var{divisor}, MINT *@var{quotient}, MINT *@var{remainder})
5673
@deftypefunx void sdiv (MINT *@var{dividend}, signed short int @var{divisor}, MINT *@var{quotient}, signed short int *@var{remainder})
5674
Set @var{quotient} to @var{dividend}/@var{divisor}, and @var{remainder} to
5675
@var{dividend} mod @var{divisor}. The quotient is rounded towards zero; the
5676
remainder has the same sign as the dividend unless it is zero.
5678
Some implementations of these functions work differently---or not at all---for
5682
@deftypefun void msqrt (MINT *@var{op}, MINT *@var{root}, MINT *@var{remainder})
5683
Set @var{root} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
5684
of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{remainder} to
5685
@m{(@var{op} - @var{root}^2), @var{op}@minus{}@var{root}*@var{root}}, i.e.
5686
zero if @var{op} is a perfect square.
5688
If @var{root} and @var{remainder} are the same variable, the results are
5692
@deftypefun void pow (MINT *@var{base}, MINT *@var{exp}, MINT *@var{mod}, MINT *@var{dest})
5693
Set @var{dest} to (@var{base} raised to @var{exp}) modulo @var{mod}.
5696
@deftypefun void rpow (MINT *@var{base}, signed short int @var{exp}, MINT *@var{dest})
5697
Set @var{dest} to @var{base} raised to @var{exp}.
5700
@deftypefun void gcd (MINT *@var{op1}, MINT *@var{op2}, MINT *@var{res})
5701
Set @var{res} to the greatest common divisor of @var{op1} and @var{op2}.
5704
@deftypefun int mcmp (MINT *@var{op1}, MINT *@var{op2})
5705
Compare @var{op1} and @var{op2}. Return a positive value if @var{op1} >
5706
@var{op2}, zero if @var{op1} = @var{op2}, and a negative value if @var{op1} <
5710
@deftypefun void min (MINT *@var{dest})
5711
Input a decimal string from @code{stdin}, and put the read integer in
5712
@var{dest}. SPC and TAB are allowed in the number string, and are ignored.
5715
@deftypefun void mout (MINT *@var{src})
5716
Output @var{src} to @code{stdout}, as a decimal string. Also output a newline.
5719
@deftypefun {char *} mtox (MINT *@var{op})
5720
Convert @var{op} to a hexadecimal string, and return a pointer to the string.
5721
The returned string is allocated using the default memory allocation function,
5722
@code{malloc} by default.
5725
@deftypefun void mfree (MINT *@var{op})
5726
De-allocate, the space used by @var{op}. @strong{This function should only be
5727
passed a value returned by @code{itom} or @code{xtom}.}
5731
@node Custom Allocation, Language Bindings, BSD Compatible Functions, Top
5732
@comment node-name, next, previous, up
5733
@chapter Custom Allocation
5734
@cindex Custom allocation
5735
@cindex Memory allocation
5736
@cindex Allocation of memory
5738
By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory
5739
allocation, and if they fail GMP prints a message to the standard error output
5740
and terminates the program.
5742
Alternate functions can be specified to allocate memory in a different way or
5743
to have a different error action on running out of memory.
5745
This feature is available in the Berkeley compatibility library (@pxref{BSD
5746
Compatible Functions}) as well as the main GMP library.
5748
@deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t))
5749
Replace the current allocation functions from the arguments. If an argument
5750
is @code{NULL}, the corresponding default function is used.
5752
These functions will be used for all memory allocation done by GMP, apart from
5753
temporary space from @code{alloca} if that function is available and GMP is
5754
configured to use it (@pxref{Build Options}).
5756
@strong{Be sure to call @code{mp_set_memory_functions} only when there are no
5757
active GMP objects allocated using the previous memory functions! Usually
5758
that means calling it before any other GMP function.}
5761
The functions supplied should fit the following declarations:
5763
@deftypefun {void *} allocate_function (size_t @var{alloc_size})
5764
Return a pointer to newly allocated space with at least @var{alloc_size}
5768
@deftypefun {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size})
5769
Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be
5770
@var{new_size} bytes.
5772
The block may be moved if necessary or if desired, and in that case the
5773
smaller of @var{old_size} and @var{new_size} bytes must be copied to the new
5774
location. The return value is a pointer to the resized block, that being the
5775
new location if moved or just @var{ptr} if not.
5777
@var{ptr} is never @code{NULL}, it's always a previously allocated block.
5778
@var{new_size} may be bigger or smaller than @var{old_size}.
5781
@deftypefun void deallocate_function (void *@var{ptr}, size_t @var{size})
5782
De-allocate the space pointed to by @var{ptr}.
5784
@var{ptr} is never @code{NULL}, it's always a previously allocated block of
5788
A @dfn{byte} here means the unit used by the @code{sizeof} operator.
5790
The @var{old_size} parameters to @var{reallocate_function} and
5791
@var{deallocate_function} are passed for convenience, but of course can be
5792
ignored if not needed. The default functions using @code{malloc} and friends
5793
for instance don't use them.
5795
No error return is allowed from any of these functions, if they return then
5796
they must have performed the specified operation. In particular note that
5797
@var{allocate_function} or @var{reallocate_function} mustn't return
5800
Getting a different fatal error action is a good use for custom allocation
5801
functions, for example giving a graphical dialog rather than the default print
5802
to @code{stderr}. How much is possible when genuinely out of memory is
5803
another question though.
5805
There's currently no defined way for the allocation functions to recover from
5806
an error such as out of memory, they must terminate program execution. A
5807
@code{longjmp} or throwing a C++ exception will have undefined results. This
5808
may change in the future.
5810
GMP may use allocated blocks to hold pointers to other allocated blocks. This
5811
will limit the assumptions a conservative garbage collection scheme can make.
5813
Since the default GMP allocation uses @code{malloc} and friends, those
5814
functions will be linked in even if the first thing a program does is an
5815
@code{mp_set_memory_functions}. It's necessary to change the GMP sources if
5819
@node Language Bindings, Algorithms, Custom Allocation, Top
5820
@chapter Language Bindings
5822
The following packages and projects offer access to GMP from languages other
5823
than C, though perhaps with varying levels of functionality and efficiency.
5825
@c GNUstep Base Library @uref{http://www.gnustep.org} (version 0.9.1) is
5826
@c intending to use GMP for its NSDecimal class, which would be an Objective
5827
@c C binding for GMP. Has some configure stuff ready, but no code.
5829
@c @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces
5830
@c in tex, just to separate the URL from the preceding text a bit.
5832
@macro spaceuref {U}
5837
@macro spaceuref {U}
5847
GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward
5848
interface, expression templates to eliminate temporaries.
5850
ALP @spaceuref{http://www.inria.fr/saga/logiciels/ALP} @* Linear algebra and
5851
polynomials using templates.
5853
CLN @spaceuref{http://clisp.cons.org/~haible/packages-cln.html"} @* High level
5854
classes for arithmetic.
5856
LiDIA @spaceuref{http://www.informatik.tu-darmstadt.de/TI/LiDIA} @* A C++
5857
library for computational number theory.
5859
NTL @spaceuref{http://www.shoup.net/ntl} @* A C++ number theory library.
5865
Omni F77 @spaceuref{http://pdplab.trc.rwcp.or.jp/pdperf/Omni/home.html} @*
5866
Arbitrary precision floats.
5872
Glasgow Haskell Compiler @spaceuref{http://www.haskell.org/ghc}
5878
Kaffe @spaceuref{http://www.kaffe.org}
5880
Kissme @spaceuref{http://kissme.sourceforge.net}
5886
GNU Common Lisp @spaceuref{http://www.gnu.org/software/gcl/gcl.html} @* In the
5887
process of switching to GMP for bignums.
5889
Librep @spaceuref{http://librep.sourceforge.net}
5895
GNU m4 betas @spaceuref{http://www.seindal.dk/rene/gnu} @* Optionally provides
5896
an arbitrary precision @code{mpeval}.
5902
MLton compiler @spaceuref{http://www.sourcelight.com/MLton}
5908
Mozart @spaceuref{http://www.mozart-oz.org}
5914
GMP module, see @file{demos/perl} in the GMP sources.
5916
Math::GMP @spaceuref{http://www.cpan.org} @* Compatible with Math::BigInt, but
5917
not as many functions as the GMP module above.
5924
mpz module in the standard distribution, @uref{http://pike.idonex.com}
5931
SWI Prolog @spaceuref{http://www.swi.psy.uva.nl/projects/SWI-Prolog} @*
5932
Arbitrary precision floats.
5938
mpz module in the standard distribution, @uref{http://www.python.org}
5944
RScheme @spaceuref{http://www.rscheme.org}
5950
DrGenius @spaceuref{http://drgenius.seul.org} @* Geometry system and
5951
mathematical programming language.
5953
GiNaC @spaceuref{http://www.ginac.de} @* C++ computer algebra using CLN.
5955
Maxima @uref{http://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma
5956
computer algebra using GCL.
5958
Q @spaceuref{http://www.musikwissenschaft.uni-mainz.de/~ag/q} @* Equational
5961
Yacas @spaceuref{http://www.xs4all.nl/~apinkus/yacas.html} @* Computer algebra
5968
@node Algorithms, Internals, Language Bindings, Top
5972
This chapter is an introduction to some of the algorithms used for various GMP
5973
operations. The code is likely to be hard to understand without knowing
5974
something about the algorithms.
5976
Some GMP internals are mentioned, but applications that expect to be
5977
compatible with future GMP releases should take care to use only the
5978
documented functions.
5981
* Multiplication Algorithms::
5982
* Division Algorithms::
5983
* Greatest Common Divisor Algorithms::
5984
* Powering Algorithms::
5985
* Root Extraction Algorithms::
5986
* Radix Conversion Algorithms::
5987
* Other Algorithms::
5988
* Assembler Coding::
5992
@node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms
5993
@section Multiplication
5994
@cindex Multiplication algorithms
5996
N@cross{}N limb multiplications and squares are done using one of four
5997
algorithms, as the size N increases.
6000
@multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6001
@item Algorithm @tab Threshold
6002
@item Basecase @tab (none)
6003
@item Karatsuba @tab @code{KARATSUBA_MUL_THRESHOLD}
6004
@item Toom-3 @tab @code{TOOM3_MUL_THRESHOLD}
6005
@item FFT @tab @code{FFT_MUL_THRESHOLD}
6009
Similarly for squaring, with the @code{SQR} thresholds. Note though that the
6010
FFT is only used if GMP is configured with @samp{--enable-fft}, @pxref{Build
6013
N@cross{}M multiplications of operands with different sizes above
6014
@code{KARATSUBA_MUL_THRESHOLD} are currently done by splitting into M@cross{}M
6015
pieces. The Karatsuba and Toom-3 routines then operate only on equal size
6016
operands. This is not very efficient, and is slated for improvement in the
6020
* Basecase Multiplication::
6021
* Karatsuba Multiplication::
6022
* Toom-Cook 3-Way Multiplication::
6023
* FFT Multiplication::
6024
* Other Multiplication::
6028
@node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms
6029
@subsection Basecase Multiplication
6031
Basecase N@cross{}M multiplication is a straightforward rectangular set of
6032
cross-products, the same as long multiplication done by hand and for that
6033
reason sometimes known as the schoolbook or grammar school method. This is an
6034
@m{O(NM),O(N*M)} algorithm. See Knuth section 4.3.1 algorithm M
6035
(@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code.
6037
Assembler implementations of @code{mpn_mul_basecase} are essentially the same
6038
as the generic C code, but have all the usual assembler tricks and
6039
obscurities introduced for speed.
6041
A square can be done in roughly half the time of a multiply, by using the fact
6042
that the cross products above and below the diagonal are the same. A triangle
6043
of products below the diagonal is formed, doubled (left shift by one bit), and
6044
then the products on the diagonal added. This can be seen in
6045
@file{mpn/generic/sqr_basecase.c}. Again the assembler implementations take
6046
essentially the same approach.
6049
\def\GMPline#1#2#3#4#5#6{%
6051
\vrule height 2.5ex depth 1ex
6052
\hbox to 2em {\hfil{#2}\hfil}%
6053
\vrule \hbox to 2em {\hfil{#3}\hfil}%
6054
\vrule \hbox to 2em {\hfil{#4}\hfil}%
6055
\vrule \hbox to 2em {\hfil{#5}\hfil}%
6056
\vrule \hbox to 2em {\hfil{#6}\hfil}%
6061
\hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}%
6062
\hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}%
6063
\hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}%
6064
\hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}%
6065
\hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}%
6066
\hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}%
6070
\hbox to 2em {\hfil u0\hfil}%
6071
\hbox to 2em {\hfil u1\hfil}%
6072
\hbox to 2em {\hfil u2\hfil}%
6073
\hbox to 2em {\hfil u3\hfil}%
6074
\hbox to 2em {\hfil u4\hfil}}%
6077
\GMPline{u0}{d}{}{}{}{}%
6079
\GMPline{u1}{}{d}{}{}{}%
6081
\GMPline{u2}{}{}{d}{}{}%
6083
\GMPline{u3}{}{}{}{d}{}%
6085
\GMPline{u4}{}{}{}{}{d}%
6092
+---+---+---+---+---+
6094
+---+---+---+---+---+
6096
+---+---+---+---+---+
6098
+---+---+---+---+---+
6100
+---+---+---+---+---+
6102
+---+---+---+---+---+
6107
In practice squaring isn't a full 2@cross{} faster than multiplying, it's
6108
usually around 1.5@cross{}. Less than 1.5@cross{} probably indicates
6109
@code{mpn_sqr_basecase} wants improving on that CPU.
6111
On some CPUs @code{mpn_mul_basecase} can be faster than the generic C
6112
@code{mpn_sqr_basecase}. @code{BASECASE_SQR_THRESHOLD} is the size at which
6113
to use @code{mpn_sqr_basecase}, this will be zero if that routine should be
6117
@node Karatsuba Multiplication, Toom-Cook 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms
6118
@subsection Karatsuba Multiplication
6120
The Karatsuba multiplication algorithm is described in Knuth section 4.3.3
6121
part A, and various other textbooks. A brief description is given here.
6123
The inputs @ma{x} and @ma{y} are treated as each split into two parts of equal
6124
length (or the most significant part one limb shorter if N is odd).
6127
\global\newdimen\GMPboxwidth \GMPboxwidth=5em
6128
\global\newdimen\GMPboxheight \GMPboxheight=3ex
6133
\vrule height 2ex depth 1ex
6134
\hbox to \GMPboxwidth {\hfil\hbox{$#1$}\hfil}%
6136
\hbox to \GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
6142
\hbox to 2\GMPboxwidth {high \hfil low}
6149
%\moveright \lispnarrowing
6157
+----------+----------+
6159
+----------+----------+
6161
+----------+----------+
6163
+----------+----------+
6168
Let @ma{b} be the power of 2 where the split occurs, ie.@: if @ms{x,0} is
6169
@ma{k} limbs (@ms{y,0} the same) then
6170
@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}.
6171
With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the
6175
@m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0,
6176
x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0}
6179
This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs,
6180
whereas a basecase multiply of N@cross{}N limbs is equivalent to four
6181
multiplies of (N/2)@cross{}(N/2). The factors @ma{(b^2+b)} etc represent the
6182
positions where the three products must be added.
6185
\global\newdimen\GMPboxwidth \GMPboxwidth=5em
6186
\global\newdimen\GMPboxheight \GMPboxheight=3ex
6188
\vbox to \GMPboxheight{%
6192
\hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}%
6194
\hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
6199
\vbox to \GMPboxheight{%
6200
\vfil \hbox to \GMPboxwidth {\hfil #1} \vfil }
6201
\vbox to \GMPboxheight{%
6205
\hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}
6210
\hbox to 4\GMPboxwidth {high \hfil low}
6212
\GMPboxA{x_1y_1}{x_0y_0}
6214
\GMPboxB{$+$}{x_1y_1}
6216
\GMPboxB{$+$}{x_0y_0}
6218
\GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)}
6225
+--------+--------+ +--------+--------+
6227
+--------+--------+ +--------+--------+
6235
sub | (x1-x0)*(y1-y0) |
6241
The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an
6242
absolute value, and the sign used to choose to add or subtract. Notice the
6243
sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1),
6244
high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb
6245
additions, rather than @m{6k,6*k}, but in GMP extra function call overheads
6246
outweigh the saving.
6248
Squaring is similar to multiplying, but with @ma{x=y} the formula reduces to
6249
an equivalent with three squares,
6252
@m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2,
6253
x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2}
6256
The final result is accumulated from those three squares the same way as for
6257
the three multiplies above. The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now
6260
A similar formula for both multiplying and squaring can be constructed with a
6261
middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}. But those sums can exceed
6262
@ma{k} limbs, leading to more carry handling and additions than the form
6265
Karatsuba multiplication is asymptotically an @ma{O(N^@W{1.585})} algorithm,
6266
the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies
6267
each 1/2 the size of the inputs. This is a big improvement over the basecase
6268
multiply at @ma{O(N^2)} and the advantage soon overcomes the extra additions
6271
@code{KARATSUBA_MUL_THRESHOLD} can be as little as 10 limbs. The @code{SQR}
6272
threshold is usually about twice the @code{MUL}. The basecase algorithm will
6273
take a time of the form @m{M(N) = aN^2 + bN + c, M(N) = a*N^2 + b*N + c} and
6274
the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN + e, K(N) = 3*M(N/2) + d*N +
6275
e}. Clearly per-crossproduct speedups in the basecase code reduce @ma{a} and
6276
decrease the threshold, but linear style speedups reducing @ma{b} will
6277
actually increase the threshold. The latter can be seen for instance when
6278
adding an optimized @code{mpn_sqr_diagonal} to @code{mpn_sqr_basecase}. Of
6279
course all speedups reduce total time, and in that sense the algorithm
6280
thresholds are merely of academic interest.
6283
@node Toom-Cook 3-Way Multiplication, FFT Multiplication, Karatsuba Multiplication, Multiplication Algorithms
6284
@subsection Toom-Cook 3-Way Multiplication
6286
The Karatsuba formula is the simplest case of a general approach to splitting
6287
inputs that leads to both Toom-Cook and FFT algorithms. A description of
6288
Toom-Cook can be found in Knuth section 4.3.3, with an example 3-way
6289
calculation after Theorem A. The 3-way form used in GMP is described here.
6291
The operands are each considered split into 3 pieces of equal length (or the
6292
most significant part 1 or 2 limbs shorter than the others).
6295
@global@newdimen@GMPboxwidth @GMPboxwidth=5em
6296
@global@newdimen@GMPboxheight @GMPboxheight=3ex
6300
\vbox to \GMPboxheight{%
6304
\hbox to \GMPboxwidth {\hfil\hbox{$#1$}\hfil}%
6306
\hbox to \GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
6308
\hbox to \GMPboxwidth {\hfil\hbox{$#3$}\hfil}%
6314
\hbox to 3\GMPboxwidth {high \hfil low}
6316
\GMPbox{x_2}{x_1}{x_0}
6318
\GMPbox{y_2}{y_1}{y_0}
6326
+----------+----------+----------+
6328
+----------+----------+----------+
6330
+----------+----------+----------+
6332
+----------+----------+----------+
6338
These parts are treated as the coefficients of two polynomials
6342
@m{X(t) = x_2t^2 + x_1t + x_0,
6343
X(t) = x2*t^2 + x1*t + x0}
6344
@m{Y(t) = y_2t^2 + y_1t + y_0,
6345
Y(t) = y2*t^2 + y1*t + y0}
6349
Again let @ma{b} equal the power of 2 which is the size of the @ms{x,0},
6350
@ms{x,1}, @ms{y,0} and @ms{y,1} pieces, ie.@: if they're @ma{k} limbs each
6351
then @m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}},
6352
b=2^(k*mp_bits_per_limb)}. With this @ma{x=X(b)} and @ma{y=Y(b)}.
6354
Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients
6358
@m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0,
6359
W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0}
6363
The @m{w_i,w[i]} are going to be determined, and when they are they'll give
6364
the final result using @ma{w=W(b)}, since @m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}.
6365
The coefficients will be roughly @ma{b^2} each, and the final @ma{W(b)} will
6366
be an addition like,
6370
\moveright #1\GMPboxwidth
6371
\vbox to \GMPboxheight{%
6375
\hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
6381
\hbox to 6\GMPboxwidth {high \hfil low}
6417
The @m{w_i,w[i]} coefficients could be formed by a simple set of cross
6418
products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2},
6419
@m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all
6420
nine @m{x_iy_j,x[i]*y[j]} for @ma{i,j=0,1,2}, and would be equivalent merely
6421
to a basecase multiply. Instead the following approach is used.
6423
@ma{X(t)} and @ma{Y(t)} are evaluated and multiplied at 5 points, giving
6424
values of @ma{W(t)} at those points. The points used can be chosen in
6425
various ways, but in GMP the following are used
6428
@multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6429
@item Point @tab Value
6430
@item @ma{t=0} @tab @m{x_0y_0,x0*y0}, which gives @ms{w,0} immediately
6431
@item @ma{t=2} @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0)*(4*y2+2*y1+y0)}
6432
@item @ma{t=1} @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0)*(y2+y1+y0)}
6433
@item @m{t={1\over2},t=1/2} @tab @m{(x_2+2x_1+4x_0)(y_2+2y_1+4y_0),(x2+2*x1+4*x0)*(y2+2*y1+4*y0)}
6434
@item @m{t=\infty,t=inf} @tab @m{x_2y_2,x2*y2}, which gives @ms{w,4} immediately
6438
At @m{t={1\over2},t=1/2} the value calculated is actually
6439
@m{16X({1\over2})Y({1\over2}), 16*X(1/2)*Y(1/2)}, giving a value for
6440
@m{16W({1\over2}),16*W(1/2)}, and this is always an integer. At
6441
@m{t=\infty,t=inf} the value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over
6442
t^4}, X(t)*Y(t)/t^4 in the limit as t approaches infinity}, but it's much
6443
easier to think of as simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately
6444
(much like @m{x_0y_0,x0*y0} at @ma{t=0} gives @ms{w,0} immediately).
6446
Now each of the points substituted into
6447
@m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination
6448
of the @m{w_i,w[i]} coefficients, and the value of those combinations has just
6454
W(0) & = & & & & & & & & & w_0 \cr
6455
16W({1\over2}) & = & w_4 & + & 2w_3 & + & 4w_2 & + & 8w_1 & + & 16w_0 \cr
6456
W(1) & = & w_4 & + & w_3 & + & w_2 & + & w_1 & + & w_0 \cr
6457
W(2) & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr
6458
W(\infty) & = & w_4 \cr
6465
16*W(1/2) = w4 + 2*w3 + 4*w2 + 8*w1 + 16*w0
6466
W(1) = w4 + w3 + w2 + w1 + w0
6467
W(2) = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0
6473
This is a set of five equations in five unknowns, and some elementary linear
6474
algebra quickly isolates each @m{w_i,w[i]}, by subtracting multiples of one
6475
equation from another.
6477
In the code the set of five values @ma{W(0)},@dots{},@m{W(\infty),W(inf)} will
6478
represent those certain linear combinations. By adding or subtracting one
6479
from another as necessary, values which are each @m{w_i,w[i]} alone are
6480
arrived at. This involves only a few subtractions of small multiples (some of
6481
which are powers of 2), and so is fast. A couple of divisions remain by
6482
powers of 2 and one division by 3 (or by 6 rather), and that last uses the
6483
special @code{mpn_divexact_by3} (@pxref{Exact Division}).
6485
In the code the values @ms{w,4}, @ms{w,2} and @ms{w,0} are formed in the
6486
destination with pointers @code{E}, @code{C} and @code{A}, and @ms{w,3} and
6487
@ms{w,1} in temporary space @code{D} and @code{B} are added to them. There
6488
are extra limbs @code{tD}, @code{tC} and @code{tB} at the high end of
6489
@ms{w,3}, @ms{w,2} and @ms{w,1} which are handled separately. The final
6490
addition then is as follows.
6494
\vbox to \GMPboxheight{%
6496
\hbox {\strut \vrule{} #1 \vrule}%
6500
\advance\baselineskip by 1ex
6502
\hbox to 6\GMPboxwidth {high \hfil low}
6503
\vbox to \GMPboxheight{%
6507
\hbox to 2\GMPboxwidth {\hfil@code{E}\hfil}
6509
\hbox to 2\GMPboxwidth {\hfil@code{C}\hfil}
6511
\hbox to 2\GMPboxwidth {\hfil@code{A}\hfil}
6515
\moveright \GMPboxwidth
6516
\vbox to \GMPboxheight{%
6520
\hbox to 2\GMPboxwidth {\hfil@code{D}\hfil}
6522
\hbox to 2\GMPboxwidth {\hfil@code{B}\hfil}
6527
\hbox to \GMPboxwidth{\hfil \GMPboxT{\code{tD}}}%
6528
\hbox to \GMPboxwidth{\hfil \GMPboxT{\code{tC}}}%
6529
\hbox to \GMPboxwidth{\hfil \GMPboxT{\code{tB}}}}
6536
+-------+-------+-------+-------+-------+-------+
6538
+-------+-------+-------+-------+-------+-------+
6539
+------+-------++------+-------+
6541
+------+-------++------+-------+
6549
The conversion of @ma{W(t)} values to the coefficients is interpolation. A
6550
polynomial of degree 4 like @ma{W(t)} is uniquely determined by values known
6551
at 5 different points. The points can be chosen to make the linear equations
6552
come out with a convenient set of steps for isolating the @m{w_i,w[i]}.
6554
In @file{mpn/generic/mul_n.c} the @code{interpolate3} routine performs the
6555
interpolation. The open-coded one-pass version may be a bit hard to
6556
understand, the steps performed can be better seen in the @code{USE_MORE_MPN}
6559
Squaring follows the same procedure as multiplication, but there's only one
6560
@ma{X(t)} and it's evaluated at 5 points, and those values squared to give
6561
values of @ma{W(t)}. The interpolation is then identical, and in fact the
6562
same @code{interpolate3} subroutine is used for both squaring and multiplying.
6564
Toom-3 is asymptotically @ma{O(N^@W{1.465})}, the exponent being
6565
@m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the
6566
original size. This is an improvement over Karatsuba at @ma{O(N^@W{1.585})},
6567
though Toom-Cook does more work in the evaluation and interpolation and so it
6568
only realizes its advantage above a certain size.
6570
Near the crossover between Toom-3 and Karatsuba there's generally a range of
6571
sizes where the difference between the two is small.
6572
@code{TOOM3_MUL_THRESHOLD} is a somewhat arbitrary point in that range and
6573
successive runs of the tune program can give different values due to small
6574
variations in measuring. A graph of time versus size for the two shows the
6575
effect, see @file{tune/README}.
6577
At the fairly small sizes where the Toom-3 thresholds occur it's worth
6578
remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be
6579
expected to make accurate predictions, due of course to the big influence of
6580
all sorts of overheads, and the fact that only a few recursions of each are
6581
being performed. Even at large sizes there's a good chance machine dependent
6582
effects like cache architecture will mean actual performance deviates from
6583
what might be predicted.
6585
The formula given above for the Karatsuba algorithm has an equivalent for
6586
Toom-3 involving only five multiplies, but this would be complicated and
6589
An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using
6590
a vector to represent the @ma{x} and @ma{y} splits and a matrix multiplication
6591
for the evaluation and interpolation stages. The matrix inverses are not
6592
meant to be actually used, and they have elements with values much greater
6593
than in fact arise in the interpolation steps. The diagram shown for the
6594
3-way is attractive, but again doesn't have to be implemented that way and for
6595
example with a bit of rearrangement just one division by 6 can be done.
6598
@node FFT Multiplication, Other Multiplication, Toom-Cook 3-Way Multiplication, Multiplication Algorithms
6599
@subsection FFT Multiplication
6601
At large to very large sizes a Fermat style FFT multiplication is used,
6602
following Sch@"onhage and Strassen (@pxref{References}). Descriptions of FFTs
6603
in various forms can be found in many textbooks, for instance Knuth section
6604
4.3.3 part C or Lipson chapter IX. A brief description of the form used in
6607
The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given
6608
@ma{N}. A full product @m{xy,x*y} is obtained by choosing @m{N \ge
6609
\mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding
6610
@ma{x} and @ma{y} with high zero limbs. The modular product is the native
6611
form for the algorithm, so padding to get a full product is unavoidable.
6613
The algorithm follows a split, evaluate, pointwise multiply, interpolate and
6614
combine similar to that described above for Karatsuba and Toom-3. A @ma{k}
6615
parameter controls the split, with an FFT-@ma{k} splitting into @ma{2^k}
6616
pieces of @ma{M=N/2^k} bits each. @ma{N} must be a multiple of
6617
@m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so
6618
the split falls on limb boundaries, avoiding bit shifts in the split and
6621
The evaluations, pointwise multiplications, and interpolation, are all done
6622
modulo @m{2^{N'}+1, 2^N'+1} where @ma{N'} is @ma{2M+k+3} rounded up to a
6623
multiple of @ma{2^k} and of @code{mp_bits_per_limb}. The results of
6624
interpolation will be the following negacyclic convolution of the input
6625
pieces, and the choice of @ma{N'} ensures these sums aren't truncated.
6627
$$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$
6634
w[n] = / (-1) * x[i] * y[j]
6641
The points used for the evaluation are @ma{g^i} for @ma{i=0} to @ma{2^k-1}
6642
where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}. @ma{g} is a @m{2^k,2^k'}th root of
6643
unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary cancellations at the
6644
interpolation stage, and it's also a power of 2 so the fast fourier transforms
6645
used for the evaluation and interpolation do only shifts, adds and negations.
6647
The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either
6648
recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or
6649
basecase), whichever is optimal at the size @ma{N'}. The interpolation is an
6650
inverse fast fourier transform. The resulting set of sums of @m{x_iy_j,
6651
x[i]*y[j]} are added at appropriate offsets to give the final result.
6653
Squaring is the same, but @ma{x} is the only input so it's one transform at
6654
the evaluate stage and the pointwise multiplies are squares. The
6655
interpolation is the same.
6657
For a mod @ma{2^N+1} product, an FFT-@ma{k} is an @m{O(N^{k/(k-1)}),
6658
O(N^(k/(k-1)))} algorithm, the exponent representing @ma{2^k} recursed modular
6659
multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original. Each
6660
successive @ma{k} is an asymptotic improvement, but overheads mean each is
6661
only faster at bigger and bigger sizes. In the code, @code{FFT_MUL_TABLE} and
6662
@code{FFT_SQR_TABLE} are the thresholds where each @ma{k} is used. Each new
6663
@ma{k} effectively swaps some multiplying for some shifts, adds and overheads.
6665
A mod @ma{2^N+1} product can be formed with a normal
6666
@ma{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT and
6667
Toom-3 etc can be compared directly. A @ma{k=4} FFT at @ma{O(N^@W{1.333})}
6668
can be expected to be the first faster than Toom-3 at @ma{O(N^@W{1.465})}. In
6669
practice this is what's found, with @code{FFT_MODF_MUL_THRESHOLD} and
6670
@code{FFT_MODF_SQR_THRESHOLD} being between 300 and 1000 limbs, depending on
6671
the CPU. So far it's been found that only very large FFTs recurse into
6672
pointwise multiplies above these sizes.
6674
When an FFT is to give a full product, the change of @ma{N} to @ma{2N} doesn't
6675
alter the theoretical complexity for a given @ma{k}, but for the purposes of
6676
considering where an FFT might be first used it can be assumed that the FFT is
6677
recursing into a normal multiply and that on that basis it's doing @ma{2^k}
6678
recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of the inputs,
6679
making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}. This would mean @ma{k=7} at
6680
@ma{O(N^@W{1.4})} would be the first FFT faster than Toom-3. In practice
6681
@code{FFT_MUL_THRESHOLD} and @code{FFT_SQR_THRESHOLD} have been found to be in
6682
the @ma{k=8} range, somewhere between 3000 and 10000 limbs.
6684
The way @ma{N} is split into @ma{2^k} pieces and then @ma{2M+k+3} is rounded
6685
up to a multiple of @ma{2^k} and @code{mp_bits_per_limb} means that when
6686
@ma{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @ma{N} is a multiple
6687
of @m{2^{2k-1},2^(2k-1)} bits. The @ma{+k+3} means some values of @ma{N} just
6688
under such a multiple will be rounded to the next. The complexity
6689
calculations above assume that a favourable size is used, meaning one which
6690
isn't padded through rounding, and it's also assumed that the extra @ma{+k+3}
6691
bits are negligible at typical FFT sizes.
6693
The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a
6694
step-effect into measured speeds. For example @ma{k=8} will round @ma{N} up
6695
to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb groups
6696
of sizes for which @code{mpn_mul_n} runs at the same speed. Or for @ma{k=9}
6697
groups of 2048 limbs, @ma{k=10} groups of 8192 limbs, etc. In practice it's
6698
been found each @ma{k} is used at quite small multiples of its size constraint
6699
and so the step effect is quite noticeable in a time versus size graph.
6701
The threshold determinations currently measure at the mid-points of size
6702
steps, but this is sub-optimal since at the start of a new step it can happen
6703
that it's better to go back to the previous @ma{k} for a while. Something
6704
more sophisticated for @code{FFT_MUL_TABLE} and @code{FFT_SQR_TABLE} will be
6708
@node Other Multiplication, , FFT Multiplication, Multiplication Algorithms
6709
@subsection Other Multiplication
6711
The 3-way Toom-Cook algorithm described above (@pxref{Toom-Cook 3-Way
6712
Multiplication}) generalizes to split into an arbitrary number of pieces, as
6713
per Knuth section 4.3.3 algorithm C. This is not currently used, though it's
6714
possible a Toom-4 might fit in between Toom-3 and the FFTs. The notes here
6715
are merely for interest.
6717
In general a split into @ma{r+1} pieces is made, and evaluations and pointwise
6718
multiplications done at @m{2r+1,2*r+1} points. A 4-way split does 7 pointwise
6719
multiplies, 5-way does 9, etc. Asymptotically an @ma{(r+1)}-way algorithm is
6720
@m{O(N^{log(2r+1)/log(r+1)}, O(N^(log(2*r+1)/log(r+1)))}. Only the pointwise
6721
multiplications count towards big-@ma{O} complexity, but the time spent in the
6722
evaluate and interpolate stages grows with @ma{r} and has a significant
6723
practical impact, with the asymptotic advantage of each @ma{r} realized only
6724
at bigger and bigger sizes. The overheads grow as @m{O(Nr),O(N*r)}, whereas
6725
in an @ma{r=2^k} FFT they grow only as @m{O(N \log r), O(N*log(r))}.
6727
Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4
6728
uses @ma{-r},@dots{},0,@dots{},@ma{r} and the latter saves some small
6729
multiplies in the evaluate stage (or rather trades them for additions), and
6730
has a further saving of nearly half the interpolate steps. The idea is to
6731
separate odd and even final coefficients and then perform algorithm C steps C7
6732
and C8 on them separately. The divisors at step C7 become @ma{j^2} and the
6733
multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}.
6735
Splitting odd and even parts through positive and negative points can be
6736
thought of as using @ma{-1} as a square root of unity. If a 4th root of unity
6737
was available then a further split and speedup would be possible, but no such
6738
root exists for plain integers. Going to complex integers with
6739
@m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in cartesian
6740
form it takes three real multiplies to do a complex multiply. The existence
6741
of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast
6742
fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}.
6745
@node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms
6746
@section Division Algorithms
6747
@cindex Division algorithms
6750
* Single Limb Division::
6751
* Basecase Division::
6752
* Divide and Conquer Division::
6755
* Small Quotient Division::
6759
@node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms
6760
@subsection Single Limb Division
6762
N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from
6763
high to low, either with a hardware divide instruction or a multiplication by
6764
inverse, whichever is best on a given CPU.
6766
The multiply by inverse follows section 8 of ``Division by Invariant Integers
6767
using Multiplication'' by Granlund and Montgomery (@pxref{References}) and is
6768
implemented as @code{udiv_qrnnd_preinv} in @file{gmp-impl.h}. The idea is to
6769
have a fixed-point approximation to @ma{1/d} (see @code{invert_limb}) and then
6770
multiply by the high limb (plus one bit) of the dividend to get a quotient
6771
@ma{q}. With @ma{d} normalized (high bit set), @ma{q} is no more than 1 too
6772
small. Subtracting @m{qd,q*d} from the dividend gives a remainder, and
6773
reveals whether @ma{q} or @ma{q-1} is correct.
6775
The result is a division done with two multiplications and four or five
6776
arithmetic operations. On CPUs with low latency multipliers this can be much
6777
faster than a hardware divide, though the cost of calculating the inverse at
6778
the start may mean it's only better on inputs bigger than say 4 or 5 limbs.
6780
When a divisor must be normalized, either for the generic C
6781
@code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is
6782
actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @ma{a} is the dividend and
6783
@ma{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set. The
6784
bit shifts for the dividend are usually accomplished ``on the fly'' meaning by
6785
extracting the appropriate bits at each step. Done this way the quotient
6786
limbs come out aligned ready to store. When only the remainder is wanted, an
6787
alternative is to take the dividend limbs unshifted and calculate @m{r = a
6788
\bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k \bmod
6789
d2^k, r*2^k mod d*2^k}. This can help on CPUs with poor bit shifts or few
6792
The multiply by inverse can be done two limbs at a time. The calculation is
6793
basically the same, but the inverse is two limbs and the divisor treated as if
6794
padded with a low zero limb. This means more work, since the inverse will
6795
need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are
6796
independent and can therefore be done partly or wholly in parallel. Likewise
6797
for a 2@cross{}1 calculating @m{qd,q*d}. The net effect is to process two
6798
limbs with roughly the same two multiplies worth of latency that one limb at a
6799
time gives. This extends to 3 or 4 limbs at a time, though the extra work to
6800
apply the inverse will almost certainly soon reach the limits of multiplier
6803
A similar approach in reverse can be taken to process just half a limb at a
6804
time if the divisor is only a half limb. In this case the 1@cross{}1 multiply
6805
for the inverse effectively becomes two @m{1\over2@cross{}1, (1/2)x1} for each
6806
limb, which can be a saving on CPUs with a fast half limb multiply, or in fact
6807
if the only multiply is a half limb, and especially if it's not pipelined.
6810
@node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms
6811
@subsection Basecase Division
6813
Basecase N@cross{}M division is like long division done by hand, but in base
6814
@m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}. See Knuth
6815
section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}.
6817
Briefly stated, while the dividend remains larger than the divisor, a high
6818
quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at
6819
the top end of the dividend. With a normalized divisor (most significant bit
6820
set), each quotient limb can be formed with a 2@cross{}1 division and a
6821
1@cross{}1 multiplication plus some subtractions. The 2@cross{}1 division is
6822
by the high limb of the divisor and is done either with a hardware divide or a
6823
multiply by inverse (the same as in @ref{Single Limb Division}) whichever is
6824
faster. Such a quotient is sometimes one too big, requiring an addback of the
6825
divisor, but that happens rarely.
6827
With Q=N@minus{}M being the number of quotient limbs, this is an
6828
@m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase
6829
Q@cross{}M multiplication, differing in fact only in the extra multiply and
6830
divide for each of the Q quotient limbs.
6833
@node Divide and Conquer Division, Exact Division, Basecase Division, Division Algorithms
6834
@subsection Divide and Conquer Division
6836
For divisors larger than @code{DC_THRESHOLD}, division is done by dividing.
6837
Or to be precise by a recursive divide and conquer algorithm based on work by
6838
Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}).
6840
The algorithm consists essentially of recognising that a 2N@cross{}N division
6841
can be done with the basecase division algorithm (@pxref{Basecase Division}),
6842
but using N/2 limbs as a base, not just a single limb. This way the
6843
multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of
6844
Karatsuba and higher multiplication algorithms (@pxref{Multiplication
6845
Algorithms}). The ``digits'' of the quotient are formed by recursive
6846
N@cross{}(N/2) divisions.
6848
If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication
6849
then the work is about the same as a basecase division, but with more function
6850
call overheads and with some subtractions separated from the multiplies.
6851
These overheads mean that it's only when N/2 is above
6852
@code{KARATSUBA_MUL_THRESHOLD} that divide and conquer is of use.
6854
@code{DC_THRESHOLD} is based on the divisor size N, so it will be somewhere
6855
above twice @code{KARATSUBA_MUL_THRESHOLD}, but how much above depends on the
6856
CPU. An optimized @code{mpn_mul_basecase} can lower @code{DC_THRESHOLD} a
6857
little by offering a ready-made advantage over repeated @code{mpn_submul_1}
6860
Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where
6861
@ma{M(N)} is the time for an N@cross{}N multiplication done with FFTs. The
6862
actual time is a sum over multiplications of the recursed sizes, as can be
6863
seen near the end of section 2.2 of Burnikel and Ziegler. For example, within
6864
the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}. With higher
6865
algorithms the @ma{M(N)} term improves and the multiplier tends to @m{\log N,
6866
log(N)}. In practice, at moderate to large sizes, a 2N@cross{}N division is
6867
about 2 to 4 times slower than an N@cross{}N multiplication.
6869
Newton's method used for division is asymptotically @ma{O(M(N))} and should
6870
therefore be superior to divide and conquer, but it's believed this would only
6871
be for large to very large N.
6874
@node Exact Division, Exact Remainder, Divide and Conquer Division, Division Algorithms
6875
@subsection Exact Division
6877
A so-called exact division is when the dividend is known to be an exact
6878
multiple of the divisor. Jebelean's exact division algorithm uses this
6879
knowledge to make some significant optimizations (@pxref{References}).
6881
The idea can be illustrated in decimal for example with 368154 divided by
6882
543. Because the low digit of the dividend is 4, the low digit of the
6883
quotient must be 8. This is arrived at from @m{4 \mathord{\times} 7 \bmod 10,
6884
4*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of
6885
the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7
6886
@equiv{} 1 mod 10}. So @m{8\mathord{\times}543 = 4344,8*543=4344} can be
6887
subtracted from the dividend leaving 363810. Notice the low digit has become
6890
The procedure is repeated at the second digit, with the next quotient digit 7
6891
(@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting
6892
@m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800. And finally at
6893
the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7
6894
mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0.
6895
So the quotient is 678.
6897
Notice however that the multiplies and subtractions don't need to extend past
6898
the low three digits of the dividend, since that's enough to determine the
6899
three quotient digits. For the last quotient digit no subtraction is needed
6900
at all. On a 2N@cross{}N division like this one, only about half the work of
6901
a normal basecase division is necessary.
6903
For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the
6904
saving over a normal basecase division is in two parts. Firstly, each of the
6905
Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and
6906
multiply. Secondly, the crossproducts are reduced when @ma{Q>M} to
6907
@m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @ma{Q@le{}M} to @m{Q(Q-1)/2,
6908
Q*(Q-1)/2}. Notice the savings are complementary. If Q is big then many
6909
divisions are saved, or if Q is small then the crossproducts reduce to a small
6912
The modular inverse used is calculated efficiently by @code{modlimb_invert} in
6913
@file{gmp-impl.h}. This does four multiplies for a 32-bit limb, or six for a
6914
64-bit limb. @file{tune/modlinv.c} has some alternate implementations that
6915
might suit processors better at bit twiddling than multiplying.
6917
The sub-quadratic exact division described by Jebelean in ``Exact Division
6918
with Karatsuba Complexity'' is not currently implemented. It uses a
6919
rearrangement similar to the divide and conquer for normal division
6920
(@pxref{Divide and Conquer Division}), but operating from low to high. A
6921
further possibility not currently implemented is ``Bidirectional Exact Integer
6922
Division'' by Krandick and Jebelean which forms quotient limbs from both the
6923
high and low ends of the dividend, and can halve once more the number of
6924
crossproducts needed in a 2N@cross{}N division.
6926
A special case exact division by 3 exists in @code{mpn_divexact_by3},
6927
supporting Toom-3 multiplication and @code{mpq} canonicalizations. It forms
6928
quotient digits with a multiply by the modular inverse of 3 (which is
6929
@code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next
6930
limb. The multiplications don't need to be on the dependent chain, as long as
6931
the effect of the borrows is applied. Only a few optimized assembler
6932
implementations currently exist.
6935
@node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms
6936
@subsection Exact Remainder
6938
If the exact division algorithm is done with a full subtraction at each stage
6939
and the dividend isn't a multiple of the divisor, then low zero limbs are
6940
produced but with a remainder in the high limbs. For dividend @ma{a}, divisor
6941
@ma{d}, quotient @ma{q}, and @m{b = 2 \GMPraise{@code{mp\_bits\_per\_limb}}, b
6942
= 2^mp_bits_per_limb}, then this remainder @ma{r} is of the form
6944
$$ a = qd + r b^n $$
6953
@ma{n} represents the number of zero limbs produced by the subtractions, that
6954
being the number of limbs produced for @ma{q}. @ma{r} will be in the range
6955
@ma{0@le{}r<d} and can be viewed as a remainder, but one shifted up by a
6958
Carrying out full subtractions at each stage means the same number of cross
6959
products must be done as a normal division, but there's still some single limb
6960
divisions saved. When @ma{d} is a single limb some simplifications arise,
6961
providing good speedups on a number of processors.
6963
@code{mpn_bdivmod}, @code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the
6964
@code{redc} function in @code{mpz_powm} differ subtly in how they return
6965
@ma{r}, leading to some negations in the above formula, but all are
6966
essentially the same.
6968
Clearly @ma{r} is zero when @ma{a} is a multiple of @ma{d}, and this leads to
6969
divisibility or congruence tests which are potentially more efficient than a
6972
The factor of @ma{b^n} on @ma{r} can be ignored in a GCD when @ma{d} is odd,
6973
hence the use of @code{mpn_bdivmod} in @code{mpn_gcd}, and the use of
6974
@code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and @code{mpz_kronecker_ui} etc
6975
(@pxref{Greatest Common Divisor Algorithms}).
6977
Montgomery's REDC method for modular multiplications uses operands of the form
6978
of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n})
6979
(yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @ma{b^n} in the exact
6980
remainder to reach a product in the same form @m{(xy)b^{-n},
6981
(x*y)*b^-n} (@pxref{Modular Powering Algorithm}).
6983
Notice that @ma{r} generally gives no useful information about the ordinary
6984
remainder @ma{a @bmod d} since @ma{b^n @bmod d} could be anything. If however
6985
@ma{b^n @equiv{} 1 @bmod d}, then @ma{r} is the negative of the ordinary
6986
remainder. This occurs whenever @ma{d} is a factor of @ma{b^n-1}, as for
6987
example with 3 in @code{mpn_divexact_by3}. Other such factors include 5, 17
6988
and 257, but no particular use has been found for this.
6991
@node Small Quotient Division, , Exact Remainder, Division Algorithms
6992
@subsection Small Quotient Division
6994
An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is
6995
small can be optimized somewhat.
6997
An ordinary basecase division normalizes the divisor by shifting it to make
6998
the high bit set, shifting the dividend accordingly, and shifting the
6999
remainder back down at the end of the calculation. This is wasteful if only a
7000
few quotient limbs are to be formed. Instead a division of just the top
7001
@m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be
7002
used to form a trial quotient. This requires only those limbs normalized, not
7003
the whole of the divisor and dividend.
7005
A multiply and subtract then applies the trial quotient to the M@minus{}Q
7006
unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q
7007
limbs remaining from the trial quotient division). The starting trial
7008
quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1
7009
too big are detected by first comparing the most significant limbs that will
7010
arise from the subtraction. An addback is done if the quotient still turns
7011
out to be 1 too big.
7013
This whole procedure is essentially the same as one step of the basecase
7014
algorithm done in a Q limb base, though with the trial quotient test done only
7015
with the high limbs, not an entire Q limb ``digit'' product. The correctness
7016
of this weaker test can be established by following the argument of Knuth
7017
section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r
7018
+ u_2, v2*q>b*r+u2} condition appropriately relaxed.
7022
@node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms
7023
@section Greatest Common Divisor
7024
@cindex Greatest common divisor algorithms
7034
@node Binary GCD, Accelerated GCD, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms
7035
@subsection Binary GCD
7037
At small sizes GMP uses an @ma{O(N^2)} binary style GCD. This is described in
7038
many textbooks, for example Knuth section 4.5.2 algorithm B. It simply
7039
consists of successively reducing operands @ma{a} and @ma{b} using
7040
@ma{@gcd{}(a,b) = @gcd{}(@min{}(a,b),@abs{}(a-b))}, and also that if @ma{a}
7041
and @ma{b} are first made odd then @ma{@abs{}(a-b)} is even and factors of two
7044
Variants like letting @ma{a-b} become negative and doing a different next step
7045
are of interest only as far as they suit particular CPUs, since on small
7046
operands it's machine dependent factors that determine performance.
7048
The Euclidean GCD algorithm, as per Knuth algorithms E and A, reduces using
7049
@ma{a @bmod b} but this has so far been found to be slower everywhere. One
7050
reason the binary method does well is that the implied quotient at each step
7051
is usually small, so often only one or two subtractions are needed to get the
7052
same effect as a division. Quotients 1, 2 and 3 for example occur 67.7% of
7053
the time, see Knuth section 4.5.3 Theorem E.
7055
When the implied quotient is large, meaning @ma{b} is much smaller than
7056
@ma{a}, then a division is worthwhile. This is the basis for the initial
7057
@ma{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter
7058
for both N@cross{}1 and 1@cross{}1 cases). But after that initial reduction,
7059
big quotients occur too rarely to make it worth checking for them.
7062
@node Accelerated GCD, Extended GCD, Binary GCD, Greatest Common Divisor Algorithms
7063
@subsection Accelerated GCD
7065
For sizes above @code{GCD_ACCEL_THRESHOLD}, GMP uses the Accelerated GCD
7066
algorithm described independently by Weber and Jebelean (the latter as the
7067
``Generalized Binary'' algorithm), @pxref{References}. This algorithm is
7068
still @ma{O(N^2)}, but is much faster than the binary algorithm since it does
7069
fewer multi-precision operations. It consists of alternating the @ma{k}-ary
7070
reduction by Sorenson, and a ``dmod'' exact remainder reduction.
7072
For operands @ma{u} and @ma{v} the @ma{k}-ary reduction replaces @ma{u} with
7073
@m{nv-du,n*v-d*u} where @ma{n} and @ma{d} are single limb values chosen to
7074
give two trailing zero limbs on that value, which can be stripped. @ma{n} and
7075
@ma{d} are calculated using an algorithm similar to half of a two limb GCD
7076
(see @code{find_a} in @file{mpn/generic/gcd.c}).
7078
When @ma{u} and @ma{v} differ in size by more than a certain number of bits, a
7079
dmod is performed to zero out bits at the low end of the larger. It consists
7080
of an exact remainder style division applied to an appropriate number of bits
7081
(@pxref{Exact Division}, and @pxref{Exact Remainder}). This is faster than a
7082
@ma{k}-ary reduction but useful only when the operands differ in size.
7083
There's a dmod after each @ma{k}-ary reduction, and if the dmod leaves the
7084
operands still differing in size then it's repeated.
7086
The @ma{k}-ary reduction step can introduce spurious factors into the GCD
7087
calculated, and these are eliminated at the end by taking GCDs with the
7088
original inputs @ma{@gcd{}(u,@gcd{}(v,g))} using the binary algorithm. Since
7089
@ma{g} is almost always small this takes very little time.
7091
At small sizes the algorithm needs a good implementation of @code{find_a}. At
7092
larger sizes it's dominated by @code{mpn_addmul_1} applying @ma{n} and @ma{d}.
7095
@node Extended GCD, Jacobi Symbol, Accelerated GCD, Greatest Common Divisor Algorithms
7096
@subsection Extended GCD
7098
The extended GCD calculates @ma{@gcd{}(a,b)} and also cofactors @ma{x} and
7099
@ma{y} satisfying @m{ax+by=\gcd(a@C{}b), a*x+b*y=gcd(a@C{}b)}. Lehmer's
7100
multi-step improvement of the extended Euclidean algorithm is used. See Knuth
7101
section 4.5.2 algorithm L, and @file{mpn/generic/gcdext.c}. This is an
7102
@ma{O(N^2)} algorithm.
7104
The multipliers at each step are found using single limb calculations for
7105
sizes up to @code{GCDEXT_THRESHOLD}, or double limb calculations above that.
7106
The single limb code is faster but doesn't produce full-limb multipliers,
7107
hence not making full use of the @code{mpn_addmul_1} calls.
7109
When a CPU has a data-dependent multiplier, meaning one which is faster on
7110
operands with fewer bits, the extra work in the double-limb calculation might
7111
only save some looping overheads, leading to a large @code{GCDEXT_THRESHOLD}.
7113
Currently the single limb calculation doesn't optimize for the small quotients
7114
that often occur, and this can lead to unusually low values of
7115
@code{GCDEXT_THRESHOLD}, depending on the CPU.
7117
An analysis of double-limb calculations can be found in ``A Double-Digit
7118
Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The code in GMP
7119
was developed independently.
7121
It should be noted that when a double limb calculation is used, it's used for
7122
the whole of that GCD, it doesn't fall back to single limb part way through.
7123
This is because as the algorithm proceeds, the inputs @ma{a} and @ma{b} are
7124
reduced, but the cofactors @ma{x} and @ma{y} grow, so the multipliers at each
7125
step are applied to a roughly constant total number of limbs.
7128
@node Jacobi Symbol, , Extended GCD, Greatest Common Divisor Algorithms
7129
@subsection Jacobi Symbol
7131
@code{mpz_jacobi} and @code{mpz_kronecker} are currently implemented with a
7132
simple binary algorithm similar to that described for the GCDs (@pxref{Binary
7133
GCD}). They're not very fast when both inputs are large. Lehmer's multi-step
7134
improvement or a binary based multi-step algorithm is likely to be better.
7136
When one operand fits a single limb, and that includes @code{mpz_kronecker_ui}
7137
and friends, an initial reduction is done with either @code{mpn_mod_1} or
7138
@code{mpn_modexact_1_odd}, followed by the binary algorithm on a single limb.
7139
The binary algorithm is well suited to a single limb, and the whole
7140
calculation in this case is quite efficient.
7142
In all the routines sign changes for the result are accumulated using some bit
7143
twiddling, avoiding table lookups or conditional jumps.
7147
@node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms
7148
@section Powering Algorithms
7149
@cindex Powering algorithms
7152
* Normal Powering Algorithm::
7153
* Modular Powering Algorithm::
7157
@node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms
7158
@subsection Normal Powering
7160
Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm,
7161
successively squaring and then multiplying by the base when a 1 bit is seen in
7162
the exponent, as per Knuth section 4.6.3. The ``left to right''
7163
variant described there is used rather than algorithm A, since it's just as
7164
easy and can be done with somewhat less temporary memory.
7167
@node Modular Powering Algorithm, , Normal Powering Algorithm, Powering Algorithms
7168
@subsection Modular Powering
7170
Modular powering is implemented using a @ma{2^k}-ary sliding window algorithm,
7171
as per ``Handbook of Applied Cryptography'' algorithm 14.85
7172
(@pxref{References}). @ma{k} is chosen according to the size of the exponent.
7173
Larger exponents use larger values of @ma{k}, the choice being made to
7174
minimize the average number of multiplications that must supplement the
7177
The modular multiplies and squares use either a simple division or the REDC
7178
method by Montgomery (@pxref{References}). REDC is a little faster,
7179
essentially saving N single limb divisions in a fashion similar to an exact
7180
remainder (@pxref{Exact Remainder}). The current REDC has some limitations.
7181
It's only @ma{O(N^2)} so above @code{POWM_THRESHOLD} division becomes faster
7182
and is used. It doesn't attempt to detect small bases, but rather always uses
7183
a REDC form, which is usually a full size operand. And lastly it's only
7184
applied to odd moduli.
7187
@node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms
7188
@section Root Extraction Algorithms
7189
@cindex Root extraction algorithms
7192
* Square Root Algorithm::
7193
* Nth Root Algorithm::
7194
* Perfect Square Algorithm::
7195
* Perfect Power Algorithm::
7199
@node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms
7200
@subsection Square Root
7202
Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul
7203
Zimmermann (@pxref{References}). This is expressed in a divide and conquer
7204
form, but as noted in the paper it can also be viewed as a discrete variant of
7207
In the Karatsuba multiplication range this is an @m{O({3\over2}
7208
M(N/2)),O(1.5*M(N/2))} algorithm, where @ma{M(n)} is the time to multiply two
7209
numbers of @ma{n} limbs. In the FFT multiplication range this grows to a
7210
bound of @m{O(6 M(N/2)),O(6*M(N/2))}. In practice a factor of about 1.5 to
7211
1.8 is found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT
7214
The algorithm does all its calculations in integers and the resulting
7215
@code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}.
7216
The extended precision given by @code{mpf_sqrt_ui} is obtained by
7217
padding with zero limbs.
7220
@node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms
7221
@subsection Nth Root
7223
Integer Nth roots are taken using Newton's method with the following
7224
iteration, where @ma{A} is the input and @ma{n} is the root to be taken.
7226
$$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$
7232
a[i+1] = - * ( --------- + (n-1)*a[i] )
7237
The initial approximation @m{a_1,a[1]} is generated bitwise by successively
7238
powering a trial root with or without new 1 bits, aiming to be just above the
7239
true root. The iteration converges quadratically when started from a good
7240
approximation. When @ma{n} is large more initial bits are needed to get good
7241
convergence. The current implementation is not particularly well optimized.
7244
@node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms
7245
@subsection Perfect Square
7247
@code{mpz_perfect_square_p} is able to quickly exclude most non-squares by
7248
checking whether the input is a quadratic residue modulo some small integers.
7250
The first test is modulo 256 which means simply examining the least
7251
significant byte. Only 44 different values occur as the low byte of a square,
7252
so 82.8% of non-squares can be immediately excluded. Similar tests modulo
7253
primes from 3 to 29 exclude 99.5% of those remaining, or if a limb is 64 bits
7254
then primes up to 53 are used, excluding 99.99%. A single N@cross{}1
7255
remainder using @code{PP} from @file{gmp-impl.h} quickly gives all these
7258
A square root must still be taken for any value that passes the residue tests,
7259
to verify it's really a square and not one of the 0.086% (or 0.000156% for 64
7260
bits) non-squares that get through. @xref{Square Root Algorithm}.
7263
@node Perfect Power Algorithm, , Perfect Square Algorithm, Root Extraction Algorithms
7264
@subsection Perfect Power
7266
Detecting perfect powers is required by some factorization algorithms.
7267
Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root
7268
extractions, though naturally only prime roots need to be considered.
7269
(@xref{Nth Root Algorithm}.)
7271
If a prime divisor @ma{p} with multiplicity @ma{e} can be found, then only
7272
roots which are divisors of @ma{e} need to be considered, much reducing the
7273
work necessary. To this end divisibility by a set of small primes is checked.
7276
@node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms
7277
@section Radix Conversion
7278
@cindex Radix conversion algorithms
7280
Radix conversions are less important than other algorithms. A program
7281
dominated by conversions should probably use a different data representation.
7289
@node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms
7290
@subsection Binary to Radix
7292
Conversions from binary to a power-of-2 radix use a simple and fast @ma{O(N)}
7293
bit extraction algorithm.
7295
Conversions from binary to other radices use repeated divisions, first by the
7296
biggest power of the radix that fits in a single limb, then by the radix on
7297
the remainders. This is an @ma{O(N^2)} algorithm and can be quite
7298
time-consuming on large inputs.
7301
@node Radix to Binary, , Binary to Radix, Radix Conversion Algorithms
7302
@subsection Radix to Binary
7304
Conversions from a power-of-2 radix into binary use a simple and fast
7305
@ma{O(N)} bitwise concatenation algorithm.
7307
Conversions from other radices use repeated multiplications, first
7308
accumulating as many digits as fit in a limb, then doing an N@cross{}1
7309
multi-precision multiplication. This is @ma{O(N^2)} and is certainly
7310
sub-optimal on sizes above the Karatsuba multiply threshold.
7314
@node Other Algorithms, Assembler Coding, Radix Conversion Algorithms, Algorithms
7315
@section Other Algorithms
7318
* Factorial Algorithm::
7319
* Binomial Coefficients Algorithm::
7320
* Fibonacci Numbers Algorithm::
7321
* Lucas Numbers Algorithm::
7325
@node Factorial Algorithm, Binomial Coefficients Algorithm, Other Algorithms, Other Algorithms
7326
@subsection Factorial
7328
Factorials @ma{n!} are calculated by a simple product from @ma{1} to @ma{n},
7329
but arranged into certain sub-products.
7331
First as many factors as fit in a limb are accumulated, then two of those
7332
multiplied to give a 2-limb product. When two 2-limb products are ready
7333
they're multiplied to a 4-limb product, and when two 4-limbs are ready they're
7334
multiplied to an 8-limb product, etc. A stack of outstanding products is
7335
built up, with two of the same size multiplied together when ready.
7337
Arranging for multiplications to have operands the same (or nearly the same)
7338
size means the Karatsuba and higher multiplication algorithms can be used.
7339
And even on sizes below the Karatsuba threshold an N@cross{}N multiply will
7340
give a basecase multiply more to work on.
7342
An obvious improvement not currently implemented would be to strip factors of
7343
2 from the products and apply them at the end with a bit shift. Another
7344
possibility would be to determine the prime factorization of the result (which
7345
can be done easily), and use a powering method, at each stage squaring then
7346
multiplying in those primes with a 1 in their exponent at that point. The
7347
advantage would be some multiplies turned into squares.
7350
@node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms
7351
@subsection Binomial Coefficients
7353
Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated
7354
by first arranging @ma{k @le{} n/2} using @m{\left({n}\atop{k}\right) =
7355
\left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then
7356
evaluating the following product simply from @ma{i=2} to @ma{i=k}.
7358
$$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$
7364
C(n,k) = (n-k+1) * prod -------
7369
It's easy to show that each denominator @ma{i} will divide the product so far,
7370
so the exact division algorithm is used (@pxref{Exact Division}).
7372
The numerators @ma{n-k+i} and denominators @ma{i} are first accumulated into
7373
as many fit a limb, to save multi-precision operations, though for
7374
@code{mpz_bin_ui} this applies only to the divisors, since @ma{n} is an
7375
@code{mpz_t} and @ma{n-k+i} in general won't fit in a limb at all.
7377
An obvious improvement would be to strip factors of 2 from each multiplier and
7378
divisor and count them separately, to be applied with a bit shift at the end.
7379
Factors of 3 and perhaps 5 could even be handled similarly. Another
7380
possibility, if @ma{n} is not too big, would be to determine the prime
7381
factorization of the result based on the factorials involved, and power up
7382
those primes appropriately. This would help most when @ma{k} is near
7386
@node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms
7387
@subsection Fibonacci Numbers
7389
The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed
7390
for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]}
7393
For small @ma{n}, a table of single limb values in @code{__gmp_fib_table} is
7394
used. On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb
7395
up to @m{F_{93},F[93]}. For convenience the table starts at @m{F_{-1},F[-1]}.
7397
Beyond the table, values are generated with a binary powering algorithm,
7398
calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to
7399
low across the bits of @ma{n}. The formulas used are
7402
F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr
7403
F_{2k-1} &= F_k^2 + F_{k-1}^2 \cr
7404
F_{2k} &= F_{2k+1} - F_{2k-1}
7410
F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k
7411
F[2k-1] = F[k]^2 + F[k-1]^2
7413
F[2k] = F[2k+1] - F[2k-1]
7417
At each step, @ma{k} is the high @ma{b} bits of @ma{n}. If the next bit of
7418
@ma{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if it's a 1
7419
then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process repeated
7420
until all bits of @ma{n} are incorporated. Notice these formulas require just
7421
two squares per bit of @ma{n}.
7423
It'd be possible to handle the first few @ma{n} above the single limb table
7424
with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} =
7425
F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually
7426
turns out to be faster for only about 10 or 20 values of @ma{n}, and including
7427
a block of code for just those doesn't seem worthwhile. If they really
7428
mattered it'd be better to extend the data table.
7430
Using a table avoids lots of calculations on small numbers, and makes small
7431
@ma{n} go fast. A bigger table would make more small @ma{n} go fast, it's
7432
just a question of balancing size against desired speed. For GMP the code is
7433
kept compact, with the emphasis primarily on a good powering algorithm.
7435
@code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but
7436
@code{mpz_fib_ui} is only interested in @m{F_n,F[n]}. In this case the last
7437
step of the algorithm can become one multiply instead of two squares. One of
7438
the following two formulas is used, according as @ma{n} is odd or even.
7441
F_{2k} &= F_k (F_k + 2F_{k-1}) \cr
7442
F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k
7448
F[2k] = F[k]*(F[k]+2F[k-1])
7450
F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k
7454
@m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a
7455
multiply. For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above
7456
can be applied just to the low limb of the calculation, without a carry or
7457
borrow into further limbs, which saves some code size. See comments with
7458
@code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done.
7461
@node Lucas Numbers Algorithm, , Fibonacci Numbers Algorithm, Other Algorithms
7462
@subsection Lucas Numbers
7464
@code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci
7465
numbers with the following simple formulas.
7468
L_k &= F_k + 2F_{k-1} \cr
7469
L_{k-1} &= 2F_k - F_{k-1}
7475
L[k] = F[k] + 2*F[k-1]
7476
L[k-1] = 2*F[k] - F[k-1]
7480
@code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be
7481
saved. Trailing zero bits on @ma{n} can be handled with a single square each.
7483
$$ L_{2k} = L_k^2 - 2(-1)^k $$
7488
L[2k] = L[k]^2 - 2*(-1)^k
7492
And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci
7493
numbers, similar to what @code{mpz_fib_ui} does.
7495
$$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$
7500
L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k
7506
@node Assembler Coding, , Other Algorithms, Algorithms
7507
@section Assembler Coding
7509
The assembler subroutines in GMP are the most significant source of speed at
7510
small to moderate sizes. At larger sizes algorithm selection becomes more
7511
important, but of course speedups in low level routines will still speed up
7512
everything proportionally.
7514
Carry handling and widening multiplies that are important for GMP can't be
7515
easily expressed in C. GCC @code{asm} blocks help a lot and are provided in
7516
@file{longlong.h}, but hand coding low level routines invariably offers a
7517
speedup over generic C by a factor of anything from 2 to 10.
7520
* Assembler Code Organisation::
7521
* Assembler Basics::
7522
* Assembler Carry Propagation::
7523
* Assembler Cache Handling::
7524
* Assembler Floating Point::
7525
* Assembler SIMD Instructions::
7526
* Assembler Software Pipelining::
7527
* Assembler Loop Unrolling::
7531
@node Assembler Code Organisation, Assembler Basics, Assembler Coding, Assembler Coding
7532
@subsection Code Organisation
7534
The various @file{mpn} subdirectories contain machine-dependent code, written
7535
in C or assembler. The @file{mpn/generic} subdirectory contains default code,
7536
used when there's no machine-specific version of a particular file.
7538
Each @file{mpn} subdirectory is for an ISA family. Generally 32-bit and
7539
64-bit variants in a family cannot share code and will have separate
7540
directories. Within a family further subdirectories may exist for CPU
7544
@node Assembler Basics, Assembler Carry Propagation, Assembler Code Organisation, Assembler Coding
7545
@subsection Assembler Basics
7547
@code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines
7548
for overall GMP performance. All multiplications and divisions come down to
7549
repeated calls to these. @code{mpn_add_n}, @code{mpn_sub_n},
7550
@code{mpn_lshift} and @code{mpn_rshift} are next most important.
7552
On some CPUs assembler versions of the internal functions
7553
@code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups,
7554
mainly through avoiding function call overheads. They can also potentially
7555
make better use of a wide superscalar processor.
7557
The restrictions on overlaps between sources and destinations
7558
(@pxref{Low-level Functions}) are designed to facilitate a variety of
7559
implementations. For example, knowing @code{mpn_add_n} won't have partly
7560
overlapping sources and destination means reading can be done far ahead of
7561
writing on superscalar processors, and loops can be vectorized on a vector
7562
processor, depending on the carry handling.
7565
@node Assembler Carry Propagation, Assembler Cache Handling, Assembler Basics, Assembler Coding
7566
@subsection Carry Propagation
7568
The problem that presents most challenges in GMP is propagating carries from
7569
one limb to the next. In functions like @code{mpn_addmul_1} and
7570
@code{mpn_add_n}, carries are the only dependencies between limb operations.
7572
On processors with carry flags, a straightforward CISC style @code{adc} is
7573
generally best. AMD K6 @code{mpn_addmul_1} however is an example of an
7574
unusual set of circumstances where a branch works out better.
7576
On RISC processors generally an add and compare for overflow is used. This
7577
sort of thing can be seen in @file{mpn/generic/aors_n.c}. Some carry
7578
propagation schemes require 4 instructions, meaning at least 4 cycles per
7579
limb, but other schemes may use just 1 or 2. On wide superscalar processors
7580
performance may be completely determined by the number of dependent
7581
instructions between carry-in and carry-out for each limb.
7583
On vector processors good use can be made of the fact that a carry bit only
7584
very rarely propagates more than one limb. When adding a single bit to a
7585
limb, there's only a carry out if that limb was @code{0xFF...FF} which on
7586
random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}},
7587
2^mp_bits_per_limb}. @file{mpn/cray/add_n.c} is an example of this, it adds
7588
all limbs in parallel, adds one set of carry bits in parallel and then only
7589
rarely needs to fall through to a loop propagating further carries.
7591
On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code
7592
for the RISC style idioms that are necessary to handle carry bits in
7593
C. Often conditional jumps are generated where @code{adc} or @code{sbb} forms
7594
would be better. And so unfortunately almost any loop involving carry bits
7595
needs to be coded in assembler for best results.
7598
@node Assembler Cache Handling, Assembler Floating Point, Assembler Carry Propagation, Assembler Coding
7599
@subsection Cache Handling
7601
GMP aims to perform well both on operands that fit entirely in L1 cache and
7602
those that don't. In the assembler subroutines this means prefetching, either
7603
always or when large enough operands are presented.
7605
Pre-fetching sources combines well with loop unrolling, since a prefetch can
7606
be initiated once per unrolled loop (or more than once if the loop processes
7607
more than one cache line).
7609
Pre-fetching destinations won't be necessary if the CPU has a big enough store
7610
queue. Older processors without a write-allocate L1 however will want
7611
destination prefetching, to avoid repeated write-throughs, unless they can
7612
keep up with the rate at which destination limbs are produced.
7614
The distance ahead to prefetch will be determined by the rate data is
7615
processed versus the time it takes to bring a line up to L1. Naturally the
7616
net data rate from L2 or RAM will always limit the rate of data processing.
7617
Prefetch distance may also be limited by the number of prefetches the
7618
processor can have in progress at any one time.
7620
If a special prefetch instruction doesn't exist then a plain load can be used,
7621
so long as the CPU supports out-of-order loads. But this may mean having a
7622
second copy of a loop so that the last few limbs can be processed without
7623
prefetching, since reading past the end of an operand must be avoided.
7626
@node Assembler Floating Point, Assembler SIMD Instructions, Assembler Cache Handling, Assembler Coding
7627
@subsection Floating Point
7629
Floating point arithmetic is used in GMP for multiplications on CPUs with poor
7630
integer multipliers. Floating point generally doesn't suit other operations
7631
like additions or shifts, due to difficulties implementing carry handling.
7633
With IEEE 53-bit double precision floats, integer multiplications producing up
7634
to 53 bits will give exact results. Breaking a multiplication into
7635
16@cross{}@ma{32@rightarrow{}48} bit pieces is convenient. With some care
7636
though three 21@cross{}@ma{32@rightarrow{}53} bit products can be used to do a
7637
64@cross{}32 multiply, if one of those 21@cross{}32 parts uses the sign bit.
7639
Generally limbs want to be treated as unsigned, but on some CPUs floating
7640
point conversions only treat integers as signed. Copying through a zero
7641
extended memory region or testing and adjusting for a sign bit may be
7644
Currently floating point FFTs aren't used for large multiplications. On some
7645
processors they probably have a good chance of being worthwhile, if great care
7646
is taken with precision control.
7649
@node Assembler SIMD Instructions, Assembler Software Pipelining, Assembler Floating Point, Assembler Coding
7650
@subsection SIMD Instructions
7652
The single-instruction multiple-data support in current microprocessors is
7653
aimed at signal processing algorithms where each data point can be treated
7654
more or less independently. There's generally not much support for
7655
propagating the sort of carries that arise in GMP.
7657
SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much
7658
work as one 32@cross{}32 from GMP's point of view, and need some shifts and
7659
adds besides. But of course if say the SIMD form is fully pipelined and uses
7660
less instruction decoding then it may still be worthwhile.
7662
On the 80x86 chips, MMX has so far found a use in @code{mpn_rshift} and
7663
@code{mpn_lshift} since it allows 64-bit operations, and is used in a special
7664
case for 16-bit multipliers in the P55 @code{mpn_mul_1}. 3DNow and SSE
7665
haven't found a use so far.
7668
@node Assembler Software Pipelining, Assembler Loop Unrolling, Assembler SIMD Instructions, Assembler Coding
7669
@subsection Software Pipelining
7671
Software pipelining consists of scheduling instructions around the branch
7672
point in a loop. For example a loop taking a checksum of an array of limbs
7673
might have a load and an add, but the load wouldn't be for that add, rather
7674
for the one next time around the loop. Each load then is effectively
7675
scheduled back in the previous iteration, allowing latency to be hidden.
7677
Naturally this is wanted only when doing things like loads or multiplies that
7678
take a few cycles to complete, and only where a CPU has multiple functional
7679
units so that other work can be done while waiting.
7681
A pipeline with several stages will have a data value in progress at each
7682
stage and each loop iteration moves them along one stage. This is like
7685
Within the loop some moves between registers may be necessary to have the
7686
right values in the right places for each iteration. Loop unrolling can help
7687
this, with each unrolled block able to use different registers for different
7688
values, even if some shuffling is still needed just before going back to the
7692
@node Assembler Loop Unrolling, , Assembler Software Pipelining, Assembler Coding
7693
@subsection Loop Unrolling
7695
Loop unrolling consists of replicating code so that several limbs are
7696
processed in each loop. At a minimum this reduces loop overheads by a
7697
corresponding factor, but it can also allow better register usage, for example
7698
alternately using one register combination and then another. Judicious use of
7699
@command{m4} macros can help avoid lots of duplication in the source code.
7701
Unrolling is commonly done to a power of 2 multiple so the number of unrolled
7702
loops and the number of remaining limbs can be calculated with a shift and
7703
mask. But other multiples can be used too, just by subtracting each @var{n}
7704
limbs processed from a counter and waiting for less than @var{n} remaining (or
7705
offsetting the counter by @var{n} so it goes negative when there's less than
7708
The limbs not a multiple of the unrolling can be handled in various ways, for
7713
A simple loop at the end (or the start) to process the excess. Care will be
7714
wanted that it isn't too much slower than the unrolled part.
7717
A set of binary tests, for example after an 8-limb unrolling, test for 4 more
7718
limbs to process, then a further 2 more or not, and finally 1 more or not.
7719
This will probably take more code space than a simple loop.
7722
A @code{switch} statement, providing separate code for each possible excess,
7723
for example an 8-limb unrolling would have separate code for 0 remaining, 1
7724
remaining, etc, up to 7 remaining. This might take a lot of code, but may be
7725
the best way to optimize all cases in combination with a deep pipelined loop.
7728
A computed jump into the middle of the loop, thus making the first iteration
7729
handle the excess. This should make times smoothly increase with size, which
7730
is attractive, but setups for the jump and adjustments for pointers can be
7731
tricky and could become quite difficult in combination with deep pipelining.
7734
One way to write the setups and finishups for a pipelined unrolled loop is
7735
simply to duplicate the loop at the start and the end, then delete
7736
instructions at the start which have no valid antecedents, and delete
7737
instructions at the end whose results are unwanted. Sizes not a multiple of
7738
the unrolling can then be handled as desired.
7741
@node Internals, Contributors, Algorithms, Top
7744
@strong{This chapter is provided only for informational purposes and the
7745
various internals described here may change in future GMP releases.
7746
Applications expecting to be compatible with future releases should use only
7747
the documented interfaces described in previous chapters.}
7750
* Integer Internals::
7751
* Rational Internals::
7753
* Raw Output Internals::
7754
* C++ Interface Internals::
7757
@node Integer Internals, Rational Internals, Internals, Internals
7758
@section Integer Internals
7760
@code{mpz_t} variables represent integers using sign and magnitude, in space
7761
dynamically allocated and reallocated. The fields are as follows.
7764
@item @code{_mp_size}
7765
The number of limbs, or the negative of that when representing a negative
7766
integer. Zero is represented by @code{_mp_size} set to zero, in which case
7767
the @code{_mp_d} data is unused.
7770
A pointer to an array of limbs which is the magnitude. These are stored
7771
``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the
7772
least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most
7773
significant. Whenever @code{_mp_size} is non-zero, the most significant limb
7776
Currently there's always at least one limb allocated, so for instance
7777
@code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch
7778
@code{_mp_d[0]} unconditionally (though its value is then only wanted if
7779
@code{_mp_size} is non-zero).
7781
@item @code{_mp_alloc}
7782
@code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d},
7783
and naturally @code{_mp_alloc >= ABS(_mp_size)}. When an @code{mpz} routine
7784
is about to (or might be about to) increase @code{_mp_size}, it checks
7785
@code{_mp_alloc} to see whether there's enough space, and reallocates if not.
7786
@code{MPZ_REALLOC} is generally used for this.
7789
The various bitwise logical functions like @code{mpz_and} behave as if
7790
negative values were twos complement. But sign and magnitude is always used
7791
internally, and necessary adjustments are made during the calculations.
7792
Sometimes this isn't pretty, but sign and magnitude are best for other
7795
Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these
7796
have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory
7797
allocation functions. Care is taken to ensure that these are big enough that
7798
no reallocation is necessary (since it would have unpredictable consequences).
7801
@node Rational Internals, Float Internals, Integer Internals, Internals
7802
@section Rational Internals
7804
@code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and
7805
denominator (@pxref{Integer Internals}).
7807
The canonical form adopted is denominator positive (and non-zero), no common
7808
factors between numerator and denominator, and zero uniquely represented as
7811
It's believed that casting out common factors at each stage of a calculation
7812
is best in general. A GCD is an @ma{O(N^2)} operation so it's better to do a
7813
few small ones immediately than to delay and have to do a big one later.
7814
Knowing the numerator and denominator have no common factors can be used for
7815
example in @code{mpq_mul} to make only two cross GCDs necessary, not four.
7817
This general approach to common factors is badly sub-optimal in the presence
7818
of simple factorizations or little prospect for cancellation, but GMP has no
7819
way to know when this will occur. As per @ref{Efficiency}, that's left to
7820
applications. The @code{mpq_t} framework might still suit, with
7821
@code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and
7822
denominator, or of course @code{mpz_t} variables can be used directly.
7825
@node Float Internals, Raw Output Internals, Rational Internals, Internals
7826
@section Float Internals
7828
Efficient calculation is the primary aim of GMP floats and the use of whole
7829
limbs and simple rounding facilitates this.
7831
@code{mpf_t} floats have a variable precision mantissa and a single machine
7832
word signed exponent. The mantissa is represented using sign and magnitude.
7834
@c FIXME: The arrow heads don't join to the lines exactly.
7836
\global\newdimen\GMPboxwidth \GMPboxwidth=5em
7837
\global\newdimen\GMPboxheight \GMPboxheight=3ex
7838
\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
7841
\hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb}
7843
\def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
7845
\hbox to 3\GMPboxwidth {%
7846
\setbox 0 = \hbox{@code{\_mp\_exp}}%
7847
\dimen0=3\GMPboxwidth
7848
\advance\dimen0 by -\wd0
7850
\advance\dimen0 by -1em
7851
\setbox1 = \hbox{$\rightarrow$}%
7853
\advance\dimen1 by -\wd1
7854
\GMPcentreline{\dimen0}%
7858
\GMPcentreline{\dimen1{}}%
7860
\hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}}
7865
\vrule height 2ex depth 1ex
7866
\hbox to \GMPboxwidth {}%
7868
\hbox to \GMPboxwidth {}%
7870
\hbox to \GMPboxwidth {}%
7872
\hbox to \GMPboxwidth {}%
7874
\hbox to \GMPboxwidth {}%
7880
\hbox to 3\GMPboxwidth {%
7881
\hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}}
7882
\hbox to 5\GMPboxwidth{%
7883
\setbox 0 = \hbox{@code{\_mp\_size}}%
7884
\dimen0 = 5\GMPboxwidth
7885
\advance\dimen0 by -\wd0
7887
\advance\dimen0 by -1em
7889
\setbox1 = \hbox{$\leftarrow$}%
7890
\setbox2 = \hbox{$\rightarrow$}%
7891
\advance\dimen0 by -\wd1
7892
\advance\dimen1 by -\wd2
7895
\GMPcentreline{\dimen0}%
7899
\GMPcentreline{\dimen1}%
7906
significant significant
7910
|---- _mp_exp ---> |
7911
_____ _____ _____ _____ _____
7912
|_____|_____|_____|_____|_____|
7913
. <------------ radix point
7915
<-------- _mp_size --------->
7921
The fields are as follows.
7924
@item @code{_mp_size}
7925
The number of limbs currently in use, or the negative of that when
7926
representing a negative value. Zero is represented by @code{_mp_size} and
7927
@code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is
7928
unused. (In the future @code{_mp_exp} might be undefined when representing
7931
@item @code{_mp_prec}
7932
The precision of the mantissa, in limbs. In any calculation the aim is to
7933
produce @code{_mp_prec} limbs of result (the most significant being non-zero).
7936
A pointer to the array of limbs which is the absolute value of the mantissa.
7937
These are stored ``little endian'' as per the @code{mpn} functions, so
7938
@code{_mp_d[0]} is the least significant limb and
7939
@code{_mp_d[ABS(_mp_size)-1]} the most significant.
7941
The most significant limb is always non-zero, but there are no other
7942
restrictions on its value, in particular the highest 1 bit can be anywhere
7945
@code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being
7946
for convenience (see below). There are no reallocations during a calculation,
7947
only in a change of precision with @code{mpf_set_prec}.
7949
@item @code{_mp_exp}
7950
The exponent, in limbs, determining the location of the implied radix point.
7951
Zero means the radix point is just above the most significant limb. Positive
7952
values mean a radix point offset towards the lower limbs and hence a value
7953
@ma{@ge{} 1}, as for example in the diagram above. Negative exponents mean a
7954
radix point further above the highest limb.
7956
Naturally the exponent can be any value, it doesn't have to fall within the
7957
limbs as the diagram shows, it can be a long way above or a long way below.
7958
Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data
7959
are treated as zero.
7964
The following various points should be noted.
7968
The least significant limbs @code{_mp_d[0]} etc can be zero, though such low
7969
zeros can always be ignored. Routines likely to produce low zeros check and
7970
avoid them to save time in subsequent calculations, but for most routines
7971
they're quite unlikely and aren't checked.
7973
@item Mantissa Size Range
7974
The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if
7975
the value can be represented in less. This means low precision values or
7976
small integers stored in a high precision @code{mpf_t} can still be operated
7979
@code{_mp_size} can also be greater than @code{_mp_prec}. Firstly a value is
7980
allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d},
7981
and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves
7982
@code{_mp_size} unchanged and so the size can be arbitrarily bigger than
7986
All rounding is done on limb boundaries. Calculating @code{_mp_prec} limbs
7987
with the high non-zero will ensure the application requested minimum precision
7990
The use of simple ``trunc'' rounding towards zero is efficient, since there's
7991
no need to examine extra limbs and increment or decrement.
7994
Since the exponent is in limbs, there are no bit shifts in basic operations
7995
like @code{mpf_add} and @code{mpf_mul}. When differing exponents are
7996
encountered all that's needed is to adjust pointers to line up the relevant
7999
Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts,
8000
but the choice is between an exponent in limbs which requires shifts there, or
8001
one in bits which requires them almost everywhere else.
8003
@item Use of @code{_mp_prec+1} Limbs
8004
The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just
8005
@code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its
8006
operation. @code{mpf_add} for instance will do an @code{mpn_add} of
8007
@code{_mp_prec} limbs. If there's no carry then that's the result, but if
8008
there is a carry then it's stored in the extra limb of space and
8009
@code{_mp_size} becomes @code{_mp_prec+1}.
8011
Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not
8012
needed for the intended precision, only the @code{_mp_prec} high limbs. But
8013
zeroing it out or moving the rest down is unnecessary. Subsequent routines
8014
reading the value will simply take the high limbs they need, and this will be
8015
@code{_mp_prec} if their target has that same precision. This is no more than
8016
a pointer adjustment, and must be checked anyway since the destination
8017
precision can be different from the sources.
8019
Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs
8020
if available. This ensures that a variable which has @code{_mp_size} equal to
8021
@code{_mp_prec+1} will get its full exact value copied. Strictly speaking
8022
this is unnecessary since only @code{_mp_prec} limbs are needed for the
8023
application's requested precision, but it's considered that an @code{mpf_set}
8024
from one variable into another of the same precision ought to produce an exact
8027
@item Application Precisions
8028
@code{__GMPF_BITS_TO_PREC} converts an application requested precision to an
8029
@code{_mp_prec}. The value in bits is rounded up to a whole limb then an
8030
extra limb is added since the most significant limb of @code{_mp_d} is only
8031
non-zero and therefore might contain only one bit.
8033
@code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra
8034
limb from @code{_mp_prec} before converting to bits. The net effect of
8035
reading back with @code{mpf_get_prec} is simply the precision rounded up to a
8036
multiple of @code{mp_bits_per_limb}.
8038
Note that the extra limb added here for the high only being non-zero is in
8039
addition to the extra limb allocated to @code{_mp_d}. For example with a
8040
32-bit limb, an application request for 250 bits will be rounded up to 8
8041
limbs, then an extra added for the high being only non-zero, giving an
8042
@code{_mp_prec} of 9. @code{_mp_d} then gets 10 limbs allocated. Reading
8043
back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and
8044
multiply by 32, giving 256 bits.
8046
Strictly speaking, the fact the high limb has at least one bit means that a
8047
float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but
8048
for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice
8049
multiple of the limb size.
8053
@node Raw Output Internals, C++ Interface Internals, Float Internals, Internals
8054
@section Raw Output Internals
8057
@code{mpz_out_raw} uses the following format.
8060
\global\newdimen\GMPboxwidth \GMPboxwidth=5em
8061
\global\newdimen\GMPboxheight \GMPboxheight=3ex
8062
\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
8065
\def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
8069
\vrule height 2.5ex depth 1.5ex
8070
\hbox to \GMPboxwidth {\hfil size\hfil}%
8072
\hbox to 3\GMPboxwidth {\hfil data bytes\hfil}%
8079
+------+------------------------+
8080
| size | data bytes |
8081
+------+------------------------+
8085
The size is 4 bytes written most significant byte first, being the number of
8086
subsequent data bytes, or the twos complement negative of that when a negative
8087
integer is represented. The data bytes are the absolute value of the integer,
8088
written most significant byte first.
8090
The most significant data byte is always non-zero, so the output is the same
8091
on all systems, irrespective of limb size.
8093
In GMP 1, leading zero bytes were written to pad the data bytes to a multiple
8094
of the limb size. @code{mpz_inp_raw} will still accept this, for
8097
The use of ``big endian'' for both the size and data fields is deliberate, it
8098
makes the data easy to read in a hex dump of a file. Unfortunately it also
8099
means that the limb data must be reversed when reading or writing, so neither
8100
a big endian nor little endian system can just read and write @code{_mp_d}.
8103
@node C++ Interface Internals, , Raw Output Internals, Internals
8104
@section C++ Interface Internals
8106
A system of expression templates is used to ensure something like @code{a=b+c}
8107
turns into a simple call to @code{mpz_add} etc. For @code{mpf_class} and
8108
@code{mpfr_class} the scheme also ensures the precision of the final
8109
destination is used for any temporaries within a statement like
8110
@code{f=w*x+y*z}. These are important features which a naive implementation
8113
A simplified description of the scheme follows. The true scheme is
8114
complicated by the fact that expressions have different return types. For
8115
detailed information, refer to the source code.
8117
To perform an operation, say, addition, we first define a ``function object''
8121
struct __gmp_binary_plus
8123
static void eval(mpf_t f, mpf_t g, mpf_t h) @{ mpf_add(f, g, h); @}
8128
And an ``additive expression'' object,
8131
__gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >
8132
operator+(const mpf_class &f, const mpf_class &g)
8135
<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g);
8139
The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<...>>} is used to
8140
encapsulate any possible kind of expression into a single template type. In
8141
fact even @code{mpf_class} etc are @code{typedef} specializations of
8144
Next we define assignment of @code{__gmp_expr} to @code{mpf_class}.
8148
mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr)
8150
expr.eval(this->get_mpf_t(), this->precision());
8155
void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval
8156
(mpf_t f, unsigned long int precision)
8158
Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t());
8162
where @code{expr.val1} and @code{expr.val2} are references to the expression's
8163
operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the
8166
This way, the expression is actually evaluated only at the time of assignment,
8167
when the required precision (that of @code{f}) is known. Furthermore the
8168
target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly
8169
with @code{f} as the output argument.
8171
Compound expressions are handled by defining operators taking subexpressions
8172
as their arguments, like this:
8175
template <class T, class U>
8177
<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
8178
operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2)
8181
<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
8186
And the corresponding specializations of @code{__gmp_expr::eval}:
8189
template <class T, class U, class Op>
8191
<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval
8192
(mpf_t f, unsigned long int precision)
8194
// declare two temporaries
8195
mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision);
8196
Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t());
8200
The expression is thus recursively evaluated to any level of complexity and
8201
all subexpressions are evaluated to the precision of @code{f}.
8204
@node Contributors, References, Internals, Top
8205
@comment node-name, next, previous, up
8206
@appendix Contributors
8207
@cindex Contributors
8209
Torbjorn Granlund wrote the original GMP library and is still developing and
8210
maintaining it. Several other individuals and organizations have contributed
8211
to GMP in various ways. Here is a list in chronological order:
8213
Gunnar Sjoedin and Hans Riesel helped with mathematical problems in early
8214
versions of the library.
8216
Richard Stallman contributed to the interface design and revised the first
8217
version of this manual.
8219
Brian Beuning and Doug Lea helped with testing of early versions of the
8220
library and made creative suggestions.
8222
John Amanatides of York University in Canada contributed the function
8223
@code{mpz_probab_prime_p}.
8225
Paul Zimmermann of Inria sparked the development of GMP 2, with his
8226
comparisons between bignum packages.
8228
Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul)
8229
contributed @code{mpz_gcd}, @code{mpz_divexact}, @code{mpn_gcd}, and
8230
@code{mpn_bdivmod}, partially supported by CNPq (Brazil) grant 301314194-2.
8232
Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure.
8233
He has also made valuable suggestions and tested numerous intermediary
8236
Joachim Hollman was involved in the design of the @code{mpf} interface, and in
8237
the @code{mpz} design revisions for version 2.
8239
Bennet Yee contributed the initial versions of @code{mpz_jacobi} and
8240
@code{mpz_legendre}.
8242
Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and
8243
@file{mpn/m68k/rshift.S} (now in @file{.asm} form).
8245
The development of floating point functions of GNU MP 2, were supported in part
8246
by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial
8249
GNU MP 2 was finished and released by SWOX AB, SWEDEN, in cooperation with the
8250
IDA Center for Computing Sciences, USA.
8252
Robert Harley of Inria, France and David Seal of ARM, England, suggested clever
8253
improvements for population count.
8255
Robert Harley also wrote highly optimized Karatsuba and 3-way Toom
8256
multiplication functions for GMP 3. He also contributed the ARM assembly
8259
Torsten Ekedahl of the Mathematical department of Stockholm University provided
8260
significant inspiration during several phases of the GMP development. His
8261
mathematical expertise helped improve several algorithms.
8263
Paul Zimmermann wrote the Divide and Conquer division code, the REDC code, the
8264
REDC-based mpz_powm code, the FFT multiply code, and the Karatsuba square
8265
root. The ECMNET project Paul is organizing was a driving force behind many
8266
of the optimizations in GMP 3.
8268
Linus Nordberg wrote the new configure system based on autoconf and
8269
implemented the new random functions.
8271
Kent Boortz made the Macintosh port.
8273
Kevin Ryde worked on a number of things: optimized x86 code, m4 asm macros,
8274
parameter tuning, speed measuring, the configure system, function inlining,
8275
divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas number
8276
functions, printf and scanf functions, perl interface, demo expression parser,
8277
the algorithms chapter in the manual, gmpasm-mode.el, and various
8278
miscellaneous improvements elsewhere.
8280
Steve Root helped write the optimized alpha 21264 assembly code.
8282
Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++
8283
istream input routines.
8285
GNU MP 4.0.1 was finished and released by Torbjorn Granlund and Kevin Ryde.
8286
Torbjorn's work was partially funded by the IDA Center for Computing Sciences,
8289
(This list is chronological, not ordered after significance. If you have
8290
contributed to GMP but are not listed above, please tell @email{tege@@swox.com}
8291
about the omission!)
8294
@node References, GNU Free Documentation License, Contributors, Top
8295
@comment node-name, next, previous, up
8296
@appendix References
8299
@c FIXME: In tex, the @uref's are unhyphenated, which is good for clarity,
8300
@c but being long words they upset paragraph formatting (the preceding line
8301
@c can get badly stretched). Would like an conditional @* style line break
8302
@c if the uref is too long to fit on the last line of the paragraph, but it's
8303
@c not clear how to do that. For now explicit @texlinebreak{}s are used on
8304
@c paragraphs that come out bad.
8310
Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate
8311
Texts in Mathematics number 138, Springer-Verlag, 1993.
8312
@texlinebreak{} @uref{http://www.math.u-bordeaux.fr/~cohen}
8315
Donald E. Knuth, ``The Art of Computer Programming'', volume 2,
8316
``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998.
8317
@texlinebreak{} @uref{http://www-cs-faculty.stanford.edu/~knuth/taocp.html}
8320
John D. Lipson, ``Elements of Algebra and Algebraic Computing'',
8321
The Benjamin Cummings Publishing Company Inc, 1981.
8324
Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of
8325
Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/}
8328
Richard M. Stallman, ``Using and Porting GCC'', Free Software Foundation, 1999,
8329
available online @uref{http://www.gnu.org/software/gcc/onlinedocs/}, and in
8330
the GCC package @uref{ftp://ftp.gnu.org/gnu/gcc/}
8337
Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'',
8338
Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022, @texlinebreak{}
8339
@uref{http://www.mpi-sb.mpg.de/~ziegler/TechRep.ps.gz}
8342
Torbjorn Granlund and Peter L. Montgomery, ``Division by Invariant Integers
8343
using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June
8344
1994. Also available @uref{ftp://ftp.cwi.nl/pub/pmontgom/divcnst.psa4.gz}
8348
Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in
8349
Mathematics of Computation, volume 44, number 170, April 1985.
8353
``An algorithm for exact division'',
8354
Journal of Symbolic Computation,
8355
volume 15, 1993, pp. 169-180.
8356
Research report version available @texlinebreak{}
8357
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz}
8360
Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended
8361
Abstract'', RISC-Linz technical report 96-31, @texlinebreak{}
8362
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz}
8365
Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'',
8366
ISSAC 97, pp. 339-341. Technical report available @texlinebreak{}
8367
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz}
8370
Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93,
8371
pp. 111-116. Technical report version available @texlinebreak{}
8372
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz}
8375
Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD
8376
of Long Integers'', Journal of Symbolic Computation, volume 19, 1995,
8377
pp. 145-157. Technical report version also available @texlinebreak{}
8378
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz}
8381
Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'',
8382
Journal of Symbolic Computation, volume 21, 1996, pp. 441-455. Early
8383
technical report version also available
8384
@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz}
8387
R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'',
8388
Proceedings of the 13th Annual IEEE Symposium on Switching and Automata
8389
Theory, October 1972, pp. 90-96. Reprinted as ``Fast Modular Transforms'',
8390
Journal of Computer and System Sciences, volume 8, number 3, June 1974,
8394
Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser
8395
Zahlen'', Computing 7, 1971, pp. 281-292.
8398
Kenneth Weber, ``The accelerated integer GCD algorithm'',
8399
ACM Transactions on Mathematical Software,
8400
volume 21, number 1, March 1995, pp. 111-122.
8403
Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805,
8404
November 1999, @uref{http://www.inria.fr/RRRT/RR-3805.html}
8407
Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root
8408
Implementations'', @texlinebreak{}
8409
@uref{http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz}
8412
Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE
8413
Symposium on Computer Arithmetic, 1993, pp. 260 to 271. Reprinted as ``More
8414
on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers,
8415
volume 43, number 8, August 1994, pp. 899-908.
8419
@node GNU Free Documentation License, Concept Index, References, Top
8420
@appendix GNU Free Documentation License
8421
@cindex GNU Free Documentation License
8425
@node Concept Index, Function Index, GNU Free Documentation License, Top
8426
@comment node-name, next, previous, up
8427
@unnumbered Concept Index
8430
@node Function Index, , Concept Index, Top
8431
@comment node-name, next, previous, up
8432
@unnumbered Function and Type Index
8439
@c compile-command: "make gmp.info"