1 Copyright 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc. 2 Contributed by the AriC and Caramel projects, INRIA. 3 4 This file is part of the GNU MPFR Library. 5 6 The GNU MPFR Library is free software; you can redistribute it and/or modify 7 it under the terms of the GNU Lesser General Public License as published by 8 the Free Software Foundation; either version 3 of the License, or (at your 9 option) any later version. 10 11 The GNU MPFR Library is distributed in the hope that it will be useful, but 12 WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY 13 or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public 14 License for more details. 15 16 You should have received a copy of the GNU Lesser General Public License 17 along with the GNU MPFR Library; see the file COPYING.LESSER. If not, see 18 http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc., 19 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. 20 21 Table of contents: 22 1. Documentation 23 2. Installation 24 3. Changes in existing functions 25 4. New functions to implement 26 5. Efficiency 27 6. Miscellaneous 28 7. Portability 29 30 ############################################################################## 31 1. Documentation 32 ############################################################################## 33 34 - add a description of the algorithms used + proof of correctness 35 36 ############################################################################## 37 2. Installation 38 ############################################################################## 39 40 - if we want to distinguish GMP and MPIR, we can check at configure time 41 the following symbols which are only defined in MPIR: 42 43 #define __MPIR_VERSION 0 44 #define __MPIR_VERSION_MINOR 9 45 #define __MPIR_VERSION_PATCHLEVEL 0 46 47 There is also a library symbol mpir_version, which should match VERSION, set 48 by configure, for example 0.9.0. 49 50 ############################################################################## 51 3. Changes in existing functions 52 ############################################################################## 53 54 - export mpfr_overflow and mpfr_underflow as public functions 55 56 - many functions currently taking into account the precision of the *input* 57 variable to set the initial working precison (acosh, asinh, cosh, ...). 58 This is nonsense since the "average" working precision should only depend 59 on the precision of the *output* variable (and maybe on the *value* of 60 the input in case of cancellation). 61 -> remove those dependencies from the input precision. 62 63 - mpfr_can_round: 64 change the meaning of the 2nd argument (err). Currently the error is 65 at most 2^(MPFR_EXP(b)-err), i.e. err is the relative shift wrt the 66 most significant bit of the approximation. I propose that the error 67 is now at most 2^err ulps of the approximation, i.e. 68 2^(MPFR_EXP(b)-MPFR_PREC(b)+err). 69 70 - mpfr_set_q first tries to convert the numerator and the denominator 71 to mpfr_t. But this convertion may fail even if the correctly rounded 72 result is representable. New way to implement: 73 Function q = a/b. nq = PREC(q) na = PREC(a) nb = PREC(b) 74 If na < nb 75 a <- a*2^(nb-na) 76 n <- na-nb+ (HIGH(a,nb) >= b) 77 if (n >= nq) 78 bb <- b*2^(n-nq) 79 a = q*bb+r --> q has exactly n bits. 80 else 81 aa <- a*2^(nq-n) 82 aa = q*b+r --> q has exactly n bits. 83 If RNDN, takes nq+1 bits. (See also the new division function). 84 85 86 ############################################################################## 87 4. New functions to implement 88 ############################################################################## 89 90 - implement mpfr_q_sub, mpfr_z_div, mpfr_q_div? 91 - implement functions for random distributions, see for example 92 https://sympa.inria.fr/sympa/arc/mpfr/2010-01/msg00034.html 93 (suggested by Charles Karney <ckarney (a] Sarnoff.com>, 18 Jan 2010): 94 * a Bernoulli distribution with prob p/q (exact) 95 * a general discrete distribution (i with prob w[i]/sum(w[i]) (Walker 96 algorithm, but make it exact) 97 * a uniform distribution in (a,b) 98 * exponential distribution (mean lambda) (von Neumann's method?) 99 * normal distribution (mean m, s.d. sigma) (ratio method?) 100 - wanted for Magma [John Cannon <john (a] maths.usyd.edu.au>, Tue, 19 Apr 2005]: 101 HypergeometricU(a,b,s) = 1/gamma(a)*int(exp(-su)*u^(a-1)*(1+u)^(b-a-1), 102 u=0..infinity) 103 JacobiThetaNullK 104 PolylogP, PolylogD, PolylogDold: see http://arxiv.org/abs/math.CA/0702243 105 and the references herein. 106 JBessel(n, x) = BesselJ(n+1/2, x) 107 IncompleteGamma [also wanted by <keith.briggs (a] bt.com> 4 Feb 2008: Gamma(a,x), 108 gamma(a,x), P(a,x), Q(a,x); see A&S 6.5, ref. [Smith01] in algorithms.bib] 109 KBessel, KBessel2 [2nd kind] 110 JacobiTheta 111 LogIntegral 112 ExponentialIntegralE1 113 E1(z) = int(exp(-t)/t, t=z..infinity), |arg z| < Pi 114 mpfr_eint1: implement E1(x) for x > 0, and Ei(-x) for x < 0 115 E1(NaN) = NaN 116 E1(+Inf) = +0 117 E1(-Inf) = -Inf 118 E1(+0) = +Inf 119 E1(-0) = -Inf 120 DawsonIntegral 121 GammaD(x) = Gamma(x+1/2) 122 - functions defined in the LIA-2 standard 123 + minimum and maximum (5.2.2): max, min, max_seq, min_seq, mmax_seq 124 and mmin_seq (mpfr_min and mpfr_max correspond to mmin and mmax); 125 + rounding_rest, floor_rest, ceiling_rest (5.2.4); 126 + remr (5.2.5): x - round(x/y) y; 127 + error functions from 5.2.7 (if useful in MPFR); 128 + power1pm1 (5.3.6.7): (1 + x)^y - 1; 129 + logbase (5.3.6.12): \log_x(y); 130 + logbase1p1p (5.3.6.13): \log_{1+x}(1+y); 131 + rad (5.3.9.1): x - round(x / (2 pi)) 2 pi = remr(x, 2 pi); 132 + axis_rad (5.3.9.1) if useful in MPFR; 133 + cycle (5.3.10.1): rad(2 pi x / u) u / (2 pi) = remr(x, u); 134 + axis_cycle (5.3.10.1) if useful in MPFR; 135 + sinu, cosu, tanu, cotu, secu, cscu, cossinu, arcsinu, arccosu, 136 arctanu, arccotu, arcsecu, arccscu (5.3.10.{2..14}): 137 sin(x 2 pi / u), etc.; 138 [from which sinpi(x) = sin(Pi*x), ... are trivial to implement, with u=2.] 139 + arcu (5.3.10.15): arctan2(y,x) u / (2 pi); 140 + rad_to_cycle, cycle_to_rad, cycle_to_cycle (5.3.11.{1..3}). 141 - From GSL, missing special functions (if useful in MPFR): 142 (cf http://www.gnu.org/software/gsl/manual/gsl-ref.html#Special-Functions) 143 + The Airy functions Ai(x) and Bi(x) defined by the integral representations: 144 * Ai(x) = (1/\pi) \int_0^\infty \cos((1/3) t^3 + xt) dt 145 * Bi(x) = (1/\pi) \int_0^\infty (e^(-(1/3) t^3) + \sin((1/3) t^3 + xt)) dt 146 * Derivatives of Airy Functions 147 + The Bessel functions for n integer and n fractional: 148 * Regular Modified Cylindrical Bessel Functions I_n 149 * Irregular Modified Cylindrical Bessel Functions K_n 150 * Regular Spherical Bessel Functions j_n: j_0(x) = \sin(x)/x, 151 j_1(x)= (\sin(x)/x-\cos(x))/x & j_2(x)= ((3/x^2-1)\sin(x)-3\cos(x)/x)/x 152 Note: the "spherical" Bessel functions are solutions of 153 x^2 y'' + 2 x y' + [x^2 - n (n+1)] y = 0 and satisfy 154 j_n(x) = sqrt(Pi/(2x)) J_{n+1/2}(x). They should not be mixed with the 155 classical Bessel Functions, also noted j0, j1, jn, y0, y1, yn in C99 156 and mpfr. 157 Cf http://en.wikipedia.org/wiki/Bessel_function#Spherical_Bessel_functions 158 *Irregular Spherical Bessel Functions y_n: y_0(x) = -\cos(x)/x, 159 y_1(x)= -(\cos(x)/x+\sin(x))/x & 160 y_2(x)= (-3/x^3+1/x)\cos(x)-(3/x^2)\sin(x) 161 * Regular Modified Spherical Bessel Functions i_n: 162 i_l(x) = \sqrt{\pi/(2x)} I_{l+1/2}(x) 163 * Irregular Modified Spherical Bessel Functions: 164 k_l(x) = \sqrt{\pi/(2x)} K_{l+1/2}(x). 165 + Clausen Function: 166 Cl_2(x) = - \int_0^x dt \log(2 \sin(t/2)) 167 Cl_2(\theta) = \Im Li_2(\exp(i \theta)) (dilogarithm). 168 + Dawson Function: \exp(-x^2) \int_0^x dt \exp(t^2). 169 + Debye Functions: D_n(x) = n/x^n \int_0^x dt (t^n/(e^t - 1)) 170 + Elliptic Integrals: 171 * Definition of Legendre Forms: 172 F(\phi,k) = \int_0^\phi dt 1/\sqrt((1 - k^2 \sin^2(t))) 173 E(\phi,k) = \int_0^\phi dt \sqrt((1 - k^2 \sin^2(t))) 174 P(\phi,k,n) = \int_0^\phi dt 1/((1 + n \sin^2(t))\sqrt(1 - k^2 \sin^2(t))) 175 * Complete Legendre forms are denoted by 176 K(k) = F(\pi/2, k) 177 E(k) = E(\pi/2, k) 178 * Definition of Carlson Forms 179 RC(x,y) = 1/2 \int_0^\infty dt (t+x)^(-1/2) (t+y)^(-1) 180 RD(x,y,z) = 3/2 \int_0^\infty dt (t+x)^(-1/2) (t+y)^(-1/2) (t+z)^(-3/2) 181 RF(x,y,z) = 1/2 \int_0^\infty dt (t+x)^(-1/2) (t+y)^(-1/2) (t+z)^(-1/2) 182 RJ(x,y,z,p) = 3/2 \int_0^\infty dt 183 (t+x)^(-1/2) (t+y)^(-1/2) (t+z)^(-1/2) (t+p)^(-1) 184 + Elliptic Functions (Jacobi) 185 + N-relative exponential: 186 exprel_N(x) = N!/x^N (\exp(x) - \sum_{k=0}^{N-1} x^k/k!) 187 + exponential integral: 188 E_2(x) := \Re \int_1^\infty dt \exp(-xt)/t^2. 189 Ei_3(x) = \int_0^x dt \exp(-t^3) for x >= 0. 190 Ei(x) := - PV(\int_{-x}^\infty dt \exp(-t)/t) 191 + Hyperbolic/Trigonometric Integrals 192 Shi(x) = \int_0^x dt \sinh(t)/t 193 Chi(x) := Re[ \gamma_E + \log(x) + \int_0^x dt (\cosh[t]-1)/t] 194 Si(x) = \int_0^x dt \sin(t)/t 195 Ci(x) = -\int_x^\infty dt \cos(t)/t for x > 0 196 AtanInt(x) = \int_0^x dt \arctan(t)/t 197 [ \gamma_E is the Euler constant ] 198 + Fermi-Dirac Function: 199 F_j(x) := (1/r\Gamma(j+1)) \int_0^\infty dt (t^j / (\exp(t-x) + 1)) 200 + Pochhammer symbol (a)_x := \Gamma(a + x)/\Gamma(a) : see [Smith01] in 201 algorithms.bib 202 logarithm of the Pochhammer symbol 203 + Gegenbauer Functions 204 + Laguerre Functions 205 + Eta Function: \eta(s) = (1-2^{1-s}) \zeta(s) 206 Hurwitz zeta function: \zeta(s,q) = \sum_0^\infty (k+q)^{-s}. 207 + Lambert W Functions, W(x) are defined to be solutions of the equation: 208 W(x) \exp(W(x)) = x. 209 This function has multiple branches for x < 0 (2 funcs W0(x) and Wm1(x)) 210 + Trigamma Function psi'(x). 211 and Polygamma Function: psi^{(m)}(x) for m >= 0, x > 0. 212 213 - from gnumeric (www.gnome.org/projects/gnumeric/doc/function-reference.html): 214 - beta 215 - betaln 216 - degrees 217 - radians 218 - sqrtpi 219 220 - mpfr_inp_raw, mpfr_out_raw (cf mail "Serialization of mpfr_t" from Alexey 221 and answer from Granlund on mpfr list, May 2007) 222 - [maybe useful for SAGE] implement companion frac_* functions to the rint_* 223 functions. For example mpfr_frac_floor(x) = x - floor(x). (The current 224 mpfr_frac function corresponds to mpfr_rint_trunc.) 225 - scaled erfc (https://sympa.inria.fr/sympa/arc/mpfr/2009-05/msg00054.html) 226 - asec, acsc, acot, asech, acsch and acoth (mail from Bjrn Terelius on mpfr 227 list, 18 June 2009) 228 229 ############################################################################## 230 5. Efficiency 231 ############################################################################## 232 233 - implement a mpfr_sqrthigh algorithm based on Mulders' algorithm, with a 234 basecase variant 235 - use mpn_div_q to speed up mpfr_div. However mpn_div_q, which is new in 236 GMP 5, is not documented in the GMP manual, thus we are not sure it 237 guarantees to return the same quotient as mpn_tdiv_qr. 238 Also mpfr_div uses the remainder computed by mpn_divrem. A workaround would 239 be to first try with mpn_div_q, and if we cannot (easily) compute the 240 rounding, then use the current code with mpn_divrem. 241 - compute exp by using the series for cosh or sinh, which has half the terms 242 (see Exercise 4.11 from Modern Computer Arithmetic, version 0.3) 243 The same method can be used for log, using the series for atanh, i.e., 244 atanh(x) = 1/2*log((1+x)/(1-x)). 245 - improve mpfr_gamma (see http://code.google.com/p/fastfunlib/). A possible 246 idea is to implement a fast algorithm for the argument reconstruction 247 gamma(x+k). One could also use the series for 1/gamma(x), see for example 248 http://dlmf.nist.gov/5/7/ or formula (36) from 249 http://mathworld.wolfram.com/GammaFunction.html 250 - fix regression with mpfr_mpz_root (from Keith Briggs, 5 July 2006), for 251 example on 3Ghz P4 with gmp-4.2, x=12.345: 252 prec=50000 k=2 k=3 k=10 k=100 253 mpz_root 0.036 0.072 0.476 7.628 254 mpfr_mpz_root 0.004 0.004 0.036 12.20 255 See also mail from Carl Witty on mpfr list, 09 Oct 2007. 256 - implement Mulders algorithm for squaring and division 257 - for sparse input (say x=1 with 2 bits), mpfr_exp is not faster than for 258 full precision when precision <= MPFR_EXP_THRESHOLD. The reason is 259 that argument reduction kills sparsity. Maybe avoid argument reduction 260 for sparse input? 261 - speed up const_euler for large precision [for x=1.1, prec=16610, it takes 262 75% of the total time of eint(x)!] 263 - speed up mpfr_atan for large arguments (to speed up mpc_log) 264 [from Mark Watkins on Fri, 18 Mar 2005] 265 Also mpfr_atan(x) seems slower (by a factor of 2) for x near from 1. 266 Example on a Athlon for 10^5 bits: x=1.1 takes 3s, whereas 2.1 takes 1.8s. 267 The current implementation does not give monotonous timing for the following: 268 mpfr_random (x); for (i = 0; i < k; i++) mpfr_atan (y, x, MPFR_RNDN); 269 for precision 300 and k=1000, we get 1070ms, and 500ms only for p=400! 270 - improve mpfr_sin on values like ~pi (do not compute sin from cos, because 271 of the cancellation). For instance, reduce the input modulo pi/2 in 272 [-pi/4,pi/4], and define auxiliary functions for which the argument is 273 assumed to be already reduced (so that the sin function can avoid 274 unnecessary computations by calling the auxiliary cos function instead of 275 the full cos function). This will require a native code for sin, for 276 example using the reduction sin(3x)=3sin(x)-4sin(x)^3. 277 See https://sympa.inria.fr/sympa/arc/mpfr/2007-08/msg00001.html and 278 the following messages. 279 - improve generic.c to work for number of terms <> 2^k 280 - rewrite mpfr_greater_p... as native code. 281 282 - mpf_t uses a scheme where the number of limbs actually present can 283 be less than the selected precision, thereby allowing low precision 284 values (for instance small integers) to be stored and manipulated in 285 an mpf_t efficiently. 286 287 Perhaps mpfr should get something similar, especially if looking to 288 replace mpf with mpfr, though it'd be a major change. Alternately 289 perhaps those mpfr routines like mpfr_mul where optimizations are 290 possible through stripping low zero bits or limbs could check for 291 that (this would be less efficient but easier). 292 293 - try the idea of the paper "Reduced Cancellation in the Evaluation of Entire 294 Functions and Applications to the Error Function" by W. Gawronski, J. Mueller 295 and M. Reinhard, to be published in SIAM Journal on Numerical Analysis: to 296 avoid cancellation in say erfc(x) for x large, they compute the Taylor 297 expansion of erfc(x)*exp(x^2/2) instead (which has less cancellation), 298 and then divide by exp(x^2/2) (which is simpler to compute). 299 300 - replace the *_THRESHOLD macros by global (TLS) variables that can be 301 changed at run time (via a function, like other variables)? One benefit 302 is that users could use a single MPFR binary on several machines (e.g., 303 a library provided by binary packages or shared via NFS) with different 304 thresholds. On the default values, this would be a bit less efficient 305 than the current code, but this isn't probably noticeable (this should 306 be tested). Something like: 307 long *mpfr_tune_get(void) to get the current values (the first value 308 is the size of the array). 309 int mpfr_tune_set(long *array) to set the tune values. 310 int mpfr_tune_run(long level) to find the best values (the support 311 for this feature is optional, this can also be done with an 312 external function). 313 314 - better distinguish different processors (for example Opteron and Core 2) 315 and use corresponding default tuning parameters (as in GMP). This could be 316 done in configure.ac to avoid hacking config.guess, for example define 317 MPFR_HAVE_CORE2. 318 Note (VL): the effect on cross-compilation (that can be a processor 319 with the same architecture, e.g. compilation on a Core 2 for an 320 Opteron) is not clear. The choice should be consistent with the 321 build target (e.g. -march or -mtune value with gcc). 322 Also choose better default values. For instance, the default value of 323 MPFR_MUL_THRESHOLD is 40, while the best values that have been found 324 are between 11 and 19 for 32 bits and between 4 and 10 for 64 bits! 325 326 - during the Many Digits competition, we noticed that (our implantation of) 327 Mulders short product was slower than a full product for large sizes. 328 This should be precisely analyzed and fixed if needed. 329 330 ############################################################################## 331 6. Miscellaneous 332 ############################################################################## 333 334 - [suggested by Tobias Burnus <burnus(at)net-b.de> and 335 Asher Langton <langton(at)gcc.gnu.org>, Wed, 01 Aug 2007] 336 support quiet and signaling NaNs in mpfr: 337 * functions to set/test a quiet/signaling NaN: mpfr_set_snan, mpfr_snan_p, 338 mpfr_set_qnan, mpfr_qnan_p 339 * correctly convert to/from double (if encoding of s/qNaN is fixed in 754R) 340 341 - check again coverage: on 2007-07-27, Patrick Pelissier reports that the 342 following files are not tested at 100%: add1.c, atan.c, atan2.c, 343 cache.c, cmp2.c, const_catalan.c, const_euler.c, const_log2.c, cos.c, 344 gen_inverse.h, div_ui.c, eint.c, exp3.c, exp_2.c, expm1.c, fma.c, fms.c, 345 lngamma.c, gamma.c, get_d.c, get_f.c, get_ld.c, get_str.c, get_z.c, 346 inp_str.c, jn.c, jyn_asympt.c, lngamma.c, mpfr-gmp.c, mul.c, mul_ui.c, 347 mulders.c, out_str.c, pow.c, print_raw.c, rint.c, root.c, round_near_x.c, 348 round_raw_generic.c, set_d.c, set_ld.c, set_q.c, set_uj.c, set_z.c, sin.c, 349 sin_cos.c, sinh.c, sqr.c, stack_interface.c, sub1.c, sub1sp.c, subnormal.c, 350 uceil_exp2.c, uceil_log2.c, ui_pow_ui.c, urandomb.c, yn.c, zeta.c, zeta_ui.c. 351 352 - check the constants mpfr_set_emin (-16382-63) and mpfr_set_emax (16383) in 353 get_ld.c and the other constants, and provide a testcase for large and 354 small numbers. 355 356 - from Kevin Ryde <user42 (a] zip.com.au>: 357 Also for pi.c, a pre-calculated compiled-in pi to a few thousand 358 digits would be good value I think. After all, say 10000 bits using 359 1250 bytes would still be small compared to the code size! 360 Store pi in round to zero mode (to recover other modes). 361 362 - add a new rounding mode: round to nearest, with ties away from zero 363 (this is roundTiesToAway in 754-2008, could be used by mpfr_round) 364 - add a new roundind mode: round to odd. If the result is not exactly 365 representable, then round to the odd mantissa. This rounding 366 has the nice property that for k > 1, if: 367 y = round(x, p+k, TO_ODD) 368 z = round(y, p, TO_NEAREST_EVEN), then 369 z = round(x, p, TO_NEAREST_EVEN) 370 so it avoids the double-rounding problem. 371 372 - add tests of the ternary value for constants 373 374 - When doing Extensive Check (--enable-assert=full), since all the 375 functions use a similar use of MACROS (ZivLoop, ROUND_P), it should 376 be possible to do such a scheme: 377 For the first call to ROUND_P when we can round. 378 Mark it as such and save the approximated rounding value in 379 a temporary variable. 380 Then after, if the mark is set, check if: 381 - we still can round. 382 - The rounded value is the same. 383 It should be a complement to tgeneric tests. 384 385 - in div.c, try to find a case for which cy != 0 after the line 386 cy = mpn_sub_1 (sp + k, sp + k, qsize, cy); 387 (which should be added to the tests), e.g. by having {vp, k} = 0, or 388 prove that this cannot happen. 389 390 - add a configure test for --enable-logging to ignore the option if 391 it cannot be supported. Modify the "configure --help" description 392 to say "on systems that support it". 393 394 - add generic bad cases for functions that don't have an inverse 395 function that is implemented (use a single Newton iteration). 396 397 - add bad cases for the internal error bound (by using a dichotomy 398 between a bad case for the correct rounding and some input value 399 with fewer Ziv iterations?). 400 401 - add an option to use a 32-bit exponent type (int) on LP64 machines, 402 mainly for developers, in order to be able to test the case where the 403 extended exponent range is the same as the default exponent range, on 404 such platforms. 405 Tests can be done with the exp-int branch (added on 2010-12-17, and 406 many tests fail at this time). 407 408 - test underflow/overflow detection of various functions (in particular 409 mpfr_exp) in reduced exponent ranges, including ranges that do not 410 contain 0. 411 412 - add an internal macro that does the equivalent of the following? 413 MPFR_IS_ZERO(x) || MPFR_GET_EXP(x) <= value 414 415 - check whether __gmpfr_emin and __gmpfr_emax could be replaced by 416 a constant (see README.dev). Also check the use of MPFR_EMIN_MIN 417 and MPFR_EMAX_MAX. 418 419 420 ############################################################################## 421 7. Portability 422 ############################################################################## 423 424 - add a web page with results of builds on different architectures 425 426 - support the decimal64 function without requiring --with-gmp-build 427 428 - [Kevin about texp.c long strings] 429 For strings longer than c99 guarantees, it might be cleaner to 430 introduce a "tests_strdupcat" or something to concatenate literal 431 strings into newly allocated memory. I thought I'd done that in a 432 couple of places already. Arrays of chars are not much fun. 433 434 - use http://gcc.gnu.org/viewcvs/trunk/config/stdint.m4 for mpfr-gmp.h 435