1 :mod:`random` --- Generate pseudo-random numbers 2 ================================================ 3 4 .. module:: random 5 :synopsis: Generate pseudo-random numbers with various common distributions. 6 7 **Source code:** :source:`Lib/random.py` 8 9 -------------- 10 11 This module implements pseudo-random number generators for various 12 distributions. 13 14 For integers, there is uniform selection from a range. For sequences, there is 15 uniform selection of a random element, a function to generate a random 16 permutation of a list in-place, and a function for random sampling without 17 replacement. 18 19 On the real line, there are functions to compute uniform, normal (Gaussian), 20 lognormal, negative exponential, gamma, and beta distributions. For generating 21 distributions of angles, the von Mises distribution is available. 22 23 Almost all module functions depend on the basic function :func:`.random`, which 24 generates a random float uniformly in the semi-open range [0.0, 1.0). Python 25 uses the Mersenne Twister as the core generator. It produces 53-bit precision 26 floats and has a period of 2\*\*19937-1. The underlying implementation in C is 27 both fast and threadsafe. The Mersenne Twister is one of the most extensively 28 tested random number generators in existence. However, being completely 29 deterministic, it is not suitable for all purposes, and is completely unsuitable 30 for cryptographic purposes. 31 32 The functions supplied by this module are actually bound methods of a hidden 33 instance of the :class:`random.Random` class. You can instantiate your own 34 instances of :class:`Random` to get generators that don't share state. 35 36 Class :class:`Random` can also be subclassed if you want to use a different 37 basic generator of your own devising: in that case, override the :meth:`~Random.random`, 38 :meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods. 39 Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this 40 allows :meth:`randrange` to produce selections over an arbitrarily large range. 41 42 The :mod:`random` module also provides the :class:`SystemRandom` class which 43 uses the system function :func:`os.urandom` to generate random numbers 44 from sources provided by the operating system. 45 46 .. warning:: 47 48 The pseudo-random generators of this module should not be used for 49 security purposes. For security or cryptographic uses, see the 50 :mod:`secrets` module. 51 52 .. seealso:: 53 54 M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally 55 equidistributed uniform pseudorandom number generator", ACM Transactions on 56 Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998. 57 58 59 `Complementary-Multiply-with-Carry recipe 60 <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative 61 random number generator with a long period and comparatively simple update 62 operations. 63 64 65 Bookkeeping functions 66 --------------------- 67 68 .. function:: seed(a=None, version=2) 69 70 Initialize the random number generator. 71 72 If *a* is omitted or ``None``, the current system time is used. If 73 randomness sources are provided by the operating system, they are used 74 instead of the system time (see the :func:`os.urandom` function for details 75 on availability). 76 77 If *a* is an int, it is used directly. 78 79 With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray` 80 object gets converted to an :class:`int` and all of its bits are used. 81 82 With version 1 (provided for reproducing random sequences from older versions 83 of Python), the algorithm for :class:`str` and :class:`bytes` generates a 84 narrower range of seeds. 85 86 .. versionchanged:: 3.2 87 Moved to the version 2 scheme which uses all of the bits in a string seed. 88 89 .. function:: getstate() 90 91 Return an object capturing the current internal state of the generator. This 92 object can be passed to :func:`setstate` to restore the state. 93 94 95 .. function:: setstate(state) 96 97 *state* should have been obtained from a previous call to :func:`getstate`, and 98 :func:`setstate` restores the internal state of the generator to what it was at 99 the time :func:`getstate` was called. 100 101 102 .. function:: getrandbits(k) 103 104 Returns a Python integer with *k* random bits. This method is supplied with 105 the MersenneTwister generator and some other generators may also provide it 106 as an optional part of the API. When available, :meth:`getrandbits` enables 107 :meth:`randrange` to handle arbitrarily large ranges. 108 109 110 Functions for integers 111 ---------------------- 112 113 .. function:: randrange(stop) 114 randrange(start, stop[, step]) 115 116 Return a randomly selected element from ``range(start, stop, step)``. This is 117 equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a 118 range object. 119 120 The positional argument pattern matches that of :func:`range`. Keyword arguments 121 should not be used because the function may use them in unexpected ways. 122 123 .. versionchanged:: 3.2 124 :meth:`randrange` is more sophisticated about producing equally distributed 125 values. Formerly it used a style like ``int(random()*n)`` which could produce 126 slightly uneven distributions. 127 128 .. function:: randint(a, b) 129 130 Return a random integer *N* such that ``a <= N <= b``. Alias for 131 ``randrange(a, b+1)``. 132 133 134 Functions for sequences 135 ----------------------- 136 137 .. function:: choice(seq) 138 139 Return a random element from the non-empty sequence *seq*. If *seq* is empty, 140 raises :exc:`IndexError`. 141 142 .. function:: choices(population, weights=None, *, cum_weights=None, k=1) 143 144 Return a *k* sized list of elements chosen from the *population* with replacement. 145 If the *population* is empty, raises :exc:`IndexError`. 146 147 If a *weights* sequence is specified, selections are made according to the 148 relative weights. Alternatively, if a *cum_weights* sequence is given, the 149 selections are made according to the cumulative weights (perhaps computed 150 using :func:`itertools.accumulate`). For example, the relative weights 151 ``[10, 5, 30, 5]`` are equivalent to the cumulative weights 152 ``[10, 15, 45, 50]``. Internally, the relative weights are converted to 153 cumulative weights before making selections, so supplying the cumulative 154 weights saves work. 155 156 If neither *weights* nor *cum_weights* are specified, selections are made 157 with equal probability. If a weights sequence is supplied, it must be 158 the same length as the *population* sequence. It is a :exc:`TypeError` 159 to specify both *weights* and *cum_weights*. 160 161 The *weights* or *cum_weights* can use any numeric type that interoperates 162 with the :class:`float` values returned by :func:`random` (that includes 163 integers, floats, and fractions but excludes decimals). 164 165 For a given seed, the :func:`choices` function with equal weighting 166 typically produces a different sequence than repeated calls to 167 :func:`choice`. The algorithm used by :func:`choices` uses floating 168 point arithmetic for internal consistency and speed. The algorithm used 169 by :func:`choice` defaults to integer arithmetic with repeated selections 170 to avoid small biases from round-off error. 171 172 .. versionadded:: 3.6 173 174 175 .. function:: shuffle(x[, random]) 176 177 Shuffle the sequence *x* in place. 178 179 The optional argument *random* is a 0-argument function returning a random 180 float in [0.0, 1.0); by default, this is the function :func:`.random`. 181 182 To shuffle an immutable sequence and return a new shuffled list, use 183 ``sample(x, k=len(x))`` instead. 184 185 Note that even for small ``len(x)``, the total number of permutations of *x* 186 can quickly grow larger than the period of most random number generators. 187 This implies that most permutations of a long sequence can never be 188 generated. For example, a sequence of length 2080 is the largest that 189 can fit within the period of the Mersenne Twister random number generator. 190 191 192 .. function:: sample(population, k) 193 194 Return a *k* length list of unique elements chosen from the population sequence 195 or set. Used for random sampling without replacement. 196 197 Returns a new list containing elements from the population while leaving the 198 original population unchanged. The resulting list is in selection order so that 199 all sub-slices will also be valid random samples. This allows raffle winners 200 (the sample) to be partitioned into grand prize and second place winners (the 201 subslices). 202 203 Members of the population need not be :term:`hashable` or unique. If the population 204 contains repeats, then each occurrence is a possible selection in the sample. 205 206 To choose a sample from a range of integers, use a :func:`range` object as an 207 argument. This is especially fast and space efficient for sampling from a large 208 population: ``sample(range(10000000), k=60)``. 209 210 If the sample size is larger than the population size, a :exc:`ValueError` 211 is raised. 212 213 Real-valued distributions 214 ------------------------- 215 216 The following functions generate specific real-valued distributions. Function 217 parameters are named after the corresponding variables in the distribution's 218 equation, as used in common mathematical practice; most of these equations can 219 be found in any statistics text. 220 221 222 .. function:: random() 223 224 Return the next random floating point number in the range [0.0, 1.0). 225 226 227 .. function:: uniform(a, b) 228 229 Return a random floating point number *N* such that ``a <= N <= b`` for 230 ``a <= b`` and ``b <= N <= a`` for ``b < a``. 231 232 The end-point value ``b`` may or may not be included in the range 233 depending on floating-point rounding in the equation ``a + (b-a) * random()``. 234 235 236 .. function:: triangular(low, high, mode) 237 238 Return a random floating point number *N* such that ``low <= N <= high`` and 239 with the specified *mode* between those bounds. The *low* and *high* bounds 240 default to zero and one. The *mode* argument defaults to the midpoint 241 between the bounds, giving a symmetric distribution. 242 243 244 .. function:: betavariate(alpha, beta) 245 246 Beta distribution. Conditions on the parameters are ``alpha > 0`` and 247 ``beta > 0``. Returned values range between 0 and 1. 248 249 250 .. function:: expovariate(lambd) 251 252 Exponential distribution. *lambd* is 1.0 divided by the desired 253 mean. It should be nonzero. (The parameter would be called 254 "lambda", but that is a reserved word in Python.) Returned values 255 range from 0 to positive infinity if *lambd* is positive, and from 256 negative infinity to 0 if *lambd* is negative. 257 258 259 .. function:: gammavariate(alpha, beta) 260 261 Gamma distribution. (*Not* the gamma function!) Conditions on the 262 parameters are ``alpha > 0`` and ``beta > 0``. 263 264 The probability distribution function is:: 265 266 x ** (alpha - 1) * math.exp(-x / beta) 267 pdf(x) = -------------------------------------- 268 math.gamma(alpha) * beta ** alpha 269 270 271 .. function:: gauss(mu, sigma) 272 273 Gaussian distribution. *mu* is the mean, and *sigma* is the standard 274 deviation. This is slightly faster than the :func:`normalvariate` function 275 defined below. 276 277 278 .. function:: lognormvariate(mu, sigma) 279 280 Log normal distribution. If you take the natural logarithm of this 281 distribution, you'll get a normal distribution with mean *mu* and standard 282 deviation *sigma*. *mu* can have any value, and *sigma* must be greater than 283 zero. 284 285 286 .. function:: normalvariate(mu, sigma) 287 288 Normal distribution. *mu* is the mean, and *sigma* is the standard deviation. 289 290 291 .. function:: vonmisesvariate(mu, kappa) 292 293 *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa* 294 is the concentration parameter, which must be greater than or equal to zero. If 295 *kappa* is equal to zero, this distribution reduces to a uniform random angle 296 over the range 0 to 2\*\ *pi*. 297 298 299 .. function:: paretovariate(alpha) 300 301 Pareto distribution. *alpha* is the shape parameter. 302 303 304 .. function:: weibullvariate(alpha, beta) 305 306 Weibull distribution. *alpha* is the scale parameter and *beta* is the shape 307 parameter. 308 309 310 Alternative Generator 311 --------------------- 312 313 .. class:: SystemRandom([seed]) 314 315 Class that uses the :func:`os.urandom` function for generating random numbers 316 from sources provided by the operating system. Not available on all systems. 317 Does not rely on software state, and sequences are not reproducible. Accordingly, 318 the :meth:`seed` method has no effect and is ignored. 319 The :meth:`getstate` and :meth:`setstate` methods raise 320 :exc:`NotImplementedError` if called. 321 322 323 Notes on Reproducibility 324 ------------------------ 325 326 Sometimes it is useful to be able to reproduce the sequences given by a pseudo 327 random number generator. By re-using a seed value, the same sequence should be 328 reproducible from run to run as long as multiple threads are not running. 329 330 Most of the random module's algorithms and seeding functions are subject to 331 change across Python versions, but two aspects are guaranteed not to change: 332 333 * If a new seeding method is added, then a backward compatible seeder will be 334 offered. 335 336 * The generator's :meth:`~Random.random` method will continue to produce the same 337 sequence when the compatible seeder is given the same seed. 338 339 .. _random-examples: 340 341 Examples and Recipes 342 -------------------- 343 344 Basic examples:: 345 346 >>> random() # Random float: 0.0 <= x < 1.0 347 0.37444887175646646 348 349 >>> uniform(2.5, 10.0) # Random float: 2.5 <= x < 10.0 350 3.1800146073117523 351 352 >>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds 353 5.148957571865031 354 355 >>> randrange(10) # Integer from 0 to 9 inclusive 356 7 357 358 >>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive 359 26 360 361 >>> choice(['win', 'lose', 'draw']) # Single random element from a sequence 362 'draw' 363 364 >>> deck = 'ace two three four'.split() 365 >>> shuffle(deck) # Shuffle a list 366 >>> deck 367 ['four', 'two', 'ace', 'three'] 368 369 >>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement 370 [40, 10, 50, 30] 371 372 Simulations:: 373 374 >>> # Six roulette wheel spins (weighted sampling with replacement) 375 >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6) 376 ['red', 'green', 'black', 'black', 'red', 'black'] 377 378 >>> # Deal 20 cards without replacement from a deck of 52 playing cards 379 >>> # and determine the proportion of cards with a ten-value 380 >>> # (a ten, jack, queen, or king). 381 >>> deck = collections.Counter(tens=16, low_cards=36) 382 >>> seen = sample(list(deck.elements()), k=20) 383 >>> seen.count('tens') / 20 384 0.15 385 386 >>> # Estimate the probability of getting 5 or more heads from 7 spins 387 >>> # of a biased coin that settles on heads 60% of the time. 388 >>> def trial(): 389 ... return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5 390 ... 391 >>> sum(trial() for i in range(10000)) / 10000 392 0.4169 393 394 >>> # Probability of the median of 5 samples being in middle two quartiles 395 >>> def trial(): 396 ... return 2500 <= sorted(choices(range(10000), k=5))[2] < 7500 397 ... 398 >>> sum(trial() for i in range(10000)) / 10000 399 0.7958 400 401 Example of `statistical bootstrapping 402 <https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling 403 with replacement to estimate a confidence interval for the mean of a sample of 404 size five:: 405 406 # http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm 407 from statistics import mean 408 from random import choices 409 410 data = 1, 2, 4, 4, 10 411 means = sorted(mean(choices(data, k=5)) for i in range(20)) 412 print(f'The sample mean of {mean(data):.1f} has a 90% confidence ' 413 f'interval from {means[1]:.1f} to {means[-2]:.1f}') 414 415 Example of a `resampling permutation test 416 <https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_ 417 to determine the statistical significance or `p-value 418 <https://en.wikipedia.org/wiki/P-value>`_ of an observed difference 419 between the effects of a drug versus a placebo:: 420 421 # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson 422 from statistics import mean 423 from random import shuffle 424 425 drug = [54, 73, 53, 70, 73, 68, 52, 65, 65] 426 placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46] 427 observed_diff = mean(drug) - mean(placebo) 428 429 n = 10000 430 count = 0 431 combined = drug + placebo 432 for i in range(n): 433 shuffle(combined) 434 new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):]) 435 count += (new_diff >= observed_diff) 436 437 print(f'{n} label reshufflings produced only {count} instances with a difference') 438 print(f'at least as extreme as the observed difference of {observed_diff:.1f}.') 439 print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null') 440 print(f'hypothesis that there is no difference between the drug and the placebo.') 441 442 Simulation of arrival times and service deliveries in a single server queue:: 443 444 from random import expovariate, gauss 445 from statistics import mean, median, stdev 446 447 average_arrival_interval = 5.6 448 average_service_time = 5.0 449 stdev_service_time = 0.5 450 451 num_waiting = 0 452 arrivals = [] 453 starts = [] 454 arrival = service_end = 0.0 455 for i in range(20000): 456 if arrival <= service_end: 457 num_waiting += 1 458 arrival += expovariate(1.0 / average_arrival_interval) 459 arrivals.append(arrival) 460 else: 461 num_waiting -= 1 462 service_start = service_end if num_waiting else arrival 463 service_time = gauss(average_service_time, stdev_service_time) 464 service_end = service_start + service_time 465 starts.append(service_start) 466 467 waits = [start - arrival for arrival, start in zip(arrivals, starts)] 468 print(f'Mean wait: {mean(waits):.1f}. Stdev wait: {stdev(waits):.1f}.') 469 print(f'Median wait: {median(waits):.1f}. Max wait: {max(waits):.1f}.') 470 471 .. seealso:: 472 473 `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_ 474 a video tutorial by 475 `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_ 476 on statistical analysis using just a few fundamental concepts 477 including simulation, sampling, shuffling, and cross-validation. 478 479 `Economics Simulation 480 <http://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_ 481 a simulation of a marketplace by 482 `Peter Norvig <http://norvig.com/bio.html>`_ that shows effective 483 use of many of the tools and distributions provided by this module 484 (gauss, uniform, sample, betavariate, choice, triangular, and randrange). 485 486 `A Concrete Introduction to Probability (using Python) 487 <http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_ 488 a tutorial by `Peter Norvig <http://norvig.com/bio.html>`_ covering 489 the basics of probability theory, how to write simulations, and 490 how to perform data analysis using Python. 491