1 :mod:`random` --- Generate pseudo-random numbers 2 ================================================ 3 4 .. module:: random 5 :synopsis: Generate pseudo-random numbers with various common distributions. 6 7 **Source code:** :source:`Lib/random.py` 8 9 -------------- 10 11 This module implements pseudo-random number generators for various 12 distributions. 13 14 For integers, there is uniform selection from a range. For sequences, there is 15 uniform selection of a random element, a function to generate a random 16 permutation of a list in-place, and a function for random sampling without 17 replacement. 18 19 On the real line, there are functions to compute uniform, normal (Gaussian), 20 lognormal, negative exponential, gamma, and beta distributions. For generating 21 distributions of angles, the von Mises distribution is available. 22 23 Almost all module functions depend on the basic function :func:`.random`, which 24 generates a random float uniformly in the semi-open range [0.0, 1.0). Python 25 uses the Mersenne Twister as the core generator. It produces 53-bit precision 26 floats and has a period of 2\*\*19937-1. The underlying implementation in C is 27 both fast and threadsafe. The Mersenne Twister is one of the most extensively 28 tested random number generators in existence. However, being completely 29 deterministic, it is not suitable for all purposes, and is completely unsuitable 30 for cryptographic purposes. 31 32 The functions supplied by this module are actually bound methods of a hidden 33 instance of the :class:`random.Random` class. You can instantiate your own 34 instances of :class:`Random` to get generators that don't share state. 35 36 Class :class:`Random` can also be subclassed if you want to use a different 37 basic generator of your own devising: in that case, override the :meth:`~Random.random`, 38 :meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods. 39 Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this 40 allows :meth:`randrange` to produce selections over an arbitrarily large range. 41 42 The :mod:`random` module also provides the :class:`SystemRandom` class which 43 uses the system function :func:`os.urandom` to generate random numbers 44 from sources provided by the operating system. 45 46 .. warning:: 47 48 The pseudo-random generators of this module should not be used for 49 security purposes. For security or cryptographic uses, see the 50 :mod:`secrets` module. 51 52 .. seealso:: 53 54 M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally 55 equidistributed uniform pseudorandom number generator", ACM Transactions on 56 Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998. 57 58 59 `Complementary-Multiply-with-Carry recipe 60 <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative 61 random number generator with a long period and comparatively simple update 62 operations. 63 64 65 Bookkeeping functions 66 --------------------- 67 68 .. function:: seed(a=None, version=2) 69 70 Initialize the random number generator. 71 72 If *a* is omitted or ``None``, the current system time is used. If 73 randomness sources are provided by the operating system, they are used 74 instead of the system time (see the :func:`os.urandom` function for details 75 on availability). 76 77 If *a* is an int, it is used directly. 78 79 With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray` 80 object gets converted to an :class:`int` and all of its bits are used. 81 82 With version 1 (provided for reproducing random sequences from older versions 83 of Python), the algorithm for :class:`str` and :class:`bytes` generates a 84 narrower range of seeds. 85 86 .. versionchanged:: 3.2 87 Moved to the version 2 scheme which uses all of the bits in a string seed. 88 89 .. function:: getstate() 90 91 Return an object capturing the current internal state of the generator. This 92 object can be passed to :func:`setstate` to restore the state. 93 94 95 .. function:: setstate(state) 96 97 *state* should have been obtained from a previous call to :func:`getstate`, and 98 :func:`setstate` restores the internal state of the generator to what it was at 99 the time :func:`getstate` was called. 100 101 102 .. function:: getrandbits(k) 103 104 Returns a Python integer with *k* random bits. This method is supplied with 105 the MersenneTwister generator and some other generators may also provide it 106 as an optional part of the API. When available, :meth:`getrandbits` enables 107 :meth:`randrange` to handle arbitrarily large ranges. 108 109 110 Functions for integers 111 ---------------------- 112 113 .. function:: randrange(stop) 114 randrange(start, stop[, step]) 115 116 Return a randomly selected element from ``range(start, stop, step)``. This is 117 equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a 118 range object. 119 120 The positional argument pattern matches that of :func:`range`. Keyword arguments 121 should not be used because the function may use them in unexpected ways. 122 123 .. versionchanged:: 3.2 124 :meth:`randrange` is more sophisticated about producing equally distributed 125 values. Formerly it used a style like ``int(random()*n)`` which could produce 126 slightly uneven distributions. 127 128 .. function:: randint(a, b) 129 130 Return a random integer *N* such that ``a <= N <= b``. Alias for 131 ``randrange(a, b+1)``. 132 133 134 Functions for sequences 135 ----------------------- 136 137 .. function:: choice(seq) 138 139 Return a random element from the non-empty sequence *seq*. If *seq* is empty, 140 raises :exc:`IndexError`. 141 142 .. function:: choices(population, weights=None, *, cum_weights=None, k=1) 143 144 Return a *k* sized list of elements chosen from the *population* with replacement. 145 If the *population* is empty, raises :exc:`IndexError`. 146 147 If a *weights* sequence is specified, selections are made according to the 148 relative weights. Alternatively, if a *cum_weights* sequence is given, the 149 selections are made according to the cumulative weights (perhaps computed 150 using :func:`itertools.accumulate`). For example, the relative weights 151 ``[10, 5, 30, 5]`` are equivalent to the cumulative weights 152 ``[10, 15, 45, 50]``. Internally, the relative weights are converted to 153 cumulative weights before making selections, so supplying the cumulative 154 weights saves work. 155 156 If neither *weights* nor *cum_weights* are specified, selections are made 157 with equal probability. If a weights sequence is supplied, it must be 158 the same length as the *population* sequence. It is a :exc:`TypeError` 159 to specify both *weights* and *cum_weights*. 160 161 The *weights* or *cum_weights* can use any numeric type that interoperates 162 with the :class:`float` values returned by :func:`random` (that includes 163 integers, floats, and fractions but excludes decimals). 164 165 .. versionadded:: 3.6 166 167 168 .. function:: shuffle(x[, random]) 169 170 Shuffle the sequence *x* in place. 171 172 The optional argument *random* is a 0-argument function returning a random 173 float in [0.0, 1.0); by default, this is the function :func:`.random`. 174 175 To shuffle an immutable sequence and return a new shuffled list, use 176 ``sample(x, k=len(x))`` instead. 177 178 Note that even for small ``len(x)``, the total number of permutations of *x* 179 can quickly grow larger than the period of most random number generators. 180 This implies that most permutations of a long sequence can never be 181 generated. For example, a sequence of length 2080 is the largest that 182 can fit within the period of the Mersenne Twister random number generator. 183 184 185 .. function:: sample(population, k) 186 187 Return a *k* length list of unique elements chosen from the population sequence 188 or set. Used for random sampling without replacement. 189 190 Returns a new list containing elements from the population while leaving the 191 original population unchanged. The resulting list is in selection order so that 192 all sub-slices will also be valid random samples. This allows raffle winners 193 (the sample) to be partitioned into grand prize and second place winners (the 194 subslices). 195 196 Members of the population need not be :term:`hashable` or unique. If the population 197 contains repeats, then each occurrence is a possible selection in the sample. 198 199 To choose a sample from a range of integers, use a :func:`range` object as an 200 argument. This is especially fast and space efficient for sampling from a large 201 population: ``sample(range(10000000), k=60)``. 202 203 If the sample size is larger than the population size, a :exc:`ValueError` 204 is raised. 205 206 Real-valued distributions 207 ------------------------- 208 209 The following functions generate specific real-valued distributions. Function 210 parameters are named after the corresponding variables in the distribution's 211 equation, as used in common mathematical practice; most of these equations can 212 be found in any statistics text. 213 214 215 .. function:: random() 216 217 Return the next random floating point number in the range [0.0, 1.0). 218 219 220 .. function:: uniform(a, b) 221 222 Return a random floating point number *N* such that ``a <= N <= b`` for 223 ``a <= b`` and ``b <= N <= a`` for ``b < a``. 224 225 The end-point value ``b`` may or may not be included in the range 226 depending on floating-point rounding in the equation ``a + (b-a) * random()``. 227 228 229 .. function:: triangular(low, high, mode) 230 231 Return a random floating point number *N* such that ``low <= N <= high`` and 232 with the specified *mode* between those bounds. The *low* and *high* bounds 233 default to zero and one. The *mode* argument defaults to the midpoint 234 between the bounds, giving a symmetric distribution. 235 236 237 .. function:: betavariate(alpha, beta) 238 239 Beta distribution. Conditions on the parameters are ``alpha > 0`` and 240 ``beta > 0``. Returned values range between 0 and 1. 241 242 243 .. function:: expovariate(lambd) 244 245 Exponential distribution. *lambd* is 1.0 divided by the desired 246 mean. It should be nonzero. (The parameter would be called 247 "lambda", but that is a reserved word in Python.) Returned values 248 range from 0 to positive infinity if *lambd* is positive, and from 249 negative infinity to 0 if *lambd* is negative. 250 251 252 .. function:: gammavariate(alpha, beta) 253 254 Gamma distribution. (*Not* the gamma function!) Conditions on the 255 parameters are ``alpha > 0`` and ``beta > 0``. 256 257 The probability distribution function is:: 258 259 x ** (alpha - 1) * math.exp(-x / beta) 260 pdf(x) = -------------------------------------- 261 math.gamma(alpha) * beta ** alpha 262 263 264 .. function:: gauss(mu, sigma) 265 266 Gaussian distribution. *mu* is the mean, and *sigma* is the standard 267 deviation. This is slightly faster than the :func:`normalvariate` function 268 defined below. 269 270 271 .. function:: lognormvariate(mu, sigma) 272 273 Log normal distribution. If you take the natural logarithm of this 274 distribution, you'll get a normal distribution with mean *mu* and standard 275 deviation *sigma*. *mu* can have any value, and *sigma* must be greater than 276 zero. 277 278 279 .. function:: normalvariate(mu, sigma) 280 281 Normal distribution. *mu* is the mean, and *sigma* is the standard deviation. 282 283 284 .. function:: vonmisesvariate(mu, kappa) 285 286 *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa* 287 is the concentration parameter, which must be greater than or equal to zero. If 288 *kappa* is equal to zero, this distribution reduces to a uniform random angle 289 over the range 0 to 2\*\ *pi*. 290 291 292 .. function:: paretovariate(alpha) 293 294 Pareto distribution. *alpha* is the shape parameter. 295 296 297 .. function:: weibullvariate(alpha, beta) 298 299 Weibull distribution. *alpha* is the scale parameter and *beta* is the shape 300 parameter. 301 302 303 Alternative Generator 304 --------------------- 305 306 .. class:: SystemRandom([seed]) 307 308 Class that uses the :func:`os.urandom` function for generating random numbers 309 from sources provided by the operating system. Not available on all systems. 310 Does not rely on software state, and sequences are not reproducible. Accordingly, 311 the :meth:`seed` method has no effect and is ignored. 312 The :meth:`getstate` and :meth:`setstate` methods raise 313 :exc:`NotImplementedError` if called. 314 315 316 Notes on Reproducibility 317 ------------------------ 318 319 Sometimes it is useful to be able to reproduce the sequences given by a pseudo 320 random number generator. By re-using a seed value, the same sequence should be 321 reproducible from run to run as long as multiple threads are not running. 322 323 Most of the random module's algorithms and seeding functions are subject to 324 change across Python versions, but two aspects are guaranteed not to change: 325 326 * If a new seeding method is added, then a backward compatible seeder will be 327 offered. 328 329 * The generator's :meth:`~Random.random` method will continue to produce the same 330 sequence when the compatible seeder is given the same seed. 331 332 .. _random-examples: 333 334 Examples and Recipes 335 -------------------- 336 337 Basic examples:: 338 339 >>> random() # Random float: 0.0 <= x < 1.0 340 0.37444887175646646 341 342 >>> uniform(2.5, 10.0) # Random float: 2.5 <= x < 10.0 343 3.1800146073117523 344 345 >>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds 346 5.148957571865031 347 348 >>> randrange(10) # Integer from 0 to 9 inclusive 349 7 350 351 >>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive 352 26 353 354 >>> choice(['win', 'lose', 'draw']) # Single random element from a sequence 355 'draw' 356 357 >>> deck = 'ace two three four'.split() 358 >>> shuffle(deck) # Shuffle a list 359 >>> deck 360 ['four', 'two', 'ace', 'three'] 361 362 >>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement 363 [40, 10, 50, 30] 364 365 Simulations:: 366 367 >>> # Six roulette wheel spins (weighted sampling with replacement) 368 >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6) 369 ['red', 'green', 'black', 'black', 'red', 'black'] 370 371 >>> # Deal 20 cards without replacement from a deck of 52 playing cards 372 >>> # and determine the proportion of cards with a ten-value 373 >>> # (a ten, jack, queen, or king). 374 >>> deck = collections.Counter(tens=16, low_cards=36) 375 >>> seen = sample(list(deck.elements()), k=20) 376 >>> seen.count('tens') / 20 377 0.15 378 379 >>> # Estimate the probability of getting 5 or more heads from 7 spins 380 >>> # of a biased coin that settles on heads 60% of the time. 381 >>> trial = lambda: choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5 382 >>> sum(trial() for i in range(10000)) / 10000 383 0.4169 384 385 >>> # Probability of the median of 5 samples being in middle two quartiles 386 >>> trial = lambda : 2500 <= sorted(choices(range(10000), k=5))[2] < 7500 387 >>> sum(trial() for i in range(10000)) / 10000 388 0.7958 389 390 Example of `statistical bootstrapping 391 <https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling 392 with replacement to estimate a confidence interval for the mean of a sample of 393 size five:: 394 395 # http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm 396 from statistics import mean 397 from random import choices 398 399 data = 1, 2, 4, 4, 10 400 means = sorted(mean(choices(data, k=5)) for i in range(20)) 401 print(f'The sample mean of {mean(data):.1f} has a 90% confidence ' 402 f'interval from {means[1]:.1f} to {means[-2]:.1f}') 403 404 Example of a `resampling permutation test 405 <https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_ 406 to determine the statistical significance or `p-value 407 <https://en.wikipedia.org/wiki/P-value>`_ of an observed difference 408 between the effects of a drug versus a placebo:: 409 410 # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson 411 from statistics import mean 412 from random import shuffle 413 414 drug = [54, 73, 53, 70, 73, 68, 52, 65, 65] 415 placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46] 416 observed_diff = mean(drug) - mean(placebo) 417 418 n = 10000 419 count = 0 420 combined = drug + placebo 421 for i in range(n): 422 shuffle(combined) 423 new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):]) 424 count += (new_diff >= observed_diff) 425 426 print(f'{n} label reshufflings produced only {count} instances with a difference') 427 print(f'at least as extreme as the observed difference of {observed_diff:.1f}.') 428 print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null') 429 print(f'hypothesis that there is no difference between the drug and the placebo.') 430 431 Simulation of arrival times and service deliveries in a single server queue:: 432 433 from random import expovariate, gauss 434 from statistics import mean, median, stdev 435 436 average_arrival_interval = 5.6 437 average_service_time = 5.0 438 stdev_service_time = 0.5 439 440 num_waiting = 0 441 arrivals = [] 442 starts = [] 443 arrival = service_end = 0.0 444 for i in range(20000): 445 if arrival <= service_end: 446 num_waiting += 1 447 arrival += expovariate(1.0 / average_arrival_interval) 448 arrivals.append(arrival) 449 else: 450 num_waiting -= 1 451 service_start = service_end if num_waiting else arrival 452 service_time = gauss(average_service_time, stdev_service_time) 453 service_end = service_start + service_time 454 starts.append(service_start) 455 456 waits = [start - arrival for arrival, start in zip(arrivals, starts)] 457 print(f'Mean wait: {mean(waits):.1f}. Stdev wait: {stdev(waits):.1f}.') 458 print(f'Median wait: {median(waits):.1f}. Max wait: {max(waits):.1f}.') 459 460 .. seealso:: 461 462 `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_ 463 a video tutorial by 464 `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_ 465 on statistical analysis using just a few fundamental concepts 466 including simulation, sampling, shuffling, and cross-validation. 467 468 `Economics Simulation 469 <http://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_ 470 a simulation of a marketplace by 471 `Peter Norvig <http://norvig.com/bio.html>`_ that shows effective 472 use of many of the tools and distributions provided by this module 473 (gauss, uniform, sample, betavariate, choice, triangular, and randrange). 474 475 `A Concrete Introduction to Probability (using Python) 476 <http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_ 477 a tutorial by `Peter Norvig <http://norvig.com/bio.html>`_ covering 478 the basics of probability theory, how to write simulations, and 479 how to perform data analysis using Python. 480