Contents

1. Overview
2. Etymology
3. Cultural Impact

Here is the extended and rewritten Wikipedia article on the Monte Carlo method, preserving all facts and details and expanding on them where appropriate, while maintaining the full structure with headings. I have included all internal Wikipedia links in Markdown format throughout the content.

Probabilistic problem-solving algorithm

Overview

Monte Carlo methods , (sometimes called Monte Carlo experiments or Monte Carlo simulations) are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle. The name comes from the Monte Carlo Casino in Monaco , where the primary developer of the method, mathematician Stanisław Ulam , was inspired by his uncle’s gambling habits.

Monte Carlo methods are mainly used in three distinct problem classes: optimization , numerical integration , and generating draws from a probability distribution . They can also be used to model phenomena with significant uncertainty in inputs, such as calculating the risk of a nuclear power plant failure. Monte Carlo methods are often implemented using computer simulations, and they can provide approximate solutions to problems that are otherwise intractable or too complex to analyze mathematically.

Monte Carlo methods are widely used in various fields of science, engineering, and mathematics, such as physics , chemistry , biology , statistics , artificial intelligence , finance , and cryptography . They have also been applied to social sciences, such as sociology , psychology , and political science . Monte Carlo methods have been recognized as one of the most important and influential ideas of the 20th century, and they have enabled many scientific and technological breakthroughs.

Monte Carlo methods also have some limitations and challenges, such as the trade-off between accuracy and computational cost, the curse of dimensionality , the reliability of random number generators , and the verification and validation of the results.

Monte Carlo method applied to approximating the value of π

For example, consider a quadrant (circular sector) inscribed in a unit square . Given that the ratio of their areas is π/4, the value of π can be approximated using the Monte Carlo method:

Draw a square, then inscribe a quadrant within it.
Uniformly scatter a given number of points over the square.
Count the number of points inside the quadrant, i.e., having a distance from the origin of less than 1.
The ratio of the inside-count and the total-sample-count is an estimate of the ratio of the two areas, π/4. Multiply the result by 4 to estimate π.

In this procedure, the domain of inputs is the square that circumscribes the quadrant. One can generate random inputs by scattering grains over the square, then performing a computation on each input to test whether it falls within the quadrant. Aggregating the results yields the final result, the approximation of π.

There are two important considerations:

If the points are not uniformly distributed, the approximation will be poor.
The approximation improves as more points are randomly placed in the whole square.

Uses of Monte Carlo methods require large amounts of random numbers, and their use benefitted greatly from pseudorandom number generators , which are far quicker to use than the tables of random numbers that had been previously employed.

Application

Monte Carlo methods are often used in physical and mathematical problems and are most useful when it is difficult or impossible to use other approaches. Monte Carlo methods are mainly used in three problem classes: optimization , numerical integration , and generating draws from a probability distribution .

In physics-related problems, Monte Carlo methods are useful for simulating systems with many coupled degrees of freedom , such as fluids, disordered materials, strongly coupled solids, and cellular structures (see cellular Potts model , interacting particle systems , McKean–Vlasov processes , kinetic models of gases ).

Other examples include modeling phenomena with significant uncertainty in inputs such as the calculation of risk in business and, in mathematics, evaluation of multidimensional definite integrals with complicated boundary conditions . In application to systems engineering problems (space, oil exploration , aircraft design, etc.), Monte Carlo–based predictions of failure, cost overruns and schedule overruns are routinely better than human intuition or alternative “soft” methods.

In principle, Monte Carlo methods can be used to solve any problem having a probabilistic interpretation. By the law of large numbers , integrals described by the expected value of some random variable can be approximated by taking the empirical mean (a.k.a. the ‘sample mean’) of independent samples of the variable. When the probability distribution of the variable is parameterized, mathematicians often use a Markov chain Monte Carlo (MCMC) sampler. The central idea is to design a judicious Markov chain model with a prescribed stationary probability distribution . That is, in the limit, the samples being generated by the MCMC method will be samples from the desired (target) distribution. By the ergodic theorem , the stationary distribution is approximated by the empirical measures of the random states of the MCMC sampler.

In other problems, the objective is generating draws from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability distributions can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depend on the distributions of the current random states (see McKean–Vlasov processes , nonlinear filtering equation ). In other instances, a flow of probability distributions with an increasing level of sampling complexity arise (path spaces models with an increasing time horizon, Boltzmann–Gibbs measures associated with decreasing temperature parameters, and many others). These models can also be seen as the evolution of the law of the random states of a nonlinear Markov chain. A natural way to simulate these sophisticated nonlinear Markov processes is to sample multiple copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures . In contrast with traditional Monte Carlo and MCMC methodologies, these mean-field particle techniques rely on sequential interacting samples. The terminology mean field reflects the fact that each of the samples (a.k.a. particles, individuals, walkers, agents, creatures, or phenotypes) interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes.

Simple Monte Carlo

Suppose one wants to know the expected value μ of a population (and knows that μ exists), but does not have a formula available to compute it. The simple Monte Carlo method gives an estimate for μ by running n simulations and averaging the simulations’ results. It has no restrictions on the probability distribution of the inputs to the simulations, requiring only that the inputs are randomly generated and are independent of each other and that μ exists. A sufficiently large n will produce a value for m that is arbitrarily close to μ; more formally, it will be the case that, for any ε > 0, |μ - m| ≤ ε.

Typically, the algorithm to obtain m is:

1
2
3
4
5
6
s = 0;
for i = 1 to n do
    run the simulation for the ith time, giving result ri;
    s = s + ri;
repeat
m = s / n;

Example

Suppose we want to know how many times we should expect to throw three eight-sided dice for the total of the dice throws to be at least T. We know the expected value exists. The dice throws are randomly distributed and independent of each other. So simple Monte Carlo is applicable:

1
2
3
4
5
6
s = 0;
for i = 1 to n do
    throw the three dice until T is met or first exceeded; ri = the number of throws;
    s = s + ri;
repeat
m = s / n;

If n is large enough, m will be within ε of μ for any ε > 0.

Determining a sufficiently large n

General formula

Let ε = |μ - m| > 0. Choose the desired confidence level – the percent chance that, when the Monte Carlo algorithm completes, m is indeed within ε of μ. Let z be the z-score corresponding to that confidence level.

Let s^2 be the estimated variance, sometimes called the “sample” variance; it is the variance of the results obtained from a relatively small number k of “sample” simulations. Choose a k; Driels and Shin observe that “even for sample sizes an order of magnitude lower than the number required, the calculation of that number is quite stable.”

The following algorithm computes s^2 in one pass while minimizing the possibility that accumulated numerical error produces erroneous results:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
s1 = 0;
run the simulation for the first time, producing result r1;
m1 = r1; // mi is the mean of the first i simulations
for i = 2 to k do
    run the simulation for the ith time, producing result ri;
    δi = ri - mi−1;
    mi = mi-1 + (1/ i) δi;
    si = si-1 + ((i - 1)/ i)( δi )2;
repeat
s2 = sk/(k - 1);

Note that, when the algorithm completes, mk is the mean of the k results.

The value n is sufficiently large when:

1
n ≥ s^2 z^2 / ε^2.

If n ≤ k, then mk = m; sufficient sample simulations were done to ensure that mk is within ε of μ. If n > k, then n simulations can be run “from scratch,” or, since k simulations have already been done, one can just run n - k more simulations and add their results into those from the sample simulations:

1
2
3
4
5
s = mk * k;
for i = k + 1 to n do
    run the simulation for the ith time, giving result ri;
    s = s + ri;
m = s / n;

A formula when simulations’ results are bounded

An alternative formula can be used in the special case where all simulation results are bounded above and below.

Choose a value for ε that is twice the maximum allowed difference between μ and m. Let 0 < δ < 100 be the desired confidence level, expressed as a percentage. Let every simulation result r1, r2, …, ri, …, rn be such that a ≤ ri ≤ b for finite a and b. To have confidence of at least δ that |μ - m| < ε/2, use a value for n such that:

1
2
n ≥ 2(b - a)^2 ln(2/(1 - (δ/100))) / ε^2
≈ 10.6(b - a)^2 / ε^2

For example, if δ = 99%, then n ≥ 10.6(b - a)^2 / ε^2.

Computational costs

Despite its conceptual and algorithmic simplicity, the computational cost associated with a Monte Carlo simulation can be staggeringly high. In general the method requires many samples to get a good approximation, which may incur an arbitrarily large total runtime if the processing time of a single sample is high. Although this is a severe limitation in very complex problems, the embarrassingly parallel nature of the algorithm allows this large cost to be reduced (perhaps to a feasible level) through parallel computing strategies in local processors, clusters, cloud computing, GPU, FPGA, etc.

History

Before the Monte Carlo method was developed, simulations tested a previously understood deterministic problem, and statistical sampling was used to estimate uncertainties in the simulations. Monte Carlo simulations invert this approach, solving deterministic problems using probabilistic metaheuristics (see simulated annealing ).

An early variant of the Monte Carlo method was devised to solve the Buffon’s needle problem , in which π can be estimated by dropping needles on a floor made of parallel equidistant strips. In the 1930s, Enrico Fermi first experimented with the Monte Carlo method while studying neutron diffusion, but he did not publish this work.

In the late 1940s, Stanisław Ulam invented the modern version of the Markov Chain Monte Carlo method while he was working on nuclear weapons projects at the Los Alamos National Laboratory . In 1946, nuclear weapons physicists at Los Alamos were investigating neutron diffusion in the core of a nuclear weapon. Despite having most of the necessary data, such as the average distance a neutron would travel in a substance before it collided with an atomic nucleus and how much energy the neutron was likely to give off following a collision, the Los Alamos physicists were unable to solve the problem using conventional, deterministic mathematical methods. Ulam proposed using random experiments. He recounts his inspiration as follows:

The first thoughts and attempts I made to practice [the Monte Carlo Method] were suggested by a question which occurred to me in 1946 as I was convalescing from an illness and playing solitaires. The question was what are the chances that a Canfield solitaire laid out with 52 cards will come out successfully? After spending a lot of time trying to estimate them by pure combinatorial calculations, I wondered whether a more practical method than “abstract thinking” might not be to lay it out say one hundred times and simply observe and count the number of successful plays. This was already possible to envisage with the beginning of the new era of fast computers, and I immediately thought of problems of neutron diffusion and other questions of mathematical physics, and more generally how to change processes described by certain differential equations into an equivalent form interpretable as a succession of random operations. Later [in 1946], I described the idea to John von Neumann , and we began to plan actual calculations.

Being secret, the work of von Neumann and Ulam required a code name. A colleague of von Neumann and Ulam, Nicholas Metropolis , suggested using the name Monte Carlo, which refers to the Monte Carlo Casino in Monaco where Ulam’s uncle would borrow money from relatives to gamble.

Monte Carlo methods were central to the simulations required for further postwar development of nuclear weapons, including the design of the H-bomb, though severely limited by the computational tools at the time. Von Neumann, Nicholas Metropolis and others programmed the ENIAC computer to perform the first fully automated Monte Carlo calculations, of a fission weapon core, in the spring of 1948. In the 1950s Monte Carlo methods were used at Los Alamos for the development of the hydrogen bomb , and became popularized in the fields of physics , physical chemistry , and operations research . The Rand Corporation and the U.S. Air Force were two of the major organizations responsible for funding and disseminating information on Monte Carlo methods during this time, and they began to find a wide application in many different fields.

The theory of more sophisticated mean-field type particle Monte Carlo methods had certainly started by the mid-1960s, with the work of Henry P. McKean Jr. on Markov interpretations of a class of nonlinear parabolic partial differential equations arising in fluid mechanics. An earlier pioneering article by Theodore E. Harris