- 1. Overview
- 2. Etymology
- 3. Cultural Impact
Producing sequences that defy prediction, that’s the crux of it. A dice roll, for instance, a simple cubical one, hands you a random number between one and six. Nothing more, nothing less. Itâs a tangible example of a hardware random number generator at work.
Random number generation, at its core, is the art of producing a string of numbers or symbols through a random number generator (RNG) that is, by definition, impossible to predict with any certainty beyond sheer chance. This doesn’t mean the outcomes are entirely devoid of patterns; you might spot them in retrospect, but trying to foresee them? Futile. True random number generators leverage the inherent chaos of the physical world. They tap into attributes of a constantly fluctuating physical environment, a dance of particles and forces so complex itâs practically beyond modeling. This stands in stark contrast to the charade of pseudorandom number generators (PRNGs). These are merely algorithms, masquerading as random. Their sequences are predetermined, a predictable script played out, all you need is the initial state, the seed , and the algorithm itself. Then there are the non-physical true random number generators (NPTRNGs), a clever breed that scavenges entropy from the digital ether, from the subtle, ever-present fluctuations within a computer system itself, without relying on a dedicated hardware source. Itâs all about the entropy, the measure of unpredictability.
The quest for randomness has spawned a diverse array of methods, each born from a specific need. Some techniques are ancient, etched into the fabric of history: the roll of dice , the flip of a coin , the meticulous shuffling of a deck of cards . Even the intricate casting of yarrow stalks for divination in the I Ching fall into this category. These methods, rooted in mechanical action, were hardly efficient for generating the vast quantities of random numbers needed for statistical analysis. The process was laborious, and the results were often compiled and distributed as random number tables , a testament to the effort involved.
Computational approaches to pseudorandomness are plentiful. While they strive for unpredictability, they invariably fall short of true randomness. They might pass certain statistical tests for randomness , fooling some of the detectors, but they are generally unsuitable for critical applications like cryptography . However, the landscape isn’t entirely bleak. There are meticulously crafted cryptographically secure pseudorandom number generators (CSPRNGs) designed with the specific demands of cryptography in mind.
Practical applications and uses
The utility of random number generators is vast, permeating fields from the thrill of gambling to the rigor of statistical sampling , the complexity of computer simulation , the ironclad security of cryptography , and the controlled design of experiments in completely randomized design . Anywhere an unpredictable outcome is paramount, RNGs come into play. For applications where unpredictability is the absolute highest priority, like in security, hardware generators are the preferred choice, provided they are feasible.
Pseudorandom number generators are indispensable for Monte Carlo-method simulations. Their reproducibility is a boon for debugging ; the same sequence of “random” numbers can be generated by simply restarting with the identical random seed . They also find a vital role in cryptography, provided the seed remains a closely guarded secret. Sender and receiver can, in essence, pre-agree on a set of numbers to serve as keys, generated automatically.
The generation of pseudorandom numbers is a fundamental task in programming. While cryptography and certain numerical algorithms demand a high caliber of randomness, many other operations require a more modest degree of unpredictability. Think of a “random quote of the day” feature, or the seemingly unpredictable movements of an AI opponent in a video game. Weaker forms of randomness are also employed in hash algorithms and in the construction of amortized searching and sorting algorithms .
Some applications that initially appear amenable to randomization reveal a subtler complexity upon closer inspection. A music player that “randomly” selects tracks, for instance, needs to appear random, perhaps even allowing some user control over the selection. A truly random system, however, would have no qualms about playing the same track multiple times in a row.
True vs. pseudo-random numbers
The generation of random numbers broadly follows two distinct paths.
Method 1: Physical Phenomena
The first method involves measuring some physical phenomenon that is presumed to possess inherent randomness. This is followed by a process to correct for any predictable biases that might arise from the measurement itself. Potential sources are as varied as atmospheric noise , the subtle fluctuations of thermal noise, and other external electromagnetic and quantum effects. For example, cosmic background radiation or the decay of radioactive isotopes, when observed over short intervals, can serve as sources of natural entropy â the measure of an information source’s unpredictability.
The rate at which entropy can be harvested from these natural sources is dictated by the underlying physical processes. Consequently, these sources of true entropy are often described as blocking , meaning they operate at a limited rate, only releasing randomness when sufficient entropy has been gathered to meet the demand. On certain Unix-like systems, including most Linux distributions , the special device file /dev/random will indeed block until adequate entropy is collected from the environment. This blocking behavior can make large, bulk requests for random data, such as filling an entire hard disk drive with random bits, a surprisingly slow affair on systems reliant on such entropy sources.
Method 2: Algorithmic Generation
The second method employs computational algorithms to generate extended sequences of numbers that appear random. In reality, these sequences are entirely determined by a shorter initial value, known as a seed or key . If this seed value is known, the entire seemingly random sequence can be precisely reproduced. This type of generator is commonly referred to as a pseudorandom number generator . These generators typically do not depend on natural entropy sources, though they might be seeded periodically by them. The significant advantage here is that they are non-blocking, meaning they aren’t constrained by external events, making large bulk reads a practical possibility.
A common strategy in modern cryptography involves a hybrid approach: true randomness, harvested from natural sources, is used to seed a cryptographically secure pseudorandom number generator (CSPRNG). While hardware random number generators are generally limited in the volume of random bits they can produce per second, they are crucial for generating the “seed ” for a faster PRNG. The PRNG then serves to “anonymize” the noise source by smoothing out any identifying characteristics and performing entropy extraction . When paired with a suitable PRNG algorithm (specifically a CSPRNG), this combination can meet stringent requirements set by standards like the Federal Information Processing Standards and Common Criteria .
Generation methods
Physical methods
The earliest methods for generating random numbersâdice , coin flipping , and roulette wheelsâare still in use today, primarily in games and gambling . However, for most applications in statistics and cryptography, they are far too slow.
Hardware Random Number Generators
A USB-pluggable hardware true random number generator
Numerous natural phenomena produce low-level, statistically random “noise ” signals. These include thermal and shot noise, the jitter and metastability inherent in electronic circuits, the seemingly erratic path of Brownian motion , and the unpredictable nature of atmospheric noise . Researchers have also harnessed the photoelectric effect using beam splitters , other quantum phenomena, and even nuclear decay . While the latter two are generally impractical for widespread use due to various constraints, they illustrate the diverse physical sources tapped for randomness. It’s worth noting that “classical” (non-quantum) phenomena, while not truly random, often exhibit enough unpredictability to be acceptable sources of randomness, leading to the interchangeable use of the terms “true” and “physical.”
A hardware random number generator is expected to produce near-perfect random numbers, often referred to as “full entropy .” However, a raw physical process rarely achieves this ideal. A practical TRNG typically comprises several key components:
- A noise source: This is where the physical process generating the entropy is implemented. Since these processes are usually analog , a digitizer is required to convert the analog output into a binary representation.
- A conditioner: Often called a randomness extractor , this component refines the quality of the raw random bits, improving their statistical properties.
- Health tests: TRNGs are frequently employed in cryptographic algorithms, which can be catastrophically compromised if the random numbers lack sufficient entropy. Therefore, built-in testing functionality is a crucial feature.
Computational methods
For reasons of both speed and security, most random number generation systems incorporate PRNGs. This architectural choice is even mandated by industrial standards like NIST SP 800-90A .
Pseudorandom Number Generators
A pseudorandom number generator (PRNG), also known by the more descriptive term deterministic random bit generator (DRBG), is an algorithm designed to produce sequences of numbers that mimic the characteristics of truly random sequences. The output of a PRNG is not genuinely random, as it is entirely dictated by an initial value, the PRNG’s seed (which itself might be derived from a truly random source). While hardware random number generators can produce sequences closer to true randomness, PRNGs are favored in many practical scenarios due to their superior speed and the crucial advantage of reproducibility.
PRNGs are fundamental to numerous applications, including simulations (such as those employing the Monte Carlo method ), electronic games (for tasks like procedural generation ), and cryptography . Cryptographic applications impose a stringent requirement: the output must be inherently unpredictable based on previous outputs. To meet this demand, more elaborate algorithms are necessary, algorithms that avoid the inherent linearity of simpler PRNGs.
A core requirement for any PRNG is that its output should possess good statistical properties. Confidence in a PRNG’s ability to generate numbers sufficiently close to random for a given purpose generally requires rigorous mathematical analysis. Even the pioneering John von Neumann cautioned against mistaking PRNGs for true random generators, wryly remarking that “Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.”
By humans
Human input can also be a source of randomness, where various inputs from end users are collected and used for randomization. However, studies consistently show that humans struggle to produce truly random sequences of digits or letters. They tend to exhibit patterns, often alternating choices too frequently compared to a well-designed random generator. Consequently, this method is not widely adopted. Paradoxically, this very human failing in generating randomness can be exploited as a tool to gain insights into brain functions that might otherwise remain inaccessible.
Post-processing and statistical checks
Even when starting with a seemingly reliable source of random numbers, perhaps derived from quantum mechanics, ensuring the output is perfectly unbiased requires careful attention. The behavior of these generators can fluctuate due to external factors like temperature, power supply voltage, device age, or other interference.
Therefore, generated random numbers are often subjected to statistical tests before use. These tests serve a dual purpose: verifying that the underlying source is functioning correctly and then post-processing the numbers to enhance their statistical properties. For instance, the TRNG9803 hardware random number generator uses an entropy measurement as a hardware test and then applies a shift register stream cipher for post-processing. It’s notoriously difficult to definitively validate the randomness of generated numbers using statistical tests alone. Researchers like Wang and Nicol have proposed distance-based statistical testing techniques to identify weaknesses in various random generators. Li and Wang, on the other hand, developed a method for testing random numbers derived from laser chaotic entropy sources, leveraging the properties of Brownian motion.
Statistical tests are also employed to build confidence that the final, post-processed output from a random number generator exhibits true unbiasedness. A multitude of randomness test suites have been developed for this purpose.
Other considerations
Reshaping the distribution
Uniform distributions
Most random number generators inherently produce integers or individual bits. To achieve the standard uniform distribution between 0 and 1, an additional step is necessary. This process is more complex than simply dividing the integer by its maximum possible value. Several factors must be considered:
- The integer used in the transformation must possess sufficient bits to meet the required precision.
- The nature of floating-point arithmetic means that precision is not uniform; it increases as the number approaches zero. This extra precision is often unused due to the sheer number of bits required.
- Rounding errors during division can introduce bias into the result. In the worst-case scenario, a boundary that should be excluded might be drawn, contrary to expectations based on real-number mathematics.
The prevailing algorithm, utilized by OpenJDK , Rust , and NumPy , is detailed in a proposal for C++ ’s STL. It doesn’t leverage the full precision and exhibits bias only in the last bit due to round-to-even behavior. Other numerical considerations arise when shifting this canonical uniform distribution to a different range. A proposed method for the Swift programming language claims to utilize full precision throughout.
Uniformly distributed integers are commonly used in algorithms such as the FisherâYates shuffle . Again, a naive implementation can introduce a modulo bias into the result, necessitating more sophisticated algorithms. A method that rarely employs division was described in 2018 by Daniel Lemire. The current state-of-the-art is the 2021 “optimal algorithm” by Stephen Canon of Apple Inc. , inspired by arithmetic encoding.
Most RNGs that generate numbers between 0 and 1 include 0 but exclude 1. Others may include or exclude both endpoints.
Other distributions
Given a source of uniform random numbers, several methods can be employed to create a new random source conforming to a specific probability density function . One technique, known as the inversion method , involves integrating up to an area that is greater than or equal to the random number (which should be generated between 0 and 1 for proper distributions). A second method, the acceptance-rejection method , selects an x and y value and checks if the function of x exceeds the y value. If it does, the x value is accepted; otherwise, it is rejected, and the algorithm restarts.
As an illustration of rejection sampling, consider generating a pair of statistically independent standard normally distributed random numbers (x, y). One could first generate the polar coordinates (r, θ), where r² follows a chi-squared distribution with 2 degrees of freedom (Ď²â ) and θ is uniformly distributed between 0 and 2Ď (UNIFORM(0,2Ď) ), as described in the BoxâMuller transform .
Whitening
The outputs from multiple independent RNGs can be combined, for instance, through a bitwise XOR operation. This process, known as software whitening , yields a combined RNG that is at least as robust as the best individual RNG used.
Computational and hardware random number generators are sometimes integrated to leverage the strengths of both. Computational generators can produce pseudorandom numbers much faster than their physical counterparts, while physical generators provide true randomness.
Low-discrepancy sequences as an alternative
Certain computations that rely on random number generators can be simplified by calculating a total or average value, such as in the Monte Carlo method for computing integrals. For these problems, the use of low-discrepancy sequences , also referred to as quasirandom numbers, can sometimes lead to more accurate solutions. These sequences possess a definite pattern that, in a qualitative sense, fills gaps evenly. In contrast, a truly random sequence might, and often does, leave larger gaps.
Activities and demonstrations
Several online resources offer samples of random numbers for exploration and use:
- The SOCR (Statistics Online Computational Resource) provides interactive activities and demonstrations of random number generation, often utilizing Java applets.
- The Quantum Optics Group at the ANU (Australian National University) generates random numbers sourced from quantum vacuum fluctuations. Samples are accessible via their quantum random number generator research page.
- Random.org offers random numbers derived from the atmospheric noise phenomenon.
- The Quantum Random Bit Generator Service at the RuÄer BoĹĄkoviÄ Institute harvests randomness from the quantum process of photonic emission in semiconductors. They provide various methods for data retrieval, including libraries for multiple programming languages.
- A group at the Taiyuan University of Technology generates random numbers from a chaotic laser source. Samples are available through their physical random number generator service.
Backdoors
The security of many cryptographic systems hinges on the use of a cryptographically secure random number generator for tasks like key generation and the creation of cryptographic nonces . If such a generator can be made predictable, it transforms into a potent backdoor , allowing an attacker to compromise the encryption.
Reports suggest that the NSA may have deliberately introduced a backdoor into the NIST -certified cryptographically secure pseudorandom number generator known as Dual EC DRBG . According to cryptographer Matthew Green , this backdoor could have enabled the NSA to ascertain the internal state of the random number generator, thereby potentially decrypting all data transmitted over an SSL connection established using it. Despite clear indications of Dual_EC_DRBG’s flaws and potential backdoor long before its confirmation in 2013, it saw considerable use, even by prominent security firms like RSA Security . Subsequently, accusations arose that RSA Security knowingly incorporated an NSA backdoor into its products, possibly as part of the Bullrun program . RSA has consistently denied knowingly embedding any backdoors.
Theorists have also posited that hardware RNGs could be surreptitiously modified to possess less entropy than advertised, rendering encryption reliant on them vulnerable. One proposed method involves altering the dopant mask of the chip, a modification undetectable by optical reverse-engineering. Consequently, in Linux, it is considered prudent not to rely solely on Intel’s RDRAND
hardware RNG for random number generation without mixing its output with other entropy sources. This caution is especially heightened following the revelations about the NSA’s Bullrun program. Theodore Ts’o, a prominent figure in the Linux kernel community, famously expressed his resistance to pressure from Intel engineers to make /dev/random dependent solely on the RDRAND instruction.
In a striking instance of malfeasance, a U.S. lottery draw was rigged in 2010. The information security director of the Multi-State Lottery Association (MUSL) secretly installed backdoor malware on MUSL’s secure RNG computer during a routine maintenance visit. This elaborate scheme allowed the perpetrator to win a total of $16,500,000 over several years.
See also
- Flipism
- League of entropy
- List of random number generators
- PP (complexity)
- Procedural generation
- Randomized algorithm
- Random password generator
- Random variable , a variable whose value is a numerical outcome of a random phenomenon.