Differential Evolution - Sarcasm Wiki

Contents
1. Overview
2. Etymology
3. Cultural Impact
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
# Differential Evolution

**Differential Evolution** (DE) is a method of mathematical optimization that belongs to the family of [evolutionary algorithms](/Evolutionary_algorithm). It is designed to optimize a problem by iteratively improving a candidate solution with regard to a given measure of quality. DE is a type of [metaheuristic](/Metaheuristic), which means it makes few or no assumptions about the problem being optimized and can search very large spaces of candidate solutions. However, like other metaheuristics, DE does not guarantee that an optimal solution will be found.

DE is particularly useful for optimizing multidimensional real-valued [functions](/Function_(mathematics)) and does not rely on the [gradient](/Gradient) of the problem being optimized. This characteristic makes DE applicable to problems that are not differentiable, which is a requirement for classic optimization methods such as [gradient descent](/Gradient_descent) and [quasi-Newton methods](/Quasi-newton_methods). Consequently, DE can be used on optimization problems that are not continuous, are noisy, or change over time.

## History

Differential Evolution was introduced by Rainer Storn and Kenneth Price in 1995. Since its inception, DE has been the subject of extensive research and application. Books have been published on various aspects of DE, including its use in [parallel computing](/Parallel_computing), [multiobjective optimization](/Multiobjective_optimization), and [constrained optimization](/Constrained_optimization). These books also contain surveys of application areas, providing a comprehensive overview of the algorithm's versatility and effectiveness.

## Algorithm

The basic variant of the DE algorithm operates by maintaining a population of candidate solutions, referred to as agents. These agents are moved around in the search-space using simple mathematical formulae to combine the positions of existing agents from the population. If the new position of an agent represents an improvement, it is accepted and becomes part of the population; otherwise, the new position is discarded. This process is repeated iteratively, with the hope that a satisfactory solution will eventually be discovered.

### Formal Description

Let \( f: \mathbb{R}^n \to \mathbb{R} \) be the fitness function that needs to be minimized. Note that maximization can be performed by considering the function \( h := -f \) instead. The function \( f \) takes a candidate solution as an argument in the form of a vector of real numbers and produces a real number as output, indicating the fitness of the given candidate solution. The gradient of \( f \) is not known. The goal is to find a solution \( \mathbf{m} \) for which \( f(\mathbf{m}) \leq f(\mathbf{p}) \) for all \( \mathbf{p} \) in the search-space, meaning that \( \mathbf{m} \) is the global minimum.

Let \( \mathbf{x} \in \mathbb{R}^n \) designate a candidate solution (agent) in the population. The basic DE algorithm can be described as follows:

1. **Parameter Selection**:
   - Choose the parameters \( \text{NP} \geq 4 \), \( \text{CR} \in [0, 1] \), and \( F \in [0, 2] \).
   - \( \text{NP} \): The population size, i.e., the number of candidate agents or "parents".
   - \( \text{CR} \): The crossover probability.
   - \( F \): The differential weight.
   - Typical settings are \( \text{NP} = 10n \), \( \text{CR} = 0.9 \), and \( F = 0.8 \).

2. **Initialization**:
   - Initialize all agents \( \mathbf{x} \) with random positions in the search-space.

3. **Iteration**:
   - Until a termination criterion is met (e.g., number of iterations performed, or adequate fitness reached), repeat the following:
     - For each agent \( \mathbf{x} \) in the population:
       - Pick three agents \( \mathbf{a}, \mathbf{b}, \) and \( \mathbf{c} \) from the population at random, ensuring they are distinct from each other and from agent \( \mathbf{x} \). \( \mathbf{a} \) is called the "base" vector.
       - Pick a random index \( R \in \{1, \ldots, n\} \), where \( n \) is the dimensionality of the problem being optimized.
       - Compute the agent's potentially new position \( \mathbf{y} = [y_1, \ldots, y_n] \) as follows:
         - For each \( i \in \{1, \ldots, n\} \), pick a uniformly distributed random number \( r_i \sim U(0, 1) \).
         - If \( r_i < \text{CR} \) or \( i = R \), set \( y_i = a_i + F \times (b_i - c_i) \); otherwise, set \( y_i = x_i \).
       - If \( f(\mathbf{y}) \leq f(\mathbf{x}) \), replace the agent \( \mathbf{x} \) in the population with the improved or equal candidate solution \( \mathbf{y} \).

4. **Termination**:
   - Pick the agent from the population that has the best fitness and return it as the best found candidate solution.

## Parameter Selection

The choice of DE parameters \( \text{NP} \), \( \text{CR} \), and \( F \) can significantly impact optimization performance. Selecting the DE parameters that yield good performance has been the subject of much research. Rules of thumb for parameter selection were devised by Storn et al. and Liu and Lampinen. Mathematical convergence analysis regarding parameter selection was done by Zaharie.

## Constraint Handling

Differential Evolution can be utilized for constrained optimization as well. A common method involves modifying the target function to include a penalty for any violation of constraints, expressed as:

\[ f{\tilde {(}}x) = f(x) + \rho \times \mathrm{CV}(x) \]

Here, \( \mathrm{CV}(x) \) represents either a constraint violation (an L1 penalty) or the square of a constraint violation (an L2 penalty). However, this method has certain drawbacks, such as the appropriate selection of the penalty coefficient \( \rho \). If \( \rho \) is set too low, it may not effectively enforce constraints; if it's too high, it can greatly slow down or even halt the convergence process. Despite these challenges, this approach remains widely used due to its simplicity and because it doesn't require altering the differential evolution algorithm itself.

Alternative strategies, such as projecting onto a feasible set or reducing dimensionality, can be used for box-constrained or linearly constrained cases. However, in the context of general nonlinear constraints, the most reliable methods typically involve penalty functions.

## Variants

Variants of the DE algorithm are continually being developed to improve optimization performance. These developments include:

- New schemes for performing crossover and mutation of agents.
- Various strategies for handling constraints.
- Adaptive strategies that dynamically adjust population size, \( F \), and \( \text{CR} \) parameters.
- Specialized algorithms for large-scale optimization.
- Multi-objective and many-objective algorithms.
- Techniques for handling binary/integer variables.

## See Also

- [Artificial bee colony algorithm](/Artificial_bee_colony_algorithm)
- [CMA-ES](/CMA-ES)
- [Evolution strategy](/Evolution_strategy)
- [Genetic algorithm](/Genetic_algorithm)