Unit 4 - Notes

CSE275 18 min read

Unit 4: Advanced Nature-Inspired Optimization

1. Firefly Algorithm (FA)

1.1. Introduction and Inspiration

The Firefly Algorithm (FA) is a metaheuristic algorithm inspired by the flashing behavior of fireflies. Developed by Xin-She Yang in 2008, it is based on the idealized behavior of fireflies, specifically how the intensity and timing of their flashes are used for communication and attracting mates.

Core Idea: Fireflies are attracted to brighter fireflies. This attraction is proportional to the brightness of the other firefly and decreases as the distance between them increases.

1.2. Key Concepts and Mathematical Formulation

The algorithm is based on three idealized rules:

  1. Unisex Attraction: All fireflies are unisex, so any firefly can be attracted to any other firefly regardless of their sex.
  2. Attractiveness & Brightness: Attractiveness is proportional to brightness. For any two flashing fireflies, the less bright one will move towards the brighter one. Attractiveness decreases as the distance between them increases. If there is no brighter firefly, a firefly will move randomly.
  3. Brightness & Objective Function: The brightness of a firefly is determined by the landscape of the objective function. For a maximization problem, brightness is proportional to the value of the objective function.

Mathematical Formulation:

  • Light Intensity (Brightness): The brightness I of a firefly at a location x is associated with the objective function f(x). For a maximization problem, I(x) ∝ f(x).

  • Attractiveness (β): The attractiveness of a firefly is relative and depends on the distance r between it and another firefly. It follows a decreasing function, commonly an exponential decay:

    TEXT
        β(r) = β₀ * e^(-γ * r²)
        

    Where:

    • β₀ is the attractiveness at r = 0 (maximum attractiveness).
    • γ is the light absorption coefficient, which controls the rate of decrease.
    • rᵢⱼ is the Cartesian distance between firefly i and firefly j.
  • Movement Equation: The movement of a firefly i towards a more attractive (brighter) firefly j is determined by:

    TEXT
        xᵢ(t+1) = xᵢ(t) + β(r) * (xⱼ(t) - xᵢ(t)) + α(t) * εᵢ
        

    Where:

    • xᵢ(t) is the current position of firefly i.
    • xⱼ(t) is the position of a brighter firefly j.
    • The second term represents the attraction.
    • The third term is a randomization term for exploration.
      • α(t) is a randomization parameter, often decreasing over time.
      • εᵢ is a vector of random numbers drawn from a uniform or Gaussian distribution.

1.3. Algorithm Pseudocode

TEXT
Algorithm: Firefly Algorithm (FA)

1.  Initialize algorithm parameters:
    - Population size (n)
    - Maximum generations (MaxGen)
    - Attractiveness at r=0 (β₀)
    - Light absorption coefficient (γ)
    - Randomization parameter (α)

2.  Initialize a population of n fireflies with random positions xᵢ (i=1, 2, ..., n).
3.  Determine the light intensity Iᵢ for each firefly based on the objective function f(xᵢ).

4.  While (t < MaxGen):
5.      For i = 1 to n (each firefly):
6.          For j = 1 to n (each firefly):
7.              If (Iⱼ > Iᵢ): // If firefly j is brighter than i
8.                  Calculate distance rᵢⱼ between xᵢ and xⱼ.
9.                  Calculate attractiveness β using β(r) = β₀ * e^(-γ * r²).
10.                 Move firefly i towards j using the movement equation:
                    xᵢ(t+1) = xᵢ(t) + β * (xⱼ(t) - xᵢ(t)) + α * (rand - 0.5)
11.                 Evaluate new solution and update light intensity Iᵢ.
12.             End If
13.         End For
14.     End For
15.     Rank fireflies and find the current best solution.
16.     t = t + 1
17. End While

18. Post-process results and visualization.
19. Return the best solution found.

1.4. Strengths and Weaknesses

Strengths:

  • Automatic Subdivision: The population can automatically subdivide into subgroups, with each group swarming around a local optimum. This allows FA to find multiple optima simultaneously.
  • Good for Multimodal Problems: Due to the subdivision capability, it performs well on complex, multimodal optimization problems.
  • Simplicity: The core concept and implementation are relatively straightforward.

Weaknesses:

  • Parameter Tuning: The performance is sensitive to the parameters β₀, γ, and α. Poor tuning can lead to slow convergence or getting stuck.
  • Slow Convergence for some problems: The "brightest" firefly does not move, which can slow down the exploitation of the best-found region.
  • Computational Cost: The main loop has a complexity of O(n²), as each firefly is compared with every other firefly.

2. Whale Optimization Algorithm (WOA)

2.1. Introduction and Inspiration

The Whale Optimization Algorithm (WOA) is a metaheuristic developed by Seyedali Mirjalili and Andrew Lewis in 2016. It is inspired by the social behavior of humpback whales, specifically their unique hunting strategy known as the "bubble-net feeding method."

Core Idea: The algorithm mimics the search for prey and the bubble-net attacking behavior of humpback whales. The best solution found so far is assumed to be the target prey.

2.2. Key Concepts and Mathematical Formulation

WOA has two main phases: Exploitation (attacking prey) and Exploration (searching for prey).

2.2.1. Exploitation Phase (Bubble-net Attacking Method)

This phase has two mechanisms:

  1. Shrinking Encircling Mechanism: Humpback whales encircle their prey. WOA models this by updating the position of search agents towards the best-known solution.

    TEXT
        D = |C · X*(t) - X(t)|
        X(t+1) = X*(t) - A · D
        

    Where:

    • t is the current iteration.
    • X*(t) is the position vector of the best solution obtained so far.
    • X(t) is the position vector of a whale.
    • A and C are coefficient vectors calculated as:
      • A = 2 · a · r₁ - a
      • C = 2 · r₂
      • a is a control parameter linearly decreased from 2 to 0 over the course of iterations.
      • r₁, r₂ are random vectors in [0, 1].
    • By decreasing a, the fluctuation range of A is also decreased, causing the whales to "shrink" their circle around the prey.
  2. Spiral Updating Position: Whales attack the prey in a spiral-shaped path. This is modeled simultaneously with the shrinking circle.

    TEXT
        X(t+1) = D' · e^(bl) · cos(2πl) + X*(t)
        

    Where:

    • D' = |X*(t) - X(t)| is the distance of the whale to the prey.
    • b is a constant defining the shape of the logarithmic spiral.
    • l is a random number in [-1, 1].

A 50% probability is assumed to choose between the shrinking circle and the spiral model during optimization.

2.2.2. Exploration Phase (Search for Prey)

This phase is based on the random search behavior of whales. Instead of updating positions based on the best agent, a whale updates its position based on a randomly chosen whale. This forces the algorithm to explore new areas of the search space.

This is achieved by using the vector A with random values greater than 1 or less than -1.

TEXT
D = |C · X_rand(t) - X(t)|
X(t+1) = X_rand(t) - A · D

Where X_rand is a random position vector (a random whale) from the current population.

2.3. Algorithm Pseudocode

TEXT
Algorithm: Whale Optimization Algorithm (WOA)

1.  Initialize the whale population Xᵢ (i = 1, 2, ..., n).
2.  Calculate the fitness of each search agent.
3.  Set X* = the best search agent.

4.  While (t < MaxGen):
5.      For each search agent:
6.          Update a, A, C, l, and p. // p is a random number in [0,1]
7.
8.          If (p < 0.5):
9.              If (|A| < 1): // Exploitation: Shrinking Encircling
10.                 Update the position using the shrinking encircling equation.
11.             Else if (|A| >= 1): // Exploration: Search for Prey
12.                 Select a random search agent X_rand.
13.                 Update the position using the random search equation.
14.             End If
15.         Else if (p >= 0.5): // Exploitation: Spiral Updating
16.             Update the position using the spiral equation.
17.         End If
18.
19.     End For
20.
21.     Check for and correct any search agents that go beyond the search space.
22.     Calculate the fitness of each search agent.
23.     Update X* if a better solution is found.
24.     t = t + 1
25. End While

26. Return X*.

2.4. Strengths and Weaknesses

Strengths:

  • Good Balance: The parameter a provides a smooth transition between exploration (|A| >= 1) and exploitation (|A| < 1), leading to a good balance.
  • Fewer Parameters: WOA has only two main internal parameters to adjust (a and b), making it simpler to implement and tune compared to other algorithms.
  • Strong Exploitation: The spiral movement mechanism provides a robust exploitation capability around the best solution.

Weaknesses:

  • Premature Convergence: Like many metaheuristics, it can sometimes converge prematurely to a local optimum, especially in very high-dimensional problems.
  • Stagnation: In later stages, the population may lose diversity, leading to stagnation where little improvement is made.

3. Grey Wolf Optimization (GWO)

3.1. Introduction and Inspiration

The Grey Wolf Optimizer (GWO) is a metaheuristic algorithm introduced by Seyedali Mirjalili et al. in 2014. It is inspired by the social hierarchy and hunting mechanism of grey wolves.

Core Idea: The algorithm simulates the leadership hierarchy and hunting behavior. The population of solutions is divided into four groups: alpha (α), beta (β), delta (δ), and omega (ω). The search is guided by the three fittest wolves (α, β, and δ).

3.2. Key Concepts and Mathematical Formulation

  1. Social Hierarchy:

    • Alpha (α): The leader of the pack. The fittest solution in the population.
    • Beta (β): The second-best decision-makers. The second fittest solution.
    • Delta (δ): Subordinates to α and β. The third fittest solution.
    • Omega (ω): The remaining wolves. They follow the guidance of α, β, and δ.
  2. Hunting Behavior:

    • Encircling Prey: Wolves encircle prey during the hunt. This is modeled mathematically as:

      TEXT
              D = |C · X_p(t) - X(t)|
              X(t+1) = X_p(t) - A · D
              

      Where t is the iteration, X_p is the position of the prey, X is the position of a wolf, and A and C are coefficient vectors similar to those in WOA.

    • Hunting: The hunt is guided by the alpha, beta, and delta wolves as they are presumed to have the best knowledge about the location of the prey. The omega wolves update their positions based on the positions of these three leaders.

      TEXT
              // Calculate distances from alpha, beta, and delta
              D_α = |C₁ · X_α - X|
              D_β = |C₂ · X_β - X|
              D_δ = |C₃ · X_δ - X|
      
              // Calculate steps towards alpha, beta, and delta
              X₁ = X_α - A₁ · D_α
              X₂ = X_β - A₂ · D_β
              X₃ = X_δ - A₃ · D_δ
      
              // Update the wolf's position (average of the three steps)
              X(t+1) = (X₁ + X₂ + X₃) / 3
              

      The vectors A and C are calculated as in WOA, providing a mechanism for exploration and exploitation. a is linearly decreased from 2 to 0. When |A| > 1, the wolves diverge from the prey (exploration). When |A| < 1, they converge towards the prey (exploitation).

3.3. Algorithm Pseudocode

TEXT
Algorithm: Grey Wolf Optimizer (GWO)

1.  Initialize the grey wolf population Xᵢ (i = 1, 2, ..., n).
2.  Initialize parameters a, A, and C.
3.  Calculate the fitness of each search agent.
4.  Identify the three best solutions as X_α, X_β, and X_δ.

5.  While (t < MaxGen):
6.      For each search agent (wolf):
7.          Update its position using the GWO position update equations based on X_α, X_β, and X_δ.
8.      End For
9.
10.     Update a, A, and C.
11.     Calculate the fitness of all search agents.
12.     Update X_α, X_β, and X_δ based on the new fitness values.
13.     t = t + 1
14. End While

15. Return X_α (the best solution found).

3.4. Strengths and Weaknesses

Strengths:

  • Simple Structure: GWO has a simple concept and is easy to implement.
  • Fewer Parameters: Similar to WOA, it has few parameters to tune.
  • Good Exploration-Exploitation Balance: The adaptive parameter a and the use of the top three solutions provide a robust mechanism for balancing global and local search.

Weaknesses:

  • Premature Convergence: Can converge prematurely, especially on complex multimodal problems where the top three solutions might cluster in the same local optimum.
  • Low Solution Accuracy: May struggle to find highly precise solutions in some complex numerical problems.
  • Slow Convergence: Convergence speed can be slow in the later stages of the search process.

4. Grasshopper Optimization Algorithm (GOA)

4.1. Introduction and Inspiration

The Grasshopper Optimization Algorithm (GOA) is a metaheuristic proposed by Saremi, Mirjalili, and Lewis in 2017. It is inspired by the swarming behavior of grasshoppers in nature, which is characterized by both long-range (migration) and short-range (foraging) movements.

Core Idea: The algorithm models the social interaction (repulsion and attraction) between grasshoppers in a swarm. The movement is influenced by social interaction, gravity, and wind advection.

4.2. Key Concepts and Mathematical Formulation

The position update equation for a grasshopper is:

TEXT
Xᵢ = Sᵢ + Gᵢ + Aᵢ

Where:

  • Xᵢ is the position of the i-th grasshopper.
  • Sᵢ is the social interaction component.
  • Gᵢ is the gravity force component.
  • Aᵢ is the wind advection component.

A more practical version used in the algorithm is:

TEXT
Xᵢᵈ(t+1) = c * ( Σ [from j=1 to N, j≠i] c * (ubᵈ - lbᵈ)/2 * s( |xⱼᵈ - xᵢᵈ| ) * (xⱼ - xᵢ)/dᵢⱼ ) + Tᵈ

Where:

  • ubᵈ and lbᵈ are the upper and lower bounds in the d-th dimension.
  • Tᵈ is the position of the target (the best solution found so far).
  • c is a decreasing coefficient that balances exploration and exploitation.
  • s(r) = f * e^(-r/l) - e^(-r) is a function that defines the social forces (attraction and repulsion). f and l are parameters for the intensity and length scale of attraction.

Key Features:

  • Social Interaction (S): The function s(r) models how grasshoppers interact. At short distances, they repel each other to avoid collision. At larger distances, they attract each other to form a swarm.
  • Adaptive Coefficient c: This is the most important parameter. It is decreased over iterations to balance exploration and exploitation.
    TEXT
        c = c_max - t * (c_max - c_min) / MaxGen
        
    • Initially, c is large, which causes grasshoppers to make large steps (exploration).
    • As iterations proceed, c becomes smaller, reducing the step size and encouraging movement towards the target (exploitation).
  • Target (T): The term Tᵈ represents the best solution found so far (the "food source"). As the algorithm progresses, the swarm converges towards this target.

4.3. Algorithm Pseudocode

TEXT
Algorithm: Grasshopper Optimization Algorithm (GOA)

1.  Initialize algorithm parameters:
    - Population size (N)
    - Max iterations (MaxGen)
    - c_max, c_min (coefficients for adaptive parameter c)

2.  Initialize the grasshopper population Xᵢ (i = 1, 2, ..., N).
3.  Calculate the fitness of each grasshopper.
4.  Set T = the position of the best grasshopper (the target).

5.  While (t < MaxGen):
6.      Update the coefficient c.
7.      For each grasshopper i:
8.          Normalize the distances between grasshoppers.
9.          Calculate the sum of social interactions (Sᵢ) with all other grasshoppers.
10.         Update the position of grasshopper i using the GOA position update equation towards the target T.
11.     End For
12.
13.     Bring any grasshoppers that go outside the boundaries back in.
14.     Calculate the fitness of the updated population.
15.     Update T if a better solution is found.
16.     t = t + 1
17. End While

18. Return T.

4.4. Strengths and Weaknesses

Strengths:

  • Strong Exploration: The repulsion mechanism between grasshoppers helps prevent them from getting trapped in local optima in the early stages.
  • Smooth Transition: The adaptive coefficient c provides a gradual shift from exploration to exploitation.
  • Good Performance on Structural Problems: Has shown promising results in structural design optimization problems.

Weaknesses:

  • Slow Convergence: The algorithm tends to converge slowly, especially towards the end of the search, as it has to evaluate the interaction between all pairs of grasshoppers.
  • Parameter Sensitivity: The parameters f and l in the social interaction function s(r) can be sensitive and difficult to tune.

5. Conceptual Grouping of Metaheuristics

Metaheuristics can be classified based on various criteria. Understanding these groupings helps in selecting an appropriate algorithm for a given problem.

5.1. Classification by Solution Strategy

  • Single-Solution Based: These algorithms work on a single candidate solution at a time. They are also known as trajectory-based methods.

    • Examples: Simulated Annealing (SA), Tabu Search (TS), Iterated Local Search (ILS).
    • Characteristics: Strong at exploitation (local search), but can be weaker at exploration.
  • Population-Based: These algorithms maintain and improve a population of candidate solutions in each iteration.

    • Examples: Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), and all algorithms discussed in this unit (FA, WOA, GWO, GOA).
    • Characteristics: Inherently better at exploration due to the diversity of the population.

5.2. Classification by Inspiration Source

  • Evolutionary Algorithms (EAs): Inspired by biological evolution. They use mechanisms like selection, crossover, and mutation.

    • Examples: Genetic Algorithm (GA), Differential Evolution (DE), Genetic Programming (GP).
  • Swarm Intelligence (SI) Algorithms: Inspired by the collective behavior of decentralized, self-organized systems, typically social animals.

    • Examples: Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), Firefly Algorithm (FA), Whale Optimization Algorithm (WOA), Grey Wolf Optimizer (GWO), Grasshopper Optimization Algorithm (GOA).
  • Physics-Based Algorithms: Inspired by physical laws and processes.

    • Examples: Simulated Annealing (SA) (inspired by annealing in metallurgy), Gravitational Search Algorithm (GSA).
  • Human-Based Algorithms: Inspired by human behaviors and interactions.

    • Examples: Harmony Search (HS) (inspired by music improvisation), Teaching-Learning-Based Optimization (TLBO).

5.3. Summary Table of Groupings

Classification Type Category 1 Category 2
Solution Strategy Single-Solution Based (e.g., SA, TS) Population-Based (e.g., GA, PSO, WOA)
Inspiration Source Evolutionary (e.g., GA, DE) Swarm Intelligence (e.g., PSO, FA, GWO)
Physics-Based (e.g., SA) Human-Based (e.g., TLBO)

6. Comparison of Metaheuristic Algorithms

No single metaheuristic is universally superior for all types of optimization problems. This is formalized by the No Free Lunch (NFL) Theorem, which states that for any algorithm, any elevated performance over one class of problems is paid for with diminished performance over another class. Therefore, comparison is problem-dependent.

6.1. Comparative Table of Advanced Algorithms

Feature Firefly Algorithm (FA) Whale Optimization (WOA) Grey Wolf Optimizer (GWO) Grasshopper Opt. (GOA)
Inspiration Flashing behavior of fireflies Bubble-net hunting of humpback whales Social hierarchy & hunting of grey wolves Swarming behavior of grasshoppers
Main Idea Attraction based on brightness (objective value) and distance Encircling & spiral attack on the best solution (prey) Search guided by the top 3 solutions (alpha, beta, delta) Social attraction/repulsion & movement towards the best target
Exploration Random movement term (α*εᵢ) and attraction to distant bright fireflies Random search based on |A| >= 1 Divergence from prey based on |A| > 1 Large steps due to high initial c value and grasshopper repulsion
Exploitation Movement towards brighter, nearby fireflies Shrinking circle (|A| < 1) and spiral attack path Convergence towards prey based on |A| < 1 Small steps due to low final c value and attraction to target
Key Parameters β₀ (attractiveness), γ (absorption), α (randomness) a (controls A), b (spiral shape) a (controls A) c (adaptive coeff.), f, l (social forces)
Complexity High (O(n²)) due to pairwise comparisons Low (O(n)) per iteration Low (O(n)) per iteration High (O(n²)) due to pairwise social force calculation
Pros Good for multimodal problems, automatic subgrouping Simple, few parameters, good exploitation Simple, few parameters, good balance Strong exploration, avoids early trapping
Cons High computational cost, sensitive parameters Can suffer from premature convergence Low precision on some problems, can stagnate Slow convergence, sensitive social force parameters

7. Scalability and Convergence Issues in Optimization

7.1. Scalability and the Curse of Dimensionality

Scalability refers to how an algorithm's performance (in terms of solution quality and computational time) is affected as the size of the problem increases. For optimization, this usually means an increase in the number of decision variables (dimensions).

The Curse of Dimensionality describes the phenomenon where the search space grows exponentially with the number of dimensions.

  • Impact on Metaheuristics:
    1. Vast Search Space: The volume of the search space becomes too large to explore effectively. A fixed-size population covers a progressively smaller fraction of the space.
    2. Increased Local Optima: The number of local optima often increases with dimensionality, making it harder to find the global optimum.
    3. Computational Cost: The time required to evaluate the objective function for one solution often increases with more dimensions, slowing down the entire algorithm.
    4. Stagnation: With a vast space to search, the algorithm may wander for many iterations without finding better solutions.

7.2. Convergence Issues

Convergence is the process by which the algorithm's population moves towards a single point or region in the search space.

  1. Premature Convergence:

    • What it is: The algorithm gets trapped in a local optimum and loses its ability to explore other regions of the search space. The population's diversity decreases rapidly, and all solutions converge to a suboptimal point.
    • Causes:
      • An overly aggressive exploitation mechanism.
      • Poor balance between exploration and exploitation.
      • Loss of diversity in the population.
      • Incorrect parameter settings.
    • Example: In GWO, if the alpha, beta, and delta wolves all get stuck in the same local basin, the entire pack will be drawn there, unable to escape.
  2. Stagnation:

    • What it is: The algorithm makes very little or no improvement over a long series of iterations. It is not necessarily converged, but its search progress has stalled.
    • Causes:
      • The step sizes of the search agents become too small.
      • The algorithm has thoroughly explored the current region but lacks a mechanism to jump to a new, promising region.
    • Example: In FA, if all fireflies have similar brightness levels and are far apart, the attraction force becomes negligible, and movement is dominated by small random steps, leading to stagnation.
  3. Convergence Rate:

    • What it is: The speed at which an algorithm approaches an optimum.
    • Trade-off: A very fast convergence rate is often a sign of strong exploitation, which increases the risk of premature convergence. A very slow rate indicates excessive exploration or stagnation. The ideal is a rapid initial exploration followed by a focused exploitation and convergence to the global optimum.

7.3. Addressing These Issues

  • Parameter Control/Tuning:

    • Adaptive Parameters: Instead of fixed parameters, use parameters that change over the course of the run. WOA, GWO, and GOA all use this concept (e.g., decreasing a or c) to shift from exploration to exploitation.
    • Self-Adaptive Parameters: Encode the parameters into the solution itself and let them evolve along with the solution (common in Evolutionary Algorithms).
  • Maintaining Diversity:

    • Introduce mechanisms to prevent the population from becoming too homogeneous.
    • Example: In GA, a higher mutation rate can reintroduce diversity. In PSO, some variants introduce "rogue" particles that explore randomly.
  • Hybridization:

    • Combine two or more algorithms to leverage their respective strengths.
    • Global-Local Hybrid: Use a global search algorithm (like GWO) to identify promising regions and then apply a local search algorithm (like Hill Climbing) to finely tune the solution within that region.
    • Algorithm Fusion: Combine operators from different algorithms, e.g., using a crossover operator from GA within a PSO algorithm.
  • Modified Algorithm Structures:

    • Multi-population/Sub-swarm approaches: Divide the population into multiple sub-populations that explore different areas and share information periodically. This can help avoid a single premature convergence point.