Genetic Algorithm

Genetic Algorithm#

A genetic algorithm is an optimization technique used to find the maximum or minimum of a function within a specified domain. In this section, we will first create a program that implements a genetic algorithm from scratch. Then, we will use the geneticalgorithm library to find the maximum or minimum of a function in a given domain.

Genetic Algorithm from Scratch#

Optimization Problem#

The following question is from OpenStax Calculus-III, Example 4.40. Find the maximum value of the fiven function within the specified domain. \( f(x,y) = x^2-2xy+4y^2-4x-2y+24, \,\,\, 0\le x \le 4, \,\,\, 0\le y \le 2\).

Initial Population#

The population is a set of randomly chosen pairs of possible \((x,y)\) values within the given domain.

This population will be updated throughout the genetic algorithm process.

Coding The following code generates an initial population of size \(N=100\) containing 100 pairs of \((x,y)\) values. In each pair, \(x\) is randomly selected between 0 and 4, and \(y\) is randomly selected between 0 and 2.

import numpy as np

population_size = 100

def population(pop_size = population_size):
    init_pop = []
    for i in range(pop_size):
        init_pop.append((np.random.uniform(0,4), np.random.uniform(0,2)))
    return init_pop

np.random.seed(0)
initial_population = population(population_size)
initial_population[:5]

[(2.195254015709299, 1.430378732744839),
 (2.4110535042865755, 1.0897663659937937),
 (1.6946191973556188, 1.2917882261333122),
 (1.75034884505077, 1.7835460015641595),
 (3.854651042004117, 0.7668830376515554)]

Fitness Function#

The fitness function is the function that we aim to maximize.

In this question, the function \(f(x,y)\) is the fitness function.

def f(x,y):
    return x**2-2*x*y+4*y**2-4*x-2*y+24

f_values = [f(x,y) for (x,y) in initial_population]
f_values[:5]

[19.081210627429286,
484825150038,
806370013216757,
97572368695568,
346269785696855]

Parent Selection#

Using randomness and fitness function values, parents will be selected from the current population.

Selected parents will then be used to generate new child pairs in the next step.
There are various methods for choosing parents, but the main principle is to select pairs with higher fitness function values with a greater probability.
Pairs with lower fitness values still have a chance to be chosen, but this probability will be smaller.

In this setion, we will use the tournament method for parent selection.

In this method, a specified number of \((x, y)\) pairs are selected randomly, and the pair with the highest fitness function value among them will be chosen as a parent.

def select_parent(pop_data, f_val, random_size=20):
    random_ind =  np.random.randint(0, len(pop_data)-1, random_size) 
    f_v = [f_val[i] for i in random_ind]
    max_f_v_index = np.argmax(f_v)
    return pop_data[random_ind[max_f_v_index]]

parents = [select_parent(initial_population, f_values, random_size=5) for i in range(population_size)]
parents[:5]

[(1.7434597010625072, 1.7838467100313442),
 (0.25658998539513744, 1.3849442387400397),
 (0.40090754924920446, 1.838965227489347),
 (0.40090754924920446, 1.838965227489347),
 (0.0767727932373341, 0.6031496333490987)]

len(parents)

Crossover#

In this step, one new child pair will be produced from two parent pairs.

The coordinates of the child will be generated using the mean values of the \(x\) and \(y\) coordinates of the parent pairs.
To introduce more randomness, we will also include a probability condition that allows one of the parents to be retained as the child.

def crossover(parent1, parent2, prob_crossover=0.1):
    prob = np.random.uniform(0,1)
    if prob < prob_crossover:
        return [parent1, parent2][np.random.randint(0,2)]
    else:
        return [(parent1[0]+parent2[0])/2, (parent1[1]+parent2[1])/2]

Mutation#

Mutation is used to slightly change the child coordinates, like in nature.

We will add a percentage defined by pct_mutation to each coordinate, resulting in a small change to the coordinates.
A probability condition will be included to determine whether mutation will be applied.

def mutuation(child, prob_mutuation=0.3, pct_mutuation=0.1):
    prob = np.random.uniform(0,1)
    c=np.random.choice([-1,1])
    if prob < prob_mutuation:
        return [min(child[0]*(1+c*pct_mutuation),4), min(child[1]*(1+c*pct_mutuation),2)] 
    else:
        return child

Next Generation#

The next (second) generation will be created from the initial population by performing crossover among consecutive pairs of parents.

Mutation will also be applied after producing the child pairs.

def next_generation(parent_list):
    children_list = []
    for i in range(len(parent_list)):
        if i != len(parent_list)-1:
            child = crossover(parent_list[i], parent_list[i+1], prob_crossover=0.1)
        else:
            child = crossover(parent_list[i], parent_list[0], prob_crossover=0.1)

        children_list.append(mutuation(child))
    return children_list

next_generation(parents)[:5]

[[1.0000248432288223, 1.5843954743856918],
 [0.36081679432428404, 1.6550687047404125],
 [0.44099830417412494, 2],
 [0.23884017124326928, 1.221057430419223],
 [0.07094087234938665, 1.0207751960002387]]

Iteration#

In this part, we will generate generations iteratively to enhance them, with the goal of finding the \((x, y)\) pair that has the highest fitness function value.

We will follow the steps outlined above:

Initialize Population: Generate an initial population of ((x, y)) pairs within the specified domain.
Parent Selection: Select parent pairs using the tournament method based on fitness values.
Crossover: Generate child pairs from the parents by averaging their coordinates, with a probability condition to retain one parent as a child.
Mutation: Slightly alter child coordinates by a set percentage, with a probability condition for mutation.
Iteration: Repeat these steps iteratively to produce new generations, improving toward the ((x, y)) pair with the highest fitness function value.

Each iteration will bring the population closer to an optimal solution based on the fitness function.

population_size = 1000

initial_population = population(population_size)
f_values = [f(x,y) for (x,y) in initial_population]
parents = [select_parent(initial_population, f_values, random_size=5) for i in range(population_size)]

max_ind = np.argmax(f_values)
print(f'Iteration: {0:<3} --- x:{parents[max_ind][0]:.2f}, y:{parents[max_ind][1]:.2f}, f(x,y): {f_values[max_ind]:.2f}')

for i in range(1,101):
    
    parents = [select_parent(parents, f_values, random_size=100) for i in range(population_size)]
    parents = next_generation(parents)

    if i%5 == 0:
        f_values = [f(x,y) for (x,y) in parents]
        max_ind = np.argmax(f_values)
        print(f'Iteration: {i:<3} --- x:{parents[max_ind][0]:.2f}, y:{parents[max_ind][1]:.2f}, f(x,y): {f_values[max_ind]:.2f}')

Iteration: 0   --- x:0.20, y:1.87, f(x,y): 35.57
Iteration: 5   --- x:0.90, y:1.63, f(x,y): 25.62
Iteration: 10  --- x:1.01, y:1.84, f(x,y): 27.14

Iteration: 15  --- x:1.09, y:2.00, f(x,y): 28.49
Iteration: 20  --- x:1.11, y:1.99, f(x,y): 28.27
Iteration: 25  --- x:1.13, y:2.00, f(x,y): 28.23

Iteration: 30  --- x:1.17, y:2.00, f(x,y): 27.98
Iteration: 35  --- x:1.21, y:2.00, f(x,y): 27.76
Iteration: 40  --- x:1.22, y:2.00, f(x,y): 27.72

Iteration: 45  --- x:1.24, y:2.00, f(x,y): 27.60
Iteration: 50  --- x:1.25, y:2.00, f(x,y): 27.55
Iteration: 55  --- x:1.27, y:2.00, f(x,y): 27.43

Iteration: 60  --- x:1.29, y:2.00, f(x,y): 27.33
Iteration: 65  --- x:1.30, y:2.00, f(x,y): 27.27
Iteration: 70  --- x:1.30, y:2.00, f(x,y): 27.27

Iteration: 75  --- x:1.34, y:2.00, f(x,y): 27.09
Iteration: 80  --- x:1.35, y:2.00, f(x,y): 27.00
Iteration: 85  --- x:1.37, y:1.99, f(x,y): 26.82

Iteration: 90  --- x:1.40, y:2.00, f(x,y): 26.76
Iteration: 95  --- x:1.44, y:2.00, f(x,y): 26.56
Iteration: 100 --- x:1.44, y:2.00, f(x,y): 26.52