当前位置：首页 > news >正文

别再死记硬背流程图了！用Python从零实现一个遗传算法（附完整代码）

news 2026/7/13 3:37:27

用Python从零构建遗传算法：告别理论，直接动手编码

记得第一次接触遗传算法时，我被各种流程图和专业术语搞得晕头转向。直到有一天，我决定抛开所有理论资料，直接用Python从头实现一个最简单的遗传算法，那些抽象概念才突然变得清晰可见。今天，我们就来复现这个学习过程——不是看别人画好的流程图，而是用代码亲手搭建每个组件，感受算法如何从零开始进化。

遗传算法(GA)本质上是一种模拟自然选择的优化方法，特别适合解决那些传统数学方法难以处理的复杂问题。与大多数教程不同，本文将完全采用"做中学"的方式，我们会：

用Python实现一个完整的GA框架
解决实际的函数极值问题
在编码过程中理解选择、交叉、变异等核心机制
分享实际调试中遇到的典型问题及解决方案

1. 环境准备与问题定义

在开始编码前，我们需要明确两个关键点：开发环境和要解决的优化问题。本文使用Python 3.8+环境，主要依赖numpy进行数值计算。安装非常简单：

pip install numpy matplotlib

我们选择一个经典的优化问题作为示例：寻找函数f(x) = x²在区间[0, 31]上的最小值。虽然这个问题用枚举法就能解决，但非常适合用来演示GA的基本原理。

为什么选择这个问题？

输入是一维的，便于可视化理解
解空间足够小（32个整数），方便调试
可以轻松扩展到更复杂函数

先定义我们的目标函数：

def fitness_function(x): return x ** 2 # 越小越好

2. 构建遗传算法核心组件

2.1 种群初始化

遗传算法从一组随机解（称为种群）开始。我们需要决定两个关键参数：

种群大小：通常20-100之间
编码方式：这里采用5位二进制编码（2⁵=32，刚好覆盖0-31）

import numpy as np def initialize_population(pop_size, chromosome_length): return np.random.randint(2, size=(pop_size, chromosome_length))

参数选择经验：

种群太小会导致多样性不足
种群太大会增加计算成本
二进制编码便于后续的交叉和变异操作

2.2 适应度评估与选择

评估每个个体的适应度（即解的质量），然后按照适应度比例进行选择。这里使用轮盘赌选择法：

def evaluate_fitness(population): # 将二进制转换为十进制 decimal = binary_to_decimal(population) # 计算适应度（这里用倒数，因为原函数是求最小） return 1 / (fitness_function(decimal) + 1e-6) # 避免除零 def selection(population, fitness): # 按适应度比例选择 idx = np.random.choice( len(population), size=len(population), p=fitness/fitness.sum() ) return population[idx]

注意：轮盘赌选择可能导致优秀个体被意外淘汰，实践中常配合精英保留策略

2.3 交叉与变异

交叉（crossover）模拟生物繁殖时的基因交换，是GA探索新解的主要方式：

def crossover(parent1, parent2, crossover_rate): if np.random.rand() < crossover_rate: point = np.random.randint(1, len(parent1)) child1 = np.concatenate([parent1[:point], parent2[point:]]) child2 = np.concatenate([parent2[:point], parent1[:point]]) return child1, child2 return parent1.copy(), parent2.copy()

变异（mutation）引入随机变化，维持种群多样性：

def mutation(individual, mutation_rate): for i in range(len(individual)): if np.random.rand() < mutation_rate: individual[i] = 1 - individual[i] # 位翻转 return individual

参数调优技巧：

交叉率通常0.6-0.9
变异率通常0.001-0.01
太高/太低的参数都会影响算法性能

3. 完整算法实现与可视化

现在我们将所有组件组合成完整的遗传算法：

def genetic_algorithm( generations=100, pop_size=20, chrom_length=5, crossover_rate=0.8, mutation_rate=0.01 ): # 初始化 population = initialize_population(pop_size, chrom_length) best_fitness = [] for gen in range(generations): # 评估 fitness = evaluate_fitness(population) best_idx = np.argmax(fitness) best_fitness.append(fitness[best_idx]) # 选择 selected = selection(population, fitness) # 交叉 next_pop = [] for i in range(0, len(selected), 2): if i+1 >= len(selected): next_pop.append(selected[i]) break child1, child2 = crossover(selected[i], selected[i+1], crossover_rate) next_pop.extend([child1, child2]) # 变异 population = np.array([mutation(ind, mutation_rate) for ind in next_pop]) return best_fitness

添加可视化让我们更直观地观察算法收敛过程：

import matplotlib.pyplot as plt fitness_history = genetic_algorithm() plt.plot(1/np.array(fitness_history)) # 转换回原始函数值 plt.xlabel('Generation') plt.ylabel('Best f(x)') plt.title('GA Optimization Progress') plt.show()

4. 实战调试与性能优化

在实际运行中，你可能会遇到以下典型问题：

问题1：早熟收敛

现象：种群很快收敛到次优解
解决方案：
- 增加变异率
- 采用锦标赛选择代替轮盘赌
- 引入移民策略增加多样性

问题2：收敛速度慢

现象：需要很多代才能找到好解
解决方案：
- 调整选择压力（如使用线性排名选择）
- 尝试不同的交叉算子（如多点交叉）
- 检查编码方式是否合适

改进版选择算子示例（锦标赛选择）：

def tournament_selection(population, fitness, tournament_size=3): selected = [] for _ in range(len(population)): # 随机选择k个个体参赛 candidates = np.random.choice(len(population), tournament_size) # 选择其中适应度最高的 winner = candidates[np.argmax(fitness[candidates])] selected.append(population[winner]) return np.array(selected)

进阶技巧：

自适应参数：让交叉率和变异率随着进化动态调整
多种群并行：独立进化的种群间偶尔交换个体
精英保留：确保每代最优个体不被淘汰

5. 扩展应用与进阶方向

掌握了基本实现后，你可以尝试以下扩展：

扩展到更复杂问题：

多峰函数优化
组合优化问题（如TSP）
神经网络超参数调优

改进编码方式：

# 实数编码示例 def real_encoding(pop_size, dim, low, high): return np.random.uniform(low, high, (pop_size, dim))

多目标优化：

使用NSGA-II等算法
维护Pareto前沿解集
设计新的适应度评价方法

在真实项目中，你可能需要处理：

高维优化问题（维度灾难）
计算昂贵的适应度函数
动态变化的目标函数

6. 完整代码整合

以下是整合后的完整实现，包含所有关键组件和可视化：

import numpy as np import matplotlib.pyplot as plt # 目标函数 def fitness_func(x): return x ** 2 # 辅助函数 def binary_to_decimal(binary): return np.sum(binary * 2**np.arange(binary.shape[1])[::-1], axis=1) # GA组件 def initialize_population(pop_size, chrom_length): return np.random.randint(2, size=(pop_size, chrom_length)) def evaluate_fitness(population): decimal = binary_to_decimal(population) return 1 / (fitness_func(decimal) + 1e-6) def tournament_selection(population, fitness, k=3): selected = [] for _ in range(len(population)): candidates = np.random.choice(len(population), k) winner = candidates[np.argmax(fitness[candidates])] selected.append(population[winner]) return np.array(selected) def crossover(p1, p2, rate): if np.random.rand() < rate: pt = np.random.randint(1, len(p1)) c1 = np.concatenate([p1[:pt], p2[pt:]]) c2 = np.concatenate([p2[:pt], p1[pt:]]) return c1, c2 return p1.copy(), p2.copy() def mutation(ind, rate): for i in range(len(ind)): if np.random.rand() < rate: ind[i] = 1 - ind[i] return ind # 主算法 def run_ga(generations=100, pop_size=20, chrom_length=5, crossover_rate=0.8, mutation_rate=0.01): pop = initialize_population(pop_size, chrom_length) history = [] for _ in range(generations): fitness = evaluate_fitness(pop) history.append(np.max(fitness)) selected = tournament_selection(pop, fitness) next_pop = [] for i in range(0, len(selected), 2): if i+1 >= len(selected): next_pop.append(selected[i]) break c1, c2 = crossover(selected[i], selected[i+1], crossover_rate) next_pop.extend([c1, c2]) pop = np.array([mutation(ind, mutation_rate) for ind in next_pop]) return history # 运行与可视化 if __name__ == "__main__": history = run_ga(generations=50) plt.plot(1/np.array(history)) plt.xlabel('Generation') plt.ylabel('Best f(x)') plt.title('Genetic Algorithm Optimization') plt.show()

运行这段代码，你会看到算法如何逐步逼近最优解0。试着调整参数观察不同设置下的收敛行为，这是理解GA调参最好的方式。

查看全文

http://www.jsqmd.com/news/855979/