用Python复现数学建模国赛B题‘穿越沙漠’:手把手教你写最优路径规划算法
用Python复现数学建模国赛B题‘穿越沙漠’:手把手教你写最优路径规划算法
当数学建模问题遇上Python编程,会产生怎样的化学反应?本文将以2020年高教杯数学建模国赛B题"穿越沙漠"为例,带你从零开始构建一个完整的路径规划解决方案。不同于单纯的理论分析,我们将聚焦于如何用代码实现最优策略的计算,让抽象的数学模型落地为可执行的程序。
1. 问题解析与建模思路
穿越沙漠问题本质上是一个资源约束下的最优路径规划问题。玩家需要在有限的天数内,合理分配初始资金购买水和食物,在沙漠地图中移动并可能通过挖矿获取额外资金,最终到达终点时保留尽可能多的剩余资金。
核心挑战在于:
- 多目标优化:最大化终点剩余资金
- 复杂约束条件:负重限制、天气影响、挖矿收益计算
- 状态空间庞大:需要考虑每天的位置、资源、资金等状态
我们可以将其建模为有约束的图搜索问题:
- 将地图区域抽象为图的节点
- 相邻区域的移动抽象为图的边
- 每个节点的状态包含:位置、剩余天数、水和食物数量、当前资金
- 边的权重由移动消耗和天气决定
class GameState: def __init__(self, position, day, water, food, money, path): self.position = position # 当前区域 self.day = day # 当前天数 self.water = water # 剩余水量(箱) self.food = food # 剩余食物(箱) self.money = money # 剩余资金 self.path = path # 已走路径2. 数据结构设计与预处理
高效的数据结构是算法实现的基础。我们需要先处理游戏地图和规则,将其转化为程序可操作的形式。
2.1 地图表示:邻接矩阵
使用邻接矩阵表示区域间的连通性,其中矩阵元素值表示两个区域是否相邻:
def build_adjacency_matrix(): # 示例:第一关27个区域的邻接矩阵 size = 27 adj = [[float('inf')]*size for _ in range(size)] # 设置相邻区域(示例) adj[0][1] = 1 # 区域1与2相邻 adj[0][24] = 1 # 区域1与25相邻 adj[1][2] = 1 # 区域2与3相邻 # ...其他连接关系 # 确保矩阵对称 for i in range(size): for j in range(size): if adj[i][j] == 1: adj[j][i] = 1 if i == j: adj[i][j] = 0 return adj2.2 天气与消耗规则
不同天气下的资源消耗差异显著,需要建立对应的消耗规则表:
| 天气类型 | 水消耗(箱/天) | 食物消耗(箱/天) | 移动倍数 |
|---|---|---|---|
| 晴朗 | 5 | 7 | 2 |
| 高温 | 8 | 6 | 2 |
| 沙暴 | 10 | 10 | 不可移动 |
weather_consumption = { 'sunny': {'water': 5, 'food': 7, 'move_factor': 2}, 'hot': {'water': 8, 'food': 6, 'move_factor': 2}, 'sandstorm': {'water': 10, 'food': 10, 'move_factor': 0} }3. 核心算法实现
3.1 Floyd最短路径算法
首先实现Floyd算法计算所有区域间的最短路径,为后续策略提供基础参考:
def floyd(adj_matrix): n = len(adj_matrix) dist = [[float('inf')]*n for _ in range(n)] path = [[-1]*n for _ in range(n)] # 初始化 for i in range(n): for j in range(n): dist[i][j] = adj_matrix[i][j] if i != j and adj_matrix[i][j] < float('inf'): path[i][j] = i # Floyd核心算法 for k in range(n): for i in range(n): for j in range(n): if dist[i][k] + dist[k][j] < dist[i][j]: dist[i][j] = dist[i][k] + dist[k][j] path[i][j] = path[k][j] return dist, path3.2 动态规划解决方案
考虑到问题的多阶段决策特性,动态规划是解决此类问题的理想选择:
def dp_solution(adj_matrix, weather_sequence, max_days, initial_money): n = len(adj_matrix) # DP表:dp[day][position] = (max_money, water, food, path) dp = [[(-1, 0, 0, []) for _ in range(n)] for _ in range(max_days+1)] # 初始状态:第0天在起点,初始资金10000 initial_water = 0 # 需计算最优初始购买量 initial_food = 0 # 需计算最优初始购买量 dp[0][0] = (initial_money, initial_water, initial_food, [0]) for day in range(max_days): for pos in range(n): current_money, water, food, path = dp[day][pos] if current_money == -1: # 无效状态 continue weather = weather_sequence[day] consume = weather_consumption[weather] # 选项1:停留 new_water = water - consume['water'] new_food = food - consume['food'] if new_water >= 0 and new_food >= 0: if dp[day+1][pos][0] < current_money: dp[day+1][pos] = (current_money, new_water, new_food, path + [pos]) # 选项2:移动(非沙暴天气) if weather != 'sandstorm': for neighbor in range(n): if adj_matrix[pos][neighbor] == 1: # 相邻区域 move_water = water - consume['water'] * consume['move_factor'] move_food = food - consume['food'] * consume['move_factor'] if move_water >= 0 and move_food >= 0: if dp[day+1][neighbor][0] < current_money: dp[day+1][neighbor] = (current_money, move_water, move_food, path + [neighbor]) # 选项3:挖矿(如果在矿山) if is_mine(pos): mine_water = water - consume['water'] * 3 # 挖矿消耗3倍 mine_food = food - consume['food'] * 3 mine_money = current_money + 1000 # 基础收益 if mine_water >= 0 and mine_food >= 0: if dp[day+1][pos][0] < mine_money: dp[day+1][pos] = (mine_money, mine_water, mine_food, path + [pos]) # 找出终点最优解 best_money = -1 best_solution = None for day in range(max_days+1): if dp[day][n-1][0] > best_money: # 假设终点是最后一个区域 best_money = dp[day][n-1][0] best_solution = dp[day][n-1] return best_solution4. 优化策略与实现技巧
4.1 状态剪枝与优化
原始DP方案状态空间可能过大,需要优化:
def optimized_dp(adj_matrix, weather_sequence, max_days, initial_money): # 使用字典存储非劣解 dp = {0: {0: [ (initial_money, init_water, init_food, [0]) ] } } for day in range(max_days): for pos in dp.get(day, {}): for state in dp[day][pos]: current_money, water, food, path = state weather = weather_sequence[day] consume = weather_consumption[weather] # 产生新状态 new_states = generate_new_states(pos, weather, consume, current_money, water, food, path) # 更新dp表 for new_day, new_pos, new_state in new_states: if new_day not in dp: dp[new_day] = {} if new_pos not in dp[new_day]: dp[new_day][new_pos] = [] # 状态 dominance 检查 add_state = True for existing in dp[new_day][new_pos][:]: if (existing[0] >= new_state[0] and existing[1] >= new_state[1] and existing[2] >= new_state[2]): add_state = False break if (existing[0] <= new_state[0] and existing[1] <= new_state[1] and existing[2] <= new_state[2]): dp[new_day][new_pos].remove(existing) if add_state: dp[new_day][new_pos].append(new_state) # 找出终点最优解 best_money = -1 best_solution = None for day in dp: if (len(adj_matrix)-1) in dp[day]: # 终点区域 for state in dp[day][len(adj_matrix)-1]: if state[0] > best_money: best_money = state[0] best_solution = state return best_solution4.2 初始资源购买优化
初始水和食物的购买比例直接影响后续策略:
def optimize_initial_purchase(max_load, prices): """ 计算最优初始购买量 :param max_load: 最大负重 :param prices: (水价格, 食物价格) :return: (最优水数量, 最优食物数量) """ from scipy.optimize import linprog # 目标函数系数:最小化初始花费 c = [prices[0], prices[1]] # 约束条件:3x + 2y <= 1200 (负重限制) A = [[3, 2]] # 水和食物的单位重量 b = [max_load] # 变量边界 x_bounds = (0, None) y_bounds = (0, None) # 求解 res = linprog(c, A_ub=A, b_ub=b, bounds=[x_bounds, y_bounds], method='highs') # 取整处理 water = int(res.x[0]) food = int(res.x[1]) # 调整确保不超负重 while 3*water + 2*food > max_load: if water > 0: water -= 1 else: food -= 1 return water, food5. 完整解决方案与测试
整合各模块,构建完整解决方案:
class DesertCrossingSolver: def __init__(self, adj_matrix, weather_seq, max_days, initial_money, max_load): self.adj_matrix = adj_matrix self.weather_seq = weather_seq self.max_days = max_days self.initial_money = initial_money self.max_load = max_load self.water_price = 5 self.food_price = 10 def solve(self): # 步骤1:计算最优初始购买 init_water, init_food = self.optimize_initial_purchase() init_cost = init_water * self.water_price + init_food * self.food_price remaining_money = self.initial_money - init_cost # 步骤2:运行动态规划算法 solution = self.run_dp_solution(init_water, init_food, remaining_money) # 步骤3:后处理(计算终点剩余价值) final_money = solution[0] + solution[1]*self.water_price/2 + solution[2]*self.food_price/2 solution = (final_money,) + solution[1:] return solution def optimize_initial_purchase(self): # ...实现同上... def run_dp_solution(self, init_water, init_food, init_money): # ...实现动态规划算法... def visualize_solution(self, solution): import matplotlib.pyplot as plt path = solution[3] regions_x = [pos%5 for pos in path] # 示例坐标计算 regions_y = [pos//5 for pos in path] plt.figure(figsize=(10,8)) plt.plot(regions_x, regions_y, 'bo-') plt.title('Optimal Path Visualization') plt.xlabel('X Coordinate') plt.ylabel('Y Coordinate') # 标记特殊区域 mines = [12, 30, 55] # 示例矿山位置 villages = [15, 39, 62] # 示例村庄位置 for mine in mines: mx, my = mine%5, mine//5 plt.plot(mx, my, 'rs', markersize=10, label='Mine' if mine == mines[0] else "") for village in villages: vx, vy = village%5, village//5 plt.plot(vx, vy, 'g^', markersize=10, label='Village' if village == villages[0] else "") plt.legend() plt.grid(True) plt.show()测试案例:
# 示例天气序列(30天) weather_types = ['sunny', 'hot', 'sandstorm'] weather_seq = [random.choice(weather_types) for _ in range(30)] # 构建求解器 solver = DesertCrossingSolver(adj_matrix, weather_seq, 30, 10000, 1200) # 求解并可视化 solution = solver.solve() print(f"最优解:剩余资金{solution[0]},路径长度{len(solution[3])}") solver.visualize_solution(solution)6. 高级优化与扩展思路
6.1 启发式搜索策略
对于大规模问题,可以考虑以下优化方向:
A*搜索算法:设计合适的启发式函数
def heuristic(position, goal, day_remaining): # 使用预计算的Floyd最短距离作为启发值 shortest_dist = floyd_distances[position][goal] return shortest_dist * min_consumption_per_day蒙特卡洛树搜索(MCTS):适用于部分天气信息可见的情况
遗传算法:用于优化路径序列
6.2 多玩家博弈策略
对于问题的第三问(多玩家情况),需要引入博弈论思想:
class MultiPlayerSolver: def __init__(self, num_players, adj_matrix, weather_seq, max_days): self.num_players = num_players self.adj_matrix = adj_matrix self.weather_seq = weather_seq self.max_days = max_days def solve_cooperative(self): # 合作博弈解决方案 # 使用Nash bargaining solution等概念 pass def solve_competitive(self): # 竞争博弈解决方案 # 使用博弈树或强化学习 pass6.3 机器学习增强
对于部分天气信息可见的情况(问题第二问),可以应用强化学习:
class RLAgent: def __init__(self, state_space, action_space): self.q_table = np.zeros((state_space, action_space)) self.learning_rate = 0.1 self.discount_factor = 0.95 def choose_action(self, state, epsilon): if random.uniform(0,1) < epsilon: return random.randint(0, self.action_space-1) else: return np.argmax(self.q_table[state]) def learn(self, state, action, reward, next_state): predict = self.q_table[state, action] target = reward + self.discount_factor * np.max(self.q_table[next_state]) self.q_table[state, action] += self.learning_rate * (target - predict)在实际项目中,这种路径规划算法的实现需要考虑更多工程细节和优化空间。通过将数学建模问题转化为可执行的算法实现,我们不仅能更好地理解问题本质,还能验证各种策略的实际效果。
