当前位置：首页 > news >正文

别只盯着CNN/RNN了！手把手用Python和NumPy实现一个玩具级DBN（附完整代码）

news 2026/7/27 8:27:56

从零构建深度信念网络：用NumPy实现MNIST手写数字识别

深度信念网络（DBN）作为深度学习发展史上的重要里程碑，至今仍是理解神经网络分层特征提取的绝佳教学案例。本文将带您用纯NumPy实现一个可配置层数的DBN，并在MNIST数据集上完成手写数字识别任务。不同于调用现成的深度学习框架，这种"造轮子"的过程能让我们真正掌握DBN的核心机制——从RBM的对比散度训练到分层特征抽象的形成原理。

1. 环境准备与数据加载

在开始构建DBN之前，我们需要准备Python环境和MNIST数据集。这里选择NumPy作为核心计算库，它不仅提供了高效的矩阵运算能力，还能让我们避开深度学习框架的"黑箱"，亲手实现每个关键步骤。

首先安装必要的库：

pip install numpy matplotlib scikit-learn

加载和预处理MNIST数据的完整代码如下：

import numpy as np from sklearn.datasets import fetch_openml from sklearn.model_selection import train_test_split def load_mnist(binarize=True): mnist = fetch_openml('mnist_784', version=1) X = mnist.data.astype('float32') / 255.0 y = mnist.target.astype('int') if binarize: # 将像素值二值化，更适合RBM处理 X = (X > 0.5).astype(np.float32) X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42) return X_train, X_test, y_train, y_test

注意：MNIST原始像素值范围是0-255，我们将其归一化到0-1之间。对于DBN的RBM层，通常建议对输入进行二值化处理，这符合RBM的二元随机神经元特性。

数据加载后，我们可以检查一下数据维度：

X_train, X_test, y_train, y_test = load_mnist() print(f"训练集形状: {X_train.shape}, 测试集形状: {X_test.shape}")

典型输出应该是：

训练集形状: (56000, 784), 测试集形状: (14000, 784)

2. 实现受限玻尔兹曼机(RBM)

RBM是DBN的基本构建块，理解它的实现是掌握DBN的关键。一个RBM包含可见层和隐藏层，层间全连接，层内无连接。我们将实现对比散度(CD)算法来训练RBM。

2.1 RBM类结构设计

class RBM: def __init__(self, n_visible, n_hidden, learning_rate=0.01, k=1): self.n_visible = n_visible self.n_hidden = n_hidden self.lr = learning_rate self.k = k # CD-k算法中的k值 # 初始化权重和偏置 self.W = np.random.normal(0, 0.01, size=(n_visible, n_hidden)) self.h_bias = np.zeros(n_hidden) self.v_bias = np.zeros(n_visible) def sigmoid(self, x): return 1 / (1 + np.exp(-x))

2.2 对比散度训练过程

CD-k算法的核心步骤如下：

正向传播：计算隐藏层概率
采样隐藏状态
反向传播：重构可见层
重复Gibbs采样k次
更新参数

def contrastive_divergence(self, v0): # 正向传播 h0_prob = self.sigmoid(np.dot(v0, self.W) + self.h_bias) h0_sample = (np.random.random(size=h0_prob.shape) < h0_prob).astype(np.float32) # Gibbs采样链 vk = v0.copy() for _ in range(self.k): hk_prob = self.sigmoid(np.dot(vk, self.W) + self.h_bias) hk_sample = (np.random.random(size=hk_prob.shape) < hk_prob).astype(np.float32) vk_prob = self.sigmoid(np.dot(hk_sample, self.W.T) + self.v_bias) vk = (np.random.random(size=vk_prob.shape) < vk_prob).astype(np.float32) # 计算梯度 hk_prob = self.sigmoid(np.dot(vk, self.W) + self.h_bias) positive_grad = np.dot(v0.T, h0_prob) negative_grad = np.dot(vk.T, hk_prob) # 更新参数 self.W += self.lr * (positive_grad - negative_grad) / v0.shape[0] self.v_bias += self.lr * np.mean(v0 - vk, axis=0) self.h_bias += self.lr * np.mean(h0_prob - hk_prob, axis=0) # 计算重构误差 reconstruction = self.sigmoid(np.dot(h0_sample, self.W.T) + self.v_bias) error = np.mean(np.sum((v0 - reconstruction) ** 2, axis=1)) return error

2.3 RBM训练监控

为了监控训练过程，我们可以添加可视化功能：

def visualize_weights(self, n_rows=10, n_cols=10): import matplotlib.pyplot as plt fig, axes = plt.subplots(n_rows, n_cols, figsize=(10, 10)) for i, ax in enumerate(axes.flat): if i < self.n_hidden: ax.imshow(self.W[:, i].reshape(28, 28), cmap='gray') ax.axis('off') plt.show()

训练RBM的示例代码：

rbm = RBM(n_visible=784, n_hidden=256, learning_rate=0.01, k=1) n_epochs = 20 batch_size = 100 for epoch in range(n_epochs): np.random.shuffle(X_train) total_error = 0 for i in range(0, X_train.shape[0], batch_size): batch = X_train[i:i+batch_size] error = rbm.contrastive_divergence(batch) total_error += error print(f"Epoch {epoch+1}/{n_epochs}, Reconstruction Error: {total_error/(i+1):.4f}") if epoch % 5 == 0: rbm.visualize_weights()

3. 构建深度信念网络(DBN)

DBN由多个堆叠的RBM组成，通过逐层无监督预训练学习分层特征表示，最后用有监督方法微调整个网络。

3.1 DBN类结构设计

class DBN: def __init__(self, layer_sizes): self.rbms = [] for i in range(len(layer_sizes)-1): rbm = RBM(n_visible=layer_sizes[i], n_hidden=layer_sizes[i+1], learning_rate=0.01) self.rbms.append(rbm) # 微调阶段的分类器权重 self.W = np.random.normal(0, 0.01, size=(layer_sizes[-1], 10)) self.b = np.zeros(10) def pretrain(self, X, epochs=10, batch_size=100): input_data = X.copy() for i, rbm in enumerate(self.rbms): print(f"Pre-training RBM layer {i+1}/{len(self.rbms)}") for epoch in range(epochs): np.random.shuffle(input_data) for j in range(0, input_data.shape[0], batch_size): batch = input_data[j:j+batch_size] rbm.contrastive_divergence(batch) # 获取当前RBM的隐藏表示作为下一层的输入 input_data = rbm.sigmoid(np.dot(input_data, rbm.W) + rbm.h_bias)

3.2 前向传播与特征提取

def transform(self, X): hidden_activation = X.copy() for rbm in self.rbms: hidden_activation = rbm.sigmoid( np.dot(hidden_activation, rbm.W) + rbm.h_bias) return hidden_activation def predict_proba(self, X): hidden_activation = self.transform(X) output = np.dot(hidden_activation, self.W) + self.b return self.softmax(output) def softmax(self, x): exp_x = np.exp(x - np.max(x, axis=1, keepdims=True)) return exp_x / np.sum(exp_x, axis=1, keepdims=True)

3.3 微调阶段实现

微调阶段使用反向传播算法调整所有层的权重：

def finetune(self, X, y, epochs=20, batch_size=100, learning_rate=0.1): y_onehot = np.eye(10)[y] # 转为one-hot编码 for epoch in range(epochs): indices = np.random.permutation(X.shape[0]) total_loss = 0 correct = 0 for i in range(0, X.shape[0], batch_size): # 获取当前batch idx = indices[i:i+batch_size] X_batch = X[idx] y_batch = y_onehot[idx] # 前向传播 hidden_activations = [X_batch] for rbm in self.rbms: hidden_activation = rbm.sigmoid( np.dot(hidden_activations[-1], rbm.W) + rbm.h_bias) hidden_activations.append(hidden_activation) output = np.dot(hidden_activations[-1], self.W) + self.b proba = self.softmax(output) # 计算损失和准确率 loss = -np.mean(np.sum(y_batch * np.log(proba + 1e-10), axis=1)) total_loss += loss correct += np.sum(np.argmax(proba, axis=1) == np.argmax(y_batch, axis=1)) # 反向传播 output_error = (proba - y_batch) / batch_size hidden_error = [output_error.dot(self.W.T)] # 计算隐藏层误差 for j in range(len(self.rbms)-1, -1, -1): act = hidden_activations[j+1] error = hidden_error[-1] * act * (1 - act) hidden_error.append(error.dot(self.rbms[j].W.T)) hidden_error.reverse() # 更新分类器权重 self.W -= learning_rate * np.dot(hidden_activations[-1].T, output_error) self.b -= learning_rate * np.sum(output_error, axis=0) # 更新RBM权重 for j in range(len(self.rbms)): grad_W = np.dot(hidden_activations[j].T, hidden_error[j+1]) grad_h_bias = np.sum(hidden_error[j+1], axis=0) self.rbms[j].W -= learning_rate * grad_W self.rbms[j].h_bias -= learning_rate * grad_h_bias accuracy = correct / X.shape[0] print(f"Epoch {epoch+1}/{epochs}, Loss: {total_loss/(i+1):.4f}, " f"Accuracy: {accuracy:.4f}")

4. 训练与评估完整DBN模型

现在我们可以将所有这些组件组合起来，训练一个完整的DBN模型：

# 定义DBN结构：784-256-128-64 dbn = DBN([784, 256, 128, 64]) # 预训练各层RBM print("Starting pre-training...") dbn.pretrain(X_train, epochs=15) # 微调整个网络 print("Starting fine-tuning...") dbn.finetune(X_train, y_train, epochs=30) # 评估测试集性能 test_proba = dbn.predict_proba(X_test) test_pred = np.argmax(test_proba, axis=1) accuracy = np.mean(test_pred == y_test) print(f"Test accuracy: {accuracy:.4f}")

4.1 参数调优建议

在实践中有几个关键参数会影响DBN的性能：

网络结构：通常从较宽的底层开始逐渐缩小隐藏层大小
- 示例配置：784-500-200-100 或 784-1024-512-256
学习率：
- 预训练阶段：0.01-0.1
- 微调阶段：0.001-0.01（通常比预训练小）
批大小：32-256之间，较大的批大小通常更稳定
CD-k中的k值：通常k=1就能工作得很好，增加k可能提高质量但降低速度

4.2 可视化中间特征

理解DBN学习到的特征表示非常重要：

import matplotlib.pyplot as plt # 可视化第一层RBM的权重 plt.figure(figsize=(10,10)) for i in range(25): plt.subplot(5,5,i+1) plt.imshow(dbn.rbms[0].W[:,i].reshape(28,28), cmap='gray') plt.axis('off') plt.suptitle('First Layer Learned Features', fontsize=16) plt.show() # 可视化第二层隐藏激活 sample_idx = np.random.choice(X_test.shape[0], 5) hidden_activations = dbn.transform(X_test[sample_idx]) plt.figure(figsize=(15,3)) for i in range(5): plt.subplot(1,5,i+1) plt.imshow(hidden_activations[i].reshape(8,8), cmap='viridis') plt.title(f"Label: {y_test[sample_idx[i]]}") plt.axis('off') plt.suptitle('Second Layer Hidden Activations', fontsize=16) plt.show()

4.3 常见问题排查

在实现DBN时可能会遇到以下问题：

重构误差不下降：
- 降低学习率
- 增加Gibbs采样的k值
- 检查数据预处理是否正确
微调阶段准确率低：
- 延长预训练时间
- 尝试不同的网络结构
- 调整微调阶段的学习率
训练过程不稳定：
- 减小批大小
- 添加权重衰减正则化
- 对权重初始化使用更小的标准差

# 添加权重衰减的RBM初始化示例 self.W = np.random.normal(0, 0.001, size=(n_visible, n_hidden))

查看全文

http://www.jsqmd.com/news/924798/

5分钟快速部署：打造你的专属AI微信聊天机器人

创新解决方案：番茄小说下载器三步实现永久保存，效率提升300%

传统备份全部文件留存，编写定期无用文件清理程序，主动舍弃过期资料，打破全部留存囤积习惯。

保姆级教程：用WSL2 + Windows Terminal打造你的Windows最强开发终端（附内存优化配置）

如何高效使用MegSpot：专业视觉对比工具终极指南

基于Arduino接近传感器与Python串口通信的体感游戏控制器实现

避坑指南：GTX750/1050装CUDA11+，千万别踩‘DCH驱动’和‘PyTorch版本’这两个大坑

ODrive开源电机控制终极指南：从零到精通掌握高性能控制算法

Steam游戏自动破解终极指南：三步轻松实现游戏自由

微信聊天记录永久保存终极指南：5分钟学会完整免费备份方案

GitHub 平台功能、解决方案、资源全揭秘，Rsync 项目问题 #929 详情曝光

2026最新适合英语底子薄中学生的实用听力平台推荐

Arduino电子骰子DIY：从电路搭建到封装，打造你的专属桌游神器

【独家首发】Gemini 2.0故事模组深度逆向：3类高转化叙事结构首次披露

鸣潮自动化终极指南：零基础3分钟掌握智能后台战斗系统

如何用茉莉花插件3步搞定Zotero中文文献管理：终极完整指南

终极AMD Ryzen硬件调试指南：深度掌控处理器底层参数

AMD显卡驱动瘦身神器：Radeon Software Slimmer终极配置指南

不只是卖出去——To B 要有优秀销售的真相（下）

2026年武汉奢侈品回收市场观察：服务差异与选择维度深度解析 - 奢品屋武汉奢侈品回收

如何打造全平台直播聚合神器：Simple Live 完整使用指南

BetterNCM安装器：3分钟完成网易云插件安装的终极指南

如何将微信对话转化为个人数字资产：WeChatMsg完全指南

从矿山滑坡到地铁安全：InSAR技术如何成为‘大地CT机’，守护我们的城市与工程？

Istio流量镜像实战指南

PAB-GAN：基于注意力机制的无监督对象级图像翻译实战解析

WeChatMsg：让微信聊天记录成为你的数字记忆宝库

Gemini API兼容性突变预警（开发者紧急须知）：v2.4→v2.5迁移必查的8个breaking change

胜菱智能一站式解决方案技术怎么样？8项核心能力盘点 - 资讯纵览