当前位置: 首页 > news >正文

CNN 架构演进:从 LeNet 到 EfficientNet

CNN 架构演进:从 LeNet 到 EfficientNet

1. 技术分析

1.1 CNN 架构演进历程

卷积神经网络经历了从简单到复杂的演进:

CNN 架构演进 LeNet (1998) → AlexNet (2012) → VGG (2014) → ResNet (2015) → EfficientNet (2019)

1.2 经典 CNN 架构对比

模型层数参数Top-1 准确率特点
LeNet560K99% (MNIST)经典架构
AlexNet860M83% (ImageNet)ReLU + Dropout
VGG16/19138M92%统一 3x3 卷积
ResNet15260M96%残差连接
EfficientNetB0-B75M-66M77%-88%复合缩放

1.3 CNN 核心组件

CNN 核心组件 卷积层: 特征提取 池化层: 降采样 激活函数: 非线性变换 全连接层: 分类 残差连接: 梯度传播

2. 核心功能实现

2.1 LeNet 实现

import torch import torch.nn as nn import torch.nn.functional as F class LeNet(nn.Module): def __init__(self, num_classes=10): super().__init__() self.conv1 = nn.Conv2d(1, 6, kernel_size=5) self.conv2 = nn.Conv2d(6, 16, kernel_size=5) self.fc1 = nn.Linear(16 * 4 * 4, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, num_classes) def forward(self, x): x = F.max_pool2d(F.relu(self.conv1(x)), 2) x = F.max_pool2d(F.relu(self.conv2(x)), 2) x = x.view(-1, 16 * 4 * 4) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x

2.2 ResNet 实现

class ResidualBlock(nn.Module): def __init__(self, in_channels, out_channels, stride=1): super().__init__() self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1) self.bn1 = nn.BatchNorm2d(out_channels) self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1) self.bn2 = nn.BatchNorm2d(out_channels) self.shortcut = nn.Sequential() if stride != 1 or in_channels != out_channels: self.shortcut = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride), nn.BatchNorm2d(out_channels) ) def forward(self, x): residual = self.shortcut(x) x = F.relu(self.bn1(self.conv1(x))) x = self.bn2(self.conv2(x)) x += residual x = F.relu(x) return x class ResNet(nn.Module): def __init__(self, block, num_blocks, num_classes=10): super().__init__() self.in_channels = 64 self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1) self.bn1 = nn.BatchNorm2d(64) self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1) self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2) self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2) self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2) self.fc = nn.Linear(512, num_classes) def _make_layer(self, block, out_channels, num_blocks, stride): strides = [stride] + [1] * (num_blocks - 1) layers = [] for stride in strides: layers.append(block(self.in_channels, out_channels, stride)) self.in_channels = out_channels return nn.Sequential(*layers) def forward(self, x): x = F.relu(self.bn1(self.conv1(x))) x = self.layer1(x) x = self.layer2(x) x = self.layer3(x) x = self.layer4(x) x = F.avg_pool2d(x, 4) x = x.view(x.size(0), -1) x = self.fc(x) return x def ResNet18(num_classes=10): return ResNet(ResidualBlock, [2, 2, 2, 2], num_classes) def ResNet34(num_classes=10): return ResNet(ResidualBlock, [3, 4, 6, 3], num_classes)

2.3 EfficientNet 实现

class MBConv(nn.Module): def __init__(self, in_channels, out_channels, expansion_factor=6, stride=1): super().__init__() hidden_dim = in_channels * expansion_factor self.conv = nn.Sequential( nn.Conv2d(in_channels, hidden_dim, kernel_size=1), nn.BatchNorm2d(hidden_dim), nn.ReLU6(), nn.Conv2d(hidden_dim, hidden_dim, kernel_size=3, stride=stride, padding=1, groups=hidden_dim), nn.BatchNorm2d(hidden_dim), nn.ReLU6(), nn.Conv2d(hidden_dim, out_channels, kernel_size=1), nn.BatchNorm2d(out_channels) ) self.shortcut = nn.Sequential() if stride == 1 and in_channels == out_channels: self.shortcut = nn.Identity() def forward(self, x): residual = self.shortcut(x) x = self.conv(x) x += residual return x class EfficientNet(nn.Module): def __init__(self, width_mult=1.0, depth_mult=1.0, num_classes=1000): super().__init__() base_channels = int(32 * width_mult) self.stem = nn.Sequential( nn.Conv2d(3, base_channels, kernel_size=3, stride=2, padding=1), nn.BatchNorm2d(base_channels), nn.ReLU6() ) self.blocks = self._make_blocks(width_mult, depth_mult) self.head = nn.Sequential( nn.Conv2d(self._get_last_channels(width_mult), int(1280 * width_mult), kernel_size=1), nn.BatchNorm2d(int(1280 * width_mult)), nn.ReLU6(), nn.AdaptiveAvgPool2d(1) ) self.fc = nn.Linear(int(1280 * width_mult), num_classes) def _make_blocks(self, width_mult, depth_mult): blocks = [] config = [ (1, 16, 1, 3), (6, 24, 2, 3), (6, 40, 2, 5), (6, 80, 3, 3), (6, 112, 3, 5), (6, 192, 4, 5), (6, 320, 1, 3) ] in_channels = int(32 * width_mult) for exp_factor, out_channels, repeats, kernel_size in config: out_channels = int(out_channels * width_mult) repeats = int(repeats * depth_mult) for i in range(repeats): stride = 2 if i == 0 and in_channels != out_channels else 1 blocks.append(MBConv(in_channels, out_channels, exp_factor, stride)) in_channels = out_channels return nn.Sequential(*blocks) def _get_last_channels(self, width_mult): return int(320 * width_mult) def forward(self, x): x = self.stem(x) x = self.blocks(x) x = self.head(x) x = x.view(x.size(0), -1) x = self.fc(x) return x

3. 性能对比

3.1 CNN 模型对比

模型参数(M)FLOPs(G)Top-1Top-5
VGG-1613815.571.5%90.1%
ResNet-50254.176.1%92.8%
ResNet-1526011.678.5%94.1%
EfficientNet-B05.30.3977.3%93.3%
EfficientNet-B7663784.4%97.1%

3.2 模型缩放效果

EfficientNetWidthDepthResolutionTop-1
B01.01.022477.3%
B11.01.124079.1%
B21.11.226080.1%
B31.21.430081.6%
B41.41.838083.0%

3.3 推理速度对比

模型速度(imgs/s)内存(GB)
ResNet-1810001.2
ResNet-506002.0
EfficientNet-B015000.8
EfficientNet-B38001.5

4. 最佳实践

4.1 CNN 模型选择

def select_cnn_model(task_type, constraints): if constraints.get('speed', False): return EfficientNet(width_mult=1.0, depth_mult=1.0) elif constraints.get('accuracy', False): return EfficientNet(width_mult=1.8, depth_mult=2.6) else: return ResNet50() class CNNFactory: @staticmethod def create(config): if config['type'] == 'resnet': return ResNet18(num_classes=config['num_classes']) elif config['type'] == 'efficientnet': return EfficientNet( width_mult=config.get('width_mult', 1.0), depth_mult=config.get('depth_mult', 1.0), num_classes=config['num_classes'] )

4.2 CNN 训练流程

class CNNTrainer: def __init__(self, model, optimizer, scheduler, loss_fn, device='cuda'): self.model = model.to(device) self.optimizer = optimizer self.scheduler = scheduler self.loss_fn = loss_fn self.device = device def train_step(self, inputs, targets): self.optimizer.zero_grad() inputs = inputs.to(self.device) targets = targets.to(self.device) outputs = self.model(inputs) loss = self.loss_fn(outputs, targets) loss.backward() self.optimizer.step() self.scheduler.step() return loss.item() def evaluate(self, dataloader): self.model.eval() correct = 0 total = 0 with torch.no_grad(): for inputs, targets in dataloader: inputs = inputs.to(self.device) targets = targets.to(self.device) outputs = self.model(inputs) predictions = torch.argmax(outputs, dim=1) correct += (predictions == targets).sum().item() total += targets.size(0) return correct / total

5. 总结

CNN 架构不断演进,性能越来越好:

  1. LeNet:CNN 的起点,奠定基础
  2. AlexNet:引入 ReLU 和 Dropout
  3. VGG:统一卷积核大小
  4. ResNet:残差连接解决梯度消失
  5. EfficientNet:复合缩放策略

对比数据如下:

  • EfficientNet-B7 达到 84.4% Top-1 准确率
  • 复合缩放比单纯增加深度或宽度更有效
  • EfficientNet 在相同参数量下比 ResNet 准确率更高
  • 推荐根据资源限制选择合适的模型规模
http://www.jsqmd.com/news/810297/

相关文章:

  • 杰理之开启TWS后出现死机问题【篇】
  • TypingMind自部署指南:构建统一AI对话管理平台
  • TikTok创作者最后的机会?:ChatGPT正在淘汰不会“提示工程+行为建模”的内容生产者(附能力自测表)
  • 顶刊IJCV 2026!清华大学等提出SoftHGNN:通用视觉识别全面提升!让超图从“硬连接”走向“软参与”
  • 如何快速上手ComfyUI-WanVideoWrapper:AI视频生成的完整指南
  • pve删除data增大root
  • Python canopen库SDO Server不支持块下载?手把手教你魔改回调函数实现(附完整源码)
  • 终极小说下载器指南:如何一键永久保存100+网站的小说内容
  • Taotoken用量看板如何帮助单片机团队管控AI辅助开发成本
  • 破解第三方平台代运营痛点:PFS全链路获客法如何提升有效询盘? - 速递信息
  • 保姆级教程:用VMWare和Windbg搞定Windows 7/10驱动双机调试(附测试签名开启)
  • Gemini总结YouTube时悄悄丢掉的关键信息(时间戳错位、技术公式省略、引用来源隐匿)——资深AI审计师首次披露
  • 开源墨水屏驱动库inkos:架构解析与嵌入式开发实战
  • 百度网盘Mac版终极破解指南:免费解锁SVIP高速下载
  • 从账单明细看使用Token Plan套餐如何有效控制成本
  • 3分钟快速解密网易云音乐NCM格式:免费工具实现音乐自由播放
  • 性价比高的NEUHAUS NEOTEC和星巴克昆山工厂合作
  • Web安全技能体系构建:从协议方法论到实战训练指南
  • IDM试用重置神器:无需破解轻松恢复30天试用期的完整指南
  • 深度学习正则化(五)—— 对抗训练 + 切面分类(三十二)
  • Cursor Free VIP终极指南:3步实现永久免费使用AI编程神器
  • ChatGPT开源项目精选列表:AI开发者的高效资源导航与实战指南
  • 告别‘玄学’整改:手把手教你搞定BMS EMC超标(基于GB/T 18655-2010电流法案例)
  • 使用Taotoken后,我的API调用延迟与稳定性观测记录
  • Swift 训练大语言模型:速度从 2.8 Gflop/s 提升至 5.884 令牌/秒,超 200 倍增长!
  • ComfyUI-VideoHelperSuite终极指南:轻松实现AI视频生成与处理
  • 2026甘肃汽车贴膜改色选购逻辑:资质、工艺与售后全解析 - 深度智识库
  • 如何完整解锁ComfyUI-Impact-Pack的图像增强功能:从安装到实战的全方位指南
  • 用STC89C52单片机+光敏电阻做个智能台灯:自动调光与手动5档切换保姆级教程
  • ARM DVP RCTX指令:数据预测与安全防护详解