从零手搓YOLOv5的C3模块:用PyTorch复现核心组件并跑通一个天气分类Demo
从零手搓YOLOv5的C3模块:用PyTorch复现核心组件并跑通一个天气分类Demo
在计算机视觉领域,YOLO系列算法因其卓越的实时检测性能而广受关注。作为该系列的最新代表作,YOLOv5通过精心设计的网络结构实现了精度与速度的完美平衡。本文将带您深入YOLOv5的核心——C3模块,从零开始用PyTorch实现这一关键组件,并构建一个完整的天气分类模型。不同于简单地调用现成框架,我们将从最基础的nn.Module起步,逐步搭建网络积木,让您真正掌握模块设计的精髓。
1. 环境准备与基础模块构建
1.1 PyTorch环境配置
确保已安装最新版PyTorch(≥1.8.0)和torchvision。推荐使用conda创建独立环境:
conda create -n yolov5 python=3.8 conda activate yolov5 pip install torch torchvision torchaudio1.2 自动填充函数实现
在卷积神经网络中,保持特征图尺寸不变是常见需求。我们先实现一个智能填充函数:
def autopad(kernel_size, padding=None): """自动计算padding值以保持输入输出尺寸一致""" if padding is None: # 对奇数核取半,对偶数核向上取整 padding = kernel_size // 2 if isinstance(kernel_size, int) else [k//2 for k in kernel_size] return padding1.3 基础卷积模块
构建包含卷积、批归一化和激活函数的复合模块:
import torch.nn as nn class Conv(nn.Module): def __init__(self, in_channels, out_channels, kernel_size=1, stride=1, padding=None, activation=True, groups=1): super().__init__() self.conv = nn.Conv2d( in_channels, out_channels, kernel_size, stride, autopad(kernel_size, padding), groups=groups, bias=False ) self.bn = nn.BatchNorm2d(out_channels) self.act = nn.SiLU() if activation else nn.Identity() def forward(self, x): return self.act(self.bn(self.conv(x)))提示:
groups=1为普通卷积,groups=in_channels时变为深度可分离卷积
2. 核心组件实现
2.1 Bottleneck模块
作为C3的基础单元,Bottleneck实现了两种残差连接模式:
class Bottleneck(nn.Module): def __init__(self, in_channels, out_channels, expansion=0.5, shortcut=True, groups=1): super().__init__() hidden_channels = int(out_channels * expansion) self.conv1 = Conv(in_channels, hidden_channels, 1, 1) self.conv2 = Conv(hidden_channels, out_channels, 3, 1, g=groups) self.use_shortcut = shortcut and in_channels == out_channels def forward(self, x): identity = x out = self.conv2(self.conv1(x)) return out + identity if self.use_shortcut else out2.2 C3模块详解
C3模块通过分支结构融合不同感受野的特征:
class C3(nn.Module): def __init__(self, in_channels, out_channels, num_bottlenecks=1, shortcut=True, groups=1, expansion=0.5): super().__init__() hidden_channels = int(out_channels * expansion) self.cv1 = Conv(in_channels, hidden_channels, 1, 1) self.cv2 = Conv(in_channels, hidden_channels, 1, 1) self.m = nn.Sequential( *[Bottleneck(hidden_channels, hidden_channels, expansion=1, shortcut=shortcut, groups=groups) for _ in range(num_bottlenecks)] ) self.cv3 = Conv(2 * hidden_channels, out_channels, 1, 1) def forward(self, x): branch1 = self.m(self.cv1(x)) branch2 = self.cv2(x) return self.cv3(torch.cat((branch1, branch2), dim=1))模块结构对比:
| 组件 | 输入通道 | 输出通道 | 核心操作 |
|---|---|---|---|
| Conv | c1 | c_ | 1×1卷积 |
| Bottleneck | c_ | c_ | 1×1→3×3卷积 |
| C3 | c1 | c2 | 双分支特征融合 |
3. 网络集成与天气分类实战
3.1 数据集准备
使用天气分类数据集(晴、雨、雪、云),按8:2划分训练测试集:
from torchvision import transforms, datasets from torch.utils.data import DataLoader, random_split transform = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) dataset = datasets.ImageFolder('weather_dataset/', transform=transform) train_set, test_set = random_split(dataset, [0.8, 0.2]) train_loader = DataLoader(train_set, batch_size=32, shuffle=True) test_loader = DataLoader(test_set, batch_size=32)3.2 网络架构设计
构建包含C3模块的完整分类网络:
class WeatherClassifier(nn.Module): def __init__(self, num_classes=4): super().__init__() self.backbone = nn.Sequential( Conv(3, 32, 3, 2), # [32, 32, 112, 112] C3(32, 64, n=1), # [32, 64, 112, 112] Conv(64, 128, 3, 2), # [32, 128, 56, 56] C3(128, 256, n=2) # [32, 256, 56, 56] ) self.head = nn.Sequential( nn.AdaptiveAvgPool2d(1), nn.Flatten(), nn.Linear(256, num_classes) ) def forward(self, x): features = self.backbone(x) return self.head(features)3.3 训练与评估
实现完整的训练流程:
def train(model, device, train_loader, optimizer, epoch): model.train() for batch_idx, (data, target) in enumerate(train_loader): data, target = data.to(device), target.to(device) optimizer.zero_grad() output = model(data) loss = nn.CrossEntropyLoss()(output, target) loss.backward() optimizer.step() if batch_idx % 10 == 0: print(f'Train Epoch: {epoch} [{batch_idx}/{len(train_loader)}] Loss: {loss.item():.4f}') def test(model, device, test_loader): model.eval() correct = 0 with torch.no_grad(): for data, target in test_loader: data, target = data.to(device), target.to(device) output = model(data) pred = output.argmax(dim=1, keepdim=True) correct += pred.eq(target.view_as(pred)).sum().item() accuracy = 100. * correct / len(test_loader.dataset) print(f'Test Accuracy: {accuracy:.2f}%') return accuracy device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = WeatherClassifier().to(device) optimizer = torch.optim.Adam(model.parameters(), lr=0.001) for epoch in range(1, 11): train(model, device, train_loader, optimizer, epoch) test(model, device, test_loader)4. 性能优化技巧
4.1 模型压缩策略
通过调整C3模块参数实现精度与效率的平衡:
# 轻量级配置 class LiteC3(nn.Module): def __init__(self, in_channels, out_channels): super().__init__() hidden_channels = out_channels // 2 self.cv1 = Conv(in_channels, hidden_channels, 1) self.cv2 = Conv(in_channels, hidden_channels, 1) self.m = nn.Sequential( *[Bottleneck(hidden_channels, hidden_channels, expansion=0.5) for _ in range(1)] ) self.cv3 = Conv(2 * hidden_channels, out_channels, 1)4.2 混合精度训练
利用NVIDIA的Apex库加速训练:
from apex import amp model = WeatherClassifier().to(device) optimizer = torch.optim.Adam(model.parameters(), lr=0.001) model, optimizer = amp.initialize(model, optimizer, opt_level="O1") with amp.scale_loss(loss, optimizer) as scaled_loss: scaled_loss.backward()4.3 数据增强改进
添加更丰富的数据增强策略:
train_transform = transforms.Compose([ transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2), transforms.RandomRotation(15), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ])