当前位置：首页 > news >正文

手把手教你将DOTA遥感数据集标注转为COCO格式（附完整Python代码）

news 2026/6/3 2:13:05

手把手教你将DOTA遥感数据集标注转为COCO格式（附完整Python代码）

遥感图像中的车辆检测是智慧交通、港口监控等场景中的核心任务。DOTA数据集作为遥感领域最具影响力的基准数据集之一，其标注格式与通用目标检测框架（如MMDetection、Detectron2）常用的COCO格式存在显著差异。本文将深入解析两种格式的转换逻辑，并提供一套经过实战检验的Python解决方案。

1. 为什么需要转换标注格式？

DOTA数据集采用旋转框标注（OBB），每个物体由四个角点坐标表示，这种格式能更精确地捕捉遥感图像中物体的朝向和形状。而COCO格式使用水平矩形框（HBB），仅需左上角和右下角坐标。两种格式的核心差异体现在三个方面：

几何表示：
- DOTA：(x1,y1,x2,y2,x3,y3,x4,y4)四对坐标
- COCO：[x_min,y_min,width,height]归一化坐标

数据结构：

# DOTA标注示例（每行一个物体） "x1 y1 x2 y2 x3 y3 x4 y4 class_name difficulty" # COCO标注结构 { "images": [{"file_name": "1.jpg", "id": 1,...}], "annotations": [{"bbox": [x,y,w,h], "category_id": 1,...}], "categories": [{"id": 1, "name": "car"},...] }

适用场景：
- DOTA：专为航空图像优化，适合旋转物体检测
- COCO：通用检测基准，主流框架原生支持

提示：当使用YOLOv5、Faster R-CNN等框架时，COCO格式能直接兼容大多数开源代码库和数据增强管道。

2. 转换核心逻辑与代码实现

2.1 关键步骤分解

转换过程需要处理三个核心问题：

坐标转换：将旋转框转化为外接水平矩形
类别映射：匹配DOTA与COCO的类别体系
文件结构重组：从每图单独标注到集中式JSON存储

2.2 完整转换代码

以下代码实现了端到端的格式转换，包含异常处理和可视化验证：

import os import json from PIL import Image import numpy as np class DOTA2COCOConverter: def __init__(self, class_mapping=None): """ :param class_mapping: 自定义类别映射字典 """ self.class_mapping = class_mapping or { 'small-vehicle': 1, 'large-vehicle': 2, 'ship': 3 } def _get_enclosing_bbox(self, points): """将旋转框转为水平矩形框""" x_coords = points[::2] y_coords = points[1::2] x_min, x_max = min(x_coords), max(x_coords) y_min, y_max = min(y_coords), max(y_coords) return [x_min, y_min, x_max - x_min, y_max - y_min] def parse_dota_annotation(self, txt_path): """解析单个DOTA标注文件""" annotations = [] with open(txt_path, 'r') as f: for line in f.readlines(): if line.strip() == '': continue parts = line.strip().split() if len(parts) < 9: continue points = list(map(float, parts[:8])) class_name = parts[8] difficulty = int(parts[9]) if len(parts) > 9 else 0 if class_name not in self.class_mapping: continue bbox = self._get_enclosing_bbox(points) annotation = { "bbox": bbox, "category_id": self.class_mapping[class_name], "iscrowd": 0, "area": bbox[2] * bbox[3] } annotations.append(annotation) return annotations def convert(self, img_dir, ann_dir, output_json): """执行批量转换""" coco_data = { "images": [], "annotations": [], "categories": [ {"id": id, "name": name} for name, id in self.class_mapping.items() ] } annotation_id = 1 for img_name in os.listdir(img_dir): if not img_name.lower().endswith(('.png', '.jpg', '.jpeg')): continue img_path = os.path.join(img_dir, img_name) base_name = os.path.splitext(img_name)[0] txt_path = os.path.join(ann_dir, base_name + '.txt') if not os.path.exists(txt_path): continue with Image.open(img_path) as img: width, height = img.size image_id = len(coco_data["images"]) + 1 coco_data["images"].append({ "id": image_id, "file_name": img_name, "width": width, "height": height }) annotations = self.parse_dota_annotation(txt_path) for ann in annotations: ann.update({ "image_id": image_id, "id": annotation_id }) coco_data["annotations"].append(ann) annotation_id += 1 with open(output_json, 'w') as f: json.dump(coco_data, f, indent=2)

3. 实战应用与验证

3.1 典型目录结构

建议采用如下目录组织：

DOTA_dataset/ ├── images/ │ ├── 0001.png │ └── 0002.png └── annotations/ ├── 0001.txt └── 0002.txt

3.2 执行转换

converter = DOTA2COCOConverter() converter.convert( img_dir='DOTA_dataset/images', ann_dir='DOTA_dataset/annotations', output_json='coco_annotations.json' )

3.3 验证结果

使用COCO API检查转换质量：

from pycocotools.coco import COCO import matplotlib.pyplot as plt coco = COCO('coco_annotations.json') img_ids = coco.getImgIds() img_info = coco.loadImgs(img_ids[0])[0] plt.imshow(Image.open(os.path.join('DOTA_dataset/images', img_info['file_name']))) ann_ids = coco.getAnnIds(imgIds=img_info['id']) annotations = coco.loadAnns(ann_ids) for ann in annotations: bbox = ann['bbox'] plt.gca().add_patch(plt.Rectangle( (bbox[0], bbox[1]), bbox[2], bbox[3], fill=False, edgecolor='red', linewidth=2 )) plt.show()

4. 常见问题与优化建议

4.1 典型错误排查

错误现象	可能原因	解决方案
JSON文件为空	路径错误或类别不匹配	检查路径是否存在，确认class_mapping覆盖所有类别
标注框偏移	坐标归一化问题	确保使用原始像素坐标，不进行归一化
类别ID混乱	重复的类别映射	确保class_mapping中每个类别有唯一ID

4.2 性能优化技巧

并行处理：对大型数据集使用多进程加速

from multiprocessing import Pool def process_image(args): img_path, ann_path = args # 处理逻辑... with Pool(4) as p: p.map(process_image, file_pairs)

增量写入：处理超大数据集时避免内存溢出

with open(output_json, 'w') as f: f.write('{"images": [], "annotations": [], "categories": [...]}\n') # 分批追加数据

可视化校验：开发阶段建议对10%的样本进行人工复核

在实际车辆检测项目中，这种转换通常只需执行一次。建议将转换后的COCO文件与原始数据一起归档，并在README中记录转换参数，便于后续复现。

查看全文

http://www.jsqmd.com/news/939412/

2026年高考复读学校价格揭秘，学有方性价比高 - mypinpai

别再死记硬背了！用Python手撸一个ID3决策树，从信息熵到分类预测保姆级教程

告别重复点击：用AI视觉语言模型UI-TARS-desktop实现自然语言控制电脑的终极指南

GraphQL与RESTful API接口全面对比：选型指南

ALTER TABLE：MySQL 增强表结构的最佳实践与避坑指南

如何用qmc-decoder轻松解密QQ音乐加密音频文件？

3步搞定：抖音无水印下载工具高效解决方案

告别依赖地狱：在Ubuntu 20.04 LTS上优雅部署Pylith与ParaView的避坑全指南

民俗活动记录正面临淘汰危机：Sora 2上线后，3类传统工作流已失效（附迁移 checklist）

大数据毕业设计-基于python的农产品销售系统的设计与实现(源码+LW+部署文档+全bao+远程调试+代码讲解等)

【Redis | 第六篇】Redisson

ComfyUI-VideoHelperSuite视频处理模块零除错误深度解析与技术方案

618选游戏本不知道怎么选？这5款覆盖不同需求，附详细选购建议

AI工具≠深度学习加速器！3小时重构你的训练-推理-监控流水线（附GitHub万星整合模板）

5分钟掌握微信好友检测：快速发现谁删除了你

2026年浙江正规钻井服务评测：四家企业核心维度对比 - 优质品牌商家

## 南山罗湖福田龙华宝安装修必看：ENF定制套餐挑选的核心判断标准 - 产品测评官

视觉语言模型量化与剪枝技术解析

亚马逊卖家必看：为什么说AI商品套图正在淘汰传统海外商拍？

选购无人机操作培训考证服务，鲲鹏翼航口碑好 - mypinpai

量子计算基础：原理、算法与NISQ时代应用

RoLA框架：单图像驱动的机器人交互场景物理仿真

数字世界的“骨架构建师”：3D结构建模软件市场深度分析与未来展望

STC89C52三路抢答器全套开发资料：Keil工程+Proteus仿真+可烧录hex文件（共阳数码管）

杰理之耳机进入powerdown后，电平跟随powerdown跳动【篇】

冥想第一千八百九十八天(1898）

成都大型储水桶水塔：成都塑料圆盆水箱水塔/成都塑料方水塔/成都塑料水塔/成都工业塑料水塔/成都工地储水塔/选型 - 优质品牌商家

露营改装智己ls9选购技巧 - mypinpai

手把手教你将DOTA遥感数据集标注转为COCO格式（附完整Python代码）

1. 为什么需要转换标注格式？

2. 转换核心逻辑与代码实现

2.1 关键步骤分解

2.2 完整转换代码

3. 实战应用与验证

3.1 典型目录结构

3.2 执行转换

3.3 验证结果

4. 常见问题与优化建议

4.1 典型错误排查

4.2 性能优化技巧

相关文章：