从Labelme到DOTA：手把手教你搞定遥感图像旋转目标检测的自定义数据集转换

发布时间：2026/7/25 10:56:48 分类：文化教育浏览：1234

从Labelme到DOTA遥感图像旋转目标检测数据集转换实战指南在遥感图像分析领域旋转目标检测OBB Detection正逐渐成为处理航空影像、卫星图片等非水平目标的关键技术。不同于常规水平框检测旋转框能更精确地捕捉任意方向的目标特征尤其适合飞机、船舶、车辆等具有明显方向性的物体。本文将深入解析如何将Labelme标注的旋转框数据转换为DOTA标准格式并集成到MMDetection框架中进行高效训练。1. 旋转目标检测基础与工具选型旋转目标检测的核心在于用旋转矩形框Oriented Bounding Box代替传统水平矩形框。这种表示方法通过四个顶点坐标x1,y1,x2,y2,x3,y3,x4,y4或中心点加旋转角度的方式能更精确地框定倾斜目标。主流工具对比工具名称标注类型输出格式适用场景Labelme多边形/旋转框JSON小规模自定义标注CVAT旋转框XML/COCO团队协作标注DOTA官方工具旋转框DOTA格式TXT专业遥感数据标注对于已有Labelme标注数据的用户转换为DOTA格式可以充分利用现有生态直接使用DOTA_devkit中的图像切割、评估工具兼容S2ANet、ReDet等主流旋转检测模型复用MMDetection中的DOTA数据处理流程提示Labelme的旋转框实际是通过多边形标注实现的转换时需注意顶点顺序必须保持顺时针排列这与DOTA标准一致。2. Labelme到DOTA格式的完整转换流程2.1 数据结构解析与清洗Labelme的JSON标注文件包含以下关键信息{ version: 4.5.6, flags: {}, shapes: [ { label: airplane, points: [[x1,y1], [x2,y2], [x3,y3], [x4,y4]], shape_type: polygon } ], imagePath: image_001.jpg, imageData: null }转换脚本需要处理的核心逻辑验证多边形是否为四边形旋转框的基本要求检查顶点顺序是否为顺时针过滤无效标注如面积过小的目标处理类别名称映射如将plane统一为airplane2.2 坐标转换核心代码实现以下Python脚本演示关键转换步骤import json import os from collections import OrderedDict def labelme_to_dota(json_path, output_dir): with open(json_path) as f: data json.load(f) filename os.path.splitext(data[imagePath])[0] .txt output_path os.path.join(output_dir, filename) with open(output_path, w) as f_out: for shape in data[shapes]: if shape[shape_type] ! polygon or len(shape[points]) ! 4: continue points np.array(shape[points]) if not is_clockwise(points): # 顶点顺序检测函数 points points[::-1] # 转换为顺时针 line .join([f{x:.1f} {y:.1f} for x,y in points]) line f {shape[label]} 0\n # difficult设为0 f_out.write(line) def is_clockwise(points): # 通过向量叉积判断顶点顺序 edge1 points[1] - points[0] edge2 points[2] - points[1] return edge1[0]*edge2[1] - edge1[1]*edge2[0] 02.3 批量处理与验证实际项目中需要处理成百上千张图像建议采用以下优化策略多进程并行处理使用multiprocessing.Pool增加日志记录成功/失败的文件生成可视化校验脚本python DOTA_devkit/ImgSplit.py --base_json label_config.json --src_img_dir ./images --dst_img_dir ./split_images常见问题解决方案顶点顺序错误使用OpenCV的cv2.convexHull统一排序坐标越界采用np.clip限制在图像尺寸范围内类别不一致建立映射表统一命名规范3. 基于MMDetection的训练部署3.1 环境配置最佳实践推荐使用Docker保证环境一致性FROM nvidia/cuda:11.1-base RUN apt-get update apt-get install -y python3.8 pip RUN pip install torch1.9.0cu111 torchvision0.10.0cu111 -f https://download.pytorch.org/whl/torch_stable.html RUN pip install mmcv-full1.4.0 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html RUN git clone https://github.com/open-mmlab/mmdetection.git cd mmdetection pip install -r requirements/build.txt关键版本组合验证CUDA 11.1 PyTorch 1.9 MMDetection 2.20Python 3.8 MMCV 1.4.03.2 配置文件关键参数调整以S2ANet为例需修改configs/s2anet/s2anet_r50_fpn_1x_dota.py中的dataset_type DOTADataset data dict( samples_per_gpu4, workers_per_gpu2, traindict( typedataset_type, ann_filedata/dota/train/labelTxt/, img_prefixdata/dota/train/images/), valdict( typedataset_type, ann_filedata/dota/val/labelTxt/, img_prefixdata/dota/val/images/), testdict( typedataset_type, ann_filedata/dota/test/labelTxt/, img_prefixdata/dota/test/images/)) model dict( bbox_headdict( num_classes15)) # 根据实际类别数修改3.3 训练优化技巧针对遥感图像特点的改进方案多尺度训练img_norm_cfg dict( mean[123.675, 116.28, 103.53], std[58.395, 57.12, 57.375], to_rgbTrue) train_pipeline [ dict(typeLoadImageFromFile), dict(typeLoadAnnotations, with_bboxTrue), dict(typeRResize, img_scale(1024, 1024)), dict(typeRRandomFlip, flip_ratio0.5), dict(typeNormalize, **img_norm_cfg), dict(typePad, size_divisor32), dict(typeDefaultFormatBundle), dict(typeCollect, keys[img, gt_bboxes, gt_labels]) ]自定义数据增强随机旋转0-360度颜色抖动针对不同光照条件随机裁剪保持目标完整性学习率策略optimizer dict(typeSGD, lr0.01, momentum0.9, weight_decay0.0001) lr_config dict( policystep, warmuplinear, warmup_iters500, warmup_ratio1.0/3, step[8, 11])4. 高级优化与部署实战4.1 大图像处理策略4096x4096遥感图像的典型处理方法分块策略对比表方法重叠像素输出数量优点缺点无重叠分块016处理简单目标切割严重50%重叠分块51236目标完整性好数据量增大滑动窗口步长256768225检测精度最高计算资源消耗极大多尺度金字塔可变100适应不同尺度目标实现复杂推荐实现代码from DOTA_devkit.ImgSplit import split_base split_base( base_jsonconfigs/dota/split_config.json, srcpathimages/original, dstpathimages/split, gap200, # 重叠像素 subsize1024 # 分块大小 )4.2 模型推理加速技巧TensorRT部署优化python tools/deployment/pytorch2onnx.py configs/s2anet/s2anet_r50_fpn_1x_dota.py checkpoints/s2anet.pth --output-file s2anet.onnx trtexec --onnxs2anet.onnx --saveEngines2anet.engine --fp16批量推理优化# 修改config中的test pipeline test_pipeline [ dict(typeLoadImageFromFile), dict( typeMultiScaleFlipAug, img_scale(1024, 1024), flipFalse, transforms[ dict(typeRResize), dict(typeNormalize, **img_norm_cfg), dict(typePad, size_divisor32), dict(typeDefaultFormatBundle), dict(typeCollect, keys[img]) ]) ]结果后处理优化使用NMS非极大值抑制合并重叠框采用加权平均法融合多尺度检测结果过滤低置信度检测score_thr0.34.3 实际项目中的经验总结在多个遥感检测项目中我们发现以下实践最为有效对高分辨率图像采用两阶段检测先用小图快速定位目标区域再对ROI进行精细检测针对小目标20像素增加特征金字塔层级FPN的P6/P7在数据增强中减少颜色扰动保持遥感图像的真实光谱特性测试阶段采用模型集成如S2ANetReDet提升鲁棒性处理DOTA数据时的一个典型目录结构示例dota_dataset/ ├── train/ │ ├── images/ │ └── labelTxt/ ├── val/ │ ├── images/ │ └── labelTxt/ └── test/ ├── images/ └── labelTxt/最终模型部署时建议将预处理图像分块和后处理结果合并也封装成服务形成端到端的解决方案。对于持续更新的遥感数据可以搭建自动化训练流水线定期用新数据微调模型。