保姆级教程：用Python脚本一键搞定OPIXray/HIXray数据集转YOLO格式（附完整代码）-编程实验室

零基础实战：Python脚本自动化转换X光安检数据集为YOLO格式

在计算机视觉领域，X光安检图像的目标检测是一个极具挑战性又充满实际应用价值的课题。对于刚接触这个领域的研究者或学生来说，第一步往往不是模型训练，而是数据准备——如何将原始数据集转换为适合目标检测框架（如YOLO）的格式。本文将手把手教你用Python脚本实现OPIXray和HIXray数据集的自动化格式转换，即使你是编程新手也能轻松上手。

1. 环境准备与数据获取

在开始转换之前，我们需要做好基础准备工作。首先确保你的电脑上安装了Python（推荐3.7及以上版本）和必要的库：

pip install opencv-python numpy

OPIXray和HIXray是两个公开的X光安检图像数据集，分别专注于不同类型物品的检测：

OPIXray：包含5类刀具物品（直刀、折叠刀、剪刀等）
HIXray：包含8类电子产品和日常物品（手机、笔记本电脑、充电宝等）

你可以从以下官方链接获取数据集：

OPIXray: [GitHub仓库链接]
HIXray: [GitHub仓库链接]

下载后，建议按照以下结构组织你的文件夹：

X-Ray/ ├── imgs/ # 存放原始图像 ├── labels/ # 存放原始标注文件 └── new_labels/ # 用于保存转换后的YOLO格式标注

2. 理解数据集格式差异

在动手编写代码前，我们需要清楚原始格式与目标格式的区别。OPIXray和HIXray通常采用类似VOC的标注格式，每张图像的标注信息存储在一个文本文件中，每行表示一个物体，格式为：

图片名称 类别 x1 y1 x2 y2

而YOLO格式则需要转换为：

类别索引 x_center y_center width height

其中坐标和尺寸都是相对于图像宽高的归一化值（0-1之间）。这种转换不仅改变了数值表示方式，还精简了数据结构，更适合YOLO系列模型的训练。

3. 核心代码解析与实现

下面我们分步骤解析转换脚本的关键部分。完整代码会在最后提供，但理解每个函数的作用至关重要。

3.1 图像尺寸获取

首先需要获取原始图像的宽高，用于后续坐标归一化：

def get_imgs_h_w(img_name, img_dir): img_path = os.path.join(img_dir, img_name) image = cv.imread(img_path) h, w = image.shape[:2] # 高度和宽度 return h, w

注意：OpenCV读取图像的shape返回顺序是(高度, 宽度)，这与我们通常的认知相反。

3.2 VOC到YOLO格式转换

这是最核心的转换函数，将边界框从(x1,y1,x2,y2)转换为YOLO格式的(x_center, y_center, width, height)：

def voc_to_yolo(w, h, box): # 将坐标转换为浮点数 x1, y1, x2, y2 = [float(x) for x in box] # 计算中心点和宽高 x_center = (x1 + x2) / 2.0 y_center = (y1 + y2) / 2.0 width = x2 - x1 height = y2 - y1 # 归一化 x_center /= w y_center /= h width /= w height /= h return [x_center, y_center, width, height]

3.3 类别索引映射

YOLO格式使用数字索引代表类别，我们需要建立类别名称到索引的映射：

def get_class_index(class_name, dataset='OPIXray'): # OPIXray类别映射 if dataset == 'OPIXray': class_dict = { 'Straight_Knife': 0, 'Folding_Knife': 1, 'Scissor': 2, 'Utility_Knife': 3, 'Multi-tool_Knife': 4 } # HIXray类别映射 else: class_dict = { 'Mobile_Phone': 0, 'Laptop': 1, 'Portable_Charger_2': 2, 'Portable_Charger_1': 3, 'Tablet': 4, 'Cosmetic': 5, 'Water': 6, 'Nonmetallic_Lighter': 7 } return class_dict.get(class_name)

4. 完整脚本与使用指南

将上述函数组合起来，我们得到完整的转换脚本：

import os import cv2 as cv def parse_annotation_line(line): """解析原始标注行""" line = line.strip().split() return { 'img_name': line[0], 'class_name': line[1], 'box': [line[2], line[3], line[4], line[5]] } def convert_dataset(txt_dir, img_dir, save_dir, dataset_type): """主转换函数""" if not os.path.exists(save_dir): os.makedirs(save_dir) for txt_name in os.listdir(txt_dir): txt_path = os.path.join(txt_dir, txt_name) save_path = os.path.join(save_dir, txt_name) with open(txt_path, 'r') as f_in, open(save_path, 'w') as f_out: for line in f_in: if not line.strip(): continue # 解析原始标注 ann = parse_annotation_line(line) # 获取图像尺寸 img_h, img_w = get_imgs_h_w(ann['img_name'], img_dir) # 获取类别索引 class_idx = get_class_index(ann['class_name'], dataset_type) if class_idx is None: continue # 转换坐标格式 yolo_box = voc_to_yolo(img_w, img_h, ann['box']) # 写入新格式 f_out.write(f"{class_idx} {' '.join([str(x) for x in yolo_box])}\n") if __name__ == '__main__': # 配置你的路径 config = { 'dataset_type': 'OPIXray', # 或'HIXray' 'txt_dir': 'path/to/your/labels', 'img_dir': 'path/to/your/images', 'save_dir': 'path/to/save/new_labels' } convert_dataset(**config)

使用这个脚本只需修改最后的config字典中的路径和数据集类型即可。脚本会自动处理所有标注文件，生成YOLO格式的标注。

5. 常见问题与解决方案

在实际使用中，你可能会遇到以下问题：

图像读取失败
- 检查图像路径是否正确
- 确保图像文件没有损坏
- 验证OpenCV是否支持该图像格式
类别映射失败
- 检查原始标注中的类别名称是否与代码中的字典完全匹配
- 注意大小写和特殊符号
坐标值异常
- 确保边界框坐标不超过图像尺寸
- 检查坐标值是否为非负数
性能优化
- 对于大型数据集，可以考虑使用多进程加速处理
- 可以先处理小样本测试脚本正确性

# 多进程处理示例（可选） from multiprocessing import Pool def process_file(args): txt_name, txt_dir, img_dir, save_dir, dataset_type = args # 处理单个文件的代码... if __name__ == '__main__': # 创建参数列表 args_list = [(name, txt_dir, img_dir, save_dir, dataset_type) for name in os.listdir(txt_dir)] # 使用4个进程并行处理 with Pool(4) as p: p.map(process_file, args_list)

6. 验证转换结果

转换完成后，强烈建议验证结果是否正确。你可以使用以下简单的可视化代码检查：

import cv2 import random def visualize_yolo(img_path, label_path): img = cv2.imread(img_path) h, w = img.shape[:2] with open(label_path) as f: for line in f: class_idx, xc, yc, bw, bh = map(float, line.split()) # 转换回像素坐标 xc *= w yc *= h bw *= w bh *= h x1 = int(xc - bw/2) y1 = int(yc - bh/2) x2 = int(xc + bw/2) y2 = int(yc + bh/2) # 随机颜色 color = (random.randint(0,255), random.randint(0,255), random.randint(0,255)) cv2.rectangle(img, (x1,y1), (x2,y2), color, 2) cv2.putText(img, str(int(class_idx)), (x1,y1-5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 1) cv2.imshow('YOLO Visualization', img) cv2.waitKey(0) cv2.destroyAllWindows() # 使用示例 visualize_yolo('path/to/image.jpg', 'path/to/label.txt')

这个可视化脚本会显示图像和对应的标注框，让你直观地确认转换是否正确。如果发现框的位置或类别不对，就需要检查转换过程中的相应步骤。