在训练完模型得到best.pt
后,通过val.py
脚本在测试集上验证模型的性能,如精确率(P
)、召回率(R
)、检测精度(AP
)等。运行前,修改参数如下:
--data ROOT / 'data/VOC_RoadDamage.yaml' --weight ROOT / 'runs/train/exp/weights/best.pt' --batch-size 64 --conf-thres 0.1 --iou-thres 0.65 --task test --save-txt --save-hybrid --save-conf
运行代码得到的结果如下:
val: data=data\VOC_RoadDamage.yaml, weights=runs\train\exp\weights\best.pt, batch_size=64, imgsz=640, conf_thres=0.1, iou_thres=0.65, task=test, device=, workers=8, single_cls=False, augment=False, verbose=True, save_txt=True, save_hybrid=True, save_conf=True, save_json=False, project=runs\val, name=exp, exist_ok=False, half=False, dnn=False
WARNING: confidence threshold 0.1 > 0.001 produces invalid results
YOLOv5 2022-8-17 Python-3.9.12 torch-1.11.0 CUDA:0 (NVIDIA GeForce RTX 3080, 10240MiB)
Fusing layers...
YOLOv5s_RoadDamage summary: 213 layers, 7023610 parameters, 0 gradients, 15.8 GFLOPs
test: Scanning 'C:\Users\**\**\datasets\RDD2018\CRDDC2022_China Motorbike_train\China\labels\test.cache' images and labels... 1977 found, 0 missing, 43 empty, 0 corrupt: 100%|██████████| 1977/1977 [00:00<?, ?it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 31/31 [00:37<00:00, 1.20s/it]
all 1977 4650 1 1 0.995 0.995
longitudinal crack 1977 2678 1 1 0.995 0.995
lateral crack 1977 1096 1 1 0.995 0.995
alligator crack 1977 641 1 1 0.995 0.995
pothole 1977 235 1 1 0.995 0.995
Speed: 0.3ms pre-process, 2.1ms inference, 2.3ms NMS per image at shape (64, 3, 640, 640)
Results saved to runs\val\exp14
1977 labels saved to runs\val\exp14\labels
但在模型训练阶段,返回的结果为
Validating runs\train\exp\weights\best.pt...
Fusing layers...
YOLOv5s_RoadDamage summary: 213 layers, 7023610 parameters, 0 gradients, 15.8 GFLOPs
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 15/15 [00:19<00:00, 1.29s/it]
all 1880 3287 0.622 0.585 0.593 0.325
longitudinal crack 1880 1410 0.683 0.772 0.769 0.413
lateral crack 1880 313 0.463 0.328 0.338 0.134
alligator crack 1880 538 0.739 0.737 0.745 0.415
pothole 1880 89 0.512 0.236 0.274 0.123
traffic line blur 1880 937 0.715 0.854 0.838 0.542
Results saved to runs\train\exp
即便测试集与验证集在损伤分布和标记处理上有所不同,但模型的性能不至于突飞猛进,故初步怀疑val.py
脚本存在bug或参数设置有问题。
在博客【YOLOV5-5.x 源码解读】val.py_满船清梦压星河HK的博客的参数解释启发下,发现了--save-txt
--save-hybrid
和--save -conf
的异常,链接到Gihub仓库YOLOv5找到类似的问题,即 Issues 5508 中val.py
计算得到的各评估参数异常,并最终转到 Issues 1563 和 Issues 1646找到解决方案。
def parse_opt():
"""
opt参数详解
data: 数据集配置文件地址 包含数据集的路径、类别个数、类名、下载地址等信息
weights: 模型的权重文件地址 weights/yolov5s.pt
batch_size: 前向传播的批次大小 默认32
imgsz: 输入网络的图片分辨率 默认640
conf-thres: object置信度阈值 默认0.001
iou-thres: 进行NMS时IOU的阈值 默认0.6
task: 设置测试的类型 有train, val, test, speed or study几种 默认val
device: 测试的设备
single-cls: 数据集是否只用一个类别 默认False
augment: 测试是否使用TTA Test Time Augment 默认False
verbose: 是否打印出每个类别的mAP 默认False
下面三个参数是auto-labelling(有点像RNN中的teaching forcing)相关参数详见:https://github.com/ultralytics/yolov5/issues/1563 下面解释是作者原话
save-txt: traditional auto-labelling
save-hybrid: save hybrid autolabels, combining existing labels with new predictions before NMS (existing predictions given confidence=1.0 before NMS.
save-conf: add confidences to any of the above commands
save-json: 是否按照coco的json格式保存预测框,并且使用cocoapi做评估(需要同样coco的json格式的标签) 默认False
project: 测试保存的源文件 默认runs/test
name: 测试保存的文件地址 默认exp 保存在runs/test/exp下
exist-ok: 是否存在当前文件 默认False 一般是 no exist-ok 连用 所以一般都要重新创建文件夹
half: 是否使用半精度推理 默认False
"""
parser = argparse.ArgumentParser(prog='val.py')
parser.add_argument('--data', type=str, default='data/coco128.yaml', help='dataset.yaml path')
parser.add_argument('--weights', nargs='+', type=str, default='weights/yolov5s.pt', help='model.pt path(s)')
parser.add_argument('--batch-size', type=int, default=4, help='batch size')
parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='inference size (pixels)')
parser.add_argument('--conf-thres', type=float, default=0.001, help='confidence threshold')
parser.add_argument('--iou-thres', type=float, default=0.6, help='NMS IoU threshold')
parser.add_argument('--task', default='val', help='train, val, test, speed or study')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--verbose', action='store_true', help='report mAP by class')
parser.add_argument('--save-txt', default=True, action='store_true', help='save results to *.txt')
parser.add_argument('--save-hybrid', action='store_true', help='save label+prediction hybrid results to *.txt')
parser.add_argument('--save-conf', default=True, action='store_true', help='save confidences in --save-txt labels')
parser.add_argument('--save-json', action='store_true', help='save a cocoapi-compatible JSON results file')
parser.add_argument('--project', default='runs/test', help='save to project/name')
parser.add_argument('--name', default='exp', help='save to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
opt = parser.parse_args() # 解析上述参数
opt.save_json |= opt.data.endswith('coco.yaml') # |或 左右两个变量有一个为True 左边变量就为True
opt.save_txt |= opt.save_hybrid
opt.data = check_file(opt.data) # check file
return opt
YOLOv5原作者解释如下:
This PR introduces hybrid autolabelling support in test.py. The auto-labelling options are now:
- python test.py --save-txt: traditional auto-labelling
- python test.py --save-hybrid: save hybrid autolabels, combining existing labels with new predictions before NMS (existing predictions given confidence=1.0 before NMS.
- python test.py --save-conf: add confidences to any of the above commands
Regardless of any of the above settings, be aware that auto-labelling works best at very high confidence thresholds, i.e. 0.90 confidence, whereas mAP computation relies on very low confidence threshold, i.e. 0.001, to properly evaluate the area under the PR curve. The two activities are thus essentially mutually exclusive, there is no reason I know of to combine the two into a single test run.
--save-hybrid
参数会在评估模型之前将真实值和预测值一起写入标注文件*.txt
,导致P R mAP@.5
变得极高。故参数设置如下即可:
--data ROOT / 'data/VOC_RoadDamage.yaml' --weight ROOT / 'runs/train/exp/weights/best.pt' --batch-size 64 --conf-thres 0.1 --iou-thres 0.65 --task test --save-txt --save-conf
执行代码的计算结果为:
val: data=data\VOC_RoadDamage.yaml, weights=runs\train\exp\weights\best.pt, batch_size=64, imgsz=640, conf_thres=0.1, iou_thres=0.65, task=test, device=, workers=8, single_cls=False, augment=False, verbose=True, save_txt=True, save_hybrid=False, save_conf=True, save_json=False, project=runs\val, name=exp, exist_ok=False, half=False, dnn=False
WARNING: confidence threshold 0.1 > 0.001 produces invalid results
YOLOv5 2022-8-17 Python-3.9.12 torch-1.11.0 CUDA:0 (NVIDIA GeForce RTX 3080, 10240MiB)
Fusing layers...
YOLOv5s_RoadDamage summary: 213 layers, 7023610 parameters, 0 gradients, 15.8 GFLOPs
test: Scanning 'C:\Users\**\**\datasets\RDD2018\CRDDC2022_China Motorbike_train\China\labels\test.cache' images and labels... 1977 found, 0 missing, 43 empty, 0 corrupt: 100%|██████████| 1977/1977 [00:00<?, ?it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 31/31 [00:27<00:00, 1.13it/s]
all 1977 4650 0.257 0.192 0.203 0.0823
longitudinal crack 1977 2678 0.279 0.353 0.292 0.108
lateral crack 1977 1096 0.147 0.189 0.115 0.0317
alligator crack 1977 641 0.601 0.228 0.407 0.19
pothole 1977 235 0 0 0 0
Speed: 0.4ms pre-process, 2.0ms inference, 1.9ms NMS per image at shape (64, 3, 640, 640)
Results saved to runs\val\exp15
1912 labels saved to runs\val\exp15\labels