基于DCS-YOLOv8模型的红外图像目标检测方法

沈凌云; 郎百和; 宋正勋; 温智滔

基于DCS-YOLOv8模型的红外图像目标检测方法

Infrared Image Object Detection Method Based on DCS-YOLOv8 Model

摘要

摘要: 针对低信噪比与复杂任务场景下，YOLOv8模型对红外遮挡目标和弱小目标检测能力不足的问题，提出了改进的DCS-YOLOv8模型（DCN_C2f-CA-SIoU-YOLOv8）的目标检测方法。以YOLOv8框架为基础，主干网络构建了基于可变形卷积的轻量级DCN_C2f（Deformable Convolution Network）模块，自适应调整网络的视觉感受野，提高目标多尺度特征表示能力。特征融合网络引入基于坐标注意力机制CA（Coordinate Attention）的模块，通过捕捉多目标空间位置依赖关系，提高目标的定位准确性。改进基于SIoU（Scylla IoU）的位置回归损失函数，实现预测框与真实框之间的相对位移方向匹配，加快模型收敛速度并提升检测与定位精度。实验结果表明，相较于YOLOv8-n\s\m\l\x系列模型，DCS-YOLOv8在FLIR、OTCBVS与VEDAI测试集上平均精度均值[email protected]平均提高了6.8%、0.6%、4.0%，分别达到86.5%、99.0%与75.6%。同时，模型的推理速度满足红外目标检测任务的实时性要求。

Abstract: In response to the challenges posed by low signal-to-noise ratios and complex task scenarios, an improved detection method called DCS-YOLOv8 (DCN_C2f-CA-SIoU-YOLOv8) is proposed to address the insufficient infrared occluded object detection and weak target detection capabilities of the YOLOv8 model. Building on the YOLOv8 framework, the backbone network incorporates a lightweight deformable convolution network (DCN_C2f) module based on deformable convolutions, which adaptively adjusts the network's visual receptive field to enhance the multi-scale feature representation of objects. The feature fusion network introduces the coordinate attention (CA) module based on coordinate attention mechanisms to capture spatial dependencies among multiple objects, thereby improving the object localization accuracy. Additionally, the position regression loss function is enhanced using Scylla IoU to ensure a relative displacement direction match between the predicted and ground truth boxes. This improvement accelerates the model convergence speed and enhances the detection and localization accuracy. The experimental results demonstrate that DCS-YOLOv8 achieves significant improvements in the average precision of the FLIR, OTCBVS, and VEDAI test sets compared to the YOLOv8-n\s\m\l\x series models. Specifically, the average [email protected] values are enhanced by 6.8%, 0.6%, and 4.0% respectively, reaching 86.5%, 99.0%, and 75.6%. Furthermore, the model's inference speed satisfies the real-time requirements for infrared object detection tasks.

HTML全文

参考文献(30)

施引文献

资源附件(0)