基于YOLOv5的无人机航拍小目标检测模型

石祥滨; 赵芮同

doi:10.3969/j.issn.2095-1248.2024.02.005

沈阳航空航天大学学报 >

2024 , Vol. 41 >Issue 2: 37 - 46

DOI: https://doi.org/10.3969/j.issn.2095-1248.2024.02.005

信息科学与工程

基于YOLOv5的无人机航拍小目标检测模型

石祥滨 ,
赵芮同

展开

沈阳航空航天大学计算机学院，沈阳 110136

石祥滨（1963-），男，辽宁沈阳人，教授，博士，主要研究方向：计算机视觉，E-mail：sxb@sau.edu.cn。

收稿日期: 2023-12-20

网络出版日期: 2024-05-29

基金资助

国家自然科学基金(61170185)

收起

A small target detection model for UAV aerial photography based on YOLOv5

Xiangbin SHI ,
Ruitong ZHAO

Expand

College of Computer Science，Shenyang Aerospace University，Shenyang 110136，China

Received date: 2023-12-20

Online published: 2024-05-29

Fold

摘要

针对无人机小目标检测中漏检率高、检测成功率低等问题，提出一种基于YOLOv5的小目标检测算法。首先，分别在backbone结构和neck结构中，融合swin transformer模块，在减少计算成本的基础上，提高目标检测的准确率，以适应无人机航拍小目标检测；其次，引入卷积注意力模块（convolutional block attention module，CBAM），以增强网络对小目标特征的关注度；最后，将原始损失函数CIoU替换为SIoU损失函数，强调高质量样本权重加速收敛，提高回归精度。实验结果表明，经过模型优化，在Visdrone2019数据集上的检测精度为35.3%，与YOLOv5相比，提升了5.2%；相较于其他经典及先进算法，SWCBSI-YOLO算法表现良好，满足针对无人机航拍小目标的检测要求。

关键词： 无人机航拍图像; 小目标检测; YOLOv5; transformer; 注意力机制; 损失函数

本文引用格式

石祥滨 , 赵芮同 . 基于YOLOv5的无人机航拍小目标检测模型[J]. 沈阳航空航天大学学报, 2024 , 41(2) : 37 -46 . DOI: 10.3969/j.issn.2095-1248.2024.02.005

Abstract

In order to solve the problems of high missed detection rate and low detection success rate in UAV small target detection， a small target detection algorithm based on YOLOv5 was proposed.Firstly， the swin transformer module was integrated into the backbone structure and the neck structure respectively， which improved the accuracy of target detection on the basis of reducing the computational cost， and could adapt to the detection of small target in UAV aerial photography.Secondly， the convolutional block attention module （CBAM） was introduced to enhance the network’s attention for small target features.Finally， the original loss function CIoU was replaced by the SIoU loss function， and the weights of high-quality samples were emphasized to accelerate convergence and improve the regression accuracy.Experimental results show that the detection accuracy on Visdrone2019 dataset is 35.3% after model optimization， which is 5.2% higher than that of YOLOv5.Compared with other classical and advanced algorithms，SWCBSI-YOLO algorithm performs well and meets the detection requirements of small targets for UAV aerial photography.

Key words： UAV aerial photography; small target detection; YOLOv5; transformer; attention mechanism; loss function

[an error occurred while processing this directive]

参考文献

原文顺序 | 文献年度倒序 | 文中引用次数倒序

1	陈旭.基于深度学习的无人机图像目标检测算法研究［D］.杭州：杭州电子科技大学，2022.

2	江波，屈若锟，李彦冬，等.基于深度学习的无人机航拍目标检测研究综述［J］.航空学报，2021，42 （4）：137-151.

3	Purkait P， Zhao C， Zach C.SPP-Net：deep absolute pose regression with synthetic views［C］//CVPR.Salt Lake City：IEEE，2018：1101-1256.

4	Girshick R.Fast R-CNN［C］//2015 IEEE International Conference on Computer Vision （ICCV）.Santiago：IEEE，2015：1440-1448.

5	Lin T Y， Goyal P， Girshick R，et al.Focal loss for dense object detection［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence，2020，42（2）：318-327.

6	王龙博，刘建辉，张贝贝，等.利用注意力机制融合的YOLOv5遥感图像目标检测［J］.信息工程大学学报，2023，24（4）：438-446.

7	赵倩，王成龙，郭彤.一种模型轻量化设计的遥感小目标检测方法：CN112329721A［P］.2021-02-05.

8	Kisantal M， Wojna Z， Murawski J，et al.Augmentation for small object detection［EB/OL］.（2019-02-19）［2023-01-23］.

9	Lin T Y， Dollár P， Girshick R，et al.Feature pyramid networks for object detection［C］//2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）.Honolulu：IEEE，2017：936-944.

10	Liu S， Qi L， Qin H F，et al.Path aggregation network for instance segmentation［C］//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City：IEEE，2018：8759-8768.

11	Yang C， Huang Z H， Wang N Y.QueryDet：cascaded sparse query for accelerating high-resolution small object detection［C］//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）.New Orleans：IEEE，2022：13658-13667.

12	Prajit R， Niki P， Ashish V，et al.Stand-alone self-attention in vision models［C］//Proceedings of the 33rd International Conference on Neural Information Processing Systems.Red Hook：IEEE.2019：68-80.

13	Woo S， Park J， Lee J Y，et al.CBAM：convolutional block attention module［C］//European Conference on Computer Vision.Cham：Springer，2018：3-19.

14	Liu Z， Lin Y T， Cao Y，et al.Swin Transformer：hierarchical Vision Transformer using Shifted Windows［C］//2021 IEEE/CVF International Conference on Computer Vision （ICCV）.Montreal：IEEE，2021：9992-10002.

15	Gevorgyan Z.SIoU loss：more powerful learning for bounding box regression［EB/OL］.（2022-05-25）［2024-04-23］.

16	Du D W， Zhu P F， Wen L Y，et al.VisDrone-DET2019：the vision meets drone object detection in image challenge results［C］//2019 IEEE/CVF International Conference on Computer Vision Workshop （ICCVW）.Seoul：IEEE，2019：213-226.

17	Li Z， Peng C， Yu G，et al.Light-head R-CNN：indefense of two-stage object detector［EB/OL］.（2018-06-04）［2024-04-23］.

18	Cai Z W， Vasconcelos N.Cascade R-CNN：delving into high quality object detection［C］//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City：IEEE，2018：6154-6162.

19	Ren S Q， He K M， Girshick R，et al.Faster R-CNN：towards real-time object detection with region proposal networks［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence，2017，39（6）：1137-1149.

20	Law H， Deng J.CornerNet：detecting objects as paired keypoints［J］.International Journal of Computer Vision，2020，128（3）：642-656.

21	Ge Z， Liu S T， Wang F，et al.YOLOX：exceeding YOLO series in 2021［EB/OL］.（2021-07-18）［2024-04-23］.

22	Zhu X K， Lyu S C， Wang X，et al.TPH-YOLOv5：improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios［C］//2021 IEEE/CVF International Conference on Computer Vision Workshops （ICCVW）.Montreal：IEEE，2021：2778-2788.

Options

文章导航