«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.cnki.ISSN1673-629X.2024.0172]
点击复制

基于双分支注意力机制的图像自动标注研究()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 34
期数:: 2024年09期

页码:: 167-173

栏目:: 人工智能

出版日期:: 2024-09-10

文章信息/Info

Title:: Research on Automatic Image Annotation Based on Dual-branch Attention Mechanism

文章编号:: 1673-629X(2024)09-0167-07

作者:: 张国有; 崔永强; 太原科技大学计算机科学与技术学院,山西太原 030024

Author(s):: ZHANG Guo-you; CUI Yong-qiang; School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China

关键词:: 图像自动标注; 卷积神经网络; 多尺度特征; 注意力机制; 特征融合

Keywords:: automatic image annotation; convolutional neural network; multi scale feature; attention mechanism; feature fusion

分类号:: TP311.51

DOI:: 10.20165/j.cnki.ISSN1673-629X.2024.0172

摘要:: 图像自动标注技术能够将图像低层视觉特征转化为人类理解的高层语义信息,增强图像的可理解性和可搜索性,在图像检索和图像分类领域具有重要的应用价值。目前,基于卷积神经网络模型的图像自动标注技术,仍存在浅层网络无法捕捉足够的特征信息、容易忽视标签之间的相互关系以及标注时难以确定标签数量的问题。该文提出的基于双分支注意力机制的图像自动标注模型,首先使用双分支注意力网络,增强图像特征和标签的相关性以及学习标签之间的相关性;其次在空间注意力分支增加多尺度特征提取模块,以提取图像的多尺度特征,解决浅层网络特征提取不充分的问题;再次通过融合模块,融合两个分支的输出,将图像特征进一步增强;最后通过标签数量预测模块,预测待标注图像的标签数量,进一步提高标注的准确性。该模型分别在三个基准数据集 Corel 5K、ESP Game 和 IAPR-TC-12 上进行实验分析,实验结果表明该模型可以有效解决上述问题,提高标注的有效性与准确性。

Abstract:: Automatic image annotation technology can transform low-level visual features of images into high-level semantic information understood by humans,enhancing the comprehensibility and searchability of images,and has important application value in the fields of image retrieval and classification. At present,automatic image annotation technology based on convolutional neural network models still faces problems such as shallow networks being unable to capture sufficient feature information, easily ignoring the interrelationships between labels,and difficulty in determining the number of labels during annotation. The proposed automatic image annotation method based on dual-branch attention mechanism first uses a dual-branch attention network to enhance the correlation between image features and labels,as well as learn the correlation between labels. Secondly, a multi scale feature extraction module is added to the spatial attention branch to extract multi scale features of the image,solving the problem of insufficient feature extraction in shallow networks. By fusing the outputs of the two branches again through the fusion module, the image features are further enhanced. Finally, the label quantity prediction module is used to predict the number of labels in the image to be annotated, further improving the accuracy of annotation. The proposed model was experimentally analyzed on three benchmark datasets,Corel 5K,ESP Game,and IAPR-TC-12.The experimental results showed that the proposed method can effectively solve the above problems and improve the effectiveness and ac-curacy of labeling.

相似文献/References:

[1]崔凤焦.表情识别算法研究进展与性能比较[J].计算机技术与发展,2018,28(02):145.[doi:10．3969/j．issn．1673－629X．2018．02．031]
　CUI Feng-jiao.Ｒesearch and Performance Comparison of Facial Expression Ｒecognition Algorithm[J].,2018,28(09):145.[doi:10．3969/j．issn．1673－629X．2018．02．031]
[2]张丹丹,李雷. 基于PCANet-RF的人脸检测系统[J].计算机技术与发展,2016,26(02):31.
　ZHANG Dan-dan,LI Lei. Face Detection System Based on PCANet-RF[J].,2016,26(09):31.
[3]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
　CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(09):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[4]郭子琰,舒心,刘常燕,等.基于ReLU 函数的卷积神经网络的花卉识别算法[J].计算机技术与发展,2018,28(05):154.[doi:10．3969/j．issn．1673－629X．2018．05．035]
　GUO Ziyan,SHU Xin,LIU Changyan,et al.A Recognition Algorithm of Flower Based on Convolution Neural Network with ReLU Function[J].,2018,28(09):154.[doi:10．3969/j．issn．1673－629X．2018．05．035]
[5]缪宇杰,吴智钧,宫婧.基于3D 卷积的视频错帧筛选方法[J].计算机技术与发展,2018,28(05):179.[doi:10.3969/ j. issn.1673-629X.2018.05.040]
　MIAO Yu-jie,WU Zhi-jun,GONG Jing.A Wrong Temporal-order Frames Identification Method Based on 3D Convolution[J].,2018,28(09):179.[doi:10.3969/ j. issn.1673-629X.2018.05.040]
[6]吴玉枝,吴志红,熊运余.基于卷积神经网络的小样本车辆检测与识别[J].计算机技术与发展,2018,28(06):1.[doi:10.3969/ j. issn.1673-629X.2018.06.001]
　WU Yu-zhi,WU Zhi-hong,XIONG Yun-yu.Vehicle Detection and Recognition of a Few Samples Based on Convolutional Neural Network[J].,2018,28(09):1.[doi:10.3969/ j. issn.1673-629X.2018.06.001]
[7]李相桥,李晨,田丽华,等.卷积神经网络并行训练的优化研究[J].计算机技术与发展,2018,28(08):12.[doi:10.3969/ j. issn.1673-629X.2018.08.003]
　LI Xiang-qiao,LI Chen,TIAN Li-hua,et al.Research on Optimization of Parallel Training for Convolution Neural Network[J].,2018,28(09):12.[doi:10.3969/ j. issn.1673-629X.2018.08.003]
[8]邓宗平,赵启军,陈虎. 基于深度学习的人脸姿态分类方法[J].计算机技术与发展,2016,26(07):11.
　DEND Zong-ping,ZHAO Qi-jun,CHEN Hu. Face Pose Classification Method Based on Deep Learning[J].,2016,26(09):11.
[9]河海大学计算机与信息学院,江苏南京 0098.卷积网络的无监督特征提取对人脸识别的研究[J].计算机技术与发展,2018,28(06):17.[doi:10.3969/ j. issn.1673-629X.2018.06.004]
　DU Bai-sheng.Research on Unsupervised Feature Extraction Based on Convolutional Neural Network for Face Recognition[J].,2018,28(09):17.[doi:10.3969/ j. issn.1673-629X.2018.06.004]
[10]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
　GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(09):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[11]王琳,张素兰,杨海峰.基于 CNN 和加权贝叶斯的最近邻图像标注方法[J].计算机技术与发展,2021,31(10):63.[doi:10. 3969 / j. issn. 1673-629X. 2021. 10. 011]
　WANG Lin,ZHANG Su-lan*,YANG Hai-feng.A Nearest Neighbor Image Annotation Method Based on CNN and Weighted Bayesian[J].,2021,31(09):63.[doi:10. 3969 / j. issn. 1673-629X. 2021. 10. 011]

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed393
全文下载/Downloads135
评论/Comments