[1]严亚滔,卜鹏辉,王航,等.基于全局注意力机制的单目深度估计算法[J].计算机技术与发展,2025,(06):34-41.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0015]
 YAN Ya-tao,BU Peng-hui,WANG Hang,et al.Monocular Depth Estimation Algorithm Based on Global Attention Mechanism[J].,2025,(06):34-41.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0015]
点击复制

基于全局注意力机制的单目深度估计算法()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2025年06期
页码:
34-41
栏目:
媒体计算
出版日期:
2025-06-10

文章信息/Info

Title:
Monocular Depth Estimation Algorithm Based on Global Attention Mechanism
文章编号:
1673-629X(2025)06-0034-08
作者:
严亚滔1卜鹏辉12王航12田隆涛1
1. 西安石油大学 机械工程学院,陕西 西安 710065;
2. 陕西省石油钻采装备工程技术研究中心,陕西 西安 710065
Author(s):
YAN Ya-tao1BU Peng-hui12WANG Hang12TIAN Long-tao1
1. School of Mechanical Engineering,Xi’an Shiyou University,Xi’an 710065,China;
2. Shaanxi Provincial Petroleum Drilling Equipment Engineering Technology Research Center,Xi’an 710065,China
关键词:
单目深度估计通道注意力机制空间注意力机制自监督学习多尺度特征
Keywords:
monocular depth estimation channel attention mechanism spatial attention mechanism self - supervised learningmultiscale feature
分类号:
TP391
DOI:
10.20165/j.cnki.ISSN1673-629X.2025.0015
摘要:
基于深度卷积神经网络(DCNN)的场景三维结构感知方法利用图像中的语义信息能够显著提升结构感知的准确度,日益受到学者们的广泛关注。 典型的 DCNN 网络结构不能有效捕捉全局语义信息导致单目深度估计的深度图中存在边界模糊,该文提出一种融合多种注意力机制的网络结构以提升深度图的结构保持性。 该网络采用编码器-解码器结构,引入结构感知模块和细节强调模块。 结构感知模块中,利用十字交叉注意力机制捕获长距离依赖关系,提高网络获取全局信息的能力。 细节强调模块中,采用通道注意力机制和空间注意力机制,选择性强调多通道特征信息实现不同目标的结构保持,并对边界像素点分配不同权重以提高深度图的边界保持性。 在 KITTI 数据集上的实验结果表明,相较于之前的Monodepth2 模型,该方法深度值绝对误差下降了 0. 8% ,阈值为 1. 25 的精度提升了 1. 8% ,并且优于现有的大多数自监督算法。
Abstract:
The three-dimensional structure perception method based on deep convolutional neural network ( DCNN) can significantly improve the accuracy of structure perception by using the semantic information in the image,and has attracted more and more scholars’ at-tention. The typical DCNN network structure cannot capture global semantic information effectively,which leads to boundary ambiguity in the depth map of monomial depth estimation. A network structure integrating multiple attention mechanisms is proposed to improve the structure retention of the depth map. The network adopts encoder-decoder structure and introduces structure awareness module and detail emphasis module. In the structure perception module,the cross-over attention mechanism is used to capture long distance dependencies and improve the ability of the network to obtain global information. In detail emphasis module,the channel attention mechanism and spatial attention mechanism are used to selectively emphasize multi - channel feature information to achieve structure preservation of different objectives, and different weights are assigned to boundary pixels to improve the boundary preservation of depth map.Experimental results on KITTI dataset show that compared with previous work,the absolute error of depth value of the proposed method is reduced by 0. 8% ,and the accuracy of threshold 1. 25 is improved by 1. 8% ,which is better than that of most existing self-supervised algorithms.

相似文献/References:

[1]姜丽莉,黄承宁.融合注意力机制改进残差网络的表情识别方法[J].计算机技术与发展,2022,32(05):42.[doi:10. 3969 / j. issn. 1673-629X. 2022. 05. 007]
 JIANG Li-li,HUANG Cheng-ning.An Expression Recognition Method Based on Fusion of Attention Mechanism and Improved Residual Network[J].,2022,32(06):42.[doi:10. 3969 / j. issn. 1673-629X. 2022. 05. 007]
[2]姜孟超,范灵毓,李硕豪*.基于注意力双线性池化的细粒度舰船识别[J].计算机技术与发展,2022,32(08):66.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 011]
 JIANG Meng-chao,FAN Ling-yu,LI Shuo-hao*.Weakly Supervised Fine-grained Natural Scene Ship Recognition viaAttention Bilinear Pooling[J].,2022,32(06):66.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 011]
[3]包 晨,袁卫华,戴久乾,等.基于通道注意力的神经协同过滤推荐算法[J].计算机技术与发展,2023,33(07):173.[doi:10. 3969 / j. issn. 1673-629X. 2023. 07. 026]
 BAO Chen,YUAN Wei-hua,DAI Jiu-qian,et al.Neural Collaborative Filtering Recommendation Algorithm Based on Channel Attention[J].,2023,33(06):173.[doi:10. 3969 / j. issn. 1673-629X. 2023. 07. 026]
[4]董佳乐,邓正杰*,张宝,等.基于人脸关键区域的深度伪造视频检测方法[J].计算机技术与发展,2025,(01):73.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0281]
 DONG Jia-le,DENG Zheng-jie*,ZHANG Bao,et al.A Deepfake Video Detection Method Based on Facial Key Regions[J].,2025,(06):73.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0281]

更新日期/Last Update: 2025-06-10