CN110287826B - Video target detection method based on attention mechanism - Google Patents
Video target detection method based on attention mechanism Download PDFInfo
- Publication number
- CN110287826B CN110287826B CN201910499786.9A CN201910499786A CN110287826B CN 110287826 B CN110287826 B CN 110287826B CN 201910499786 A CN201910499786 A CN 201910499786A CN 110287826 B CN110287826 B CN 110287826B
- Authority
- CN
- China
- Prior art keywords
- feature
- detected
- frame
- candidate
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及一种基于注意力机制的视频目标检测方法,涉及计算机视觉。本发明包括如下步骤:步骤S1,提取当前时间帧的候选特征图;步骤S2,在过去时间段设定融合窗口,计算窗口内各帧的拉普拉斯方差,将方差归一化作为窗口内各帧的权重,将窗口内所有帧的候选特征图进行加权求和得到时序特征,将当前时间帧的候选特征与时序特征相连接,得到待检测特征图;步骤S3,利用卷积层在待检测特征图上提取出额外尺度的特征图;步骤S4,在不同尺度的特征图上利用卷积层进行目标类别及位置预测。本发明的特征融合方法对过去时间段内不同质量的帧特征分配了不同的权重,使得时序信息的融合更加充分,提高了检测模型的性能。
The invention relates to a video target detection method based on an attention mechanism, and relates to computer vision. The present invention includes the following steps: step S1, extracting candidate feature maps of the current time frame; step S2, setting a fusion window in the past time period, calculating the Laplacian variance of each frame in the window, and normalizing the variance as a value in the window For the weight of each frame, the candidate feature maps of all frames in the window are weighted and summed to obtain time series features, and the candidate features of the current time frame are connected with the time series features to obtain the feature map to be detected; step S3, the convolution layer is used to Feature maps of additional scales are extracted from the detection feature maps; in step S4, target categories and positions are predicted by using convolutional layers on feature maps of different scales. The feature fusion method of the present invention assigns different weights to frame features of different qualities in the past time period, so that the fusion of time sequence information is more sufficient and the performance of the detection model is improved.
Description
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910499786.9A CN110287826B (en) | 2019-06-11 | 2019-06-11 | Video target detection method based on attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910499786.9A CN110287826B (en) | 2019-06-11 | 2019-06-11 | Video target detection method based on attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110287826A CN110287826A (en) | 2019-09-27 |
CN110287826B true CN110287826B (en) | 2021-09-17 |
Family
ID=68003699
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910499786.9A Active CN110287826B (en) | 2019-06-11 | 2019-06-11 | Video target detection method based on attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110287826B (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110674886B (en) * | 2019-10-08 | 2022-11-25 | 中兴飞流信息科技有限公司 | Video target detection method fusing multi-level features |
CN110751646A (en) * | 2019-10-28 | 2020-02-04 | 支付宝(杭州)信息技术有限公司 | Method and device for identifying damage by using multiple image frames in vehicle video |
CN111310609B (en) * | 2020-01-22 | 2023-04-07 | 西安电子科技大学 | Video target detection method based on time sequence information and local feature similarity |
CN113393491B (en) * | 2020-03-12 | 2025-02-21 | 优酷文化科技(北京)有限公司 | Method, device and electronic device for detecting target object from video |
WO2022036567A1 (en) * | 2020-08-18 | 2022-02-24 | 深圳市大疆创新科技有限公司 | Target detection method and device, and vehicle-mounted radar |
CN112016472B (en) * | 2020-08-31 | 2023-08-22 | 山东大学 | Driver attention area prediction method and system based on target dynamic information |
CN112434607B (en) * | 2020-11-24 | 2023-05-26 | 北京奇艺世纪科技有限公司 | Feature processing method, device, electronic equipment and computer readable storage medium |
CN112686913B (en) * | 2021-01-11 | 2022-06-10 | 天津大学 | Object Boundary Detection and Object Segmentation Models Based on Boundary Attention Consistency |
CN112561001A (en) * | 2021-02-22 | 2021-03-26 | 南京智莲森信息技术有限公司 | Video target detection method based on space-time feature deformable convolution fusion |
CN113688801B (en) * | 2021-10-22 | 2022-02-15 | 南京智谱科技有限公司 | Chemical gas leakage detection method and system based on spectrum video |
CN114594770B (en) * | 2022-03-04 | 2024-04-26 | 深圳市千乘机器人有限公司 | Inspection method for inspection robot without stopping |
CN115131710B (en) * | 2022-07-05 | 2024-09-03 | 福州大学 | Real-time action detection method based on multi-scale feature fusion attention |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102393958A (en) * | 2011-07-16 | 2012-03-28 | 西安电子科技大学 | Multi-focus image fusion method based on compressive sensing |
CN105913404A (en) * | 2016-07-01 | 2016-08-31 | 湖南源信光电科技有限公司 | Low-illumination imaging method based on frame accumulation |
CN107481238A (en) * | 2017-09-20 | 2017-12-15 | 众安信息技术服务有限公司 | Image quality measure method and device |
CN108921803A (en) * | 2018-06-29 | 2018-11-30 | 华中科技大学 | A kind of defogging method based on millimeter wave and visual image fusion |
CN109104568A (en) * | 2018-07-24 | 2018-12-28 | 苏州佳世达光电有限公司 | The intelligent cleaning driving method and drive system of monitoring camera |
CN109684912A (en) * | 2018-11-09 | 2019-04-26 | 中国科学院计算技术研究所 | A kind of video presentation method and system based on information loss function |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103152513B (en) * | 2011-12-06 | 2016-05-25 | 瑞昱半导体股份有限公司 | Image processing method and relevant image processing apparatus |
CN103702032B (en) * | 2013-12-31 | 2017-04-12 | 华为技术有限公司 | Image processing method, device and terminal equipment |
US10395118B2 (en) * | 2015-10-29 | 2019-08-27 | Baidu Usa Llc | Systems and methods for video paragraph captioning using hierarchical recurrent neural networks |
US10169656B2 (en) * | 2016-08-29 | 2019-01-01 | Nec Corporation | Video system using dual stage attention based recurrent neural network for future event prediction |
CN109829398B (en) * | 2019-01-16 | 2020-03-31 | 北京航空航天大学 | A method for object detection in video based on 3D convolutional network |
-
2019
- 2019-06-11 CN CN201910499786.9A patent/CN110287826B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102393958A (en) * | 2011-07-16 | 2012-03-28 | 西安电子科技大学 | Multi-focus image fusion method based on compressive sensing |
CN105913404A (en) * | 2016-07-01 | 2016-08-31 | 湖南源信光电科技有限公司 | Low-illumination imaging method based on frame accumulation |
CN107481238A (en) * | 2017-09-20 | 2017-12-15 | 众安信息技术服务有限公司 | Image quality measure method and device |
CN108921803A (en) * | 2018-06-29 | 2018-11-30 | 华中科技大学 | A kind of defogging method based on millimeter wave and visual image fusion |
CN109104568A (en) * | 2018-07-24 | 2018-12-28 | 苏州佳世达光电有限公司 | The intelligent cleaning driving method and drive system of monitoring camera |
CN109684912A (en) * | 2018-11-09 | 2019-04-26 | 中国科学院计算技术研究所 | A kind of video presentation method and system based on information loss function |
Non-Patent Citations (2)
Title |
---|
Infrared dim target detection based on visual attention;Xin Wang;《Infrared Physics & Technology》;20121130;513-521 * |
基于提升小波变换的图像清晰度评价算法;王昕;《万方数据知识服务平台》;20100322;52-57 * |
Also Published As
Publication number | Publication date |
---|---|
CN110287826A (en) | 2019-09-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110287826B (en) | Video target detection method based on attention mechanism | |
CN111126472B (en) | An Improved Target Detection Method Based on SSD | |
CN110276316B (en) | A human keypoint detection method based on deep learning | |
CN108830285B (en) | Target detection method for reinforcement learning based on fast-RCNN | |
CN110111366B (en) | End-to-end optical flow estimation method based on multistage loss | |
CN108399362B (en) | Rapid pedestrian detection method and device | |
CN109284670B (en) | A pedestrian detection method and device based on multi-scale attention mechanism | |
WO2019192397A1 (en) | End-to-end recognition method for scene text in any shape | |
CN112884742B (en) | A multi-target real-time detection, recognition and tracking method based on multi-algorithm fusion | |
CN111723798B (en) | Multi-instance natural scene text detection method based on relevance hierarchy residual errors | |
US20180114071A1 (en) | Method for analysing media content | |
CN113591968A (en) | Infrared weak and small target detection method based on asymmetric attention feature fusion | |
CN108256562A (en) | Well-marked target detection method and system based on Weakly supervised space-time cascade neural network | |
CN116645592B (en) | A crack detection method and storage medium based on image processing | |
CN110705412A (en) | Video target detection method based on motion history image | |
CN113139896A (en) | Target detection system and method based on super-resolution reconstruction | |
CN108256462A (en) | A kind of demographic method in market monitor video | |
CN111310609B (en) | Video target detection method based on time sequence information and local feature similarity | |
Li et al. | Learning to holistically detect bridges from large-size VHR remote sensing imagery | |
CN113610024B (en) | A multi-strategy deep learning remote sensing image small target detection method | |
WO2022219402A1 (en) | Semantically accurate super-resolution generative adversarial networks | |
Midwinter et al. | Unsupervised defect segmentation with pose priors | |
US12190535B2 (en) | Generating depth images for image data | |
CN118865178B (en) | A flood extraction and location method based on deep learning and spatial information fusion | |
WO2023093086A1 (en) | Target tracking method and apparatus, training method and apparatus for model related thereto, and device, medium and computer program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20241211 Address after: 518000 1002, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province Patentee after: Shenzhen Wanzhida Technology Co.,Ltd. Country or region after: China Address before: 100124 No. 100 Chaoyang District Ping Tian Park, Beijing Patentee before: Beijing University of Technology Country or region before: China |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20250521 Address after: Room 1-103, 1st Floor, Building 3, No. 5 Guangmao Street, Daxing Economic Development Zone, Daxing District, Beijing, 102600 Patentee after: Kuaima (Beijing) Electronic Technology Co.,Ltd. Country or region after: China Address before: 518000 1002, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province Patentee before: Shenzhen Wanzhida Technology Co.,Ltd. Country or region before: China |