CN112560732B - Feature extraction method of multi-scale feature extraction network - Google Patents
Feature extraction method of multi-scale feature extraction network Download PDFInfo
- Publication number
- CN112560732B CN112560732B CN202011530198.6A CN202011530198A CN112560732B CN 112560732 B CN112560732 B CN 112560732B CN 202011530198 A CN202011530198 A CN 202011530198A CN 112560732 B CN112560732 B CN 112560732B
- Authority
- CN
- China
- Prior art keywords
- scale
- layer
- map
- feature
- feature extraction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明涉及一种特征提取神经网络,特别是一种多尺度特征提取网络及该网络的特征提取方法。The invention relates to a feature extraction neural network, in particular to a multi-scale feature extraction network and a feature extraction method of the network.
背景技术Background technique
多尺度目标检测一直是计算机视觉领域研究的热点和难点,为获得多尺度目标检测的精度提升,FPN、PA-Net、NAS-FPN、BiFPN等网络结构不断被提出,但由于上述网络结构相对复杂,在提升目标检测精度的同时也携带过多的计算量,令推理时间出现延迟,使得多尺度目标检测在工业界的运用和推广变得困难。Multi-scale target detection has always been a hotspot and difficulty in the field of computer vision research. In order to improve the accuracy of multi-scale target detection, network structures such as FPN, PA-Net, NAS-FPN, and BiFPN have been continuously proposed. , while improving the accuracy of target detection, it also carries too much calculation, which delays the reasoning time, making it difficult to apply and promote multi-scale target detection in the industry.
发明内容Contents of the invention
为了克服现有技术的不足,本发明提供一种提升目标检测精度的多尺度特征提取网络的特征提取方法。In order to overcome the deficiencies of the prior art, the present invention provides a feature extraction method of a multi-scale feature extraction network that improves target detection accuracy.
本发明解决其技术问题所采用的技术方案是:The technical solution adopted by the present invention to solve its technical problems is:
一种多尺度特征提取网络的特征提取方法,多尺度特征提取网络,包括依次相连的降维卷积层、尺度特征提取层、合并层和特征融合层,所述尺度特征提取层包括大目标检测分支、原始特征检测分支和小目标检测分支。A feature extraction method of a multi-scale feature extraction network. The multi-scale feature extraction network includes a dimensionality reduction convolution layer, a scale feature extraction layer, a merge layer, and a feature fusion layer connected in sequence. The scale feature extraction layer includes a large target detection layer. branch, original feature detection branch and small object detection branch.
所述大目标检测分支包括依次相连的下采样特征图层、第一空洞卷积层和上采样恢复层;所述原始特征检测分支包括第二空洞卷积层;所述小目标检测分支包括依次相连的上采样特征图层、第三空洞卷积层和下采样恢复层。The large target detection branch includes a down-sampling feature layer, the first dilated convolution layer, and an up-sampling restoration layer connected in sequence; the original feature detection branch includes a second dilated convolution layer; the small target detection branch includes sequentially Connected upsampled feature layer, third dilated convolutional layer and downsampled recovery layer.
所述第一空洞卷积层包括三个卷积核均为3*3的空洞卷积,且该三个空洞卷积的空洞率分别为1、2和3;所述第二空洞卷积层和第三空洞卷积层的结构均与所述第一空洞卷积层相同。The first dilated convolutional layer includes three dilated convolutions whose convolution kernels are all 3*3, and the dilated rates of the three dilated convolutions are 1, 2 and 3 respectively; the second dilated convolutional layer The structures of the third dilated convolutional layer and the first dilated convolutional layer are the same.
所述降维卷积层以及所述特征融合层的卷积核均为1*1。The convolution kernels of the dimensionality reduction convolution layer and the feature fusion layer are both 1*1.
多尺度特征提取网络的特征提取方法包括以下步骤如下:The feature extraction method of the multi-scale feature extraction network includes the following steps as follows:
(1)、将待提取特征图输入至降维卷积层内进行1x1的卷积,将待提取特征图进行特征融合与降维,形成原始特征图;(1) Input the feature map to be extracted into the dimensionality reduction convolution layer for 1x1 convolution, and perform feature fusion and dimensionality reduction on the feature map to be extracted to form the original feature map;
(2)、下采样特征图层对原始特征图进行下采样,构成下采样特征图;上采样特征图层对原始特征图进行上采样,构成上采样特征图;(2) The downsampled feature layer downsamples the original feature map to form a downsampled feature map; the upsampled feature layer upsamples the original feature map to form an upsampled feature map;
(3)、第一空洞卷积层至第三空洞卷积层分别对下采样特征图、原始特征图和上采样特征图进行三个3*3的空洞卷积,生成下采样第一尺度图、下采样第二尺度图、下采样第三尺度图、原始第一尺度图、原始第二尺度图、原始第三尺度图、上采样第一尺度图、上采样第二尺度图和上采样第三尺度图;(3), the first hole convolution layer to the third hole convolution layer respectively perform three 3*3 hole convolutions on the downsampled feature map, original feature map and upsampled feature map to generate the first downsampled scale map , downsampled second scale map, downsampled third scale map, original first scale map, original second scale map, original third scale map, upsampled first scale map, upsampled second scale map, and upsampled third scale map Three-scale map;
(4)、上采样恢复层分别对下采样第一尺度图、下采样第二尺度图和下采样第三尺度图进行上采样,下采样恢复层分别对上采样第一尺度图、上采样第二尺度图和上采样第三尺度图进行下采样;(4) The upsampling restoration layer performs upsampling on the downsampled first scale map, the downsampled second scale map and the downsampled third scale map respectively, and the downsampling restoration layer respectively performs upsampling on the upsampled first scale map, upsampled second scale map The second-scale map and the up-sampled third-scale map are down-sampled;
(5)、合并层将下采样第一尺度图、下采样第二尺度图、下采样第三尺度图分别进行上采样得到的三个尺度图,原始第一尺度图、原始第二尺度图、原始第三尺度图,上采样第一尺度图、上采样第二尺度图和上采样第三尺度图分别进行下采样得到的三个尺度图合并后被特征融合层进行1*1的卷积融合,形成尺度特征提取图。(5) The merging layer performs upsampling of the downsampled first scale map, downsampled second scale map, and downsampled third scale map respectively to three scale maps obtained by upsampling, the original first scale map, the original second scale map, The original third scale image, the upsampled first scale image, the upsampled second scale image and the upsampled third scale image are respectively down-sampled to obtain three scale images that are merged and then 1*1 convolution fusion is performed by the feature fusion layer , forming a scale feature extraction map.
本发明的有益效果是:本发明能提取不同尺度的特征,再通过特征融合层进行特征融合,令多尺度特征提取网络具有多尺度特征提取能力的同时计算复杂度低,需要进行多尺度特征提取时能即时应用本特征提取网络,提升目标检测精度,而该多尺度特征提取网络的特征提取方法能对输入的待提取特征图进行特征降维、分目标检测以及核心的多尺度特征提取,能快速获取尺度特征提取图,且具有目标检测精度高、运算量少的优点。The beneficial effects of the present invention are: the present invention can extract features of different scales, and then perform feature fusion through the feature fusion layer, so that the multi-scale feature extraction network has multi-scale feature extraction capabilities and low computational complexity, requiring multi-scale feature extraction The feature extraction network can be applied immediately to improve the accuracy of target detection, and the feature extraction method of the multi-scale feature extraction network can perform feature dimensionality reduction, sub-target detection and core multi-scale feature extraction on the input feature map to be extracted. Quickly obtain the scale feature extraction map, and has the advantages of high target detection accuracy and less calculation.
附图说明Description of drawings
下面结合附图和实施例对本发明进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and embodiments.
图1是本发明的网络结构示意图;Fig. 1 is a schematic diagram of the network structure of the present invention;
图2是本发明的特征提取方法流程图。Fig. 2 is a flowchart of the feature extraction method of the present invention.
实施方式Implementation
参照图1,一种多尺度特征提取网络的特征提取方法,多尺度特征提取网络,包括依次相连的降维卷积层1、尺度特征提取层、合并层2和特征融合层3,所述尺度特征提取层包括大目标检测分支、原始特征检测分支和小目标检测分支,能提取不同尺度的特征,再通过特征融合层3进行特征融合,令多尺度特征提取网络具有多尺度特征提取能力的同时计算复杂度低,神经网络需要进行多尺度特征提取时能即时应用本特征提取网络,提升目标检测精度。Referring to Fig. 1 , a feature extraction method of a multi-scale feature extraction network, a multi-scale feature extraction network, includes successively connected dimensionality
所述大目标检测分支包括依次相连的下采样特征图层4、第一空洞卷积层5和上采样恢复层6;所述原始特征检测分支包括第二空洞卷积层7;所述小目标检测分支包括依次相连的上采样特征图层8、第三空洞卷积层9和下采样恢复层10,本实施例中,采用上采样的方式是UP-CONV,下采样的方式是MAXPOOL。The large target detection branch includes a
所述第一空洞卷积层5包括三个卷积核均为3*3的空洞卷积,且该三个空洞卷积的空洞率分别为1、2和3;所述第二空洞卷积层7和第三空洞卷积层9的结构均与所述第一空洞卷积层5相同,都是具有三个卷积核均为3*3的空洞卷积,空洞率分别为1、2和3,每个卷积核只作用该空洞卷积层通道数的1/3,令大目标检测分支、原始特征检测分支和小目标检测分支均具备多尺度特征提取能力。The first dilated
所述降维卷积层1以及所述特征融合层3的卷积核均为1*1,令降维卷积层1和特征融合层3具备特征降维以及特征融合能力,且节约计算时间。The convolution kernels of the dimensionality
参照图1和图2,多尺度特征提取网络的特征提取方法包括以下步骤如下:Referring to Figure 1 and Figure 2, the feature extraction method of the multi-scale feature extraction network includes the following steps as follows:
(1)、将待提取特征图输入至降维卷积层1内进行1x1的卷积,将待提取特征图进行特征融合与降维,形成原始特征图,是降维卷积层对输入的待提取特征图实现特征的融合及特征的降维,可以节约计算时间,原始特征图的特征深度降为待提取特征图的原始深度的1/3。(1) Input the feature map to be extracted into the dimensionality
(2)、下采样特征图层4对原始特征图进行下采样,构成下采样特征图,下采样特征图的深度与原始特征图相同不变,图像宽高变为原来的1/2倍,下采样的目的是实现大目标的检测及减少运算量。(2) The
上采样特征图层8对原始特征图进行上采样,构成上采样特征图;上采样特征图的同样与原始特征图相同不变,图像宽高变为原来的2倍,上采样的目的是实现小目标的检测。The
(3)、第一空洞卷积层5至第三空洞卷积层9分别对下采样特征图、原始特征图和上采样特征图进行三个3*3的空洞卷积,即下采样特征图、原始特征图和上采样特征图各自都进行三个3*3的空洞卷积,该三个3*3空洞卷积的空洞率分别为1(即标准3x3卷积),2,3,每个卷积核只作用该层通道数的1/3,因此,就生成下采样第一尺度图11、下采样第二尺度图12、下采样第三尺度图13、原始第一尺度图14、原始第二尺度图15、原始第三尺度图16、上采样第一尺度图17、上采样第二尺度图18和上采样第三尺度图19,令下采样特征图、原始特征图和上采样特征图能接受不同感受野的卷积,从而实现多尺度特征的提取。(3), the first
(4)、上采样恢复层6分别对下采样第一尺度图11、下采样第二尺度图12和下采样第三尺度图13进行出上采样,下采样恢复层10分别对上采样第一尺度图17、上采样第二尺度图18和上采样第三尺度图19进行下采样,目的是令上采样恢复层6和下采样恢复层10将采样第一尺度图11、下采样第二尺度图12、下采样第三尺度图13、上采样第一尺度图17、上采样第二尺度图18和上采样第三尺度图19的宽高均与原始第一尺度图14、原始第二尺度图15和原始第三尺度图16保持一致,便于步骤五的合并及特征融合。(4) The
(5)、合并层2将下采样第一尺度图11、下采样第二尺度图12、下采样第三尺度图13分别进行上采样得到的三个尺度图,原始第一尺度图14、原始第二尺度图15、原始第三尺度图16,上采样第一尺度图17、上采样第二尺度图18和上采样第三尺度图19分别进行下采样得到的三个尺度图合并后被特征融合层3进行1*1的卷积融合,形成尺度特征提取图, 完成多尺度特征提取,并保持与待提取特征图相同的特征深度。(5) Merging
以上的实施方式不能限定本发明创造的保护范围,专业技术领域的人员在不脱离本发明创造整体构思的情况下,所做的均等修饰与变化,均仍属于本发明创造涵盖的范围之内。The above embodiments cannot limit the scope of protection of the present invention, and equivalent modifications and changes made by those in the technical field without departing from the overall concept of the present invention still fall within the scope of the present invention.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011530198.6A CN112560732B (en) | 2020-12-22 | 2020-12-22 | Feature extraction method of multi-scale feature extraction network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011530198.6A CN112560732B (en) | 2020-12-22 | 2020-12-22 | Feature extraction method of multi-scale feature extraction network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112560732A CN112560732A (en) | 2021-03-26 |
CN112560732B true CN112560732B (en) | 2023-07-04 |
Family
ID=75031388
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011530198.6A Active CN112560732B (en) | 2020-12-22 | 2020-12-22 | Feature extraction method of multi-scale feature extraction network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112560732B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115187820A (en) * | 2021-04-06 | 2022-10-14 | 中国科学院深圳先进技术研究院 | Light-weight target detection method, device, equipment and storage medium |
CN113313668B (en) * | 2021-04-19 | 2022-09-27 | 石家庄铁道大学 | A method for extracting surface disease features of subway tunnels |
CN113378786B (en) * | 2021-07-05 | 2023-09-19 | 广东省机场集团物流有限公司 | Ultra-light target detection network and method |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460403A (en) * | 2018-01-23 | 2018-08-28 | 上海交通大学 | The object detection method and system of multi-scale feature fusion in a kind of image |
CN111242036A (en) * | 2020-01-14 | 2020-06-05 | 西安建筑科技大学 | A Crowd Counting Method Based on Encoder-Decoder Structure Multi-scale Convolutional Neural Networks |
CN111462127A (en) * | 2020-04-20 | 2020-07-28 | 武汉大学 | Real-time semantic segmentation method and system for automatic driving |
CN111860693A (en) * | 2020-07-31 | 2020-10-30 | 元神科技(杭州)有限公司 | Lightweight visual target detection method and system |
CN111899191A (en) * | 2020-07-21 | 2020-11-06 | 武汉工程大学 | A text image restoration method, device and storage medium |
CN111898539A (en) * | 2020-07-30 | 2020-11-06 | 国汽(北京)智能网联汽车研究院有限公司 | Multi-target detection method, device, system, equipment and readable storage medium |
CN111967524A (en) * | 2020-08-20 | 2020-11-20 | 中国石油大学(华东) | Multi-scale fusion feature enhancement algorithm based on Gaussian filter feedback and cavity convolution |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11651206B2 (en) * | 2018-06-27 | 2023-05-16 | International Business Machines Corporation | Multiscale feature representations for object recognition and detection |
-
2020
- 2020-12-22 CN CN202011530198.6A patent/CN112560732B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460403A (en) * | 2018-01-23 | 2018-08-28 | 上海交通大学 | The object detection method and system of multi-scale feature fusion in a kind of image |
CN111242036A (en) * | 2020-01-14 | 2020-06-05 | 西安建筑科技大学 | A Crowd Counting Method Based on Encoder-Decoder Structure Multi-scale Convolutional Neural Networks |
CN111462127A (en) * | 2020-04-20 | 2020-07-28 | 武汉大学 | Real-time semantic segmentation method and system for automatic driving |
CN111899191A (en) * | 2020-07-21 | 2020-11-06 | 武汉工程大学 | A text image restoration method, device and storage medium |
CN111898539A (en) * | 2020-07-30 | 2020-11-06 | 国汽(北京)智能网联汽车研究院有限公司 | Multi-target detection method, device, system, equipment and readable storage medium |
CN111860693A (en) * | 2020-07-31 | 2020-10-30 | 元神科技(杭州)有限公司 | Lightweight visual target detection method and system |
CN111967524A (en) * | 2020-08-20 | 2020-11-20 | 中国石油大学(华东) | Multi-scale fusion feature enhancement algorithm based on Gaussian filter feedback and cavity convolution |
Non-Patent Citations (4)
Title |
---|
AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network;Jin Chen等;《EURASIP Journal on Wireless Communications and Networking》;1-7 * |
基于FD-SSD的遥感图像多目标检测方法;朱敏超等;《计算机应用与软件》;第36卷(第1期);232-238 * |
基于多级特征和混合注意力机制的室内人群检测网络;沈文祥等;《计算机应用》;第39卷(第12期);3496-3502 * |
基于深度学习的无人驾驶汽车环境感知与控制方法研究;李健明;《中国优秀硕士学位论文全文数据库 工程科技II辑》(第01期);C035-469 * |
Also Published As
Publication number | Publication date |
---|---|
CN112560732A (en) | 2021-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112560732B (en) | Feature extraction method of multi-scale feature extraction network | |
CN111915660B (en) | Binocular disparity matching method and system based on shared features and attention up-sampling | |
CN108710830A (en) | A kind of intensive human body 3D posture estimation methods for connecting attention pyramid residual error network and equidistantly limiting of combination | |
WO2022213395A1 (en) | Light-weighted target detection method and device, and storage medium | |
CN110033410A (en) | Image reconstruction model training method, image super-resolution rebuilding method and device | |
CN111784582B (en) | A low-light image super-resolution reconstruction method based on DEC_SE | |
CN111161146B (en) | Coarse-to-fine single-image super-resolution reconstruction method | |
CN115035295B (en) | Remote sensing image semantic segmentation method based on shared convolution kernel and boundary loss function | |
CN112270366B (en) | Micro target detection method based on self-adaptive multi-feature fusion | |
CN114359297A (en) | Attention pyramid-based multi-resolution semantic segmentation method and device | |
CN111932480A (en) | Deblurred video recovery method and device, terminal equipment and storage medium | |
CN117173412A (en) | Medical image segmentation method based on CNN and Transformer fusion network | |
CN114972746B (en) | A medical image segmentation method based on multi-resolution overlapping attention mechanism | |
CN110866938A (en) | Full-automatic video moving object segmentation method | |
CN115661462A (en) | Medical image segmentation method based on convolution and deformable self-attention mechanism | |
CN115424290A (en) | Human body posture estimation method, device, terminal and computer readable storage medium | |
CN111783862A (en) | Stereo salient object detection technology based on multi-attention-directed neural network | |
CN113313162A (en) | Method and system for detecting multi-scale feature fusion target | |
CN115496909A (en) | Semantic segmentation method for three-branch adaptive weight feature fusion | |
CN115272677A (en) | Multi-scale feature fusion semantic segmentation method, equipment and storage medium | |
CN111738921B (en) | Deep super-resolution method based on progressive fusion of multiple information based on deep neural network | |
CN116681978A (en) | Attention mechanism and multi-scale feature fusion-based saliency target detection method | |
CN110084750B (en) | Single Image Super-resolution Method Based on Multilayer Ridge Regression | |
CN117788296B (en) | Infrared remote sensing image super-resolution reconstruction method based on heterogeneous combined depth network | |
CN112634153B (en) | Image deblurring method based on edge enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20250520 Address after: 528400 Guangdong Province Zhongshan City Zhuangji Development Zone Chengdong Community Yushan Road No. 28 First Floor Area A Patentee after: Zhongshan Lanqi Technology Co.,Ltd. Country or region after: China Address before: 528400 Zhongshan, Guangdong Shi Qu District, Xueyuan Road, No. 1 Patentee before: University OF ELECTRONIC SCIENCE AND TECHNOLOGY OF CHINA, ZHONGSHAN INSTITUTE Country or region before: China |
|
TR01 | Transfer of patent right |