CN112560732B - Feature extraction method of multi-scale feature extraction network - Google Patents

Feature extraction method of multi-scale feature extraction network Download PDF

Info

Publication number
CN112560732B
CN112560732B CN202011530198.6A CN202011530198A CN112560732B CN 112560732 B CN112560732 B CN 112560732B CN 202011530198 A CN202011530198 A CN 202011530198A CN 112560732 B CN112560732 B CN 112560732B
Authority
CN
China
Prior art keywords
scale
layer
map
feature
sampling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011530198.6A
Other languages
Chinese (zh)
Other versions
CN112560732A (en
Inventor
潘新建
张崇富
邓春健
杨亮
吴洁滢
李奇
李志莉
徐世祥
王婷瑶
温贺平
高庆国
刘凯
迟锋
刘黎明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China Zhongshan Institute
Original Assignee
University of Electronic Science and Technology of China Zhongshan Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China Zhongshan Institute filed Critical University of Electronic Science and Technology of China Zhongshan Institute
Priority to CN202011530198.6A priority Critical patent/CN112560732B/en
Publication of CN112560732A publication Critical patent/CN112560732A/en
Application granted granted Critical
Publication of CN112560732B publication Critical patent/CN112560732B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a feature extraction method of a multi-scale feature extraction network, which comprises a dimension reduction convolution layer, a scale feature extraction layer, a merging layer and a feature fusion layer which are sequentially connected, wherein the scale feature extraction layer comprises a large target detection branch, an original feature detection branch and a small target detection branch, features with different scales can be extracted, and feature fusion is carried out through the feature fusion layer, so that the multi-scale feature extraction network has the multi-scale feature extraction capability and low calculation complexity, the feature extraction network can be immediately applied when multi-scale feature extraction is needed, the target detection precision is improved, and the feature extraction method of the multi-scale feature extraction network can carry out feature dimension reduction, target detection and core multi-scale feature extraction on an input feature image to be extracted, can rapidly obtain the scale feature extraction image, and has the advantages of high target detection precision and less calculation quantity.

Description

Feature extraction method of multi-scale feature extraction network
Technical Field
The present invention relates to a neural network for feature extraction, and more particularly, to a multi-scale feature extraction network and a feature extraction method thereof.
Background
The multi-scale target detection is always a hotspot and a difficulty in research in the field of computer vision, in order to obtain the improvement of the precision of the multi-scale target detection, network structures such as FPN, PA-Net, NAS-FPN, biFPN and the like are continuously proposed, but because the network structures are relatively complex, the target detection precision is improved, and meanwhile, too much calculation amount is carried, so that the reasoning time is delayed, and the application and popularization of the multi-scale target detection in the industry become difficult.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a feature extraction method of a multi-scale feature extraction network for improving the target detection precision.
The technical scheme adopted for solving the technical problems is as follows:
the feature extraction method of the multi-scale feature extraction network comprises a dimension reduction convolution layer, a scale feature extraction layer, a merging layer and a feature fusion layer which are sequentially connected, wherein the scale feature extraction layer comprises a large target detection branch, an original feature detection branch and a small target detection branch.
The large target detection branch comprises a downsampling characteristic layer, a first cavity convolution layer and an upsampling recovery layer which are connected in sequence; the original feature detection branch comprises a second cavity convolution layer; the small target detection branch comprises an up-sampling characteristic layer, a third cavity convolution layer and a down-sampling recovery layer which are connected in sequence.
The first cavity convolution layer comprises three cavity convolutions with convolution kernels of 3*3, and the cavity rates of the three cavity convolutions are 1, 2 and 3 respectively; the structures of the second hole convolution layer and the third hole convolution layer are the same as those of the first hole convolution layer.
The convolution kernels of the dimension reduction convolution layer and the feature fusion layer are 1*1.
The feature extraction method of the multi-scale feature extraction network comprises the following steps of:
(1) Inputting the feature image to be extracted into a dimension reduction convolution layer to carry out 1x1 convolution, and carrying out feature fusion and dimension reduction on the feature image to be extracted to form an original feature image;
(2) Downsampling the original feature map by the downsampling feature map layer to form a downsampled feature map; the up-sampling feature map layer up-samples the original feature map to form an up-sampling feature map;
(3) The first hole convolution layer to the third hole convolution layer respectively conduct three 3*3 hole convolutions on the downsampled feature map, the original feature map and the upsampled feature map to generate a downsampled first scale map, a downsampled second scale map, a downsampled third scale map, an original first scale map, an original second scale map, an original third scale map, an upsampled first scale map, an upsampled second scale map and an upsampled third scale map;
(4) The up-sampling recovery layer up-samples the down-sampling first scale map, the down-sampling second scale map and the down-sampling third scale map respectively, and the down-sampling recovery layer down-samples the up-sampling first scale map, the up-sampling second scale map and the up-sampling third scale map respectively;
(5) And combining the three dimensional graphs obtained by respectively downsampling the downsampled first dimensional graph, the downsampled second dimensional graph and the downsampled third dimensional graph by the combining layer, and then performing 1*1 convolution fusion by the feature fusion layer to form a dimensional feature extraction graph.
The beneficial effects of the invention are as follows: the method can extract the features of different scales, and then performs feature fusion through the feature fusion layer, so that the multi-scale feature extraction network has the multi-scale feature extraction capability and low computation complexity, the feature extraction network can be immediately applied when the multi-scale feature extraction is required, the target detection precision is improved, and the feature extraction method of the multi-scale feature extraction network can perform feature dimension reduction, target detection and core multi-scale feature extraction on the input feature image to be extracted, can rapidly acquire the scale feature extraction image, and has the advantages of high target detection precision and less calculation amount.
Drawings
The invention will be further described with reference to the drawings and examples.
FIG. 1 is a schematic diagram of a network architecture of the present invention;
fig. 2 is a flow chart of a feature extraction method of the present invention.
Description of the embodiments
Referring to fig. 1, a feature extraction method of a multi-scale feature extraction network includes a dimension reduction convolution layer 1, a scale feature extraction layer, a merging layer 2 and a feature fusion layer 3 which are sequentially connected, wherein the scale feature extraction layer includes a large target detection branch, an original feature detection branch and a small target detection branch, features with different scales can be extracted, feature fusion is performed through the feature fusion layer 3, the multi-scale feature extraction network has multi-scale feature extraction capability and low calculation complexity, and the feature extraction network can be immediately applied when the neural network needs multi-scale feature extraction, so that the target detection accuracy is improved.
The large target detection branch comprises a downsampling characteristic layer 4, a first cavity convolution layer 5 and an upsampling recovery layer 6 which are connected in sequence; the original feature detection branch comprises a second cavity convolution layer 7; the small target detection branch comprises an UP-sampling feature layer 8, a third hole convolution layer 9 and a down-sampling recovery layer 10 which are sequentially connected, in this embodiment, an UP-sampling mode is UP-CONV, and a down-sampling mode is MAXPOOL.
The first hole convolution layer 5 comprises three hole convolutions with convolution kernels of 3*3, and the hole rates of the three hole convolutions are 1, 2 and 3 respectively; the structures of the second hole convolution layer 7 and the third hole convolution layer 9 are the same as those of the first hole convolution layer 5, and the structures are hole convolutions with three convolution kernels of 3*3, the hole rate is 1, 2 and 3 respectively, and each convolution kernel only acts on 1/3 of the channel number of the hole convolution layer, so that the large target detection branch, the original feature detection branch and the small target detection branch all have multi-scale feature extraction capability.
The convolution kernels of the dimension reduction convolution layer 1 and the feature fusion layer 3 are 1*1, so that the dimension reduction convolution layer 1 and the feature fusion layer 3 have feature dimension reduction and feature fusion capabilities, and calculation time is saved.
Referring to fig. 1 and 2, the feature extraction method of the multi-scale feature extraction network includes the steps of:
(1) The feature map to be extracted is input into the dimension reduction convolution layer 1 to carry out 1x1 convolution, feature fusion and dimension reduction are carried out on the feature map to be extracted to form an original feature map, the dimension reduction convolution layer is used for realizing feature fusion and dimension reduction of the features on the input feature map to be extracted, calculation time can be saved, and the feature depth of the original feature map is reduced to 1/3 of the original depth of the feature map to be extracted.
(2) The downsampling feature map layer 4 downsamples the original feature map to form a downsampling feature map, the depth of the downsampling feature map is the same as that of the original feature map, the width and the height of the image are 1/2 times that of the original feature map, and the downsampling purpose is to achieve detection of a large target and reduce the operation amount.
The up-sampling feature map layer 8 up-samples the original feature map to form an up-sampling feature map; the up-sampling feature map is the same as the original feature map, the image width and height are changed to 2 times of the original feature map, and the purpose of up-sampling is to realize the detection of a small target.
(3) The first hole convolution layer 5 to the third hole convolution layer 9 respectively carry out three 3*3 hole convolutions on the downsampled feature map, the original feature map and the upsampled feature map, namely, the downsampled feature map, the original feature map and the upsampled feature map respectively carry out three 3*3 hole convolutions, the hole rate of the three 3*3 hole convolutions is respectively 1 (namely, standard 3x3 convolutions), and 2 and 3, each convolution kernel only acts on 1/3 of the channel number of the layer, so that a downsampled first scale map 11, a downsampled second scale map 12, a downsampled third scale map 13, an original first scale map 14, an original second scale map 15, an original third scale map 16, an upsampled first scale map 17, an upsampled second scale map 18 and an upsampled third scale map 19 are generated, and the downsampled feature map, the original feature map and the upsampled feature map can accept convolutions of different receptive fields, and extraction of multi-scale features is realized.
(4) The up-sampling restoring layer 6 up-samples the down-sampling first scale map 11, the down-sampling second scale map 12 and the down-sampling third scale map 13 respectively, and the down-sampling restoring layer 10 down-samples the up-sampling first scale map 17, the up-sampling second scale map 18 and the up-sampling third scale map 19 respectively, so that the up-sampling restoring layer 6 and the down-sampling restoring layer 10 can keep the widths and heights of the down-sampling first scale map 11, the down-sampling second scale map 12, the down-sampling third scale map 13, the up-sampling first scale map 17, the up-sampling second scale map 18 and the up-sampling third scale map 19 consistent with those of the original first scale map 14, the original second scale map 15 and the original third scale map 16, and the combination and feature fusion of the fifth step are facilitated.
(5) The merging layer 2 respectively carries out up-sampling on the down-sampling first scale image 11, the down-sampling second scale image 12 and the down-sampling third scale image 13 to obtain three scale images, namely an original first scale image 14, an original second scale image 15 and an original third scale image 16, and three scale images respectively obtained by down-sampling the up-sampling first scale image 17, the up-sampling second scale image 18 and the up-sampling third scale image 19 are merged and then subjected to 1*1 convolution fusion by the feature fusion layer 3 to form a scale feature extraction image, multi-scale feature extraction is completed, and feature depth identical to that of the feature image to be extracted is kept.
The above embodiments do not limit the protection scope of the invention, and those skilled in the art can make equivalent modifications and variations without departing from the whole inventive concept, and they still fall within the scope of the invention.

Claims (1)

1. The characteristic extraction method of the multi-scale characteristic extraction network is characterized in that the multi-scale characteristic extraction network comprises a dimension reduction convolution layer (1), a scale characteristic extraction layer, a merging layer (2) and a characteristic fusion layer (3) which are connected in sequence, wherein the scale characteristic extraction layer comprises a large target detection branch, an original characteristic detection branch and a small target detection branch;
the large target detection branch comprises a downsampling characteristic layer (4), a first cavity convolution layer (5) and an upsampling recovery layer (6) which are connected in sequence; the original feature detection branch comprises a second hole convolution layer (7); the small target detection branch comprises an up-sampling characteristic layer (8), a third cavity convolution layer (9) and a down-sampling recovery layer (10) which are connected in sequence; the first cavity convolution layer (5) comprises three cavity convolutions with convolution kernels of 3*3, and the cavity rates of the three cavity convolutions are 1, 2 and 3 respectively; the structures of the second hole convolution layer (7) and the third hole convolution layer (9) are the same as those of the first hole convolution layer (5);
the convolution kernels of the dimension reduction convolution layer (1) and the feature fusion layer (3) are 1*1;
the feature extraction method of the multi-scale feature extraction network comprises the following steps of:
firstly, inputting a feature image to be extracted into a dimension reduction convolution layer to carry out 1x1 convolution, and carrying out feature fusion and dimension reduction on the feature image to be extracted to form an original feature image;
secondly, the downsampling feature map layer downsamples the original feature map to form a downsampled feature map; the up-sampling feature map layer up-samples the original feature map to form an up-sampling feature map;
thirdly, the first hole convolution layer to the third hole convolution layer respectively conduct three 3*3 hole convolutions on the downsampled feature map, the original feature map and the upsampled feature map to generate a downsampled first scale map, a downsampled second scale map, a downsampled third scale map, an original first scale map, an original second scale map, an original third scale map, an upsampled first scale map, an upsampled second scale map and an upsampled third scale map;
fourthly, the up-sampling recovery layer up-samples the down-sampling first scale map, the down-sampling second scale map and the down-sampling third scale map respectively, and the down-sampling recovery layer down-samples the up-sampling first scale map, the up-sampling second scale map and the up-sampling third scale map respectively;
and fifthly, merging three scale maps obtained by respectively carrying out downsampling on the downsampled first scale map, the downsampled second scale map and the downsampled third scale map by a merging layer, and carrying out 1*1 convolution fusion on the three scale maps obtained by respectively carrying out downsampling on the upsampled first scale map, the upsampled second scale map and the upsampled third scale map by a feature fusion layer to form a scale feature extraction map.
CN202011530198.6A 2020-12-22 2020-12-22 Feature extraction method of multi-scale feature extraction network Active CN112560732B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011530198.6A CN112560732B (en) 2020-12-22 2020-12-22 Feature extraction method of multi-scale feature extraction network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011530198.6A CN112560732B (en) 2020-12-22 2020-12-22 Feature extraction method of multi-scale feature extraction network

Publications (2)

Publication Number Publication Date
CN112560732A CN112560732A (en) 2021-03-26
CN112560732B true CN112560732B (en) 2023-07-04

Family

ID=75031388

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011530198.6A Active CN112560732B (en) 2020-12-22 2020-12-22 Feature extraction method of multi-scale feature extraction network

Country Status (1)

Country Link
CN (1) CN112560732B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115187820A (en) * 2021-04-06 2022-10-14 中国科学院深圳先进技术研究院 Light-weight target detection method, device, equipment and storage medium
CN113313668B (en) * 2021-04-19 2022-09-27 石家庄铁道大学 Subway tunnel surface disease feature extraction method
CN113378786B (en) * 2021-07-05 2023-09-19 广东省机场集团物流有限公司 Ultra-light target detection network and method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460403A (en) * 2018-01-23 2018-08-28 上海交通大学 The object detection method and system of multi-scale feature fusion in a kind of image
CN111242036A (en) * 2020-01-14 2020-06-05 西安建筑科技大学 Crowd counting method based on encoding-decoding structure multi-scale convolutional neural network
CN111462127A (en) * 2020-04-20 2020-07-28 武汉大学 Real-time semantic segmentation method and system for automatic driving
CN111860693A (en) * 2020-07-31 2020-10-30 元神科技(杭州)有限公司 Lightweight visual target detection method and system
CN111898539A (en) * 2020-07-30 2020-11-06 国汽(北京)智能网联汽车研究院有限公司 Multi-target detection method, device, system, equipment and readable storage medium
CN111899191A (en) * 2020-07-21 2020-11-06 武汉工程大学 Text image restoration method and device and storage medium
CN111967524A (en) * 2020-08-20 2020-11-20 中国石油大学(华东) Multi-scale fusion feature enhancement algorithm based on Gaussian filter feedback and cavity convolution

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11651206B2 (en) * 2018-06-27 2023-05-16 International Business Machines Corporation Multiscale feature representations for object recognition and detection

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460403A (en) * 2018-01-23 2018-08-28 上海交通大学 The object detection method and system of multi-scale feature fusion in a kind of image
CN111242036A (en) * 2020-01-14 2020-06-05 西安建筑科技大学 Crowd counting method based on encoding-decoding structure multi-scale convolutional neural network
CN111462127A (en) * 2020-04-20 2020-07-28 武汉大学 Real-time semantic segmentation method and system for automatic driving
CN111899191A (en) * 2020-07-21 2020-11-06 武汉工程大学 Text image restoration method and device and storage medium
CN111898539A (en) * 2020-07-30 2020-11-06 国汽(北京)智能网联汽车研究院有限公司 Multi-target detection method, device, system, equipment and readable storage medium
CN111860693A (en) * 2020-07-31 2020-10-30 元神科技(杭州)有限公司 Lightweight visual target detection method and system
CN111967524A (en) * 2020-08-20 2020-11-20 中国石油大学(华东) Multi-scale fusion feature enhancement algorithm based on Gaussian filter feedback and cavity convolution

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network;Jin Chen等;《EURASIP Journal on Wireless Communications and Networking》;1-7 *
基于FD-SSD的遥感图像多目标检测方法;朱敏超等;《计算机应用与软件》;第36卷(第1期);232-238 *
基于多级特征和混合注意力机制的室内人群检测网络;沈文祥等;《计算机应用》;第39卷(第12期);3496-3502 *
基于深度学习的无人驾驶汽车环境感知与控制方法研究;李健明;《中国优秀硕士学位论文全文数据库 工程科技II辑》(第01期);C035-469 *

Also Published As

Publication number Publication date
CN112560732A (en) 2021-03-26

Similar Documents

Publication Publication Date Title
CN112560732B (en) Feature extraction method of multi-scale feature extraction network
CN108475415B (en) Method and system for image processing
CN110909642A (en) Remote sensing image target detection method based on multi-scale semantic feature fusion
CN110047058B (en) Image fusion method based on residual pyramid
CN108717569A (en) Expansion full convolution neural network and construction method thereof
CN112990219B (en) Method and device for image semantic segmentation
CN110674704A (en) Crowd density estimation method and device based on multi-scale expansion convolutional network
CN111932480A (en) Deblurred video recovery method and device, terminal equipment and storage medium
CN115187820A (en) Light-weight target detection method, device, equipment and storage medium
CN115953303A (en) Multi-scale image compressed sensing reconstruction method and system combining channel attention
CN111612825A (en) Image sequence motion occlusion detection method based on optical flow and multi-scale context
Yang et al. Image super-resolution reconstruction based on improved Dirac residual network
Deng et al. Multiple frame splicing and degradation learning for hyperspectral imagery super-resolution
CN110599495B (en) Image segmentation method based on semantic information mining
CN115565034A (en) Infrared small target detection method based on double-current enhanced network
CN106033594B (en) Spatial information restoration methods based on the obtained feature of convolutional neural networks and device
CN113313162A (en) Method and system for detecting multi-scale feature fusion target
CN114820423A (en) Automatic cutout method based on saliency target detection and matching system thereof
CN111582353B (en) Image feature detection method, system, device and medium
CN111428809B (en) Crowd counting method based on spatial information fusion and convolutional neural network
CN115775214B (en) Point cloud completion method and system based on multi-stage fractal combination
CN111402140A (en) Single image super-resolution reconstruction system and method
CN117952883A (en) Backlight image enhancement method based on bilateral grid and significance guidance
CN116681978A (en) Attention mechanism and multi-scale feature fusion-based saliency target detection method
CN116740376A (en) Pyramid integration and attention enhancement-based target detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant