CN111738237A - Target detection method of multi-core iteration RPN based on heterogeneous convolution - Google Patents

Target detection method of multi-core iteration RPN based on heterogeneous convolution Download PDF

Info

Publication number
CN111738237A
CN111738237A CN202010817648.3A CN202010817648A CN111738237A CN 111738237 A CN111738237 A CN 111738237A CN 202010817648 A CN202010817648 A CN 202010817648A CN 111738237 A CN111738237 A CN 111738237A
Authority
CN
China
Prior art keywords
network
convolution
target
image data
heterogeneous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010817648.3A
Other languages
Chinese (zh)
Other versions
CN111738237B (en
Inventor
刘晋
尚圣杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maritime University
Original Assignee
Shanghai Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maritime University filed Critical Shanghai Maritime University
Publication of CN111738237A publication Critical patent/CN111738237A/en
Application granted granted Critical
Publication of CN111738237B publication Critical patent/CN111738237B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a target detection method of multi-core iteration RPN based on heterogeneous convolution, which comprises the following steps: receiving image data to be detected; carrying out graying and local binarization data enhancement processing on the image data to obtain processed image data; inputting the processed image data into a heterogeneous convolution network for feature extraction to obtain a feature map; inputting the feature map into a multi-scale feature extraction network, realizing feature extraction of different scales and obtaining a target feature map; inputting the target characteristic diagram into an RIR target detection network to obtain a plurality of area candidate frames; obtaining a target score corresponding to each region candidate frame according to the non-maximum inhibition function, and obtaining a region suggestion window according to a preset score threshold; and classifying the region suggestion windows according to the full convolution network layer and the normalized index function classifier to obtain a classification result, an image category and a confidence score. By applying the embodiment of the invention, the problems of low operation speed, poor small target detection effect and the like are effectively solved.

Description

Target detection method of multi-core iteration RPN based on heterogeneous convolution
Technical Field
The invention relates to the technical field of computer vision image processing, in particular to a target detection method of multi-core iteration RPN based on heterogeneous convolution.
Background
Target detection is one of the most challenging tasks in computer vision applications, and has wide application in fields such as unmanned driving, security systems and the like. However, in the target detection process of a natural scene, the target detection process is affected by some non-human factors such as illumination, the direction of an object, and object occlusion. For the increasing use demand of people, how to improve the target detection performance in natural scenes has become an urgent need at present.
Currently, target detection methods are mainly divided into two main categories, including two-stage and one-stage network detection. Wherein two-stage target detection is adopted in two stages, which mainly comprises: (1) firstly, generating a candidate region for the image by using a region generation network (RPN), and (2) classifying the generated candidate region by using a deep learning network. While one-stage networks include only one stage that directly utilizes deep learning networks to generate category probabilities and location information for objects. Therefore, the two-stage network has the characteristics of high detection precision and low detection speed, and on the contrary, the one-stage network has the characteristics of high detection speed and low precision.
The traditional two-stage network has good detection effect on common targets, and mainly comprises the following steps: (1) the method comprises the steps of (1) extracting features of a target image by using different feature extraction networks (such as a residual error network (ResNet) and a Convolutional Neural Network (CNN) (2) carrying out primary detection on the target image by using a region generation network (RPN), simply distinguishing a foreground from an irrelevant background in the image and generating a candidate region frame of the target), (3) carrying out class classification on the image target by using an image classification network according to the candidate region frame generated by the RPN so as to output a final position and a class of the target.
Aiming at the problems of insensitivity to targets with different sizes, poor small target detection effect and high time consumption in target detection, a multi-core iterative RPN target detection network based on heterogeneous convolution is designed, a convolution kernel with the size of 1 multiplied by 1 replaces a convolution kernel with the size of 3 multiplied by 3 in the aspect of feature extraction, the accuracy is kept while the calculated amount and the network parameters are reduced, the calculation time is greatly reduced in the aspect of feature extraction, and the detection speed is improved. According to the inclusion thought proposed by Google, a multi-scale feature extraction network is proposed, and different sizes of target images in the images are concentrated through convolution kernels of different sizes, so that the detection accuracy of the network on different sizes is improved through the multi-scale extraction network. According to the existing RPN network mechanism, an iterative RPN mode of RPNINRPN (RIR) is designed, and on the basis of a region candidate frame generated by a first layer of RPN, the generated region candidate frame is further finely screened by a second layer of RPN, so that the screening can not only further increase the accuracy of the classification network, but also further enhance the detection precision of small targets, and further solve the problems of incomplete detection and long time consumption in detection by other methods.
The region generation network (RPN) is a full convolution network, and can simultaneously detect a target at each position of the feature map and give a target score to generate a high-quality region candidate frame. This network is an object detection assisting network proposed in the Faster-RCNN network proposed by Ross b.girshick in 2016.
Inclusion, also known as GoogleNet, is a CNN classification network model proposed by Google in 2014. The Incep network has adaptability to images of different scales through convolution kernels of different sizes, and the network is enlarged in width rather than depth, so that parameters in the network are greatly reduced, and the calculation amount is reduced.
Disclosure of Invention
The invention aims to provide a target detection method of multi-core iteration RPN based on heterogeneous convolution, which aims to overcome the defects of the prior art and can effectively solve the problems of insensitivity to targets with different sizes, low running speed, poor detection effect of small target objects and the like in target detection.
In order to achieve the above object, the present invention provides a target detection method for multi-core iteration RPN based on heterogeneous convolution, including:
receiving image data to be detected;
carrying out graying and local binarization data enhancement processing on the image data to obtain processed image data;
inputting the processed image data into a heterogeneous convolution network for feature extraction to obtain a feature map;
inputting the feature map into a pre-constructed multi-scale feature extraction network to realize feature extraction of different scales and obtain a target feature map;
inputting the target characteristic diagram into a RIR network to obtain a plurality of area candidate frames;
obtaining a target score corresponding to each region candidate frame according to the non-maximum inhibition function; screening the plurality of regional candidate frames according to a preset score threshold value to obtain a regional suggestion window;
and classifying the region suggestion windows according to the full convolution network layer and the normalized index function classifier to obtain a classification result and obtain an image category and a confidence score.
Preferably, the step of performing graying and local binarization data enhancement processing on the image data to obtain processed image data includes:
carrying out graying processing on the received image data;
carrying out local binarization processing on the image subjected to the graying processing to obtain a binarized image;
and carrying out noise addition, rotation and turnover on the binary image by adopting a data enhancement algorithm to obtain processed image data.
Preferably, the step of inputting the processed image data into a heterogeneous convolutional network for feature extraction to obtain a feature map includes:
constructing a heterogeneous convolutional network, wherein the size of a convolutional kernel of the heterogeneous convolutional network is 3 multiplied by 3 and 1 multiplied by 1, and the heterogeneous convolutional network is formed by arranging and combining according to a heterogeneous kernel mode;
inputting the processed image data into the constructed heterogeneous convolution network, and extracting image features;
and carrying out convolution operation on the obtained image feature map through a convolution kernel of 1 × 1 to output the feature map with reduced dimensionality.
In one implementation manner, the step of inputting the feature map into a pre-constructed multi-scale feature extraction network to implement feature extraction of different scales and obtain a target feature map includes:
and inputting the feature map into a multi-scale feature extraction network, convolving the targets with different proportions in the image by adopting convolution kernels with three different sizes in the multi-scale feature extraction network, and generating corresponding target feature maps according to different sensitivity degrees of the convolution kernels with each size to the targets with different sizes.
Inputting the target feature map into a RIR network, and acquiring a plurality of region candidate frames, wherein the step comprises the following steps:
constructing an RIR network structure, wherein the RIR network structure is as follows: the two RPN layers form a network structure in a full connection mode;
inputting the target feature map into the RIR network, and generating set n region candidate frames by the first layer RPN according to the target in the feature map;
and screening the generated n region candidate frames through the second layer RPN.
The target detection method based on heterogeneous convolution multi-core iteration RPN provided by the embodiment of the invention is different from the traditional methods such as target detection morphological processing, and the like, in the aspect of feature extraction, the convolution based on heterogeneous kernels uses convolution kernels with the size of 1 multiplied by 1 to replace convolution kernels with the size of 3 multiplied by 3, so that the accuracy is maintained while the calculated amount and the network parameters are reduced, in the aspect of feature extraction, the calculation time is greatly reduced, and the detection speed is improved. According to the inclusion thought proposed by Google, a multi-scale feature extraction network is proposed, different sizes of target images in different types of concentration images are classified through convolution kernels with different sizes, and therefore the detection accuracy of the network is improved through the multi-scale extraction network. The iterative RPN mode of the RPNINRPN (RIR) is designed according to the existing RPN network mechanism, and on the basis of the region candidate frame generated by the first layer of RPN, the generated region candidate frame is more finely screened through the second layer of RPN, so that the accuracy of the classification network can be further improved, and the detection precision of the small target can be further enhanced, thereby effectively solving the problems of poor detection effect and time consumption of other methods, effectively solving the problems of insensitivity to targets with different sizes, low operation speed, poor detection effect of the small target and the like in target detection, and having wide application range and strong robustness.
Drawings
Fig. 1 is a schematic flowchart of a target detection method of multi-core iterative RPN based on heterogeneous convolution according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a heterogeneous convolutional network according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a multi-scale feature extraction network according to an embodiment of the present invention.
FIG. 4 is a schematic diagram illustrating the detection effect comparison of different branches in multiple scales according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of a multi-core iterative RPN network architecture based on heterogeneous convolution according to an embodiment of the present invention.
Fig. 6 is a sample picture of a real-time example of a target detection network according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention.
Please refer to fig. 1-6. It should be noted that the drawings provided in the present embodiment are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
As shown in fig. 1, the present invention provides a target detection method for multi-core iterative RPN based on heterogeneous convolution, where the method includes:
s110, receiving image data to be detected;
s120, carrying out graying and local binarization data enhancement processing on the image data to obtain processed image data;
it can be understood that the received image data is grayed, and the pixels of the image are traversed one by one and expressed by numerical values of 0-255; the method comprises the steps of carrying out local binarization processing on a grayed image, carrying out noise addition, rotation, overturning and other transformations on the image through a data enhancement algorithm so as to enrich original image data, and finally setting the size of the processed image as the size required to be input by a network.
S130, inputting the processed image data into a heterogeneous convolution network for feature extraction to obtain a feature map;
it should be noted that, the method for constructing the heterogeneous convolutional network: the isomorphic convolution refers to convolution kernels with the same size from the first layer to the bottom layer of a convolution network, and is different from the sizes of the convolution kernels of heterogeneous convolution, wherein the convolution kernels with the sizes of 1 × 1 and 3 × 3 are combined according to a certain arrangement sequence, wherein P is the number of kernels with different types in the convolution network, and M is the depth of the set network input. Fig. 2 shows a heterogeneous convolutional neural network with P-4 and M-16.
Further, the image data after being subjected to graying and local binarization processing is transmitted into a constructed heterogeneous network, the size of the image is reduced through a layer of 3 × 3 convolution kernel, and then only the features of the image are learned through three layers of 1 × 1 convolution kernels, so that the size of the feature image is not reduced, and meanwhile, the effect of reducing the calculation complexity is achieved. The output image matrix can be formalized and expressed as the following formula:
Figure BDA0002633300900000061
wherein h iso、hi、hkThe height of the output image matrix after convolution, the height of the input convolution network image matrix and the height of the convolution kernel are respectively. w is ao、wi、wkThe width of the convolved output image, the width of the input convolution network image matrix and the width of the convolution kernel are respectively. And p is equal to 0 in the heterogeneous convolution process, s is the step number moved on the convolution kernel retargeting image, and s is equal to 2 in the convolution kernel moving process.
S140, inputting the feature map into a pre-constructed multi-scale feature extraction network to realize feature extraction of different scales and obtain a target feature map;
it can be understood that the method of constructing a multi-scale feature extraction network using convolution kernels of 1 × 1, 3 × 3, 5 × 5 size: and performing feature extraction on the image through heterogeneous convolution, outputting a feature map of the image, and reducing dimensionality of the feature map through a convolution kernel with the size of 1 multiplied by 1. And 3 × 3 convolution kernels are replaced by two layers of convolution kernels of 3 × 1 and 1 × 3, and similarly, 5 × 5 convolution kernels are replaced by convolution kernels of 5 × 1 and 1 × 5 sizes, and the multi-scale feature extraction network is constructed as shown in fig. 3.
Further, feature extraction is performed on the feature map with reduced dimensionality through the constructed multi-scale feature extraction network, as shown in fig. 4, (a), (b), (c), (d), and (e) are respectively a final effect map of retaining three sizes of convolution branches in the original input picture and the multi-scale feature extraction network, a final effect map of removing 1 × 1 convolution kernel branches, a final effect map of removing 3 × 3 convolution kernel branches, and a final effect map of removing 5 × 5 convolution kernel branches. It is shown that a convolution kernel of 1 × 1 is sensitive to large, medium and small size objects, while a convolution kernel of 3 × 3 is sensitive to large and medium size objects, and a convolution kernel of 5 × 5 is only sensitive to large size objects and cannot detect other smaller objects. Therefore, the multi-scale feature extraction network can perform multi-scale feature extraction on different image targets to acquire more accurate information.
S150, inputting the target characteristic diagram into a RIR network to obtain a plurality of area candidate frames;
it should be noted that two RPN network layers are connected in sequence to form an RIR network layer of an RPNINRPN, and a specific structure thereof is shown in fig. 5.
Further, feature graphs of targets with different sizes extracted by the multi-scale network are transmitted into an RIR network to be constructed, the obtained target feature graphs are transmitted into the RIR network, the feature graphs are convolved through a sliding window with the size of 3 x 3 to obtain a feature graph with the channel number being 256, the height and the width of the feature graph are the same as those of the transmitted feature graph, the height is set to be H, the width is set to be W, approximate features can be regarded as H x W vectors, and each vector is 256-dimensional. And performing full connection operation on each feature vector twice to respectively obtain feature maps with the sizes of 2 multiplied by H multiplied by W and 4 multiplied by H multiplied by W, and respectively expressing and obtaining a foreground score, a background score and four coordinate values of the foreground. In the process of sliding window convolution of the RPN, K region candidate frames with different sizes are generated every time a pixel passes through. Experiments have shown that the best results are achieved when the candidate box sizes are set to 128 × 128, 256 × 256, 512 × 512 and the aspect ratios are 1: 1, 2: 1, 1: 2, i.e., K is 9.
S160, obtaining a target score corresponding to each area candidate frame according to the non-maximum inhibition function; screening the plurality of regional candidate frames according to a preset score threshold value to obtain a regional suggestion window;
it should be noted that the region candidate box and the score generated by the RIR network are screened by the non-maximum suppression function. The scores of all candidate region boxes are ranked from high to low first, and the box with the highest score is selected. And simultaneously calculating the overlapping area (IOU) of the frame with the highest score and other frames, and only keeping the highest score frame if the IOU is larger than a set threshold value. If the IOU is smaller than the set threshold value, all the regions are reserved until all the region candidate frames are compared.
Specifically, a region candidate frame meeting a set threshold is selected through a non-maximum suppression function, after passing through a first layer of RPN, a region candidate frame and a score which are relatively in line with the actual position of a target in an image are selected, then a feature image in the candidate frame generated in the first layer is transmitted into the next layer of RPN, and then the second layer of RPN carries out more accurate target detection on each part of candidate frame region and gives a corresponding score. And an image which is more consistent with the position of the label is found, so that the influence of irrelevant or less relevant images on detection classification is reduced. The best performing candidate is retained by comparison with the first generated region candidate. The Loss function used by the RIR network formed by two layers of RPNs can be formally expressed as the following formula:
Figure BDA0002633300900000081
Figure BDA0002633300900000082
Figure BDA0002633300900000083
wherein, x, y, w, h respectively represent the center position coordinates and the width and height of the detected region candidate frame in each layer of the RPN network. x is the number ofbox,ybox,wbox,hboxThe coordinates of the center point and the width and height of the 9 region candidate frames generated for the RPN, respectively. x is the number of*,y*,w*,h*For the coordinates of the centre position of the image labelAnd width and height. N is a radical ofreg,NclsNormalization of the number of region candidate boxes generated for the RPN network and normalization of the dimensions of the feature map vector. When the label marks the foreground
Figure BDA0002633300900000084
When marked as background
Figure BDA0002633300900000085
piThe probability that the candidate box is an image object for the i-th region. λ is a regulation parameter, and experiments show that the Loss function plays the maximum positive feedback effect when λ is 10.
S170, classifying the region suggestion window according to the full convolution network layer and the normalized index function classifier to obtain a classification result and obtain an image category and a confidence score.
It can be understood that the region suggestion windows on the acquired images are respectively introduced into a normalized exponential function softmax, and the softmax function classifies the target region according to the set object class and the learned features. And the detection network is fed back in a forward direction through a Focal local Loss function, wherein the Focal local is improved according to a cross entropy function, and the model is more concentrated on samples which are difficult to classify by reducing the weight of the samples which are easy to classify and the weight of a larger number of backgrounds. The loss function can be formally expressed as:
L(pi)=-βi(1-pi)γlog(pi) (5)
β thereiniAnd gamma is a set loss parameter, and in the experiment, when gamma is 2, βiThe model trained at 0.25 performed best. p is a radical ofiThe probability value that the ith object is detected as a certain class. The accuracy of the detection classification network is continuously improved through a Focal local Loss function mechanism, and a final image classification category and a confidence score are output. Fig. 5 is a schematic diagram of a multi-core iterative RPN structure based on heterogeneous convolution, and a network finally obtains the classification and confidence score of an object in an image. FIG. 6 shows an embodiment of the present inventionFor example.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (5)

1. A target detection method of multi-core iteration RPN based on heterogeneous convolution is characterized by comprising the following steps:
receiving image data to be detected;
carrying out graying and local binarization data enhancement processing on the image data to obtain processed image data;
inputting the processed image data into a heterogeneous convolution network for feature extraction to obtain a feature map;
inputting the feature map into a pre-constructed multi-scale feature extraction network to realize feature extraction of different scales and obtain a target feature map;
inputting the target characteristic diagram into a RIR network to obtain a plurality of area candidate frames;
obtaining a target score corresponding to each region candidate frame according to the non-maximum inhibition function; screening the plurality of regional candidate frames according to a preset score threshold value to obtain a regional suggestion window;
and classifying the region suggestion windows according to the full convolution network layer and the normalized index function classifier to obtain a classification result and obtain an image category and a confidence score.
2. The target detection method of multi-core iterative RPN based on heterogeneous convolution according to claim 1, wherein the step of performing graying and local binarization data enhancement processing on the image data to obtain processed image data includes:
carrying out graying processing on the received image data;
carrying out local binarization processing on the image subjected to the graying processing to obtain a binarized image;
and carrying out noise addition, rotation and turnover on the binary image by adopting a data enhancement algorithm to obtain processed image data.
3. The target detection method of multi-core iterative RPN based on heterogeneous convolution according to claim 1, wherein the step of inputting the processed image data into a heterogeneous convolution network for feature extraction to obtain a feature map includes:
constructing a heterogeneous convolutional network, wherein the heterogeneous convolutional network is formed by arranging and combining convolution kernels with the sizes of 3 x 3 and 1 x 1 according to a heterogeneous mode;
inputting the processed image data into the constructed heterogeneous convolution network, and extracting image features;
and carrying out convolution operation on the obtained image feature map through a convolution kernel with the size of 1 × 1 to output the feature map with reduced dimensionality.
4. The target detection method of multi-core iterative RPN based on heterogeneous convolution according to claim 1, wherein the step of transmitting the feature map into a pre-constructed multi-scale feature extraction network to realize feature extraction of different scales and obtain a target feature map includes:
and inputting the feature map into a multi-scale feature extraction network, convolving the targets with different proportions in the image by adopting convolution kernels with three different sizes in the multi-scale feature extraction network, and generating corresponding target feature maps according to different sensitivity degrees of the convolution kernels with each size to the targets with different sizes.
5. The target detection method of multi-core iterative RPN based on heterogeneous convolution according to claim 1, wherein the step of inputting the target feature map into an RIR network to obtain a plurality of region candidate boxes includes:
constructing an RIR network structure, wherein the RIR network structure is as follows: the two RPN layers form a network structure in a full connection mode;
inputting the target feature map into the RIR network, and generating set n region candidate frames by the first layer RPN according to the target in the feature map;
and screening the generated n region candidate frames through the second layer RPN.
CN202010817648.3A 2020-04-29 2020-08-14 Heterogeneous convolution-based target detection method for multi-core iteration RPN Active CN111738237B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010357545.3A CN111563440A (en) 2020-04-29 2020-04-29 Target detection method of multi-core iteration RPN based on heterogeneous convolution
CN2020103575453 2020-04-29

Publications (2)

Publication Number Publication Date
CN111738237A true CN111738237A (en) 2020-10-02
CN111738237B CN111738237B (en) 2024-03-15

Family

ID=72071825

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202010357545.3A Pending CN111563440A (en) 2020-04-29 2020-04-29 Target detection method of multi-core iteration RPN based on heterogeneous convolution
CN202010817648.3A Active CN111738237B (en) 2020-04-29 2020-08-14 Heterogeneous convolution-based target detection method for multi-core iteration RPN

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202010357545.3A Pending CN111563440A (en) 2020-04-29 2020-04-29 Target detection method of multi-core iteration RPN based on heterogeneous convolution

Country Status (1)

Country Link
CN (2) CN111563440A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949498A (en) * 2021-03-04 2021-06-11 北京联合大学 Target key point detection method based on heterogeneous convolutional neural network

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113780254A (en) * 2021-11-12 2021-12-10 阿里巴巴达摩院(杭州)科技有限公司 Picture processing method and device, electronic equipment and computer storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109584248A (en) * 2018-11-20 2019-04-05 西安电子科技大学 Infrared surface object instance dividing method based on Fusion Features and dense connection network
US10282834B1 (en) * 2018-06-22 2019-05-07 Caterpillar Inc. Measurement platform that automatically determines wear of machine components based on images
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device
CN110210463A (en) * 2019-07-03 2019-09-06 中国人民解放军海军航空大学 Radar target image detecting method based on Precise ROI-Faster R-CNN

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device
US10282834B1 (en) * 2018-06-22 2019-05-07 Caterpillar Inc. Measurement platform that automatically determines wear of machine components based on images
CN109584248A (en) * 2018-11-20 2019-04-05 西安电子科技大学 Infrared surface object instance dividing method based on Fusion Features and dense connection network
CN110210463A (en) * 2019-07-03 2019-09-06 中国人民解放军海军航空大学 Radar target image detecting method based on Precise ROI-Faster R-CNN

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨娟;曹浩宇;汪荣贵;薛丽霞;胡敏;: "区域建议网络的细粒度车型识别", 中国图象图形学报, no. 06 *
瑚敏君;冯德俊;李强;: "基于实例分割模型的建筑物自动提取", 测绘通报, no. 04 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949498A (en) * 2021-03-04 2021-06-11 北京联合大学 Target key point detection method based on heterogeneous convolutional neural network
CN112949498B (en) * 2021-03-04 2023-11-14 北京联合大学 Target key point detection method based on heterogeneous convolutional neural network

Also Published As

Publication number Publication date
CN111738237B (en) 2024-03-15
CN111563440A (en) 2020-08-21

Similar Documents

Publication Publication Date Title
CN108304873B (en) Target detection method and system based on high-resolution optical satellite remote sensing image
Khodabandeh et al. A robust learning approach to domain adaptive object detection
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
KR102516360B1 (en) A method and apparatus for detecting a target
CN109684922B (en) Multi-model finished dish identification method based on convolutional neural network
Bascón et al. An optimization on pictogram identification for the road-sign recognition task using SVMs
US8989442B2 (en) Robust feature fusion for multi-view object tracking
Xu et al. Learning-based shadow recognition and removal from monochromatic natural images
US8433101B2 (en) System and method for waving detection based on object trajectory
Zhang et al. Long-range terrain perception using convolutional neural networks
Hong et al. Tracking using multilevel quantizations
Kang et al. Deep learning-based weather image recognition
CN110322445B (en) Semantic segmentation method based on maximum prediction and inter-label correlation loss function
CN110569782A (en) Target detection method based on deep learning
CN109242032B (en) Target detection method based on deep learning
CN112733614B (en) Pest image detection method with similar size enhanced identification
CN114037674B (en) Industrial defect image segmentation detection method and device based on semantic context
CN111738237B (en) Heterogeneous convolution-based target detection method for multi-core iteration RPN
Huo et al. Semisupervised learning based on a novel iterative optimization model for saliency detection
CN115661777A (en) Semantic-combined foggy road target detection algorithm
Zhang et al. Weighted smallest deformation similarity for NN-based template matching
Pham et al. Biseg: Simultaneous instance segmentation and semantic segmentation with fully convolutional networks
Lee et al. License plate detection via information maximization
Zhang et al. Spatial contextual superpixel model for natural roadside vegetation classification
Maldonado-Ramírez et al. Robotic visual tracking of relevant cues in underwater environments with poor visibility conditions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant