CN111931767B - Multi-model target detection method, device and system based on picture informativeness and storage medium - Google Patents

Multi-model target detection method, device and system based on picture informativeness and storage medium Download PDF

Info

Publication number
CN111931767B
CN111931767B CN202010776488.2A CN202010776488A CN111931767B CN 111931767 B CN111931767 B CN 111931767B CN 202010776488 A CN202010776488 A CN 202010776488A CN 111931767 B CN111931767 B CN 111931767B
Authority
CN
China
Prior art keywords
network
target detection
networks
informativeness
information degree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010776488.2A
Other languages
Chinese (zh)
Other versions
CN111931767A (en
Inventor
孙亚杰
张颖
吴雨瑶
吴爱国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN202010776488.2A priority Critical patent/CN111931767B/en
Publication of CN111931767A publication Critical patent/CN111931767A/en
Application granted granted Critical
Publication of CN111931767B publication Critical patent/CN111931767B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a multi-model target detection method, device and system based on picture informativeness and a storage medium, wherein the multi-model target detection method comprises the following steps: a first step of: selecting a plurality of target detection networks as candidate target detection networks, and designing information degree additional networks for the candidate detection networks according to the layer number of the target detection networks; and a second step of: the method comprises the steps of jointly training a target detection network and an informativeness additional network, designing a loss function of the informativeness additional network and a training strategy of the whole network, wherein the loss function of the target detection network is determined by a target detection method; and a third step of: and performing scale normalization on the output value of the informativeness-added network, and selecting a target detection network according to the output of the informativeness-added network. The beneficial effects of the invention are as follows: 1. the multi-model target detection method can combine the excellent performance of different detection models on different picture characteristics to comprehensively improve the detection accuracy.

Description

Multi-model target detection method, device and system based on picture informativeness and storage medium
Technical Field
The invention relates to the field of computer vision and industrial detection, in particular to a multi-model target detection method, device and system based on picture informativeness and a storage medium.
Background
Target detection is a classical task in computer vision, which refers to finding out the position of an object in a given picture and framing it out with a minimum bounding rectangle, and a specific task will also output the class of the object. For example, in the task of detecting the defect of the workpiece, the defective part needs to be framed out by a rectangular frame and what type of defect is designated.
Before deep learning is successfully applied to target detection, conventional image processing adopts a method of edge detection, such as line detection by using a Laplace and gradient method, and edge detection of some standard shapes by using a Hough change method. After the edge features are obtained, the leftmost (right) and uppermost (lower) points of the features can be found according to the shape of the edge, and the information of the minimum circumscribed rectangle can be obtained according to the position coordinates of the four points. And after the minimum external rectangle is obtained, sending the local picture characteristics in the rectangle range to a classifier. The searching of the classifier and the minimum external rectangle can be carried out as two independent tasks, wherein the task in the classifier is to correctly classify the picture according to the obtained picture characteristics, the class of the picture object is a limited class, and the picture object is encoded by using an One-hot encoding mode.
After deep learning successfully applies target detection, the advent of convolutional neural networks has freed up the complex task of feature extraction, and the network can learn to extract the desired features through the adjustment of a large number of parameters in the network. In the field of object detection, the most commonly used method is to operate on a feature map extracted by a convolutional neural network, for example, a method adopted by a face R-CNN, find a subgraph possibly including an object in the feature map by using a region candidate network, send the subgraph into the network as an image to be classified, and classify the image by using an SVM classifier.
The existing target detection method based on deep learning is divided into Two types, namely One Stage and Two Stage, wherein the Two-step detection method mainly comprises FastR-CNN and an improved version thereof, and the Two methods are divided into Two parts, namely a region candidate frame extraction network and a regression network; ONE STAGE is a detection method based on YOLO and Corner-Net, and a ONE-step method is to finish classification and regression tasks in ONE network, and has the advantage of very high detection speed.
Various detection methods are excellent in specific sizes and categories, and poor in other sizes and categories.
Disclosure of Invention
The invention provides a multi-model target detection method based on picture informativeness, which comprises the following steps of:
a first step of: selecting a plurality of target detection networks as candidate target detection networks, and designing information degree additional networks for the candidate detection networks according to the layer number of the target detection networks;
and a second step of: the method comprises the steps of jointly training a target detection network and an informativeness additional network, designing a loss function of the informativeness additional network and a training strategy of the whole network, wherein the loss function of the target detection network is determined by a target detection method;
and a third step of: and performing scale normalization on the output value of the informativeness-added network, and selecting a target detection network according to the output of the informativeness-added network.
As a further improvement of the invention, in said first step, it is further specifically comprised of performing the steps of:
step 1: selecting five target detection networks of Faster R-CNN, centerNet, cornerNet-V2, YOLO V3 and SSD as candidate target detection networks;
step 2: respectively designing information degree additional networks for the five networks in the step 1, wherein 4 characteristic layers are selected by Faster R-CNN to construct the information degree additional network, 3 characteristic layers are selected by CenterNet, cornerNet-V2 and SSD to construct the information degree additional network, and 5 characteristic layers are selected by YOLO V3 to construct the information degree additional network;
step 3: and (3) for the feature layers extracted in the step (2), performing a convolution operation, performing global average pooling again to convert the feature images into feature vectors, splicing the feature vectors obtained finally by each feature layer into a feature vector, connecting two layers, and mapping the feature vectors into informativeness values.
As a further improvement of the present invention, in the second step, specifically, further comprising:
each candidate target detection model and the informativity additional network thereof need to be jointly trained on the same data set, or pretrained on a large data set and then finely tuned on a small data set, and the loss function for training is as follows:
wherein B is batch, loss dctction Refers to loss of the target detection network,L 1 The form of (2) is as follows:
l i and l j Is the loss of the object detection network of the two pictures I and j that make up a picture pair, I i And I j The method is characterized in that the method is the informativity value of pictures i and j, lambda is set to 0.1 in the training process, the weight of an informativity additional network is fixed when the pictures arrive at half of the total epochs, only a target detection network is trained, and xi is a threshold value of the informativity additional network, and is set to 0.5 in the training process, wherein lambda is the weight of an informativity additional network loss function in the total network.
As a further improvement of the present invention, in the third step, the following steps are further performed: unifying the picture informativeness scale: the output of the informativity additional network is subjected to scale normalization, and the normalized calculation formula is as follows:
wherein I is the information degree value output by the information degree additional network of the detection network, I min And I max The information degree value after conversion is between 0 and 1 aiming at the minimum information degree value and the maximum information degree value of a detection network.
The invention also discloses a multi-model target detection device based on the picture informativeness, which comprises: information degree additional network structural unit: the method comprises the steps of selecting a plurality of target detection networks as candidate target detection networks, and designing information degree additional networks for the candidate detection networks according to the layer number of the target detection networks; training unit: the method comprises the steps of training a target detection network and an informativeness additional network in a combined mode, designing a loss function of the informativeness additional network and a training strategy of the whole network, wherein the loss function of the target detection network is determined by a target detection method;
unifying unit: for scale normalization of output values of the informative additional network, selecting a target detection network based on the output of the informative additional network
As a further improvement of the present invention, in the information degree additional network structural unit, it further specifically includes:
candidate target detection module: the method comprises the steps of selecting five target detection networks of Faster R-CNN, centerNet, cornerNet-V2, YOLO V3 and SSD as candidate target detection networks;
designing an informativeness additional network module: the method comprises the steps that information degree additional networks are respectively designed for five networks of a candidate target detection module, 4 characteristic layers are selected by a fast R-CNN to construct the information degree additional network, 3 characteristic layers are selected by CenterNet, cornerNet-V2 and SSD to construct the information degree additional network, 5 characteristic layers are selected by a YOLO V3 to construct the information degree additional network; the feature layer processing module: the method is used for extracting the feature layers for constructing the informativity network module from the original target detection network, carrying out a convolution operation, carrying out global average pooling again to convert the feature images into feature vectors, splicing the feature vectors obtained finally by each feature layer into one feature vector, connecting two layers of feature vectors, and mapping the feature vectors into informativity values.
As a further improvement of the present invention, in the training unit, specifically further comprising:
each candidate target detection model and its informativity additional network should be jointly trained on the same dataset, or pre-trained on a large dataset, such as ImageNet, and then fine-tuned on a small dataset, the loss function for training is as follows:
wherein B is batch, loss detction Refers to loss, L of the target detection network 1 The form of (2) is as follows:
l i and l j Is the loss of the object detection network of the two pictures I and j that make up a picture pair, I i And I j The method is characterized in that the method is the informativity value of pictures i and j, lambda is set to 0.1 in the training process, the weight of an informativity additional network is fixed when the pictures arrive at half of the total epochs, only a target detection network is trained, and xi is a threshold value of the informativity additional network, and is set to 0.5 in the training process, wherein lambda is the weight of an informativity additional network loss function in the total network.
As a further improvement of the present invention, in the unified unit, further comprising:
the picture information degree scale unification module: the calculation formula for scale normalization of the output of the informativeness additional network is as follows:
wherein I is the information degree value output by the information degree additional network of the detection network, I min And I max The information degree value after conversion is between 0 and 1 aiming at the minimum information degree value and the maximum information degree value of a detection network.
The invention also discloses a multi-model target detection system based on the picture informativeness, which comprises: a memory, a processor and a computer program stored on the memory, the computer program being configured to implement the steps of the multimodal object detection method of the invention when called by the processor.
The invention also discloses a computer readable storage medium storing a computer program configured to implement the steps of the multi-model object detection method of the invention when invoked by a processor.
The beneficial effects of the invention are as follows: 1. the multi-model target detection method can combine the excellent performance of different detection models on different picture characteristics to comprehensively improve the detection accuracy; 2. the multi-model target detection method can be applied to the fields of industrial defect detection and intelligent security, and has the advantages of high accuracy and small speed sacrifice.
Drawings
FIG. 1 is a basic flow chart of the multi-model object detection method of the present invention;
fig. 2 is a diagram of an information degree additional network structure of the present invention.
Detailed Description
As shown in fig. 1, the invention discloses a multi-model target detection method based on picture informativeness, which comprises the following steps:
a first step of: selecting a plurality of target detection networks as candidate target detection networks, and designing information degree additional networks for the candidate detection networks according to the layer number of the target detection networks;
and a second step of: the method comprises the steps of jointly training a target detection network and an informativeness additional network, designing a loss function of the informativeness additional network and a training strategy of the whole network, wherein the loss function of the target detection network is determined by a target detection method;
and a third step of: and performing scale normalization on the output value of the informativeness-added network, and selecting a target detection network according to the output of the informativeness-added network. And selecting an optimal detection network according to the picture informativeness value output by each detection network, wherein the smaller the informativeness value is, the more familiar the detection network is with the picture, and the better the detection effect is.
In the first step, the method further specifically comprises the following steps:
step 1: selecting five target detection networks of Faster R-CNN, centerNet, cornerNet-V2, YOLO V3 and SSD as candidate target detection networks;
step 2: respectively designing information degree additional networks for the five networks in the step 1, wherein 4 characteristic layers are selected by Faster R-CNN to construct the information degree additional network, self-weight 3 characteristic layers are selected by CenterNet, cornerNet-V2 and SSD to construct the information degree additional network, and 5 characteristic layers are selected by YOLO V3 to construct the information degree additional network;
step 3: and (3) for the feature layers extracted in the step (2), performing a convolution operation, performing global average pooling again to convert the feature images into feature vectors, splicing the feature vectors obtained finally by each feature layer into a feature vector, connecting two layers, and mapping the feature vectors into informativeness values.
As shown in fig. 2, the feature layers of the original detection network are taken, the number of the taken layers is related to the total number of the detection network, and the more the total number of the layers is, the more feature layers can be taken to form the informativeness additional network. For Resnet-18, 3 layers are taken to form the information degree additional network, and for Resnet-50, 5 layers are taken to form the information degree additional network. In order to ensure that enough features are acquired, the number of the extracted feature layers is not less than 3 layers, and in order to ensure that the extra calculation cost is not great, the number of the extracted feature layers is not more than 8 layers. The method comprises the steps of performing convolution operation on the extracted feature layers twice, performing global pooling to convert the feature layers into one-dimensional vectors, connecting one BN layer between two layers of convolution networks, splicing a plurality of one-dimensional vectors together, connecting a linear activation unit, connecting two full-connection layers, wherein dropout is arranged between each full-connection layer, and the last full-connection layer maps the feature vectors into informativeness values. (note: resnet is residual network, BN layer is batch normalization layer)
In the second step, the method specifically further includes:
each candidate target detection model and its informativity additional network should be jointly trained on the same dataset, or pre-trained on a large dataset, such as ImageNet, and then fine-tuned on a small dataset, the loss function for training is as follows:
wherein B is batch, loss detction Refers to loss, L of the target detection network 1 The form of (2) is as follows:
l i and l j Is the loss of the object detection network of the two pictures I and j that make up a picture pair, I i And I j The method is characterized in that the method is an informativity value of pictures i and j, lambda is set to be 0.1 in the training process, the weight of an informativity additional network is fixed when the pictures reach half of the total epochs, only a target detection network is trained, xi is a threshold value of the informativity additional network, 0.5 is set in the training process, lambda is the weight of an informativity additional network loss function in the total network, batch is the number of a batch of pictures, loss is a loss function, and epochs is the number of training rounds.
The loss function is designed for informative additional networks and trained with the loss function of the target detection network. The loss function of the information degree additional network uses the loss value in the target detection network, and the loss function of the information degree additional network is designed according to the loss. And adding the loss function of the target detection and the loss function of the informativeness additional network to perform joint training, wherein the weight of the informativeness additional function is set to be 0.1.
The training process is that the original detection network and the informativity additional network are trained together, so that the situation that the predicted informativity value is almost the same due to the fact that the robustness is too high when the predicted informativity value appears is prevented, the number of epochs in the informativity additional network training is lower than that of the original detection network, generally half of that of the detection network, and the aim is to effectively distinguish the informativity value of each picture. The Batch size should not be set too large and should be set to an even number in order to be divided into picture pairs.
In the third step, the method further comprises the following steps:
unifying the picture informativeness scale: because the loss function forms adopted by different target detection networks are inconsistent, the information degrees of the same picture under different detection networks are compared, the output value scales of all the information degree additional networks are unified, and the normalized calculation formula is as follows:
wherein I is the information degree value output by the information degree additional network of the detection network, I min And I max The information degree value after conversion is between 0 and 1 aiming at the minimum information degree value and the maximum information degree value of a detection network.
And (3) selecting a detection result: and selecting a plurality of detection networks as candidate target detection networks, and independently designing an informativeness additional network for each detection network, so that each detection network and the informativeness additional network thereof train on the same data set, and setting the same epochs during training. When a picture is detected, the informativity of the picture is output by the informativity additional network, and the picture informativity value of a plurality of detection networks is compared, so that the detection result which is the most deeply understood picture content, namely the detection result with the smallest informativity value is selected as the final detection result.
The invention also discloses a multi-model target detection device based on the picture informativeness, which comprises: information degree additional network structural unit: the method comprises the steps of selecting a plurality of target detection networks as candidate target detection networks, and designing information degree additional networks for the candidate detection networks according to the layer number of the target detection networks; training unit: the method comprises the steps of training a target detection network and an informativeness additional network in a combined mode, designing a loss function of the informativeness additional network and a training strategy of the whole network, wherein the loss function of the target detection network is determined by a target detection method;
unifying unit: the method is used for carrying out scale normalization on the output value of the informativeness adding network, and selecting a target detection network according to the output of the informativeness adding network.
The information degree additional network structure unit further specifically comprises:
candidate target detection module: the method comprises the steps of selecting five target detection networks of Faster R-CNN, centerNet, cornerNet-V2, YOLO V3 and SSD as candidate target detection networks;
designing an informativeness additional network module: the method comprises the steps that information degree additional networks are respectively designed for five networks of a candidate target detection module, 4 characteristic layers are selected by a fast R-CNN to construct the information degree additional network, 3 characteristic layers are selected by CenterNet, cornerNet-V2 and SSD to construct the information degree additional network, 5 characteristic layers are selected by a YOLO V3 to construct the information degree additional network; the feature layer processing module: the method is used for extracting the feature layers for constructing the informativity network module from the original target detection network, carrying out a convolution operation, carrying out global average pooling again to convert the feature images into feature vectors, splicing the feature vectors obtained finally by each feature layer into one feature vector, connecting two layers of feature vectors, and mapping the feature vectors into informativity values.
The training unit specifically further comprises:
each candidate target detection model and its informativity additional network should be jointly trained on the same dataset, or pre-trained on a large dataset, such as ImageNet, and then fine-tuned on a small dataset, the loss function for training is as follows:
wherein B is batch, loss detction Refers to loss, L of the target detection network 1 The form of (2) is as follows:
l i and l j Is the loss of the object detection network of the two pictures I and j that make up a picture pair, I i And I j The method is characterized in that the method is the informativity value of pictures i and j, lambda is set to 0.1 in the training process, the weight of an informativity additional network is fixed when the pictures arrive at half of the total epochs, only a target detection network is trained, and xi is a threshold value of the informativity additional network, and is set to 0.5 in the training process, wherein lambda is the weight of an informativity additional network loss function in the total network.
In the unified element, further comprising:
the picture information degree scale unification module: the calculation formula for scale normalization of the output of the informativeness additional network is as follows:
wherein I is the information degree value output by the information degree additional network of the detection network, I min And I max The information degree value after conversion is between 0 and 1 aiming at the minimum information degree value and the maximum information degree value of a detection network. Each detection network needs to have the informativeness value limited between 0-1 in order to facilitate comparing informativeness values between different detection networks.
The invention also discloses a multi-model target detection system based on the picture informativeness, a memory, a processor and a computer program stored on the memory, wherein the computer program is configured to realize the steps of the multi-model target detection method when being called by the processor.
The invention also discloses a computer readable storage medium storing a computer program configured to implement the steps of the multi-model object detection method of the invention when invoked by a processor.
The beneficial effects of the invention are as follows: 1. the multi-model target detection method can combine the excellent performance of different detection models on different picture characteristics to comprehensively improve the detection accuracy; 2. the multi-model target detection method can be applied to the fields of industrial defect detection and intelligent security, and has the advantages of high accuracy and small speed sacrifice.
The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims (8)

1. The multi-model target detection method based on the picture informativity is characterized by comprising the following steps of:
a first step of: selecting a plurality of target detection networks as candidate target detection networks, and designing information degree additional networks for the candidate detection networks according to the layer number of the target detection networks;
and a second step of: the method comprises the steps of jointly training a target detection network and an informativeness additional network, designing a loss function of the informativeness additional network and a training strategy of the whole network, wherein the loss function of the target detection network is determined by a target detection method;
and a third step of: performing scale normalization on the output value of the informativeness-added network, and selecting a target detection network according to the output of the informativeness-added network;
in the second step, the method specifically further includes:
each candidate target detection model and the informativity additional network thereof need to be jointly trained on the same data set, or the training is performed on a large data set, and then the training is performed on a small data set, wherein the loss function for training is as follows:
wherein B is batch, loss detction Refers to the purpose ofTarget detection network loss, L 1 The form of (2) is as follows:
L i and l j Is the loss of the object detection network of the two pictures I and j that make up a picture pair, I i And I j The method is characterized in that the method is the informativity value of pictures i and j, lambda is set to 0.1 in the training process, the weight of an informativity additional network is fixed when the pictures arrive at half of the total epochs, only a target detection network is trained, and xi is a threshold value of the informativity additional network, and is set to 0.5 in the training process, wherein lambda is the weight of an informativity additional network loss function in the total network.
2. The multi-model object detection method according to claim 1, further comprising, in the first step, the steps of:
step 1: selecting five target detection networks of Faster R-CNN, centerNet, cornerNet-V2, YOLO V3 and SSD as candidate target detection networks;
step 2: respectively designing information degree additional networks for the five networks in the step 1, wherein 4 characteristic layers are selected by Faster R-CNN to construct the information degree additional network, self-weight 3 characteristic layers are selected by CenterNet, cornerNet-V2 and SSD to construct the information degree additional network, and 5 characteristic layers are selected by YOLO V3 to construct the information degree additional network;
step 3: and (3) for the feature layers extracted in the step (2), performing a convolution operation, performing global average pooling again to convert the feature images into feature vectors, splicing the feature vectors obtained finally by each feature layer into a feature vector, connecting two layers, and mapping the feature vectors into informativeness values.
3. The multi-model object detection method according to claim 1, further comprising, in the third step, performing the steps of:
unifying the picture informativeness scale: the output of the informativity additional network is subjected to scale normalization, and the normalized calculation formula is as follows:
wherein I is the information degree value output by the information degree additional network of the detection network, I min And I max The information degree value after conversion is between 0 and 1 aiming at the minimum information degree value and the maximum information degree value of a detection network.
4. A multi-model object detection device based on picture informativeness, characterized by comprising:
information degree additional network structural unit: the method comprises the steps of selecting a plurality of target detection networks as candidate target detection networks, and designing information degree additional networks for the candidate detection networks according to the layer number of the target detection networks; training unit: the method comprises the steps of training a target detection network and an informativeness additional network in a combined mode, designing a loss function of the informativeness additional network and a training strategy of the whole network, wherein the loss function of the target detection network is determined by a target detection method;
unifying unit: the method comprises the steps of performing scale normalization on an output value of an informativeness adding network, and selecting a target detection network according to the output of the informativeness adding network;
the training unit specifically further comprises:
each candidate target detection model and the informativity additional network thereof need to be jointly trained on the same data set, or the training is performed on a large data set, and then the training is performed on a small data set, wherein the loss function for training is as follows:
wherein B is batch, loss detction Refers to loss, L of the target detection network 1 The form of (2) is as follows:
I i and l j Is the loss of the object detection network of the two pictures I and j that make up a picture pair, I i And I j The method is characterized in that the method is the informativity value of pictures i and j, lambda is set to 0.1 in the training process, the weight of an informativity additional network is fixed when the pictures arrive at half of the total epochs, only a target detection network is trained, and xi is a threshold value of the informativity additional network, and is set to 0.5 in the training process, wherein lambda is the weight of an informativity additional network loss function in the total network.
5. The multi-model object detection apparatus according to claim 4, wherein in the informative additional network configuration unit, further comprising:
candidate target detection module: the method comprises the steps of selecting five target detection networks of Faster R-CNN, centerNet, cornerNet-V2, YOLO V3 and SSD as candidate target detection networks;
designing an informativeness additional network module: the method comprises the steps that information degree additional networks are respectively designed for five networks of a candidate target detection module, 4 characteristic layers are selected by a fast R-CNN to construct the information degree additional network, 3 characteristic layers are selected by CenterNet, cornerNet-V2 and SSD to construct the information degree additional network, 5 characteristic layers are selected by a YOLO V3 to construct the information degree additional network;
the feature layer processing module: the method is used for extracting the feature layers for constructing the informativity network module from the original target detection network, carrying out a convolution operation, carrying out global average pooling again to convert the feature images into feature vectors, splicing the feature vectors obtained finally by each feature layer into one feature vector, connecting two layers of feature vectors, and mapping the feature vectors into informativity values.
6. The multi-model object detection apparatus according to claim 4, further comprising, in the unified element:
the picture information degree scale unification module: the calculation formula for scale normalization of the output of the informativeness additional network is as follows:
wherein I is the information degree value output by the information degree additional network of the detection network, I min And I max The information degree value after conversion is between 0 and 1 aiming at the minimum information degree value and the maximum information degree value of a detection network.
7. A multi-model object detection system based on picture informativeness, comprising: a memory, a processor and a computer program stored on the memory, the computer program being configured to implement the steps of the multimodal object detection method of any of claims 1-3 when invoked by the processor.
8. A computer-readable storage medium, characterized by: the computer readable storage medium stores a computer program configured to implement the steps of the multimodal object detection method of any of claims 1-3 when invoked by a processor.
CN202010776488.2A 2020-08-05 2020-08-05 Multi-model target detection method, device and system based on picture informativeness and storage medium Active CN111931767B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010776488.2A CN111931767B (en) 2020-08-05 2020-08-05 Multi-model target detection method, device and system based on picture informativeness and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010776488.2A CN111931767B (en) 2020-08-05 2020-08-05 Multi-model target detection method, device and system based on picture informativeness and storage medium

Publications (2)

Publication Number Publication Date
CN111931767A CN111931767A (en) 2020-11-13
CN111931767B true CN111931767B (en) 2023-09-15

Family

ID=73307622

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010776488.2A Active CN111931767B (en) 2020-08-05 2020-08-05 Multi-model target detection method, device and system based on picture informativeness and storage medium

Country Status (1)

Country Link
CN (1) CN111931767B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392857B (en) * 2021-08-17 2022-03-11 深圳市爱深盈通信息技术有限公司 Target detection method, device and equipment terminal based on yolo network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416394A (en) * 2018-03-22 2018-08-17 河南工业大学 Multi-target detection model building method based on convolutional neural networks
CN109345575A (en) * 2018-09-17 2019-02-15 中国科学院深圳先进技术研究院 A kind of method for registering images and device based on deep learning
CN110322423A (en) * 2019-04-29 2019-10-11 天津大学 A kind of multi-modality images object detection method based on image co-registration
CN110942000A (en) * 2019-11-13 2020-03-31 南京理工大学 Unmanned vehicle target detection method based on deep learning
CN111444821A (en) * 2020-03-24 2020-07-24 西北工业大学 Automatic identification method for urban road signs

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416394A (en) * 2018-03-22 2018-08-17 河南工业大学 Multi-target detection model building method based on convolutional neural networks
CN109345575A (en) * 2018-09-17 2019-02-15 中国科学院深圳先进技术研究院 A kind of method for registering images and device based on deep learning
CN110322423A (en) * 2019-04-29 2019-10-11 天津大学 A kind of multi-modality images object detection method based on image co-registration
CN110942000A (en) * 2019-11-13 2020-03-31 南京理工大学 Unmanned vehicle target detection method based on deep learning
CN111444821A (en) * 2020-03-24 2020-07-24 西北工业大学 Automatic identification method for urban road signs

Also Published As

Publication number Publication date
CN111931767A (en) 2020-11-13

Similar Documents

Publication Publication Date Title
CN107577990B (en) Large-scale face recognition method based on GPU (graphics processing Unit) accelerated retrieval
CN109583483B (en) Target detection method and system based on convolutional neural network
CN111768432A (en) Moving target segmentation method and system based on twin deep neural network
CN109359725B (en) Training method, device and equipment of convolutional neural network model and computer readable storage medium
CN108734210B (en) Object detection method based on cross-modal multi-scale feature fusion
CN110569738B (en) Natural scene text detection method, equipment and medium based on densely connected network
CN106156777B (en) Text picture detection method and device
CN113361495A (en) Face image similarity calculation method, device, equipment and storage medium
Molina-Moreno et al. Efficient scale-adaptive license plate detection system
CN113487610B (en) Herpes image recognition method and device, computer equipment and storage medium
CN112861785B (en) Instance segmentation and image restoration-based pedestrian re-identification method with shielding function
CN111652273A (en) Deep learning-based RGB-D image classification method
CN116385773A (en) Small target detection method, storage medium and electronic equipment
CN115147632A (en) Image category automatic labeling method and device based on density peak value clustering algorithm
CN111931767B (en) Multi-model target detection method, device and system based on picture informativeness and storage medium
CN111582057B (en) Face verification method based on local receptive field
CN111860054A (en) Convolutional network training method and device
CN115620083A (en) Model training method, face image quality evaluation method, device and medium
CN115063831A (en) High-performance pedestrian retrieval and re-identification method and device
CN114612802A (en) System and method for classifying fine granularity of ship target based on MBCNN
CN110210443B (en) Gesture recognition method for optimizing projection symmetry approximate sparse classification
CN112733741A (en) Traffic signboard identification method and device and electronic equipment
CN113516148A (en) Image processing method, device and equipment based on artificial intelligence and storage medium
CN114118303B (en) Face key point detection method and device based on prior constraint
CN116071625B (en) Training method of deep learning model, target detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant