CN115346125A - Target detection method based on deep learning - Google Patents

Target detection method based on deep learning Download PDF

Info

Publication number
CN115346125A
CN115346125A CN202211270276.2A CN202211270276A CN115346125A CN 115346125 A CN115346125 A CN 115346125A CN 202211270276 A CN202211270276 A CN 202211270276A CN 115346125 A CN115346125 A CN 115346125A
Authority
CN
China
Prior art keywords
training
target detection
layer
data set
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211270276.2A
Other languages
Chinese (zh)
Other versions
CN115346125B (en
Inventor
韩德红
杜益龙
圣道翠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Jinhantu Technology Co ltd
Original Assignee
Nanjing Jinhantu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Jinhantu Technology Co ltd filed Critical Nanjing Jinhantu Technology Co ltd
Priority to CN202211270276.2A priority Critical patent/CN115346125B/en
Publication of CN115346125A publication Critical patent/CN115346125A/en
Application granted granted Critical
Publication of CN115346125B publication Critical patent/CN115346125B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention discloses a target detection method based on deep learning, which comprises the steps of preprocessing an image to be detected, establishing a data set, wherein the data set comprises a target sample data set and a non-target sample data set; carrying out loss training and regression training on the data set in sequence to expand the target sample; extracting the characteristic vectors of all target samples, determining an adjacent matrix according to the characteristic vectors by taking the target samples as nodes, and establishing a weighted undirected graph; constructing an initial target detection model based on deep learning, and performing first training on the initial target detection model through a weighted undirected graph and a training set in a non-target sample data set; if the training times reach the preset training times, performing second training by using the optimization module; if the loss function is converged after the second training, stopping the training to obtain a target detection model, and performing target detection on the input image by using the target detection model; the target detection model built by the method has better robustness.

Description

Target detection method based on deep learning
Technical Field
The invention relates to the technical field of target detection, in particular to a target detection method based on deep learning.
Background
The traditional target detection algorithm can be divided into three steps of region selection, feature extraction and classifier classification, the feature extraction is carried out by manually selecting image features, and the feature is single and the robustness is poor. The neural network changes the current situation, can be independent of manual feature extraction, is developed vigorously after a major breakthrough in the field of image classification, and the current target detection algorithm based on deep learning becomes the mainstream of target detection research.
With the continuous development of the field, the related models of the target detection algorithm are also infinite, and the way of manually optimizing the flow becomes very inefficient in the presence of huge number of models. The automated approach also requires a great deal of computational support to select a qualified model, and is time-consuming.
Disclosure of Invention
The present invention has been made in view of the above-mentioned conventional problems.
Therefore, the invention provides a target detection method based on deep learning, which can solve the problems that the existing detection model is low in detection speed and low in detection precision caused by small samples.
In order to solve the above technical problems, the present invention provides the following technical solutions, including: preprocessing an image to be detected, and establishing a data set, wherein the data set comprises a target sample data set and a non-target sample data set; carrying out loss training and regression training on the data set in sequence to expand a target sample; extracting the characteristic vectors of all target samples, determining an adjacent matrix according to the characteristic vectors by taking the target samples as nodes, and establishing a weighted undirected graph; constructing an initial target detection model based on deep learning, and performing first training on the initial target detection model through a weighted undirected graph and a training set in a non-target sample data set; if the training times reach the preset training times, performing second training by using the optimization module; and if the loss function is converged after the second training, stopping the training to obtain a target detection model, and performing target detection on the input image by using the target detection model.
As a preferable aspect of the deep learning-based target detection method of the present invention, wherein: preprocessing comprises noise reduction processing, binarization and normalization; the method comprises the steps that noise interference in an image to be detected is suppressed through a filtering module, wherein the filtering module comprises a plurality of two-dimensional filtering matrixes, a light-shielding storage unit and at least one CMOS sensor array; exposing at least N rows of gratings of an image to be detected through a CMOS sensor array, and transferring charges generated by exposure to an insulating and shielding storage unit at t so as to capture pixel points of M pixel regions in the image to be detected; finally, the pixel values of the pixel points are convoluted through a two-dimensional filter matrix to complete noise reduction processing; performing binarization processing on the denoised image to obtain a binary image, and segmenting according to a connected domain of the binary image to divide a target sample and a non-target sample to obtain a non-target sample data set, wherein the target sample at least comprises an object to be detected, and the non-target sample does not comprise the object to be detected; sampling target samples containing objects to be detected with different sizes according to the same proportion to obtain data blocks, and performing normalization operation to obtain a target sample data set; wherein 70% of the data set is used as a training set and 30% of the data set is used as a testing set.
As a preferable aspect of the deep learning-based target detection method of the present invention, wherein: the loss training and the regression training include: performing loss training on the data set by adopting a cross entropy loss function; and performing regression training on the target sample data set by adopting a Smooth L1 loss function.
As a preferable scheme of the target detection method based on deep learning of the present invention, wherein: the feature vector includes: extracting multilayer semantic features of all target samples, and sequentially performing down-sampling and up-sampling on the multilayer semantic features according to a preset sampling rate to obtain a first feature vector; and fusing the first feature vector and the multilayer semantic features to obtain a second feature vector, and successively performing convolution and downsampling on the second feature vector to obtain the feature vector.
As a preferable aspect of the deep learning-based target detection method of the present invention, wherein: establishing the weighted undirected graph comprises the following steps: determining an adjacency matrix element K from the eigenvectors n,m
Figure 295971DEST_PATH_IMAGE001
Wherein mu is a distance function, D n Is a feature vector of node n, D m Is the feature vector of node m;
according to the adjacency matrix element K n,m Establishing an undirected graph G with the right:
Figure 681953DEST_PATH_IMAGE002
wherein K is the adjacent matrix set, and R is the edge weight between nodes.
As a preferable aspect of the deep learning-based target detection method of the present invention, wherein: the construction of the initial target detection model comprises the following steps: the initial target detection model consists of a convolutional layer, an LSTM layer, a residual network layer, a full connection layer and an output layer; the convolution layers comprise a first convolution layer with convolution kernel of 3 × 3, an LSTM layer, a second convolution layer with convolution kernel of 1 × 1 and a third convolution layer with convolution kernel of 5 × 5; the residual network layer comprises a fourth convolution layer with convolution kernel 1 x 1, a space pyramid pooling layer with pooling window 2 x 2, a fifth convolution layer with convolution kernel 3 x 3 and a sixth convolution layer with convolution kernel 1 x 1; the first convolution layer is used as an input layer of the initial target detection model, the LSTM layer is respectively connected with the output of the first convolution layer and the input of the second convolution layer, the second convolution layer is output to the third convolution layer through a Leaky ReLU activation function, the third convolution layer is output to the fourth convolution layer through a ReLU activation function, the fourth convolution layer, the spatial pyramid pooling layer, the fifth convolution layer and the sixth convolution layer are output through a MaxOut activation function, the characteristics extracted by the sixth convolution layer are subjected to nonlinear combination through the full-connection layer and are output to the output layer, and the output layer outputs a detection result through a softmax function.
As a preferable scheme of the target detection method based on deep learning of the present invention, wherein: the first training includes: inputting 30% of training sets in the weighted undirected graph and the non-target sample data set into an initial target detection model, performing first training by a random gradient descent method, freezing a first convolution layer, an LSTM layer and a second convolution layer when the training times reach 20 times, training the rest layers except the first convolution layer, the LSTM layer and the second convolution layer by using the remaining 70% of training sets in the weighted undirected graph and the non-target sample data set, and performing second training when the training times reach preset training times.
As a preferable aspect of the deep learning-based target detection method of the present invention, wherein: the optimization module comprises: setting a learning rate, and constructing a Loss function Loss based on the model weight N obtained after the first training:
Figure 461690DEST_PATH_IMAGE003
wherein y is an output value of the initial target detection model,
Figure 935528DEST_PATH_IMAGE004
is the predicted value of the initial target detection model, lambda is the balance factor, gamma is the learning rate, L pos Is the loss value of the target sample, L neg Loss values for non-target samples;
and (5) performing second training by using a WOA algorithm, and stopping training when the loss function value reaches the minimum value, namely convergence, so as to obtain the optimal model weight.
As a preferable aspect of the deep learning-based target detection method of the present invention, wherein: the second training comprises: taking the balance factors and the model weight as whale individuals, and initializing the number of the whale individuals, the maximum iteration times T and the number of initial target detection model neurons; randomly generating the position of the whale individual, and calculating the fitness of the whale individual; updating the positions of the whale individuals, calculating the fitness of the whale individuals at the moment, and selecting the optimal individuals according to the fitness; stopping training when the loss function value reaches the minimum and the model precision meets the requirement, and obtaining the optimal model weight; wherein, the individual fitness of whale includes:
F=1/ Loss
wherein F is the fitness of the whale individual.
As a preferable aspect of the deep learning-based target detection method of the present invention, wherein: the model precision comprises: and inputting the test set into the model after the first training for testing, and performing precision evaluation by adopting an accuracy ACC.
The invention has the beneficial effects that: the method builds the target detection model based on deep learning, and performs optimization training on the target detection model by setting a secondary training mechanism, so that the method has better robustness.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise. Wherein:
fig. 1 is a schematic flowchart of a deep learning-based target detection method according to a first embodiment of the present invention;
fig. 2 is a schematic structural diagram of a target detection model of the target detection method based on deep learning according to the first embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, specific embodiments accompanied with figures are described in detail below, and it is apparent that the described embodiments are a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, shall fall within the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein, and it will be appreciated by those skilled in the art that the present invention may be practiced without departing from the spirit and scope of the present invention and that the present invention is not limited by the specific embodiments disclosed below.
Furthermore, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
The present invention will be described in detail with reference to the drawings, wherein the cross-sectional views illustrating the structure of the device are not enlarged partially in general scale for convenience of illustration, and the drawings are only exemplary and should not be construed as limiting the scope of the present invention. In addition, the three-dimensional dimensions of length, width and depth should be included in the actual fabrication.
Also in the description of the present invention, it should be noted that the terms "upper, lower, inner and outer" and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, which are only for convenience of description and simplification of description, but do not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms first, second, or third are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected and connected" in the present invention are to be understood broadly, unless otherwise explicitly specified or limited, for example: can be fixedly connected, detachably connected or integrally connected; they may be mechanically, electrically, or directly connected, or indirectly connected through intervening media, or may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Example 1
Referring to fig. 1 to 2, a first embodiment of the present invention provides a deep learning-based target detection method, including:
s1: preprocessing an image to be detected, and establishing a data set, wherein the data set comprises a target sample data set and a non-target sample data set.
In order to better detect the target, the embodiment first performs preprocessing, i.e., denoising, binarization and normalization, on the image to be detected, specifically:
(1) Noise reduction processing
The noise interference in the image to be detected is suppressed through a filtering module, and the filtering module comprises a plurality of two-dimensional filtering matrixes, a light-shielding storage unit and at least one CMOS sensor array; exposing at least N rows of gratings of an image to be detected through a CMOS sensor array, and transferring charges generated by exposure to an insulating and shielding storage unit at t so as to capture pixel points of M pixel regions in the image to be detected; finally, the pixel values of the pixel points are convoluted through a two-dimensional filter matrix to complete noise reduction processing;
(2) Binarization method
Carrying out binarization processing on the denoised image to obtain a binary image, and segmenting according to a connected domain of the binary image to divide a target sample and a non-target sample to obtain a non-target sample data set, wherein the target sample at least comprises an object to be detected, and the non-target sample does not comprise the object to be detected; the object to be detected can be, for example, a vehicle, a radar, a crowd, a ship, etc.
(3) Normalization
Sampling target samples containing objects to be detected with different sizes according to the same proportion to obtain data blocks, and performing normalization operation to obtain a target sample data set; wherein 70% of the data set is used as a training set and 30% of the data set is used as a testing set.
S2: and carrying out loss training and regression training on the data set in sequence to expand the target sample.
In order to improve the detection precision, a large number of samples are required to train the model, but the acquisition difficulty of the target sample is large, and the embodiment expands the acquired target sample through loss training and regression training, and specifically comprises the following steps:
(1) Performing Loss training on the data set by adopting a cross entropy Loss function (Cross Encopy Loss);
(2) And performing regression training on the target sample data set by adopting a Smooth L1 loss function.
S3: extracting the characteristic vectors of all target samples, determining an adjacent matrix according to the characteristic vectors by taking the target samples as nodes, and establishing a weighted undirected graph.
(1) Extracting multilayer semantic features of all target samples, and successively performing down-sampling and up-sampling on the multilayer semantic features according to a preset sampling rate to obtain a first feature vector;
(2) And fusing the first feature vector and the multilayer semantic features to obtain a second feature vector, and successively performing convolution and downsampling on the second feature vector to obtain the feature vector.
Further, an adjacent matrix is determined according to the characteristic vector, and a weighted undirected graph is established:
(1) Determining an adjacency matrix element K from the eigenvectors n,m
Figure 749900DEST_PATH_IMAGE005
Wherein, mu is a distance function, D n Is a feature vector of node n, D m Is the feature vector of node m;
(2) According to the adjacency matrix element K n,m Establishing an undirected graph G with the right:
G=(K,R)
wherein, K is an adjacent matrix set, and R is the edge weight between nodes.
S4: and constructing an initial target detection model based on deep learning, and training the initial target detection model for the first time through a weighted undirected graph and a training set in a non-target sample data set.
Referring to fig. 2, the initial target detection model is composed of a convolutional layer, an LSTM layer, a residual network layer, a full link layer, and an output layer; the convolution layers comprise a first convolution layer with convolution kernel of 3 × 3, an LSTM layer, a second convolution layer with convolution kernel of 1 × 1 and a third convolution layer with convolution kernel of 5 × 5, so that the receptive field is effectively enlarged; the residual network layer comprises a fourth convolution layer with convolution kernel of 1 × 1, a spatial pyramid pooling layer with pooling window of 2 × 2, a fifth convolution layer with convolution kernel of 3 × 3 and a sixth convolution layer with convolution kernel of 1 × 1, and preferably, the residual network is optimized by combining the pooling layers, so that the detection capability of the dense target detection is improved.
The first convolution layer is used as an input layer of the initial target detection model, the LSTM layer is respectively connected with the output of the first convolution layer and the input of the second convolution layer, the second convolution layer is output to the third convolution layer through an Leaky ReLU activation function, the third convolution layer is output to the fourth convolution layer through a ReLU activation function, the fourth convolution layer, the spatial pyramid pooling layer, the fifth convolution layer and the sixth convolution layer are output through a MaxOut activation function, the features extracted by the sixth convolution layer are subjected to nonlinear combination through the full connection layer and output to the output layer, and the output layer outputs a detection result through a softmax function.
Preferably, the fifth convolutional layer firstly reduces the calculated amount under the fourth convolutional layer with reduced dimensions, and then converts the convolution characteristics of the fourth convolutional layer into the same dimensions through the spatial pyramid pooling layer, so that the fifth convolutional layer can process images with any dimension, and then the sixth convolutional layer is restored, thereby maintaining the precision, reducing the calculated amount and greatly accelerating the detection speed.
Further, the initial target detection model is trained for the first time through a training set with a weighted undirected graph and a non-target sample data set:
inputting 30% of training sets in the weighted undirected graph and the non-target sample data set into an initial target detection model, performing first training by a random gradient descent method, freezing a first convolution layer, an LSTM layer and a second convolution layer when the training times reach 20 times, training the rest layers except the first convolution layer, the LSTM layer and the second convolution layer by using the remaining 70% of training sets in the weighted undirected graph and the non-target sample data set, and performing second training when the training times reach preset training times.
S5: if the training times reach the preset training times, performing second training by using the optimization module; and if the loss function is converged after the second training, stopping the training to obtain a target detection model, and performing target detection on the input image by using the target detection model.
The optimization module comprises:
(1) Setting a learning rate, and constructing a Loss function Loss based on the model weight N obtained after the first training:
Figure 165838DEST_PATH_IMAGE006
wherein y is an output value of the initial target detection model,
Figure 432871DEST_PATH_IMAGE004
is the predicted value of the initial target detection model, lambda is the balance factor, gamma is the learning rate, L pos Is the loss value of the target sample, L neg Loss values for non-target samples;
the embodiment sets the learning rate to 0.01, and sets dropout to 0.7 to prevent over-fitting while speeding up training when the learning rate is too low, which may cause over-fitting.
(2) And (5) performing second training by using a WOA algorithm, and stopping training when the loss function value reaches the minimum value, namely convergence, so as to obtain the optimal model weight.
a) Taking the balance factors and the model weight as whale individuals, and initializing the number of the whale individuals, the maximum iteration times T and the number of initial target detection model neurons;
b) Randomly generating the position of the whale individual, and calculating the fitness of the whale individual;
the fitness of individual whale includes:
F=1/ Loss
wherein F is the fitness of whale individuals.
c) Updating the position of the individual whale, calculating the fitness of the individual whale at the moment, and selecting the optimal individual according to the fitness;
the location of individual whales was updated according to the following formula:
X=X q -aD
wherein X is the updated position of the whale individual, a is a random number of (0, 1), and D is the distance between the whale and the prey.
And c) comparing the fitness with the fitness in the step b), and selecting the individual with higher fitness as the optimal individual.
d) Stopping training when the loss function value reaches the minimum and the model precision meets the requirement, and obtaining the optimal model weight; and inputting the test set into the model after the first training for testing, and performing precision evaluation by adopting an accuracy ACC.
Preferably, the method is combined with a random gradient descent method and a WOA algorithm to train the target detection model, so that the optimal model weight is obtained, and the robustness is enhanced.
Example 2
In order to verify and explain the technical effects adopted in the method, the embodiment selects a CNN target detection algorithm, an RPN network-based target detection method and a comparison test by adopting the method, and compares test results by means of scientific demonstration to verify the real effect of the method.
The detection accuracy of the CNN target detection algorithm is high, but the corresponding calculation amount is large, so that the time cost is too high, and the target detection method based on the RPN network can only be trained through small-batch data, so that the detection performance is poor.
In order to verify that the method has higher detection precision and higher detection speed compared with a CNN target detection algorithm and an RPN network-based target detection method, in this embodiment, the existing technical scheme (CNN target detection algorithm and RPN network-based target detection method) and the method are respectively adopted to perform detection comparison on 500 vehicle images to be detected, accuracy (ACC), recall rate (TPR), precision (PRE) and F1-score (F1) are adopted as measurement indexes of each method, and the results are shown in table 1.
Figure 835034DEST_PATH_IMAGE007
Figure 503913DEST_PATH_IMAGE008
Figure 74440DEST_PATH_IMAGE009
Figure 828770DEST_PATH_IMAGE010
In the formula, TP is a real example, i.e., the number of samples for which true reentrancy is detected; FP is a false positive case, namely the number of samples for which false reentrancy is detected; FN is false negative, i.e. number of undetected true reentrant samples; TN is the true negative, i.e., the number of false reentrant samples that are not detected.
Table 1: target detection performance of different approaches.
Figure DEST_PATH_IMAGE012A
As is clear from the data in Table 1, the method has good performance in target detection, and each performance index is superior to a CNN target detection algorithm and a target detection method based on an RPN network.
It should be recognized that embodiments of the present invention can be realized and implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The methods may be implemented in a computer program using standard programming techniques, including a non-transitory computer-readable storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner according to the methods and figures described in the detailed description. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
Further, the operations of processes described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described herein (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) collectively executed on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.
Further, the method may be implemented in any type of computing platform operatively connected to a suitable interface, including but not limited to a personal computer, mini computer, mainframe, workstation, networked or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and the like. Aspects of the invention may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described herein includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described herein. A computer program can be applied to input data to perform the functions described herein to transform the input data to generate output data that is stored to non-volatile memory. The output information may also be applied to one or more output devices, such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including particular visual depictions of physical and tangible objects produced on a display.
As used in this application, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of example, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal).
It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims (10)

1. A target detection method based on deep learning is characterized by comprising the following steps:
preprocessing an image to be detected, and establishing a data set, wherein the data set comprises a target sample data set and a non-target sample data set;
carrying out loss training and regression training on the data set in sequence to expand a target sample;
extracting the characteristic vectors of all target samples, determining an adjacent matrix according to the characteristic vectors by taking the target samples as nodes, and establishing a weighted undirected graph;
constructing an initial target detection model based on deep learning, and performing first training on the initial target detection model through a weighted undirected graph and a training set in a non-target sample data set;
if the training times reach the preset training times, performing second training by using the optimization module;
and if the loss function is converged after the second training, stopping the training to obtain a target detection model, and performing target detection on the input image by using the target detection model.
2. The deep learning-based target detection method according to claim 1, wherein the preprocessing includes noise reduction processing, binarization, and normalization;
the method comprises the steps that noise interference in an image to be detected is suppressed through a filtering module, wherein the filtering module comprises a plurality of two-dimensional filtering matrixes, a light-shielding storage unit and at least one CMOS sensor array; exposing at least N rows of gratings of an image to be detected through a CMOS sensor array, and transferring charges generated by exposure to an insulating and shielding storage unit at t so as to capture pixel points of M pixel regions in the image to be detected; finally, convolution is carried out on the pixel values of the pixel points through a two-dimensional filter matrix to complete noise reduction processing;
performing binarization processing on the denoised image to obtain a binary image, and segmenting according to a connected domain of the binary image to divide a target sample and a non-target sample to obtain a non-target sample data set, wherein the target sample at least comprises an object to be detected, and the non-target sample does not comprise the object to be detected;
sampling target samples containing objects to be detected with different sizes according to the same proportion to obtain data blocks, and performing normalization operation to obtain a target sample data set;
wherein 70% of the data set is used as the training set and 30% of the data set is used as the test set.
3. The deep learning-based target detection method of claim 1, wherein the loss training and the regression training comprise:
performing loss training on the data set by adopting a cross entropy loss function;
and performing regression training on the target sample data set by adopting a Smooth L1 loss function.
4. The deep learning-based object detection method according to claim 2 or 3, wherein the feature vector includes:
extracting multilayer semantic features of all target samples, and successively performing down-sampling and up-sampling on the multilayer semantic features according to a preset sampling rate to obtain a first feature vector;
and fusing the first feature vector and the multilayer semantic features to obtain a second feature vector, and successively performing convolution and downsampling on the second feature vector to obtain the feature vector.
5. The deep learning-based target detection method of claim 4, wherein building a weighted undirected graph comprises:
determining an adjacency matrix element K from the eigenvectors n,m
Figure DEST_PATH_IMAGE002AA
Wherein, mu is a distance function, D n Feature vector for node n,D m Is the feature vector of node m;
according to the adjacency matrix element K n,m Establishing an undirected graph G with the right:
Figure DEST_PATH_IMAGE004A
wherein K is the adjacent matrix set, and R is the edge weight between nodes.
6. The deep learning-based target detection method of claim 5, wherein constructing an initial target detection model comprises:
the initial target detection model consists of a convolution layer, an LSTM layer, a residual error network layer, a full connection layer and an output layer; the convolution layers comprise a first convolution layer with convolution kernel of 3 × 3, an LSTM layer, a second convolution layer with convolution kernel of 1 × 1 and a third convolution layer with convolution kernel of 5 × 5; the residual network layer comprises a fourth convolution layer with convolution kernel 1 x 1, a spatial pyramid pooling layer with pooling window 2 x 2, a fifth convolution layer with convolution kernel 3 x 3 and a sixth convolution layer with convolution kernel 1 x 1;
the first convolution layer is used as an input layer of the initial target detection model, the LSTM layer is respectively connected with the output of the first convolution layer and the input of the second convolution layer, the second convolution layer is output to the third convolution layer through a Leaky ReLU activation function, the third convolution layer is output to the fourth convolution layer through a ReLU activation function, the fourth convolution layer, the spatial pyramid pooling layer, the fifth convolution layer and the sixth convolution layer are output through a MaxOut activation function, the characteristics extracted by the sixth convolution layer are subjected to nonlinear combination through the full-connection layer and are output to the output layer, and the output layer outputs a detection result through a softmax function.
7. The deep learning-based target detection method of claim 6, wherein the first training comprises:
inputting 30% of training sets in the weighted undirected graph and the non-target sample data set into an initial target detection model, performing first training by a random gradient descent method, freezing a first convolution layer, an LSTM layer and a second convolution layer when the training times reach 20 times, training the rest layers except the first convolution layer, the LSTM layer and the second convolution layer by using the remaining 70% of training sets in the weighted undirected graph and the non-target sample data set, and performing second training when the training times reach preset training times.
8. The deep learning-based target detection method of claim 7, wherein the optimization module comprises:
setting a learning rate, and constructing a Loss function Loss based on the model weight N obtained after the first training:
Figure DEST_PATH_IMAGE005
wherein y is an output value of the initial target detection model,
Figure DEST_PATH_IMAGE006
is the predicted value of the initial target detection model, lambda is the balance factor, gamma is the learning rate, L pos Is the loss value, L, of the target sample neg Loss values for non-target samples;
and (5) performing second training by using a WOA algorithm, and stopping training when the loss function value reaches the minimum value, namely convergence, so as to obtain the optimal model weight.
9. The deep learning-based target detection method of claim 8, wherein the second training comprises:
taking the balance factors and the model weight as whale individuals, and initializing the number of the whale individuals, the maximum iteration times T and the number of initial target detection model neurons;
randomly generating the position of the whale individual, and calculating the fitness of the whale individual;
updating the positions of the whale individuals, calculating the fitness of the whale individuals at the moment, and selecting the optimal individuals according to the fitness;
stopping training when the loss function value reaches the minimum and the model precision meets the requirement, and obtaining the optimal model weight;
wherein, the individual fitness of whale includes:
Figure DEST_PATH_IMAGE008
wherein F is the fitness of the whale individual.
10. The deep learning-based target detection method of claim 9, wherein the model accuracy comprises:
and inputting the test set into the model after the first training for testing, and performing precision evaluation by adopting an accuracy ACC.
CN202211270276.2A 2022-10-18 2022-10-18 Target detection method based on deep learning Active CN115346125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211270276.2A CN115346125B (en) 2022-10-18 2022-10-18 Target detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211270276.2A CN115346125B (en) 2022-10-18 2022-10-18 Target detection method based on deep learning

Publications (2)

Publication Number Publication Date
CN115346125A true CN115346125A (en) 2022-11-15
CN115346125B CN115346125B (en) 2023-03-24

Family

ID=83957722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211270276.2A Active CN115346125B (en) 2022-10-18 2022-10-18 Target detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN115346125B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116405310A (en) * 2023-04-28 2023-07-07 北京宏博知微科技有限公司 Network data security monitoring method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215795A (en) * 2020-09-02 2021-01-12 苏州超集信息科技有限公司 Intelligent server component detection method based on deep learning
CN112417099A (en) * 2020-11-20 2021-02-26 南京邮电大学 Method for constructing fraud user detection model based on graph attention network
CN112668440A (en) * 2020-12-24 2021-04-16 西安电子科技大学 SAR ship target detection method based on regression loss of balance sample
CN113468803A (en) * 2021-06-09 2021-10-01 淮阴工学院 Improved WOA-GRU-based flood flow prediction method and system
CN114021935A (en) * 2021-10-29 2022-02-08 陕西科技大学 Aquatic product safety early warning method based on improved convolutional neural network model
CN114694144A (en) * 2022-06-01 2022-07-01 南京航空航天大学 Intelligent identification and rating method for non-metallic inclusions in steel based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215795A (en) * 2020-09-02 2021-01-12 苏州超集信息科技有限公司 Intelligent server component detection method based on deep learning
CN112417099A (en) * 2020-11-20 2021-02-26 南京邮电大学 Method for constructing fraud user detection model based on graph attention network
CN112668440A (en) * 2020-12-24 2021-04-16 西安电子科技大学 SAR ship target detection method based on regression loss of balance sample
CN113468803A (en) * 2021-06-09 2021-10-01 淮阴工学院 Improved WOA-GRU-based flood flow prediction method and system
CN114021935A (en) * 2021-10-29 2022-02-08 陕西科技大学 Aquatic product safety early warning method based on improved convolutional neural network model
CN114694144A (en) * 2022-06-01 2022-07-01 南京航空航天大学 Intelligent identification and rating method for non-metallic inclusions in steel based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
REHAM BARHAM等: ""Link Prediction Based on Whale Optimization Algorithm"", 《2017 INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116405310A (en) * 2023-04-28 2023-07-07 北京宏博知微科技有限公司 Network data security monitoring method and system
CN116405310B (en) * 2023-04-28 2024-03-15 北京宏博知微科技有限公司 Network data security monitoring method and system

Also Published As

Publication number Publication date
CN115346125B (en) 2023-03-24

Similar Documents

Publication Publication Date Title
CN107704857B (en) End-to-end lightweight license plate recognition method and device
CN109685152B (en) Image target detection method based on DC-SPP-YOLO
CN109190537B (en) Mask perception depth reinforcement learning-based multi-person attitude estimation method
CN107220618B (en) Face detection method and device, computer readable storage medium and equipment
CN112052787B (en) Target detection method and device based on artificial intelligence and electronic equipment
CN111401516B (en) Searching method for neural network channel parameters and related equipment
CN107529650B (en) Closed loop detection method and device and computer equipment
JP7233807B2 (en) Computer-implemented method, computer system, and computer program for simulating uncertainty in artificial neural networks
CN111414987B (en) Training method and training device of neural network and electronic equipment
CN104866868A (en) Metal coin identification method based on deep neural network and apparatus thereof
CN111160217B (en) Method and system for generating countermeasure sample of pedestrian re-recognition system
CN112036381B (en) Visual tracking method, video monitoring method and terminal equipment
CN114140683A (en) Aerial image target detection method, equipment and medium
CN111105017A (en) Neural network quantization method and device and electronic equipment
CN111626379B (en) X-ray image detection method for pneumonia
CN115346125B (en) Target detection method based on deep learning
CN112364974B (en) YOLOv3 algorithm based on activation function improvement
EP3822864A1 (en) Method and apparatus with deep neural network model fusing
CN111428566B (en) Deformation target tracking system and method
CN110490058B (en) Training method, device and system of pedestrian detection model and computer readable medium
CN112420125A (en) Molecular attribute prediction method and device, intelligent equipment and terminal
CN115439708A (en) Image data processing method and device
WO2022100607A1 (en) Method for determining neural network structure and apparatus thereof
CN112634174B (en) Image representation learning method and system
CN113407820A (en) Model training method, related system and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant