CN111260060B - Object detection neural network hybrid training method and system based on dynamic intensity - Google Patents

Object detection neural network hybrid training method and system based on dynamic intensity Download PDF

Info

Publication number
CN111260060B
CN111260060B CN202010104069.4A CN202010104069A CN111260060B CN 111260060 B CN111260060 B CN 111260060B CN 202010104069 A CN202010104069 A CN 202010104069A CN 111260060 B CN111260060 B CN 111260060B
Authority
CN
China
Prior art keywords
data
loss
mixed
neural network
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010104069.4A
Other languages
Chinese (zh)
Other versions
CN111260060A (en
Inventor
何发智
全权
李博文
邓杰希
舒凌轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202010104069.4A priority Critical patent/CN111260060B/en
Publication of CN111260060A publication Critical patent/CN111260060A/en
Application granted granted Critical
Publication of CN111260060B publication Critical patent/CN111260060B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a dynamic strength-based object detection neural network hybrid training method, which alleviates the data memory problem of a detection network and improves the generalization of the detection network on a test set. By constructing mixed data, the data set is expanded to a certain degree, and the linear relation among different types of data is additionally added, so that the expressive force of the model is improved. And through the setting of the dynamic mixing parameters, the training difficulty of the mixing training method on the detection network is reduced, and the training process is smoothed, so that the optimal model is obtained more easily. Compared with other training methods, the trained model has better generalization.

Description

Object detection neural network hybrid training method and system based on dynamic intensity
Technical Field
The invention relates to the technical field of computer application and computer vision, in particular to a neural network optimization method, and specifically relates to a dynamic intensity-based object detection neural network hybrid training method and system.
Background
Neural networks are a common approach to dealing with computer vision problems. The method mainly combines components such as a convolutional layer, a pooling layer, a full-link layer and an activation function into a network in a certain mode, and then trains the network with sufficient data related to tasks. When the network is sufficiently trained to process data other than the training data, it is indicated that the network has been trained. The resulting network can then be used as a black box function to solve our problem.
The training method of the neural network refers to a training strategy adopted in the training of the neural network, and when the neural network adopts different training strategies, models with different effects can be obtained finally. The training method generally comprises several parts: data preprocessing, an auxiliary network for training and a loss function. Wherein each part of the modification affects the training process and the final result of the neural network.
The inventor of the present application finds that the method of the prior art has at least the following technical problems in the process of implementing the present invention:
the existing training method is excessively dependent on an original data set, and the training difficulty is high.
Disclosure of Invention
In view of this, the present invention provides a dynamic strength-based object detection neural network hybrid training method and system, so as to solve or at least partially solve the technical problem that the existing training method in the prior art relies too much on the original data set, and the training difficulty is high.
In order to solve the above technical problem, a first aspect of the present invention provides a dynamic intensity-based object detection neural network hybrid training method, including:
s1: constructing an object detection neural network, and initializing all parameters of the object detection neural network;
s2: acquiring original training data and preprocessing the original training data, wherein the original training data comprise image data and label data, and the label data comprise category information and position information of all marked objects in an image;
s3: acquiring a dynamic mixing parameter, randomly selecting first data and second data from the preprocessed training data, mixing the selected first data and second data by using the dynamic mixing parameter as a mixing intensity adjusting parameter to obtain a mixed data pair, wherein the mixed data pair comprises mixed image data, label data corresponding to the first data and label data corresponding to the second data;
s4: obtaining an expanded data set based on the mixed data pair;
s5: setting a target mixed loss function according to the mixed data in the expanded data set, the label data corresponding to the first data and the label data corresponding to the second data, wherein the target mixed loss function comprises classification loss and position loss;
s6: and taking the expanded data set as training data, and training the object detection neural network by combining a mixed loss function and adopting a random gradient descent method to obtain the trained object detection neural network.
In one embodiment, S1 includes:
s1.1: constructing a neural network by using a convolutional layer, a full-link layer, a pooling layer and an activation layer or using an existing neural network as an object detection neural network;
s1.2: and initializing all parameters of the object detection neural network by adopting a random parameter initialization method.
In one embodiment, the preprocessing the original training data in S2 specifically includes:
s2.1: performing pixel value normalization on image data in original training data;
s2.2: cutting the picture;
s2.3: and turning over the picture.
In one embodiment, S3 specifically includes:
s3.1: and generating a dynamic mixing parameter lambda according to a formula, wherein the specific formula is as follows:
Figure BDA0002387888370000021
wherein the content of the first and second substances,
Figure BDA0002387888370000022
is a preset maximum lambda value, alpha is a change rate adjusting parameter, n is a turning iteration number, and epoch is a current iteration number in the training process;
s3.2: randomly selecting first data and second data from the preprocessed training data as mixed original data, and linearly mixing the two data according to the proportion of a dynamic mixing parameter lambda, wherein the specific operation is as follows:
x=λx1+(1-λ)x2
the resulting mixed data pair was (x, y)1,y2) Where x is the blended image data and y1Is the first data x1Corresponding tag data, y2Is the second data x2The corresponding tag data.
In one embodiment, S4 includes integrating the blended data pair into an expanded data set
Figure BDA0002387888370000031
Figure BDA0002387888370000032
Wherein the content of the first and second substances,
Figure BDA0002387888370000033
λ is a dynamic mixing parameter, xp、xqRepresenting two data to be mixed, ypDenotes xpTag data of (a), yqDenotes xqThe tag data of (a) is stored in the memory,
Figure BDA0002387888370000034
is the original training data set.
In one embodiment, S5 specifically includes:
s5.1: using the mixed image data as input image data, and two label data y respectively related to the input image datapi,yqiThe loss is calculated from the class information of (c),
losscls(θ)=λlossi(θ)+(1-λ)lossj(θ)
Figure BDA0002387888370000035
Figure BDA0002387888370000036
wherein
Figure BDA0002387888370000037
Figure BDA0002387888370000038
For the expanded data set, m is the size of the expanded data, θ is the parameter to be optimized in the neural network, LCEAs a cross entropy function, λ is a dynamic mixing parameter,
Figure BDA0002387888370000039
for the mixed image data, ypi,yqiTag data of the first data and tag data of the second data, respectively, fθIn order to be a neural network model,
Figure BDA00023878883700000310
as a neural network pair
Figure BDA00023878883700000311
Predicted value of (1), lossi(θ) is the loss of classification of the first data, lossj(θ) is the loss of classification of the second data, losscls(θ) is a mixed classification function;
s5.2: using the mixed data as input image data, two label data y respectively associated with the input image datapi,yqiThe loss calculation of the position information specifically includes:
lossloc(θ)=λlossi(θ)′+(1-λ)lossj(θ)′
Figure BDA00023878883700000312
Figure BDA00023878883700000313
therein, lossi(θ)' is the loss of position, loss, of the first dataj(θ)' is the loss of position, loss, of the second dataloc(theta) is the hybrid position loss function, theta is the parameter to be optimized in the neural network, LSMIn order to smooth the L1 loss function,
Figure BDA0002387888370000041
for the mixed image data and its label data, fθIn order to be a neural network model,
Figure BDA0002387888370000042
as a neural network pair
Figure BDA0002387888370000043
The predicted value of (2);
s5.3: combining the mixed classification loss function with the mixed position loss function to obtain a target mixed loss function:
Loss(θ)=losscls(θ)+γlossloc(θ)
where Loss (θ) is the target mixture Loss function and γ is a specific gravity parameter of classification Loss and position Loss.
In one embodiment, S6 includes:
s6.1: randomly selecting a series of data from the expanded data set;
s6.2: inputting the obtained series of data into an object detection neural network to obtain a corresponding prediction result, and obtaining a loss value of the prediction result according to an actual result through a target mixed loss function;
s6.3: parameters of the entire network are updated using back propagation.
Based on the same inventive concept, the second aspect of the present invention provides a dynamic intensity-based object detection neural network hybrid training system, comprising:
the neural network construction module is used for constructing an object detection neural network and initializing all parameters of the object detection neural network;
the data preprocessing module is used for acquiring original training data and preprocessing the original training data, wherein the original training data comprise image data and label data, and the label data comprise category information and position information of all marked objects in an image;
the data mixing module is used for acquiring dynamic mixing parameters, randomly selecting first data and second data from the preprocessed training data, mixing the selected first data and second data by using the dynamic mixing parameters as mixing intensity adjusting parameters to obtain a mixed data pair, wherein the mixed data pair comprises mixed image data, label data corresponding to the first data and label data corresponding to the second data;
an extended data set obtaining module for obtaining an extended data set based on the mixed data pair;
a loss function setting module, configured to set a target mixed loss function according to the mixed data in the expanded data set, the tag data corresponding to the first data, and the tag data corresponding to the second data, where the target mixed loss function includes a classification loss and a position loss;
and the training module is used for training the object detection neural network by using the expanded data set as training data and combining a mixed loss function and adopting a random gradient descent method to obtain the trained object detection neural network.
Based on the same inventive concept, a third aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed, performs the method of the first aspect.
Based on the same inventive concept, a fourth aspect of the present invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to the first aspect when executing the program.
One or more technical solutions in the embodiments of the present application at least have one or more of the following technical effects:
the invention provides a dynamic-intensity-based object detection neural network hybrid training method, which comprises the steps of firstly constructing an object detection neural network, then preprocessing original training data, then acquiring dynamic hybrid parameters, and mixing selected first data and second data by taking the dynamic hybrid parameters as hybrid intensity adjusting parameters; then obtaining an expanded data set based on the mixed data pair; then, setting a target mixing loss function according to the mixed data in the expanded data set, the label data corresponding to the first data and the label data corresponding to the second data; and finally, taking the expanded data set as training data, combining a mixed loss function, and training the object detection neural network by adopting a random gradient descent method to obtain the trained object detection neural network. The selected first data and the selected second data are mixed by taking the dynamic mixing parameters as the mixing intensity adjusting parameters, so that a mixed data pair can be constructed, a data set is expanded to a certain degree, the training difficulty of the mixed training method on a detection network is reduced, the training process is smoothed, and the optimal model is obtained more easily. Compared with other training methods, the trained model has better generalization, and the technical problem that the existing training method in the prior art excessively depends on an original data set and is difficult to train is solved.
Furthermore, the two selected data are linearly mixed according to the proportion of the dynamic mixing parameter lambda, namely, the linear relation among different types of data is additionally added, and the expressive force of the model is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic flow chart illustrating an implementation of a dynamic intensity-based object detection neural network hybrid training method according to an embodiment;
FIG. 2 is a schematic diagram of the calculation of a target mixing loss function;
FIG. 3 is a block diagram of a dynamic intensity-based object detection neural network hybrid training system according to an embodiment of the present invention;
FIG. 4 is a block diagram of a computer-readable storage medium according to an embodiment of the present invention;
fig. 5 is a block diagram of a computer device in an embodiment of the present invention.
Detailed Description
The invention provides a novel training method of an object detection neural network, and aims to improve the existing training method, improve the generalization and robustness of the neural network by adopting a dynamic parameter adjustment and mixed training mode, and generally improve the performance of a model on test data.
The general inventive concept of the present invention is as follows:
firstly, constructing an object detection neural network, then preprocessing original training data, then acquiring dynamic mixing parameters, and mixing the selected first data and second data by taking the dynamic mixing parameters as mixing intensity adjustment parameters; then obtaining an expanded data set based on the mixed data pair; then, setting a target mixing loss function according to the mixed data in the expanded data set, the label data corresponding to the first data and the label data corresponding to the second data; and finally, taking the expanded data set as training data, combining a mixed loss function, and training the object detection neural network by adopting a random gradient descent method to obtain the trained object detection neural network.
According to the method, the data set is expanded to a certain degree by constructing the mixed data, and the linear relation among different types of data is additionally added, so that the expressive force of the model is improved. And through the setting of the dynamic mixing parameters, the training difficulty of the mixing training method on the detection network is reduced, and the training process is smoothed, so that the optimal model is obtained more easily. Compared with other training methods, the trained model has better generalization.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
The embodiment provides a dynamic intensity-based object detection neural network hybrid training method, please refer to fig. 1, which includes:
s1: and constructing an object detection neural network, and initializing all parameters of the object detection neural network.
Specifically, the object detection neural network may be constructed by using a theory related to the neural network, or an existing neural network architecture may be used as the object detection neural network.
S2: the method comprises the steps of obtaining original training data and preprocessing the original training data, wherein the original training data comprise image data and label data, and the label data comprise category information and position information of all marked objects in an image.
In particular, the raw training data may be obtained from an existing data set, e.g.
Figure BDA0002387888370000071
Object detection data for network training is represented, where x represents image data, y represents tag data, and the tag data includes category information c, position information l of all tagged objects within the image. Before neural network training, preprocessing such as pixel normalization, picture cropping, etc. is required for image data in training data.
S3: and acquiring dynamic mixing parameters, randomly selecting first data and second data from the preprocessed training data, mixing the selected first data and second data by using the dynamic mixing parameters as mixing intensity adjusting parameters, and acquiring a mixed data pair, wherein the mixed data pair comprises mixed image data, label data corresponding to the first data and label data corresponding to the second data.
Specifically, the selected first data and the second data are mixed using the dynamic mixing parameter as the mixing intensity adjustment parameter, and the magnitude of the dynamic mixing parameter indicates the mixing intensity. Similarly, the tag data corresponding to the first data and the tag data corresponding to the second data both include category information and position information.
S4: an expanded data set is obtained based on the mixed data pair.
S5: and setting a target mixing loss function according to the mixed data in the expanded data set, the label data corresponding to the first data and the label data corresponding to the second data, wherein the target mixing loss function comprises classification loss and position loss.
Specifically, the classification loss in the target mixture loss function is calculated from the category information in the tag data, and the position loss is calculated from the position information in the tag data.
S6: and taking the expanded data set as training data, and training the object detection neural network by combining a mixed loss function and adopting a random gradient descent method to obtain the trained object detection neural network.
Specifically, the classification and detection task based on the neural network refers to using the neural network as a tool, and feeding back corresponding object detection results by inputting object detection data into the neural network. In the training stage of the neural network, a large amount of detection data is used as training data for training the network, and after the network can correctly detect the existing training data, the network is basically trained. And finally, the network obtained by training is used for processing the object detection data which needs to be processed.
After the expanded data set and the target mixture loss function are obtained, the object detection neural network in S1 may be trained by using a stochastic gradient descent method, so as to obtain a trained object detection neural network.
In one embodiment, S1 includes:
s1.1: constructing a neural network by using a convolutional layer, a full-link layer, a pooling layer and an activation layer or using an existing neural network as an object detection neural network;
s1.2: and initializing all parameters of the object detection neural network by adopting a random parameter initialization method.
In one embodiment, the preprocessing the original training data in S2 specifically includes:
s2.1: performing pixel value normalization on image data in original training data;
s2.2: cutting the picture;
s2.3: and turning over the picture.
Specifically, the implementation process of pixel value normalization may be: first divide all pixel values by 255, limit all pixel values to [0,1], then average all pixel values of all images, and subtract all pixel values from the average value so that the center point is zeroed, and the process of picture cropping may be: firstly, the selected picture is expanded at the edge part, the edge part is filled with 0 value, for example, the filling width is one fourth or one half of the total side length, then an area with the same size as the original image is randomly selected from the filled picture and cut out as new data, and the picture turning process can be as follows: in the process of turning over the picture, the image is horizontally turned over by taking the central line of the image as a reference under a certain probability, and the image is not turned over with a certain probability. For example, the rollover probability is set to 0.5.
In one embodiment, S3 specifically includes:
s3.1: and generating a dynamic mixing parameter lambda according to a formula, wherein the specific formula is as follows:
Figure BDA0002387888370000081
wherein the content of the first and second substances,
Figure BDA0002387888370000082
is a preset maximum lambda value, alpha is a change rate adjusting parameter, n is a turning iteration number, and epoch is a current iteration number in the training process;
s3.2: randomly selecting first data and second data from the preprocessed training data as mixed original data, and linearly mixing the two data according to the proportion of a dynamic mixing parameter lambda, wherein the specific operation is as follows:
x=λx1+(1-λ)x2
the resulting mixed data pair was (x, y)1,y2) Where x is the blended image data and y1Is the first data x1Corresponding tag data, y2As second data x2Corresponding labelAnd (4) data.
Specifically, the dynamic mixing parameter λ is mainly used to adjust the mixing degree of the subsequent image data, and the smaller the value of λ, the smaller the mixing intensity, and the linear mixing operation is adopted in the present embodiment.
In one embodiment, S4 includes integrating the blended data into an expanded data set
Figure BDA0002387888370000091
Figure BDA0002387888370000092
Wherein the content of the first and second substances,
Figure BDA0002387888370000093
λ is a dynamic mixing parameter, xp、xqRepresenting two data to be mixed, ypRepresents xpTag data of yqRepresents xqThe tag data of (a) is stored in the memory,
Figure BDA0002387888370000094
is the original training data set.
Specifically, after data mixing, different mixed data pairs can be obtained, and the mixed data pairs are merged together to obtain an expanded data set.
In an embodiment, S5 specifically includes:
s5.1: using the mixed image data as input image data, and two label data y respectively related to the input image datapi,yqiThe loss is calculated from the class information of (c),
losscls(θ)=λlossi(θ)+(1-λ)lossj(θ)
Figure BDA0002387888370000095
Figure BDA0002387888370000096
wherein
Figure BDA0002387888370000097
Figure BDA0002387888370000098
For the expanded data set, m is the size of the expanded data, θ is the parameter to be optimized in the neural network, LCEAs a cross entropy function, λ is a dynamic mixing parameter,
Figure BDA0002387888370000099
for the mixed image data, ypi,yqiTag data of the first data and tag data of the second data, respectively, fθIn order to be a neural network model,
Figure BDA00023878883700000910
as a neural network pair
Figure BDA00023878883700000911
Predicted value of (1), lossi(θ) is the loss of classification of the first data, lossj(θ) is the loss of classification of the second data, losscls(θ) is a mixed classification function;
s5.2: using the mixed data as input image data, two label data y respectively associated with the input image datapi,yqiThe loss calculation of the position information specifically includes:
lossloc(θ)=λlossi(θ)′+(1-λ)lossj(θ)′
Figure BDA0002387888370000101
Figure BDA0002387888370000102
wherein,lossi(θ)' is the loss of position, loss, of the first dataj(θ)' is the loss of position, loss, of the second dataloc(theta) is the hybrid position loss function, theta is the parameter to be optimized in the neural network, LSMIn order to smooth the L1 loss function,
Figure BDA0002387888370000103
for the mixed image data and its label data, fθIn order to be a neural network model,
Figure BDA0002387888370000104
as a neural network pair
Figure BDA0002387888370000105
The predicted value of (2);
s5.3: combining the mixed classification loss function with the mixed position loss function to obtain a target mixed loss function:
Loss(θ)=losscls(θ)+γlossloc(θ)
wherein Loss (θ) is the target mixture Loss function, and γ is the specific gravity parameter of the classification Loss and the position Loss.
Specifically, the loss function is a function used by the neural network to represent the difference between the predicted result and the actual result. The mixing loss function is divided into two parts: a classification loss function and a location detection loss function. Wherein the classification loss function represents the difference between the prediction of the neural network for the object class in the image content and the object class in the actual image, and the difference between the prediction and the object class in the actual image is quantified by using the cross entropy, and the classification loss function can be expressed as:
Figure BDA0002387888370000106
wherein losscls(theta) is a classification loss function, theta is a parameter to be optimized in the neural network, LCEAs a cross-entropy function, xiFor the input image data, yiIs xiCorresponding tag data, fθIs the nerveNetwork model, fθ(xi) For the neural network pair xiThe predicted value of (2). The invention adopts a mixed classification loss function, takes mixed data x as input image data and two label data y respectively related to the input image datapi,yqiObtaining loss by calculating the class information of (1)i(theta) and lossj(θ), the two loss functions are then summed based on the dynamic mixing parameters.
Similarly, the position penalty function represents the difference between the prediction of the object position in the image content by the neural network and the object position in the actual image, and the difference between them (predicted position and actual position) is usually quantified using a smoothed L1 penalty (Smooth L1 Loss). The smoothing loss function can be expressed as:
Figure BDA0002387888370000111
the mixed position loss function is similar to the mixed classification loss function, i.e. the mixed data x is used as the input image data, and the two label data y are respectively related to the mixed data xpi,yqiThe loss is calculated from the position information W of (1) to obtain lossi(theta)' and lossj(θ)', and then the two loss functions are summed based on the dynamic mixing parameters.
After a mixed classification loss function and a mixed position loss function are obtained respectively, the two functions are combined to obtain a final loss function. The calculation principle of the final loss function is shown in fig. 2, where classification loss 1 represents the classification loss of the first data (picture 1), classification loss 2 represents the classification loss of the second data (picture 2), regression loss 1 represents the position loss of the first data, and regression loss 2 represents the position loss of the second data.
In one embodiment, S6 includes:
s6.1: randomly selecting a series of data from the expanded data set;
s6.2: inputting the obtained series of data into an object detection neural network to obtain a corresponding prediction result, and obtaining a loss value of the actual result and the prediction result through a target mixed loss function;
s6.3: parameters of the entire network are updated using back propagation.
Specifically, the objective is to minimize an objective loss function, a partial derivative is calculated for each hyper-parameter, a gradient of the current round can be obtained, then the loss function is updated in a direction opposite to the gradient, and the global optimal solution of the hyper-parameter, that is, the optimal model parameter, can be obtained by continuously performing iterative update.
Generally, the object detection network hybrid training method based on the dynamic parameters, provided by the invention, alleviates the data memory problem of the detection network and improves the generalization of the detection network on a test set. By constructing mixed data, the data set is expanded to a certain degree, and the linear relation among different types of data is additionally added, so that the expressive force of the model is improved. And through the setting of the dynamic mixing parameters, the training difficulty of the mixing training method on the detection network is reduced, and the training process is smoothed, so that the optimal model is obtained more easily. Compared with other training methods, the trained model has better generalization.
Example two
Based on the same inventive concept, the present embodiment provides a dynamic intensity-based object detection neural network hybrid training system, please refer to fig. 3, which includes:
a neural network construction module 201, configured to construct an object detection neural network, and initialize all parameters of the object detection neural network;
the data preprocessing module 202 is configured to acquire original training data and preprocess the original training data, where the original training data includes image data and tag data, and the tag data includes category information and position information of all tagged objects in an image;
the data mixing module 203 is configured to obtain a dynamic mixing parameter, randomly select first data and second data from the preprocessed training data, mix the selected first data and second data with the dynamic mixing parameter as a mixing intensity adjustment parameter, and obtain a mixed data pair, where the mixed data pair includes mixed image data, tag data corresponding to the first data, and tag data corresponding to the second data;
an extended data set obtaining module 204, configured to obtain an extended data set based on the mixed data pair;
a loss function setting module 205, configured to set a target mixed loss function according to the mixed data in the expanded data set, the tag data corresponding to the first data, and the tag data corresponding to the second data, where the target mixed loss function includes a classification loss and a position loss;
and the training module 206 is configured to train the object detection neural network by using the expanded data set as training data and combining a mixed loss function and using a random gradient descent method, so as to obtain a trained object detection neural network.
Since the system described in the second embodiment of the present invention is a system adopted for implementing the dynamic strength-based object detection neural network hybrid training method in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and deformation of the system based on the method described in the first embodiment of the present invention, and thus details are not described herein again. All systems adopted by the method of the first embodiment of the present invention are within the intended protection scope of the present invention.
EXAMPLE III
Referring to fig. 4, based on the same inventive concept, the present application further provides a computer readable storage medium 300, on which a computer program 311 is stored, which when executed, implements the method as described in the first embodiment.
Since the computer-readable storage medium introduced in the third embodiment of the present invention is a computer-readable storage medium used for implementing the dynamic-strength-based object detection neural network hybrid training method in the first embodiment of the present invention, based on the method introduced in the first embodiment of the present invention, persons skilled in the art can understand the specific structure and deformation of the computer-readable storage medium, and thus details are not described herein again. Any computer readable storage medium used in the method of the first embodiment of the present invention is within the scope of the present invention.
Example four
Based on the same inventive concept, the present application further provides a computer device, please refer to fig. 5, which includes a storage 401, a processor 402, and a computer program 403 stored in the storage and running on the processor, and when the processor 402 executes the above program, the method in the first embodiment is implemented.
Since the computer device introduced in the fourth embodiment of the present invention is a computer device used for implementing the dynamic strength-based object detection neural network hybrid training method in the first embodiment of the present invention, based on the method introduced in the first embodiment of the present invention, those skilled in the art can understand the specific structure and deformation of the computer device, and thus, details are not described herein. All the computer devices used in the method of the first embodiment of the present invention are within the scope of the present invention.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.

Claims (8)

1. A dynamic intensity-based object detection neural network hybrid training method is characterized by comprising the following steps:
s1: constructing an object detection neural network, and initializing all parameters of the object detection neural network;
s2: acquiring original training data and preprocessing the original training data, wherein the original training data comprises image data and label data, and the label data comprises category information and position information of all marked objects in an image;
s3: acquiring a dynamic mixing parameter, randomly selecting first data and second data from the preprocessed training data, mixing the selected first data and second data by using the dynamic mixing parameter as a mixing intensity adjusting parameter to obtain a mixed data pair, wherein the mixed data pair comprises mixed image data, label data corresponding to the first data and label data corresponding to the second data;
s4: obtaining an expanded data set based on the mixed data pair;
s5: setting a target mixed loss function according to the mixed data in the expanded data set, the label data corresponding to the first data and the label data corresponding to the second data, wherein the target mixed loss function comprises classification loss and position loss;
s6: training an object detection neural network by using the expanded data set as training data and combining a mixing loss function and adopting a random gradient descent method to obtain a trained object detection neural network;
wherein, preprocessing the original training data in S2 specifically includes:
s2.1: performing pixel value normalization on image data in original training data;
s2.2: cutting the picture;
s2.3: turning over the picture;
s3 specifically includes:
s3.1: and generating a dynamic mixing parameter lambda according to a formula, wherein the specific formula is as follows:
Figure FDA0003609450090000011
wherein the content of the first and second substances,
Figure FDA0003609450090000012
is a preset maximum lambda value, alpha is a change rate adjusting parameter, n is a turning iteration number, and epoch is a current iteration number in the training process;
s3.2: randomly selecting first data and second data from the preprocessed training data as mixed original data, and linearly mixing the two data according to the proportion of a dynamic mixing parameter lambda, wherein the specific operation is as follows:
x=λx1+(1-λ)x2
the resulting mixed data pair was (x, y)1,y2) Where x is the blended image data and y1Is the first data x1Corresponding tag data, y2Is the second data x2The corresponding tag data.
2. The method of claim 1, wherein S1 includes:
s1.1: constructing a neural network by using a convolutional layer, a full-link layer, a pooling layer and an activation layer or using an existing neural network as an object detection neural network;
s1.2: and initializing all parameters of the object detection neural network by adopting a random parameter initialization method.
3. The method of claim 1, wherein S4 includes integrating the blended data into an expanded data set
Figure FDA0003609450090000021
Figure FDA0003609450090000022
Wherein the content of the first and second substances,
Figure FDA0003609450090000023
(xpi,ypi),(xqi,yqi) e.D, λ is the dynamic mixing parameter, xpi,xqiWhich represents the two data that are to be mixed,
Figure FDA0003609450090000024
representing the mixed image data, ypi,yqiTag data of the first data and tag data of the second data,
Figure FDA0003609450090000025
is the original training data set.
4. The method according to claim 3, wherein S5 specifically includes:
s5.1: the mixed image data is used as input image data, and two label data y related to the input image data are respectivelypi,yqiThe loss is calculated from the class information of (c),
losscls(θ)=λlossi(θ)+(1-λ)lossj(θ)
Figure FDA0003609450090000026
Figure FDA0003609450090000027
wherein
Figure FDA0003609450090000028
Figure FDA0003609450090000029
For the expanded data set, m is the size of the expanded data, θ is the parameter to be optimized in the neural network, LCEAs a cross entropy function, λ is a dynamic mixing parameter,
Figure FDA00036094500900000210
for the mixed image data, ypi,yqiTag data of the first data and tag data of the second data, respectively, fθIn order to be a neural network model,
Figure FDA00036094500900000211
as a neural network pair
Figure FDA00036094500900000212
Predicted value of (1), lossi(θ) is the loss of classification of the first data, lossj(θ) is the loss of classification of the second data, losscls(θ) is a mixed classification loss function;
s5.2: using the mixed data as input image data, two label data y respectively associated with the input image datapi,yqiThe loss calculation of the position information specifically includes:
lossloc(θ)=λlossi(θ)′+(1-λ)lossj(θ)′
Figure FDA0003609450090000031
Figure FDA0003609450090000032
therein, lossi(θ)' is the loss of position, loss, of the first dataj(θ)' is the loss of position, loss, of the second dataloc(theta) is the hybrid position loss function, theta is the parameter to be optimized in the neural network, LsMIn order to smooth the L1 loss function,
Figure FDA0003609450090000033
for the mixed image data and its label data, fθIn order to be a neural network model,
Figure FDA0003609450090000034
as a neural network pair
Figure FDA0003609450090000035
The predicted value of (2);
s5.3: combining the mixed classification loss function with the mixed position loss function to obtain a target mixed loss function:
Loss(θ)=losscls(θ)+γlossloc(θ)
where Loss (θ) is the target mixture Loss function and γ is a specific gravity parameter of classification Loss and position Loss.
5. The method of claim 1, wherein S6 includes:
s6.1: randomly selecting a series of data from the expanded data set;
s6.2: inputting the obtained series of data into an object detection neural network to obtain a corresponding prediction result, and obtaining a loss value of the prediction result according to an actual result through a target mixed loss function;
s6.3: parameters of the entire network are updated using back propagation.
6. An object detection neural network hybrid training system based on dynamic intensity, comprising:
the neural network construction module is used for constructing an object detection neural network and initializing all parameters of the object detection neural network;
the data preprocessing module is used for acquiring original training data and preprocessing the original training data, wherein the original training data comprise image data and label data, and the label data comprise category information and position information of all marked objects in an image;
the data mixing module is used for acquiring dynamic mixing parameters, randomly selecting first data and second data from the preprocessed training data, mixing the selected first data and second data by using the dynamic mixing parameters as mixing intensity adjusting parameters to obtain a mixed data pair, wherein the mixed data pair comprises mixed image data, label data corresponding to the first data and label data corresponding to the second data;
an extended data set obtaining module for obtaining an extended data set based on the mixed data pair;
a loss function setting module, configured to set a target mixed loss function according to the mixed data in the expanded data set, the tag data corresponding to the first data, and the tag data corresponding to the second data, where the target mixed loss function includes a classification loss and a position loss;
the training module is used for training the object detection neural network by using the expanded data set as training data and combining a mixed loss function and adopting a random gradient descent method to obtain a trained object detection neural network;
wherein, the data preprocessing module is specifically configured to:
performing pixel value normalization on image data in original training data;
cutting pictures;
turning over the picture;
the data mixing module is specifically configured to:
and generating a dynamic mixing parameter lambda according to a formula, wherein the specific formula is as follows:
Figure FDA0003609450090000041
wherein the content of the first and second substances,
Figure FDA0003609450090000042
is a preset maximum lambda value, alpha is a change rate adjusting parameter, n is a turning iteration number, epoch is a current iteration number of the training process;
randomly selecting first data and second data from the preprocessed training data as mixed original data, and linearly mixing the two data according to the proportion of a dynamic mixing parameter lambda, wherein the specific operation is as follows:
x=λx1+(1-λ)x2
the resulting mixed data pair was (x, y)1,y2) Where x is the blended image data and y1Is the first data x1Corresponding tag data, y2Is the second data x2The corresponding tag data.
7. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed, implements the method of any one of claims 1 to 5.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 5 when executing the program.
CN202010104069.4A 2020-02-20 2020-02-20 Object detection neural network hybrid training method and system based on dynamic intensity Active CN111260060B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010104069.4A CN111260060B (en) 2020-02-20 2020-02-20 Object detection neural network hybrid training method and system based on dynamic intensity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010104069.4A CN111260060B (en) 2020-02-20 2020-02-20 Object detection neural network hybrid training method and system based on dynamic intensity

Publications (2)

Publication Number Publication Date
CN111260060A CN111260060A (en) 2020-06-09
CN111260060B true CN111260060B (en) 2022-06-14

Family

ID=70951283

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010104069.4A Active CN111260060B (en) 2020-02-20 2020-02-20 Object detection neural network hybrid training method and system based on dynamic intensity

Country Status (1)

Country Link
CN (1) CN111260060B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647742A (en) * 2018-05-19 2018-10-12 南京理工大学 Fast target detection method based on lightweight neural network
CN109508746A (en) * 2018-11-16 2019-03-22 西安电子科技大学 Pulsar candidate's body recognition methods based on convolutional neural networks
CN110765873A (en) * 2019-09-19 2020-02-07 华中师范大学 Facial expression recognition method and device based on expression intensity label distribution

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10330787B2 (en) * 2016-09-19 2019-06-25 Nec Corporation Advanced driver-assistance system
US10579897B2 (en) * 2017-10-02 2020-03-03 Xnor.ai Inc. Image based object detection
CN107886073B (en) * 2017-11-10 2021-07-27 重庆邮电大学 Fine-grained vehicle multi-attribute identification method based on convolutional neural network
CN108898188A (en) * 2018-07-06 2018-11-27 四川奇迹云科技有限公司 A kind of image data set aid mark system and method
CN109065030B (en) * 2018-08-01 2020-06-30 上海大学 Convolutional neural network-based environmental sound identification method and system
US10304193B1 (en) * 2018-08-17 2019-05-28 12 Sigma Technologies Image segmentation and object detection using fully convolutional neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647742A (en) * 2018-05-19 2018-10-12 南京理工大学 Fast target detection method based on lightweight neural network
CN109508746A (en) * 2018-11-16 2019-03-22 西安电子科技大学 Pulsar candidate's body recognition methods based on convolutional neural networks
CN110765873A (en) * 2019-09-19 2020-02-07 华中师范大学 Facial expression recognition method and device based on expression intensity label distribution

Also Published As

Publication number Publication date
CN111260060A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
US10296827B2 (en) Data category identification method and apparatus based on deep neural network
CN107330956B (en) Cartoon hand drawing unsupervised coloring method and device
CN106803067B (en) Method and device for evaluating quality of face image
DE112019005750T5 (en) Learning to generate synthetic data sets for training neural networks
DE102018111905A1 (en) Domain-specific language for generating recurrent neural network architectures
CN112308095A (en) Picture preprocessing and model training method and device, server and storage medium
Kolesnikov et al. PixelCNN models with auxiliary variables for natural image modeling
DE112016004535T5 (en) Universal Compliance Network
DE112016005006T5 (en) AUTOMATIC VIDEO EXECUTIVE SUMMARY
CN111696196B (en) Three-dimensional face model reconstruction method and device
CN103400368B (en) Based on graph theory and the parallel rapid SAR image segmentation method of super-pixel
CN112712546A (en) Target tracking method based on twin neural network
CN111192277A (en) Instance partitioning method and device
DE102019122402A1 (en) CLASSIFYING TIME SERIES IMAGE DATA
CN113689436A (en) Image semantic segmentation method, device, equipment and storage medium
CN110599453A (en) Panel defect detection method and device based on image fusion and equipment terminal
CN111325766A (en) Three-dimensional edge detection method and device, storage medium and computer equipment
DE102018206108A1 (en) Generate validation data with generative contradictory networks
CN103839244A (en) Real-time image fusion method and device
CN111260060B (en) Object detection neural network hybrid training method and system based on dynamic intensity
CN109829857B (en) Method and device for correcting inclined image based on generation countermeasure network
DE102023101265A1 (en) Object detection in image stream processing using optical flow with dynamic regions of interest
CN110175622A (en) The vehicle part recognition methods of convolutional neural networks based on symbiosis and system
CN114820755A (en) Depth map estimation method and system
CN114897884A (en) No-reference screen content image quality evaluation method based on multi-scale edge feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant