CN111260060A - Object detection neural network hybrid training method and system based on dynamic intensity - Google Patents
Object detection neural network hybrid training method and system based on dynamic intensity Download PDFInfo
- Publication number
- CN111260060A CN111260060A CN202010104069.4A CN202010104069A CN111260060A CN 111260060 A CN111260060 A CN 111260060A CN 202010104069 A CN202010104069 A CN 202010104069A CN 111260060 A CN111260060 A CN 111260060A
- Authority
- CN
- China
- Prior art keywords
- data
- loss
- mixed
- neural network
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012549 training Methods 0.000 title claims abstract description 118
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 101
- 238000001514 detection method Methods 0.000 title claims abstract description 77
- 238000000034 method Methods 0.000 title claims abstract description 68
- 230000008569 process Effects 0.000 claims abstract description 14
- 230000006870 function Effects 0.000 claims description 81
- 238000007781 pre-processing Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 11
- 238000011478 gradient descent method Methods 0.000 claims description 9
- 238000003062 neural network model Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 5
- 239000000203 mixture Substances 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 4
- 238000012047 cause and effect analysis Methods 0.000 claims description 4
- 238000012043 cost effectiveness analysis Methods 0.000 claims description 4
- 238000011176 pooling Methods 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 3
- 230000005484 gravity Effects 0.000 claims description 3
- 238000011423 initialization method Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 10
- 230000004048 modification Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a dynamic strength-based object detection neural network hybrid training method, which alleviates the data memory problem of a detection network and improves the generalization of the detection network on a test set. By constructing mixed data, the data set is expanded to a certain degree, and the linear relation among different types of data is additionally added, so that the expressive force of the model is improved. And through the setting of the dynamic mixing parameters, the training difficulty of the mixing training method on the detection network is reduced, and the training process is smoothed, so that the optimal model is obtained more easily. Compared with other training methods, the trained model has better generalization.
Description
Technical Field
The invention relates to the technical field of computer application and computer vision, in particular to a neural network optimization method, and specifically relates to a dynamic intensity-based object detection neural network hybrid training method and system.
Background
Neural networks are a common approach to dealing with computer vision problems. The method mainly combines components such as a convolutional layer, a pooling layer, a full-link layer, an activation function and the like into a network in a certain mode, and trains the network with sufficient task-related data. When the network is sufficiently trained to process data other than the training data, it is indicated that the network has been trained. The resulting network can then be used as a black box function to solve our problem.
The training method of the neural network refers to a training strategy adopted in the training of the neural network, and when the neural network adopts different training strategies, models with different effects can be obtained finally. The training method generally comprises several parts: data preprocessing, an auxiliary network for training and a loss function. Wherein each part of the modification affects the training process and the final result of the neural network.
The inventor of the present application finds that the method of the prior art has at least the following technical problems in the process of implementing the present invention:
the existing training method is excessively dependent on an original data set, and the training difficulty is high.
Disclosure of Invention
In view of this, the present invention provides a dynamic strength-based object detection neural network hybrid training method and system, so as to solve or at least partially solve the technical problem that the existing training method in the prior art relies too much on the original data set, and the training difficulty is high.
In order to solve the above technical problem, a first aspect of the present invention provides a dynamic intensity-based object detection neural network hybrid training method, including:
s1: constructing an object detection neural network, and initializing all parameters of the object detection neural network;
s2: acquiring original training data and preprocessing the original training data, wherein the original training data comprises image data and label data, and the label data comprises category information and position information of all marked objects in an image;
s3: acquiring a dynamic mixing parameter, randomly selecting first data and second data from the preprocessed training data, mixing the selected first data and second data by using the dynamic mixing parameter as a mixing intensity adjusting parameter to obtain a mixed data pair, wherein the mixed data pair comprises mixed image data, label data corresponding to the first data and label data corresponding to the second data;
s4: obtaining an expanded data set based on the mixed data pair;
s5: setting a target mixed loss function according to the mixed data in the expanded data set, the label data corresponding to the first data and the label data corresponding to the second data, wherein the target mixed loss function comprises classification loss and position loss;
s6: and taking the expanded data set as training data, and training the object detection neural network by combining a mixed loss function and adopting a random gradient descent method to obtain the trained object detection neural network.
In one embodiment, S1 includes:
s1.1: constructing a neural network by using a convolutional layer, a full-link layer, a pooling layer and an activation layer or using an existing neural network as an object detection neural network;
s1.2: and initializing all parameters of the object detection neural network by adopting a random parameter initialization method.
In one embodiment, the preprocessing the original training data in S2 specifically includes:
s2.1: performing pixel value normalization on image data in original training data;
s2.2: cutting the picture;
s2.3: and turning over the picture.
In one embodiment, S3 specifically includes:
s3.1: and generating a dynamic mixing parameter lambda according to a formula, wherein the specific formula is as follows:
wherein,is a preset maximum lambda value, α is a change rate adjusting parameter, n is a turning iteration number, epoch is a current iteration number in the training process;
s3.2: randomly selecting first data and second data from the preprocessed training data as mixed original data, and linearly mixing the two data according to the proportion of a dynamic mixing parameter lambda, wherein the specific operation is as follows:
x=λx1+(1-λ)x2
the resulting mixed data pair was (x, y)1,y2) Where x is the blended image data and y1Is the first data x1Corresponding tag data, y2Is the second data x2The corresponding tag data.
Wherein,λ is a dynamic mixing parameter, xp、xqRepresenting two data to be mixed, ypDenotes xpTag data of yqDenotes xqThe tag data of (a) is stored in the memory,is the original training data set.
In one embodiment, S5 specifically includes:
s5.1: using the mixed image data as input image data, and two label data y respectively related to the input image datapi,yqiThe loss is calculated from the class information of (c),
losscls(θ)=λlossi(θ)+(1-λ)lossj(θ)
wherein For the expanded data set, m is the size of the expanded data, θ is the parameter to be optimized in the neural network, LCEAs a cross entropy function, λ is a dynamic mixing parameter,for the mixed image data, ypi,yqiTag data of the first data and tag data of the second data, respectively, fθIn order to be a neural network model,as a neural network pairPredicted value of (1), lossi(θ) is the loss of classification of the first data, lossj(θ) is the loss of classification of the second data, losscls(θ) is a mixed classification function;
s5.2: using the mixed data as input image data, two label data y respectively associated with the input image datapi,yqiThe loss calculation of the position information specifically includes:
lossloc(θ)=λlossi(θ)′+(1-λ)lossj(θ)′
therein, lossi(θ)' is the loss of position, loss, of the first dataj(θ)' is the loss of position, loss, of the second dataloc(theta) is the hybrid position loss function, theta is the parameter to be optimized in the neural network, LSMIn order to smooth the L1 loss function,for the mixed image data and its label data, fθIn order to be a neural network model,as a neural network pairThe predicted value of (2);
s5.3: combining the mixed classification loss function with the mixed position loss function to obtain a target mixed loss function:
Loss(θ)=losscls(θ)+γlossloc(θ)
where Loss (θ) is the target mixture Loss function and γ is a specific gravity parameter of classification Loss and position Loss.
In one embodiment, S6 includes:
s6.1: randomly selecting a series of data from the expanded data set;
s6.2: inputting the obtained series of data into an object detection neural network to obtain a corresponding prediction result, and obtaining a loss value of the prediction result according to an actual result through a target mixed loss function;
s6.3: parameters of the entire network are updated using back propagation.
Based on the same inventive concept, the second aspect of the present invention provides a dynamic intensity-based object detection neural network hybrid training system, comprising:
the neural network construction module is used for constructing an object detection neural network and initializing all parameters of the object detection neural network;
the data preprocessing module is used for acquiring original training data and preprocessing the original training data, wherein the original training data comprise image data and label data, and the label data comprise category information and position information of all marked objects in an image;
the data mixing module is used for acquiring dynamic mixing parameters, randomly selecting first data and second data from the preprocessed training data, mixing the selected first data and second data by using the dynamic mixing parameters as mixing intensity adjusting parameters to obtain a mixed data pair, wherein the mixed data pair comprises mixed image data, label data corresponding to the first data and label data corresponding to the second data;
an extended data set obtaining module for obtaining an extended data set based on the mixed data pair;
a loss function setting module, configured to set a target mixed loss function according to the mixed data in the expanded data set, the tag data corresponding to the first data, and the tag data corresponding to the second data, where the target mixed loss function includes a classification loss and a position loss;
and the training module is used for training the object detection neural network by using the expanded data set as training data and combining a mixed loss function and adopting a random gradient descent method to obtain the trained object detection neural network.
Based on the same inventive concept, a third aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed, performs the method of the first aspect.
Based on the same inventive concept, a fourth aspect of the present invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to the first aspect when executing the program.
One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:
the invention provides a dynamic strength-based object detection neural network hybrid training method, which comprises the steps of firstly constructing an object detection neural network, then preprocessing original training data, then acquiring dynamic hybrid parameters, and mixing selected first data and second data by taking the dynamic hybrid parameters as hybrid strength adjusting parameters; then obtaining an expanded data set based on the mixed data pair; then, setting a target mixing loss function according to the mixed data in the expanded data set, the label data corresponding to the first data and the label data corresponding to the second data; and finally, taking the expanded data set as training data, combining a mixed loss function, and training the object detection neural network by adopting a random gradient descent method to obtain the trained object detection neural network. The selected first data and the selected second data are mixed by taking the dynamic mixing parameters as the mixing intensity adjusting parameters, so that a mixed data pair can be constructed, a data set is expanded to a certain degree, the training difficulty of the mixed training method on a detection network is reduced, the training process is smoothed, and the optimal model is obtained more easily. Compared with other training methods, the trained model has better generalization, and the technical problem that the existing training method in the prior art excessively depends on an original data set and is difficult to train is solved.
Furthermore, the two selected data are linearly mixed according to the proportion of the dynamic mixing parameter lambda, namely, the linear relation among different types of data is additionally added, and the expressive force of the model is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic flow chart illustrating an implementation of a dynamic intensity-based object detection neural network hybrid training method according to an embodiment;
FIG. 2 is a schematic diagram of the calculation of a target mixing loss function;
FIG. 3 is a block diagram of a dynamic intensity-based object detection neural network hybrid training system according to an embodiment of the present invention;
FIG. 4 is a block diagram of a computer-readable storage medium according to an embodiment of the present invention;
fig. 5 is a block diagram of a computer device in an embodiment of the present invention.
Detailed Description
The invention provides a novel training method of an object detection neural network, and aims to improve the existing training method, improve the generalization and robustness of the neural network by adopting a dynamic parameter adjustment and mixed training mode, and generally improve the performance of a model on test data.
The general inventive concept of the present invention is as follows:
firstly, constructing an object detection neural network, then preprocessing original training data, then acquiring dynamic mixing parameters, and mixing the selected first data and second data by taking the dynamic mixing parameters as mixing intensity adjusting parameters; then obtaining an expanded data set based on the mixed data pair; then, setting a target mixing loss function according to the mixed data in the expanded data set, the label data corresponding to the first data and the label data corresponding to the second data; and finally, taking the expanded data set as training data, combining a mixed loss function, and training the object detection neural network by adopting a random gradient descent method to obtain the trained object detection neural network.
According to the method, the data set is expanded to a certain degree by constructing the mixed data, and the linear relation among different types of data is additionally added, so that the expressive force of the model is improved. And through the setting of the dynamic mixing parameters, the training difficulty of the mixing training method on the detection network is reduced, and the training process is smoothed, so that the optimal model is obtained more easily. Compared with other training methods, the trained model has better generalization.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
The embodiment provides a dynamic intensity-based object detection neural network hybrid training method, please refer to fig. 1, which includes:
s1: and constructing an object detection neural network, and initializing all parameters of the object detection neural network.
Specifically, the object detection neural network may be constructed by using a theory related to the neural network, or an existing neural network architecture may be used as the object detection neural network.
S2: the method comprises the steps of obtaining original training data and preprocessing the original training data, wherein the original training data comprise image data and label data, and the label data comprise category information and position information of all marked objects in an image.
In particular, the raw training data may be obtained from an existing data set, e.g.Object detection data for network training is represented, where x represents image data, y represents tag data, and the tag data includes category information c, position information l of all tagged objects within the image. Before neural network training, preprocessing such as pixel normalization, picture cropping, etc. is required for image data in training data.
S3: and acquiring dynamic mixing parameters, randomly selecting first data and second data from the preprocessed training data, mixing the selected first data and second data by using the dynamic mixing parameters as mixing intensity adjusting parameters, and acquiring a mixed data pair, wherein the mixed data pair comprises mixed image data, label data corresponding to the first data and label data corresponding to the second data.
Specifically, the selected first data and the second data are mixed using the dynamic mixing parameter as the mixing intensity adjustment parameter, and the magnitude of the dynamic mixing parameter indicates the mixing intensity. Similarly, the tag data corresponding to the first data and the tag data corresponding to the second data both include category information and position information.
S4: an expanded data set is obtained based on the mixed data pair.
S5: and setting a target mixing loss function according to the mixed data in the expanded data set, the label data corresponding to the first data and the label data corresponding to the second data, wherein the target mixing loss function comprises classification loss and position loss.
Specifically, the classification loss in the target mixture loss function is calculated from the category information in the tag data, and the position loss is calculated from the position information in the tag data.
S6: and taking the expanded data set as training data, and training the object detection neural network by combining a mixed loss function and adopting a random gradient descent method to obtain the trained object detection neural network.
Specifically, the classification and detection task based on the neural network refers to using the neural network as a tool, and feeding back corresponding object detection results by inputting object detection data into the neural network. In the training stage of the neural network, a large amount of detection data is used as training data for training the network, and after the network can correctly detect the existing training data, the network is basically trained. And finally, the network obtained by training is used for processing the object detection data which needs to be processed.
After the expanded data set and the target mixed loss function are obtained, the object detection neural network in S1 may be trained by using a random gradient descent method, so as to obtain a trained object detection neural network.
In one embodiment, S1 includes:
s1.1: constructing a neural network by using a convolutional layer, a full-link layer, a pooling layer and an activation layer or using an existing neural network as an object detection neural network;
s1.2: and initializing all parameters of the object detection neural network by adopting a random parameter initialization method.
In one embodiment, the preprocessing the original training data in S2 specifically includes:
s2.1: performing pixel value normalization on image data in original training data;
s2.2: cutting the picture;
s2.3: and turning over the picture.
Specifically, the implementation process of pixel value normalization may be: first divide all pixel values by 255, limit all pixel values to [0,1], then average all pixel values of all images, and subtract all pixel values from the average value so that the center point is zeroed, and the process of picture cropping may be: firstly, the selected picture is expanded at the edge part, the edge part is filled with 0 value, for example, the filling width is one fourth or one half of the total side length, then an area with the same size as the original image is randomly selected from the filled picture and cut out as new data, and the picture turning process can be as follows: in the process of turning over the picture, the image is horizontally turned over by taking the central line of the image as a reference under a certain probability, and the image is not turned over with a certain probability. For example, the rollover probability is set to 0.5.
In one embodiment, S3 specifically includes:
s3.1: and generating a dynamic mixing parameter lambda according to a formula, wherein the specific formula is as follows:
wherein,is a preset maximum lambda value, α is a change rate adjusting parameter, n is a turning iteration number, epoch is a current iteration number in the training process;
s3.2: randomly selecting first data and second data from the preprocessed training data as mixed original data, and linearly mixing the two data according to the proportion of a dynamic mixing parameter lambda, wherein the specific operation is as follows:
x=λx1+(1-λ)x2
the resulting mixed data pair was (x, y)1,y2) Where x is the blended image data and y1Is the first data x1Corresponding tag data, y2Is the second data x2The corresponding tag data.
Specifically, the dynamic mixing parameter λ is mainly used to adjust the mixing degree of the subsequent image data, and the smaller the value of λ, the smaller the mixing intensity, and the linear mixing operation is adopted in the present embodiment.
Wherein,λ is a dynamic mixing parameter, xp、xqRepresenting two data to be mixed, ypDenotes xpTag data of yqDenotes xqThe tag data of (a) is stored in the memory,is the original training data set.
Specifically, after data mixing, different mixed data pairs can be obtained, and the mixed data pairs are merged together to obtain an expanded data set.
In one embodiment, S5 specifically includes:
s5.1: using the mixed image data as input image data, and two label data y respectively related to the input image datapi,yqiThe loss is calculated from the class information of (c),
losscls(θ)=λlossi(θ)+(1-λ)lossj(θ)
wherein For the expanded data set, m is the size of the expanded data, θ is the parameter to be optimized in the neural network, LCEAs a cross entropy function, λ is a dynamic mixing parameter,for the mixed image data, ypi,yqiTag data of the first data and tag data of the second data, respectively, fθIn order to be a neural network model,as a neural network pairPredicted value of (1), lossi(θ) is the loss of classification of the first data, lossj(theta) is a score of the second dataLoss of class, losscls(θ) is a mixed classification function;
s5.2: using the mixed data as input image data, two label data y respectively associated with the input image datapi,yqiThe loss calculation of the position information specifically includes:
lossloc(θ)=λlossi(θ)′+(1-λ)lossj(θ)′
therein, lossi(θ)' is the loss of position, loss, of the first dataj(θ)' is the loss of position, loss, of the second dataloc(theta) is the hybrid position loss function, theta is the parameter to be optimized in the neural network, LSMIn order to smooth the L1 loss function,for the mixed image data and its label data, fθIn order to be a neural network model,as a neural network pairThe predicted value of (2);
s5.3: combining the mixed classification loss function with the mixed position loss function to obtain a target mixed loss function:
Loss(θ)=losscls(θ)+γlossloc(θ)
where Loss (θ) is the target mixture Loss function and γ is a specific gravity parameter of classification Loss and position Loss.
Specifically, the loss function is a function used by the neural network to represent the difference between the predicted result and the actual result. The mixing loss function is divided into two parts: a classification loss function and a location detection loss function. Wherein the classification loss function represents the difference between the prediction of the neural network for the object class in the image content and the object class in the actual image, and the difference between the prediction and the object class in the actual image is quantified by using the cross entropy, and the classification loss function can be expressed as:
wherein losscls(theta) is a classification loss function, theta is a parameter to be optimized in the neural network, LCEAs a cross-entropy function, xiFor the input image data, yiIs xiCorresponding tag data, fθAs a neural network model, fθ(xi) For neural network pair xiThe predicted value of (2). The invention adopts a mixed classification loss function, takes mixed data x as input image data and two label data y respectively related to the input image datapi,yqiObtaining loss by calculating the class information of (1)i(theta) and lossj(θ), the two loss functions are then summed based on the dynamic mixing parameters.
Similarly, the position penalty function represents the difference between the prediction of the object position in the image content by the neural network and the object position in the actual image, and the difference between them (predicted position and actual position) is usually quantified using a smoothed L1 penalty (Smooth L1 Loss). The smoothing loss function can be expressed as:
the mixed position loss function is similar to the mixed classification loss function, i.e. the mixed data x is used as the input image data, and the two label data y are respectively related to the mixed data xpi,yqiThe loss is calculated from the position information W of (1) to obtain lossi(theta)' and lossj(θ)', and then the two loss functions are summed based on the dynamic mixing parameters.
After the mixed classification loss function and the mixed position loss function are obtained respectively, the two are combined to obtain a final loss function. The calculation principle of the final loss function is shown in fig. 2, where classification loss 1 represents the classification loss of the first data (picture 1), classification loss 2 represents the classification loss of the second data (picture 2), regression loss 1 represents the position loss of the first data, and regression loss 2 represents the position loss of the second data.
In one embodiment, S6 includes:
s6.1: randomly selecting a series of data from the expanded data set;
s6.2: inputting the obtained series of data into an object detection neural network to obtain a corresponding prediction result, and obtaining a loss value of the prediction result according to an actual result through a target mixed loss function;
s6.3: parameters of the entire network are updated using back propagation.
Specifically, the objective is to minimize an objective loss function, the partial derivative of each hyper-parameter is calculated, the gradient of the current round can be obtained, then the loss function is updated in the opposite direction of the gradient, and the global optimal solution of the hyper-parameter, that is, the optimal model parameter, can be obtained by continuously performing iterative update.
Generally, the object detection network hybrid training method based on the dynamic parameters, provided by the invention, alleviates the data memory problem of the detection network and improves the generalization of the detection network on a test set. By constructing mixed data, the data set is expanded to a certain degree, and the linear relation among different types of data is additionally added, so that the expressive force of the model is improved. And through the setting of the dynamic mixing parameters, the training difficulty of the mixing training method on the detection network is reduced, and the training process is smoothed, so that the optimal model is obtained more easily. Compared with other training methods, the trained model has better generalization.
Example two
Based on the same inventive concept, the present embodiment provides a dynamic intensity-based object detection neural network hybrid training system, please refer to fig. 3, which includes:
a neural network construction module 201, configured to construct an object detection neural network, and initialize all parameters of the object detection neural network;
the data preprocessing module 202 is configured to acquire original training data and preprocess the original training data, where the original training data includes image data and tag data, and the tag data includes category information and position information of all tagged objects in an image;
the data mixing module 203 is configured to obtain a dynamic mixing parameter, randomly select first data and second data from the preprocessed training data, mix the selected first data and second data with the dynamic mixing parameter as a mixing intensity adjustment parameter, and obtain a mixed data pair, where the mixed data pair includes mixed image data, tag data corresponding to the first data, and tag data corresponding to the second data;
an extended data set obtaining module 204, configured to obtain an extended data set based on the mixed data pair;
a loss function setting module 205, configured to set a target mixed loss function according to the mixed data in the expanded data set, the tag data corresponding to the first data, and the tag data corresponding to the second data, where the target mixed loss function includes a classification loss and a position loss;
and the training module 206 is configured to train the object detection neural network by using the extended data set as training data and combining a mixed loss function and using a random gradient descent method, so as to obtain a trained object detection neural network.
Since the system described in the second embodiment of the present invention is a system adopted for implementing the dynamic strength-based object detection neural network hybrid training method in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and deformation of the system based on the method described in the first embodiment of the present invention, and thus details are not described herein again. All systems adopted by the method of the first embodiment of the present invention are within the intended protection scope of the present invention.
EXAMPLE III
Referring to fig. 4, based on the same inventive concept, the present application further provides a computer-readable storage medium 300, on which a computer program 311 is stored, which when executed implements the method according to the first embodiment.
Since the computer-readable storage medium introduced in the third embodiment of the present invention is a computer-readable storage medium used for implementing the dynamic-strength-based object detection neural network hybrid training method in the first embodiment of the present invention, based on the method introduced in the first embodiment of the present invention, persons skilled in the art can understand the specific structure and deformation of the computer-readable storage medium, and thus details are not described herein again. Any computer readable storage medium used in the method of the first embodiment of the present invention is within the scope of the present invention.
Example four
Based on the same inventive concept, the present application further provides a computer device, please refer to fig. 5, which includes a storage 401, a processor 402, and a computer program 403 stored in the storage and running on the processor, and when the processor 402 executes the above program, the method in the first embodiment is implemented.
Since the computer device introduced in the fourth embodiment of the present invention is a computer device used for implementing the dynamic strength-based object detection neural network hybrid training method in the first embodiment of the present invention, based on the method introduced in the first embodiment of the present invention, those skilled in the art can understand the specific structure and deformation of the computer device, and thus, details are not described herein. All the computer devices used in the method in the first embodiment of the present invention are within the scope of the present invention.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.
Claims (10)
1. A dynamic intensity-based object detection neural network hybrid training method is characterized by comprising the following steps:
s1: constructing an object detection neural network, and initializing all parameters of the object detection neural network;
s2: acquiring original training data and preprocessing the original training data, wherein the original training data comprises image data and label data, and the label data comprises category information and position information of all marked objects in an image;
s3: acquiring a dynamic mixing parameter, randomly selecting first data and second data from the preprocessed training data, mixing the selected first data and second data by using the dynamic mixing parameter as a mixing intensity adjusting parameter to obtain a mixed data pair, wherein the mixed data pair comprises mixed image data, label data corresponding to the first data and label data corresponding to the second data;
s4: obtaining an expanded data set based on the mixed data pair;
s5: setting a target mixed loss function according to the mixed data in the expanded data set, the label data corresponding to the first data and the label data corresponding to the second data, wherein the target mixed loss function comprises classification loss and position loss;
s6: and taking the expanded data set as training data, and training the object detection neural network by combining a mixed loss function and adopting a random gradient descent method to obtain the trained object detection neural network.
2. The method of claim 1, wherein S1 includes:
s1.1: constructing a neural network by using a convolutional layer, a full-link layer, a pooling layer and an activation layer or using an existing neural network as an object detection neural network;
s1.2: and initializing all parameters of the object detection neural network by adopting a random parameter initialization method.
3. The method of claim 1, wherein preprocessing the raw training data in S2 specifically includes:
s2.1: performing pixel value normalization on image data in original training data;
s2.2: cutting the picture;
s2.3: and turning over the picture.
4. The method according to claim 3, wherein S3 specifically comprises:
s3.1: and generating a dynamic mixing parameter lambda according to a formula, wherein the specific formula is as follows:
wherein,is a preset maximum lambda value, α is a change rate adjusting parameter, n is a turning iteration number, epoch is a current iteration number in the training process;
s3.2: randomly selecting first data and second data from the preprocessed training data as mixed original data, and linearly mixing the two data according to the proportion of a dynamic mixing parameter lambda, wherein the specific operation is as follows:
x=λx1+(1-λ)x2
the resulting mixed data pair was (x, y)1,y2) Where x is the blended image data and y1Is the first data x1Corresponding tag data, y2Is the second data x2The corresponding tag data.
6. The method of claim 1, wherein S5 specifically comprises:
s5.1: using the mixed image data as input image data, and two label data y respectively related to the input image datapi,yqiThe loss is calculated from the class information of (c),
losscls(θ)=λlossi(θ)+(1-λ)lossj(θ)
wherein For the expanded data set, m is the size of the expanded data, θ is the parameter to be optimized in the neural network, LCEAs a cross entropy function, λ is a dynamic mixing parameter,for the mixed image data, ypi,yqiTag data of the first data and tag data of the second data, respectively, fθIn order to be a neural network model,as a neural network pairPredicted value of (1), lossi(θ) is the loss of classification of the first data, lossj(θ) is the loss of classification of the second data, losscls(θ) is a mixed classification loss function;
s5.2: using the mixed data as input image data, two label data y respectively associated with the input image datapi,yqiThe loss calculation of the position information specifically includes:
lossloc(θ)=λlossi(θ)′+(1-λ)lossj(θ)′
therein, lossi(θ)' is the loss of position, loss, of the first dataj(θ)' is the loss of position, loss, of the second dataloc(theta) is the hybrid position loss function, theta is the parameter to be optimized in the neural network, LSMIn order to smooth the L1 loss function,for the mixed image data and its label data, fθIn order to be a neural network model,as a neural network pairThe predicted value of (2);
s5.3: combining the mixed classification loss function with the mixed position loss function to obtain a target mixed loss function:
Loss(θ)=losscls(θ)+γlossloc(θ)
where Loss (θ) is the target mixture Loss function and γ is a specific gravity parameter of classification Loss and position Loss.
7. The method of claim 1, wherein S6 includes:
s6.1: randomly selecting a series of data from the expanded data set;
s6.2: inputting the obtained series of data into an object detection neural network to obtain a corresponding prediction result, and obtaining a loss value of the prediction result according to an actual result through a target mixed loss function;
s6.3: parameters of the entire network are updated using back propagation.
8. An object detection neural network hybrid training system based on dynamic intensity, comprising:
the neural network construction module is used for constructing an object detection neural network and initializing all parameters of the object detection neural network;
the data preprocessing module is used for acquiring original training data and preprocessing the original training data, wherein the original training data comprise image data and label data, and the label data comprise category information and position information of all marked objects in an image;
the data mixing module is used for acquiring dynamic mixing parameters, randomly selecting first data and second data from the preprocessed training data, mixing the selected first data and second data by using the dynamic mixing parameters as mixing intensity adjusting parameters to obtain a mixed data pair, wherein the mixed data pair comprises mixed image data, label data corresponding to the first data and label data corresponding to the second data;
an extended data set obtaining module for obtaining an extended data set based on the mixed data pair;
a loss function setting module, configured to set a target mixed loss function according to the mixed data in the expanded data set, the tag data corresponding to the first data, and the tag data corresponding to the second data, where the target mixed loss function includes a classification loss and a position loss;
and the training module is used for training the object detection neural network by using the expanded data set as training data and combining a mixed loss function and adopting a random gradient descent method to obtain the trained object detection neural network.
9. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed, implements the method of any one of claims 1 to 7.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 7 when executing the program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010104069.4A CN111260060B (en) | 2020-02-20 | 2020-02-20 | Object detection neural network hybrid training method and system based on dynamic intensity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010104069.4A CN111260060B (en) | 2020-02-20 | 2020-02-20 | Object detection neural network hybrid training method and system based on dynamic intensity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111260060A true CN111260060A (en) | 2020-06-09 |
CN111260060B CN111260060B (en) | 2022-06-14 |
Family
ID=70951283
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010104069.4A Active CN111260060B (en) | 2020-02-20 | 2020-02-20 | Object detection neural network hybrid training method and system based on dynamic intensity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111260060B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113780556A (en) * | 2021-09-18 | 2021-12-10 | 深圳市商汤科技有限公司 | Neural network training and character recognition method, device, equipment and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180082137A1 (en) * | 2016-09-19 | 2018-03-22 | Nec Laboratories America, Inc. | Advanced driver-assistance system |
CN107886073A (en) * | 2017-11-10 | 2018-04-06 | 重庆邮电大学 | A kind of more attribute recognition approaches of fine granularity vehicle based on convolutional neural networks |
CN108647742A (en) * | 2018-05-19 | 2018-10-12 | 南京理工大学 | Fast target detection method based on lightweight neural network |
CN108898188A (en) * | 2018-07-06 | 2018-11-27 | 四川奇迹云科技有限公司 | A kind of image data set aid mark system and method |
CN109065030A (en) * | 2018-08-01 | 2018-12-21 | 上海大学 | Ambient sound recognition methods and system based on convolutional neural networks |
CN109508746A (en) * | 2018-11-16 | 2019-03-22 | 西安电子科技大学 | Pulsar candidate's body recognition methods based on convolutional neural networks |
US20190102646A1 (en) * | 2017-10-02 | 2019-04-04 | Xnor.ai Inc. | Image based object detection |
US10304193B1 (en) * | 2018-08-17 | 2019-05-28 | 12 Sigma Technologies | Image segmentation and object detection using fully convolutional neural network |
CN110765873A (en) * | 2019-09-19 | 2020-02-07 | 华中师范大学 | Facial expression recognition method and device based on expression intensity label distribution |
-
2020
- 2020-02-20 CN CN202010104069.4A patent/CN111260060B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180082137A1 (en) * | 2016-09-19 | 2018-03-22 | Nec Laboratories America, Inc. | Advanced driver-assistance system |
US20190102646A1 (en) * | 2017-10-02 | 2019-04-04 | Xnor.ai Inc. | Image based object detection |
CN107886073A (en) * | 2017-11-10 | 2018-04-06 | 重庆邮电大学 | A kind of more attribute recognition approaches of fine granularity vehicle based on convolutional neural networks |
CN108647742A (en) * | 2018-05-19 | 2018-10-12 | 南京理工大学 | Fast target detection method based on lightweight neural network |
CN108898188A (en) * | 2018-07-06 | 2018-11-27 | 四川奇迹云科技有限公司 | A kind of image data set aid mark system and method |
CN109065030A (en) * | 2018-08-01 | 2018-12-21 | 上海大学 | Ambient sound recognition methods and system based on convolutional neural networks |
US10304193B1 (en) * | 2018-08-17 | 2019-05-28 | 12 Sigma Technologies | Image segmentation and object detection using fully convolutional neural network |
CN109508746A (en) * | 2018-11-16 | 2019-03-22 | 西安电子科技大学 | Pulsar candidate's body recognition methods based on convolutional neural networks |
CN110765873A (en) * | 2019-09-19 | 2020-02-07 | 华中师范大学 | Facial expression recognition method and device based on expression intensity label distribution |
Non-Patent Citations (2)
Title |
---|
HARDIK MEISHERI 等: "Sentiment Extraction from Consumer-Generated Noisy Short Texts", 《2017 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW)》 * |
刘栋 等: "深度学习及其在图像物体分类与检测中的应用综述", 《计算机科学》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113780556A (en) * | 2021-09-18 | 2021-12-10 | 深圳市商汤科技有限公司 | Neural network training and character recognition method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111260060B (en) | 2022-06-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106803067B (en) | Method and device for evaluating quality of face image | |
US10296827B2 (en) | Data category identification method and apparatus based on deep neural network | |
US11182644B2 (en) | Method and apparatus for pose planar constraining on the basis of planar feature extraction | |
DE112019005750T5 (en) | Learning to generate synthetic data sets for training neural networks | |
DE112016004266T5 (en) | Procedure for facial alignment | |
CN112308095A (en) | Picture preprocessing and model training method and device, server and storage medium | |
Kolesnikov et al. | PixelCNN models with auxiliary variables for natural image modeling | |
CN112712546A (en) | Target tracking method based on twin neural network | |
CN103400368B (en) | Based on graph theory and the parallel rapid SAR image segmentation method of super-pixel | |
CN111696196B (en) | Three-dimensional face model reconstruction method and device | |
CN113689436A (en) | Image semantic segmentation method, device, equipment and storage medium | |
CN111325766A (en) | Three-dimensional edge detection method and device, storage medium and computer equipment | |
DE112020006045T5 (en) | FORMAL SECURE SYMBOLIC REINFORCEMENT LEARNING THROUGH VISUAL INPUTS | |
CN113971644A (en) | Image identification method and device based on data enhancement strategy selection | |
CN111260060B (en) | Object detection neural network hybrid training method and system based on dynamic intensity | |
CN109829857B (en) | Method and device for correcting inclined image based on generation countermeasure network | |
DE102023101265A1 (en) | Object detection in image stream processing using optical flow with dynamic regions of interest | |
Yang et al. | Inversion based on a detached dual-channel domain method for StyleGAN2 embedding | |
CN114897884A (en) | No-reference screen content image quality evaluation method based on multi-scale edge feature fusion | |
CN103839244A (en) | Real-time image fusion method and device | |
CN113095506A (en) | Machine learning method, system and medium based on end, edge and cloud cooperation | |
CN107403465A (en) | City scenarios sectional plan method for reconstructing based on structure priori and deep learning | |
CN110175622A (en) | The vehicle part recognition methods of convolutional neural networks based on symbiosis and system | |
CN116977359A (en) | Image processing method, apparatus, device, readable storage medium, and program product | |
CN112837318B (en) | Ultrasonic image generation model generation method, ultrasonic image synthesis method, medium and terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |