CN112508080B - Vehicle model identification method, device, equipment and medium based on experience playback - Google Patents

Vehicle model identification method, device, equipment and medium based on experience playback Download PDF

Info

Publication number
CN112508080B
CN112508080B CN202011394840.2A CN202011394840A CN112508080B CN 112508080 B CN112508080 B CN 112508080B CN 202011394840 A CN202011394840 A CN 202011394840A CN 112508080 B CN112508080 B CN 112508080B
Authority
CN
China
Prior art keywords
model
vehicle
network
training
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011394840.2A
Other languages
Chinese (zh)
Other versions
CN112508080A (en
Inventor
彭凌西
李泽轩
邵楚越
江卓飞
徐泽峰
林泉余
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Wanzhida Enterprise Management Co ltd
Original Assignee
Guangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou University filed Critical Guangzhou University
Priority to CN202011394840.2A priority Critical patent/CN112508080B/en
Publication of CN112508080A publication Critical patent/CN112508080A/en
Application granted granted Critical
Publication of CN112508080B publication Critical patent/CN112508080B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a vehicle model identification method, device, equipment and medium based on experience playback, wherein the method comprises the following steps: acquiring an original vehicle image; the original vehicle image comprises vehicle model information; performing data expansion on the original vehicle image through a GAN network to obtain model sample data; inputting the model sample data into an countermeasure network with experience playback for training to obtain a target model; and identifying the acquired vehicle image to be identified according to the target model, and determining the vehicle model in the vehicle image to be identified. The invention improves the recognition capability of the vehicle model, designs the expansion method of the vehicle appearance image training sample, and based on the expansion method, optimizes the generated picture result of GAN by using the vehicle appearance sample with the experience playback aiming at the non-ideal generated result, thereby being widely applied to the technical field of image processing.

Description

Vehicle model identification method, device, equipment and medium based on experience playback
Technical Field
The invention relates to the technical field of image processing, in particular to a vehicle model identification method, device, equipment and medium based on experience playback.
Background
The iteration speed of automobile market updating is continuously improved, vehicles with different brands and models on the market continuously emerge in a large number, and the prior art lacks means for effectively identifying the brand and model information of the vehicles. The traditional vehicle recognition system recognition information is only limited to the recognition of characters and colors of a license plate, the recognizable information is single and lacks of multiple elements, and a recognition algorithm cannot dynamically update self-adjustment and expand a recognition data set according to an actual application environment, so that the vehicle information recognition requirement cannot be met in the time environment that a vehicle manufacturer frequently pushes out a new model.
The vehicle model is taken as a part of the representation of the driving user, the vehicle model cannot play a role as effective value data in the fields of personalized service and the like, if the information such as the vehicle model, license plate and the like can be reasonably utilized, the vehicle parking experience, consumption experience, personalized customization service and other experiences of the user can be improved, and the vehicle model is also beneficial to the development of customized products such as commercial popularization, security service and the like in the industries of related industries.
Disclosure of Invention
In view of the above, the embodiment of the invention provides a vehicle model identification method, device, equipment and medium based on experience playback, which improves identification accuracy.
The first aspect of the present invention provides a vehicle model identification method based on experience playback, comprising:
acquiring an original vehicle image; the original vehicle image comprises vehicle model information;
performing data expansion on the original vehicle image through a GAN network to obtain model sample data;
inputting the model sample data into an countermeasure network with experience playback for training to obtain a target model;
and identifying the acquired vehicle image to be identified according to the target model, and determining the vehicle model in the vehicle image to be identified.
Optionally, the acquiring the original vehicle image information includes:
crawling an original vehicle image of known vehicle model information through a crawler technology;
carrying out graying treatment, brightness normalization treatment and contrast normalization treatment on the original vehicle image to obtain a target image for representing texture information;
inputting the target image into a pre-training network, and extracting a characteristic block;
inputting the characteristic blocks into an SVM classifier for training to obtain a target SVM classifier;
inputting the target image into the SVM classifier, and outputting probability labels of various recognition results;
and according to the probability tag, calculating a recognition result of the vehicle model as vehicle model information in the original vehicle image.
Optionally, the data expansion is performed on the original vehicle image through a GAN network to obtain model sample data, including:
inputting noise data into the GAN network to obtain a test sample, and taking the original vehicle image as a training sample;
inputting the training sample and the test sample into an initial discriminator of the GAN network to obtain a discrimination result;
training the GAN network through the DQN network to obtain an ideal generator and an ideal discriminator;
generating a vehicle appearance image through the ideal generator, checking the generated vehicle appearance image through the ideal discriminator, and taking the checked vehicle appearance image as an expansion result of the original vehicle image.
Optionally, the inputting the model sample data into the countermeasure network with experience playback to train, to obtain a target model, including:
taking the model sample data as current state data;
inputting the current state data into an countermeasure network with experience playback for training, and updating a Q value function until the Q value function converges to obtain a converged neural network model;
and inputting the test sample into the neural network model, and testing the neural network model to obtain a target model.
Optionally, the training the current state data in the countermeasure network with experience playback, updating the Q-value function until the Q-value function converges, and obtaining a converged neural network model includes:
acquiring sample parameters of a vehicle signal picture, and generating a Markov decision process quadruple;
initializing a Q-Table in Prioritized Replay DQN;
the Q-Table is implemented in Prioritized Replay DQN.
Optionally, the implementing the Q-Table in Prioritized Replay DQN includes:
adopting a deep neural network as a Q-Table, and presetting target parameters;
defining an objective function using a 2-norm in the Q value;
calculating the gradient of the target parameter with respect to a cost function;
according to the gradient, obtaining an optimal Q value by using a random gradient descent method;
performing cyclic training on the deep neural network according to the optimal Q value;
an empirical playback data set is acquired and all super-parameters of the Q-network are updated by gradient back-propagation according to the objective function.
According to another aspect of the present invention, there is provided a vehicle model identification apparatus based on empirical playback, comprising:
the acquisition module is used for acquiring an original vehicle image; the original vehicle image comprises vehicle model information;
the data expansion module is used for carrying out data expansion on the original vehicle image through a GAN network to obtain model sample data;
the training module is used for inputting the model sample data into an countermeasure network with experience playback for training to obtain a target model;
the identification module is used for identifying the acquired vehicle image to be identified according to the target model and determining the vehicle model in the vehicle image to be identified.
According to another aspect of the present invention, there is provided an electronic device including a processor and a memory;
the memory is used for storing programs;
the processor executes the program to implement the method as described above.
According to another aspect of the present invention, there is provided a computer-readable storage medium storing a program that is executed by a processor to implement a method as described above.
The method comprises the steps of firstly, acquiring an original vehicle image; then, carrying out data expansion on the original vehicle image through a GAN network to obtain model sample data; then inputting the model sample data into an countermeasure network with experience playback for training to obtain a target model; and finally, identifying the acquired vehicle image to be identified according to the target model, and determining the vehicle model in the vehicle image to be identified. The embodiment of the invention improves the recognition capability of the vehicle model, designs the expansion method of the vehicle appearance image training sample, and utilizes the generated picture result with experience playback and optimization GAN aiming at the vehicle appearance sample with the non-ideal generated result on the basis.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of method steps provided by an embodiment of the present invention;
FIG. 2 is a flow chart of a complete implementation of the present invention;
fig. 3 is a schematic structural diagram of a pre-training network DenseNet according to an embodiment of the present invention;
fig. 4 is a flowchart illustrating steps of an countermeasure generation network according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
Aiming at the problems existing in the prior art, the invention provides a vehicle model identification method, device, equipment and medium based on experience playback. The invention provides an expansion method for vehicle appearance image training samples, which utilizes a deep convolutional neural network to increase the capability of vehicle model identification for a vehicle identification system, designs a vehicle appearance image training sample expansion method based on GAN (generation countermeasure network), and utilizes a generated picture result with experience playback optimization for the vehicle appearance samples with non-ideal generated results on the basis.
Referring to fig. 1, the method of the present invention comprises the steps of:
acquiring an original vehicle image; the original vehicle image comprises vehicle model information;
performing data expansion on the original vehicle image through a GAN network to obtain model sample data;
inputting the model sample data into an countermeasure network with experience playback for training to obtain a target model;
and identifying the acquired vehicle image to be identified according to the target model, and determining the vehicle model in the vehicle image to be identified.
Specifically, the acquiring original vehicle image information includes:
crawling an original vehicle image of known vehicle model information through a crawler technology;
carrying out graying treatment, brightness normalization treatment and contrast normalization treatment on the original vehicle image to obtain a target image for representing texture information;
inputting the target image into a pre-training network, and extracting a characteristic block;
inputting the characteristic blocks into an SVM classifier for training to obtain a target SVM classifier;
inputting the target image into the SVM classifier, and outputting probability labels of various recognition results;
and according to the probability tag, calculating a recognition result of the vehicle model as vehicle model information in the original vehicle image.
Specifically, the embodiment of the invention uses a crawler technology to crawl vehicle images and model information with definite models, carries out label classification on the images to form a training sample set, and prepares for training a GAN network and SVM discriminant model, and specifically comprises the following steps of 11-14:
and 11, preprocessing an image, namely only using texture information of the image to judge the model of the vehicle, and performing gray scale processing, brightness, contrast normalization and other processing on an image data set.
And step 12, inputting the image into a pre-training network DenseNet, detecting a target, extracting a characteristic block, and taking the characteristic block as a block characteristic training sample of an SVM support vector machine classifier. For DenseNet networks, the inputs of each layer of the hidden layers are connected to the outputs of all the previous layers, in x l Representing the output of the layer I network, H l The (-) function represents a combined operation comprising a series of BN-ReLU-Conv operations
x l =H l ([x 0 ,x 1 ,…,x l-1 ])
And 13, after the vehicle image of the model to be identified is subjected to the steps 11-12, the obtained output is used as a block feature test sample, the block feature test sample is input into an SVM support vector machine classifier for classification, probability labels of feature block identification results are output, and the identification results are obtained by calculating the block probability labels.
In step 14, the samples supplemented by the gan network enter a sample set of the vehicle image with a definite model, so that the block feature training samples of the SVM classifier need to be continuously updated and trained, and the SVM classifier is iteratively updated and trained by using the feature block recognition result probability labels together so as to maintain good classification performance.
The pre-training network DenseNet of the embodiment of the invention specifically comprises the following steps:
for the pre-training network DenseNet, in each processing module, the characteristic information can be transmitted forward from the lower layer to the higher layer through a direct channel, so that the higher layer can fully acquire the characteristics from the lower layer, the occurrence of a redundant layer is greatly reduced, the characteristic multiplexing is enhanced, and the anti-overfitting capability is stronger.
Optionally, the data expansion is performed on the original vehicle image through a GAN network to obtain model sample data, including:
inputting noise data into the GAN network to obtain a test sample, and taking the original vehicle image as a training sample;
inputting the training sample and the test sample into an initial discriminator of the GAN network to obtain a discrimination result;
training the GAN network through the DQN network to obtain an ideal generator and an ideal discriminator;
generating a vehicle appearance image through the ideal generator, checking the generated vehicle appearance image through the ideal discriminator, and taking the checked vehicle appearance image as an expansion result of the original vehicle image.
Specifically, the GAN network augmenting the vehicle appearance image set includes the following steps 21-25:
step 21, inputting the noise z-P (z) to the GAN generator to obtain the output G (z) as the test sample.
Step 22, taking the real vehicle appearance image x in the training set as a training sample, and inputting the training sample and the test sample G (z) generated by the GAN generator into the GAN discriminator D to obtain a discrimination result D (G (z)).
And 23, training the GAN by the DQN network, and after multiple iterations, obtaining a more ideal generator for generating an image, wherein the false judgment probability of the generator is not approaching 0.5 by the discriminator, and a more ideal discriminator for generating the image, wherein the true judgment probability of the image is approaching 0.5 by the discriminator.
And step 24, continuously learning by using the generator after training and a discriminator, generating a vehicle appearance image by using the generator, and when the discriminator has the false judging probability maintained at a stable level, permitting the sample generated by the generator to be added into the vehicle appearance image set to finish expansion.
And 25, continuously monitoring the states of the generator and the discriminator, calculating a return iteration generator and training the discriminator through the DQN network, and maintaining the ideal states.
Optionally, the inputting the model sample data into the countermeasure network with experience playback to train, to obtain a target model, including:
taking the model sample data as current state data;
inputting the current state data into an countermeasure network with experience playback for training, and updating a Q value function until the Q value function converges to obtain a converged neural network model;
and inputting the test sample into the neural network model, and testing the neural network model to obtain a target model.
Specifically, the invention relates to a strong countermeasure network training method based on deep reinforcement learning with experience playback, which specifically comprises the following steps of 31-34:
step 31, a generator generates and inputs a model sample for training as current state data;
step 32, inputting each parameter into the deep reinforcement learning with experience playback for training; continuously updating the Q value function until the Q value function is converged, and obtaining a converged neural network model;
step 33, inputting the dynamic parameters of the generated sample for testing into the obtained model;
step 34 iterates the GAN network so that the GAN network can output a sample picture close to the real vehicle model.
Optionally, the training the current state data in the countermeasure network with experience playback, updating the Q-value function until the Q-value function converges, and obtaining a converged neural network model includes:
acquiring sample parameters of a vehicle signal picture, and generating a Markov decision process quadruple;
initializing a Q-Table in Prioritized Replay DQN;
the Q-Table is implemented in Prioritized Replay DQN.
Specifically, in the above step 32, specifically, the method includes:
obtaining sample parameters of a vehicle model picture, and generating a Markov decision process four-element group E= < S, A, P and R >, wherein S is a state set describing the probability of the vehicle model sample generated picture output on a neural network, A is a sample picture generated by an opposite generation network, P is a state transfer function and R is a rewarding function;
training data with Prioritized Replay DQN; initializing Q-Table, wherein the row and column are S and A respectively, and the value of Q-Table is used for measuring the quality of action a taken by the current state S; the Bellman equation is used to update the Q-Table during training:
Q(s,a)=r+γ(max(Q(s',a')))
wherein s is a state, a is an action, s 'is a next state, a' is an action which can be taken by the next state, Q (s, a) is a Q value after taking the action a in the current state s, r is an actual rewarding value, gamma is an attenuation rate, and max (Q (s ', a')) is a maximum Q value of the next state;
in Prioritized Replay DQN, Q-Table is realized by a neural network, and the state s is input to output the Q value of different actions a.
Optionally, the implementing the Q-Table in Prioritized Replay DQN includes:
adopting a deep neural network as a Q-Table, and presetting target parameters;
defining an objective function using a 2-norm in the Q value;
calculating the gradient of the target parameter with respect to a cost function;
according to the gradient, obtaining an optimal Q value by using a random gradient descent method;
performing cyclic training on the deep neural network according to the optimal Q value;
an empirical playback data set is acquired and all super-parameters of the Q-network are updated by gradient back-propagation according to the objective function.
Specifically, the implementation of the Q-Table in the embodiment of the invention specifically comprises the following steps (1) - (6):
(1) The deep neural network is adopted as Q-Table, and the parameters are as follows:
Q(s,a,θ)=Qπ(s,a)
(2) The objective function is defined using a 2-norm in the Q value:
L(θ)=||r+γ·maxQ(s',a',θ)-Q(s,a,θ)||2;
(3) Calculating the gradient of the parameter theta with respect to the cost function;
(4) Using a random gradient descent method to realize an end-to-end optimization target;
the gradient described above is calculated and the gradient,calculating from the deep neural network, and updating parameters by using random gradient descent so as to obtain an optimal Q value;
(5) The action at is randomly selected according to the probability epsilon or the action at with the maximum Q value is selected according to the Q value output by the neural network, the rewarding rt after the action at is executed and the input of the next network are obtained, and the neural network calculates the output of the network at the next moment according to the current value, so that the cycle is performed.
The prize value in step (5) includes: probability P of neural network output 1 And the actual probability P 2 The mean square value of the generated sample vehicle model is added with the percentage K of the total vehicle model:
after several iterations and training in the training process, when the Q value representing the prize converges to the maximum value, the allocation strategy is optimized.
(6) Will s t 、a t 、r t 、s t+1 And the termination judgment indexes are sequentially stored into an experience playback data set D, m samples are continuously sampled from the D when the data reach a certain number, the current target Q value is calculated, all super parameters of the Q network are updated through gradient back propagation, meanwhile, the current state s= s T +1 is enabled, if s is in a termination state, the current iteration is completed, or the iteration is completed when the iteration round number T is reached, otherwise, the step (5) is transferred to continue iteration. The specific method comprises the following steps:
in the process of continuously and iteratively updating the data, each segment t is s t 、a t 、r t 、s t+1 And a five-tuple { s t, a t, r t, s t +1, done } consisting of the termination criterion done is stored in the experience playback set D. When the stored quantity reaches the playback set capacity D, old data overflows according to the rolling to store new data, and the validity of samples in D is ensured. Once the number of samples reaches the small training sample number m, randomly sampling m samples (j=1, 2..m) from D is started, and the current target Q value y corresponding to each sample is calculated j . And playing back experience versus state s in the data t Converted into label storage for business data analysis.
All parameters θ of the Q network are updated by gradient back propagation of the neural network using the mean square error loss function L (θ).
The following describes in detail the steps of the vehicle model identification method of the present invention with reference to fig. 2:
step 1, extracting and identifying appearance features of a vehicle:
the embodiment of the invention uses a crawler technology to crawl the vehicle images and model information with definite models, and carries out label classification on the images to form trainingSample ofSet, trainingGAN networkAnd an SVM discriminant model, comprising the following specific steps:
and 11, preprocessing an image, namely only using texture information of the image to judge the model of the vehicle, and performing gray scale processing, brightness, contrast normalization and other processing on an image data set.
And step 12, inputting the image into a pre-training network DenseNet, detecting a target, extracting a characteristic block, and taking the characteristic block as a block characteristic training sample of an SVM support vector machine classifier. For DenseNet networks, the inputs of each layer of the hidden layers are connected to the outputs of all the previous layers, in x l Representing the output of the layer I network, H l The (-) function represents a combined operation comprising a series of BN-ReLU-Conv operations
x l =H l ([x 0 ,x 1 ,…,x l-1 ])
And 13, after the vehicle image of the model to be identified is subjected to the steps 11-12, the obtained output is used as a block feature test sample, the block feature test sample is input into an SVM support vector machine classifier for classification, probability labels of feature block identification results are output, and the identification results are obtained by calculating the block probability labels.
In step 14, the samples supplemented by the gan network enter a sample set of the vehicle image with a definite model, so that the block feature training samples of the SVM classifier need to be continuously updated and trained, and the SVM classifier is iteratively updated and trained by using the feature block recognition result probability labels together so as to maintain good classification performance.
The pre-training network DenseNet in step 12 specifically includes:
for the pre-training network DenseNet, the structure is shown in fig. 3, in each processing module, the characteristic information can be transmitted forward from the lower layer to the higher layer through a direct channel, so that the higher layer can fully acquire the characteristics from the lower layer, the occurrence of a redundant layer is greatly reduced, the characteristic multiplexing is enhanced, and the anti-overfitting capability is stronger.
Step 2, the GAN network expands the vehicle appearance image set
Step 21, inputting the noise z-P (z) to the GAN generator to obtain the output G (z) as the test sample.
Step 22, taking the real vehicle appearance image x in the training set as a training sample, and inputting the training sample and the test sample G (z) generated by the GAN generator into the GAN discriminator D to obtain a discrimination result D (G (z)).
And 23, training the GAN by the DQN network, and after multiple iterations, obtaining a more ideal generator for generating an image, wherein the false judgment probability of the generator is not approaching 0.5 by the discriminator, and a more ideal discriminator for generating the image, wherein the true judgment probability of the image is approaching 0.5 by the discriminator.
And step 24, continuously learning by using the generator after training and a discriminator, generating a vehicle appearance image by using the generator, and when the discriminator has the false judging probability maintained at a stable level, permitting the sample generated by the generator to be added into the vehicle appearance image set to finish expansion.
And 25, continuously monitoring the states of the generator and the discriminator, calculating a return iteration generator and training the discriminator through the DQN network, and maintaining the ideal states.
The method for training the countermeasure network based on the deep reinforcement learning with experience playback in the step 3, as shown in fig. 4, specifically includes:
step 31, a generator generates and inputs a model sample for training as current state data;
step 32, inputting each parameter into the deep reinforcement learning with experience playback for training; continuously updating the Q value function until the Q value function is converged, and obtaining a converged neural network model;
step 33, inputting the dynamic parameters of the generated sample for testing into the obtained model;
step 34 iterates the GAN network so that the GAN network can output a sample picture close to the real vehicle model.
The step 32 specifically includes:
obtaining sample parameters of a vehicle model picture, and generating a Markov decision process four-element group E= < S, A, P and R >, wherein S is a state set describing the probability of the vehicle model sample generated picture output on a neural network, A is a sample picture generated by an opposite generation network, P is a state transfer function and R is a rewarding function;
training data with Prioritized Replay DQN; initializing Q-Table, wherein the row and column are S and A respectively, and the value of Q-Table is used for measuring the quality of action a taken by the current state S; the Bellman equation is used to update the Q-Table during training:
Q(s,a)=r+γ(max(Q(s',a')))
wherein s is a state, a is an action, s 'is a next state, a' is an action which can be taken by the next state, Q (s, a) is a Q value after taking the action a in the current state s, r is an actual rewarding value, gamma is an attenuation rate, and max (Q (s ', a')) is a maximum Q value of the next state;
in Prioritized Replay DQN, the Q-Table is realized through a neural network, the state s is input, the Q values of different actions a are output, and the specific realization process is as follows:
(1) The deep neural network is adopted as Q-Table, and the parameters are as follows:
Q(s,a,θ)=Qπ(s,a)
(2) The objective function is defined using a 2-norm in the Q value:
L(θ)=||r+γ·maxQ(s',a',θ)-Q(s,a,θ)||2;
(3) Calculating the gradient of the parameter theta with respect to the cost function;
(4) Using a random gradient descent method to realize an end-to-end optimization target;
the gradient described above is calculated and the gradient,calculating from the deep neural network, and updating parameters by using random gradient descent so as to obtain an optimal Q value;
(5) The action at is randomly selected according to the probability epsilon or the action at with the maximum Q value is selected according to the Q value output by the neural network, the rewarding rt after the action at is executed and the input of the next network are obtained, and the neural network calculates the output of the network at the next moment according to the current value, so that the cycle is performed.
The prize value in step (5) includes: probability P of neural network output 1 And the actual probability P 2 The mean square value of the generated sample vehicle model is added with the percentage K of the total vehicle model:
after several iterations and training in the training process, when the Q value representing the prize converges to the maximum value, the allocation strategy is optimized.
(6) Will s t 、a t 、r t 、s t+1 And the termination judgment indexes are sequentially stored into an experience playback data set D, m samples are continuously sampled from the D when the data reach a certain number, the current target Q value is calculated, all super parameters of the Q network are updated through gradient back propagation, meanwhile, the current state s= s T +1 is enabled, if s is in a termination state, the current iteration is completed, or the iteration is completed when the iteration round number T is reached, otherwise, the step (5) is transferred to continue iteration. The specific method comprises the following steps:
in the process of continuously and iteratively updating the data, each segment t is s t 、a t 、r t 、s t+1 And a five-tuple { s t, a t, r t, s t +1, done } consisting of the termination criterion done is stored in the experience playback set D. When the stored quantity reaches the playback set capacity D, old data overflows according to the rolling to store new data, and the validity of samples in D is ensured. Once the number of samples reaches the small number of training samples m, it starts to follow from DM samples (j=1, 2..m), and calculating the current target Q value y corresponding to each sample j . And playing back experience versus state s in the data t Converted into label storage for business data analysis.
All parameters θ of the Q network are updated by gradient back propagation of the neural network using the mean square error loss function L (θ).
In summary, the embodiment of the invention improves the recognition capability of the vehicle model, designs the expansion method of the vehicle appearance image training sample, and optimizes the generated picture result of the GAN by using the experience playback aiming at the vehicle appearance sample with the non-ideal generated result on the basis.
The embodiment of the invention also provides a vehicle model identification device based on experience playback, which comprises:
the acquisition module is used for acquiring an original vehicle image; the original vehicle image comprises vehicle model information;
the data expansion module is used for carrying out data expansion on the original vehicle image through a GAN network to obtain model sample data;
the training module is used for inputting the model sample data into an countermeasure network with experience playback for training to obtain a target model;
the identification module is used for identifying the acquired vehicle image to be identified according to the target model and determining the vehicle model in the vehicle image to be identified.
The embodiment of the invention also provides electronic equipment, which comprises a processor and a memory;
the memory is used for storing programs;
the processor executes the program to implement the method as described above.
The embodiment of the invention also provides a computer readable storage medium storing a program, which is executed by a processor to implement the method as described above.
Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the method shown in fig. 1.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the embodiments described above, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present invention, and these equivalent modifications and substitutions are intended to be included in the scope of the present invention as defined in the appended claims.

Claims (6)

1. A vehicle model identification method based on empirical playback, comprising:
acquiring an original vehicle image; the original vehicle image comprises vehicle model information;
performing data expansion on the original vehicle image through a GAN network to obtain model sample data; inputting the model sample data into an countermeasure network with experience playback for training to obtain a target model;
identifying the acquired vehicle image to be identified according to the target model, and determining the vehicle model in the vehicle image to be identified;
inputting the model sample data into an countermeasure network with experience playback for training to obtain a target model, wherein the training comprises the following steps:
taking the model sample data as current state data;
inputting the current state data into an countermeasure network with experience playback for training, and updating a Q value function until the Q value function converges to obtain a converged neural network model; inputting a test sample into the neural network model, and testing the neural network model to obtain a target model;
the training of inputting the current state data into the countermeasure network with experience playback, updating the Q value function until the Q value function converges, and obtaining a converged neural network model, comprising:
acquiring sample parameters of a vehicle signal picture, and generating a Markov decision process quadruple;
initializing a Q-Table in Prioritized Replay DQN;
implementing the Q-Table in Prioritized Replay DQN;
the implementing of the Q-Table in Prioritized Replay DQN includes: adopting a deep neural network as a Q-Table, and presetting target parameters;
defining an objective function using a 2-norm in the Q value;
calculating the gradient of the target parameter with respect to a cost function;
according to the gradient, obtaining an optimal Q value by using a random gradient descent method;
performing cyclic training on the deep neural network according to the optimal Q value;
acquiring an experience playback data set, and updating all super parameters of the Q network through gradient back propagation according to the objective function;
the specific implementation process is as follows:
(1) The deep neural network is adopted as Q-Table, and the parameters are as follows:
Q(s,a,θ)=Qπ(s,a)
(2) The objective function is defined using a 2-norm in the Q value:
L(θ)=||r+γ·maxQ(s',a',θ)-Q(s,a,θ)||2;
(3) Calculating the gradient of the parameter theta with respect to the cost function;
(4) Using a random gradient descent method to realize an end-to-end optimization target;
the gradient described above is calculated and the gradient,calculating from the deep neural network, and updating parameters by using random gradient descent so as to obtain an optimal Q value;
(5) Randomly selecting an action at according to the probability epsilon or selecting the action at with the maximum Q value through the Q value output by the neural network, obtaining the rewarding rt after executing the action at and the input of the next network, and calculating the output of the network at the next moment according to the current value by the neural network, so that the cycle is performed;
the prize value in step (5) includes: probability P of neural network output 1 And the actual probability P 2 The mean square value of the generated sample vehicle model is added with the percentage K of the total vehicle model:
after a plurality of iterations and training in the training process, when the Q value representing rewards converges to the maximum value, the allocation strategy is optimized;
(6) Will s t 、a t 、r t 、s t+1 And the termination judgment indexes are sequentially stored into an experience playback data set D, m samples are continuously sampled from the D when the data reach a certain number, the current target Q value is calculated, all super parameters of the Q network are updated through gradient back propagation, meanwhile, the current state s= s T +1 is enabled, if s is in a termination state, the current iteration is completed, or the iteration is completed when the iteration round number T is reached, otherwise, the step (5) is transferred to continue iteration.
2. The method for identifying a vehicle model based on empirical playback according to claim 1, wherein said acquiring original vehicle image information comprises:
crawling an original vehicle image of known vehicle model information through a crawler technology;
carrying out graying treatment, brightness normalization treatment and contrast normalization treatment on the original vehicle image to obtain a target image for representing texture information;
inputting the target image into a pre-training network, and extracting a characteristic block;
inputting the characteristic blocks into an SVM classifier for training to obtain a target SVM classifier;
inputting the target image into the SVM classifier, and outputting probability labels of various recognition results; and according to the probability tag, calculating a recognition result of the vehicle model as vehicle model information in the original vehicle image.
3. The vehicle model identification method based on experience playback according to claim 2, wherein the data expansion of the original vehicle image through the GAN network to obtain model sample data comprises:
inputting noise data into the GAN network to obtain a test sample, and taking the original vehicle image as a training sample;
inputting the training sample and the test sample into an initial discriminator of the GAN network to obtain a discrimination result; specifically, the calculation formula of the discrimination result is:
wherein z-p r (z) is input noise data; g (z) is a test sample; d represents an initial arbiter; d (G (z)) is the discrimination result;
training the GAN network through the DQN network to obtain an ideal generator and an ideal discriminator;
generating a vehicle appearance image through the ideal generator, checking the generated vehicle appearance image through the ideal discriminator, and taking the checked vehicle appearance image as an expansion result of the original vehicle image.
4. An apparatus for implementing an empirically derived playback vehicle model identification method as set forth in any one of claims 1-3, characterized by comprising:
the acquisition module is used for acquiring an original vehicle image; the original vehicle image comprises vehicle model information;
the data expansion module is used for carrying out data expansion on the original vehicle image through a GAN network to obtain model sample data;
the training module is used for inputting the model sample data into an countermeasure network with experience playback for training to obtain a target model;
the identification module is used for identifying the acquired vehicle image to be identified according to the target model and determining the vehicle model in the vehicle image to be identified.
5. An electronic device comprising a processor and a memory;
the memory is used for storing programs;
the processor executing the program to implement the method of any one of claims 1-3.
6. A computer readable storage medium, characterized in that the storage medium stores a program, which is executed by a processor to implement the method of any one of claims 1-3.
CN202011394840.2A 2020-12-03 2020-12-03 Vehicle model identification method, device, equipment and medium based on experience playback Active CN112508080B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011394840.2A CN112508080B (en) 2020-12-03 2020-12-03 Vehicle model identification method, device, equipment and medium based on experience playback

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011394840.2A CN112508080B (en) 2020-12-03 2020-12-03 Vehicle model identification method, device, equipment and medium based on experience playback

Publications (2)

Publication Number Publication Date
CN112508080A CN112508080A (en) 2021-03-16
CN112508080B true CN112508080B (en) 2024-01-12

Family

ID=74969435

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011394840.2A Active CN112508080B (en) 2020-12-03 2020-12-03 Vehicle model identification method, device, equipment and medium based on experience playback

Country Status (1)

Country Link
CN (1) CN112508080B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326941A (en) * 2021-06-25 2021-08-31 江苏大学 Knowledge distillation method, device and equipment based on multilayer multi-attention migration

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106611193A (en) * 2016-12-20 2017-05-03 太极计算机股份有限公司 Image content information analysis method based on characteristic variable algorithm
CN108805177A (en) * 2018-05-22 2018-11-13 同济大学 Vehicle type identifier method under complex environment background based on deep learning
US10176405B1 (en) * 2018-06-18 2019-01-08 Inception Institute Of Artificial Intelligence Vehicle re-identification techniques using neural networks for image analysis, viewpoint-aware pattern recognition, and generation of multi- view vehicle representations
CN110147709A (en) * 2018-11-02 2019-08-20 腾讯科技(深圳)有限公司 Training method, device, terminal and the storage medium of vehicle attribute model
CN110458120A (en) * 2019-08-15 2019-11-15 中国水利水电科学研究院 Different automobile types recognition methods and system under a kind of complex environment
CN110874578A (en) * 2019-11-15 2020-03-10 北京航空航天大学青岛研究院 Unmanned aerial vehicle visual angle vehicle identification and tracking method based on reinforcement learning
CN111079640A (en) * 2019-12-09 2020-04-28 合肥工业大学 Vehicle type identification method and system based on automatic amplification sample
CN111260072A (en) * 2020-01-08 2020-06-09 上海交通大学 Reinforced learning exploration method based on generation countermeasure network
CN111680640A (en) * 2020-06-11 2020-09-18 合肥工业大学 Vehicle type identification method and system based on domain migration

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020068868A1 (en) * 2018-09-24 2020-04-02 Chad Steelberg Object detection machine learning
US11554785B2 (en) * 2019-05-07 2023-01-17 Foresight Ai Inc. Driving scenario machine learning network and driving environment simulation

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106611193A (en) * 2016-12-20 2017-05-03 太极计算机股份有限公司 Image content information analysis method based on characteristic variable algorithm
CN108805177A (en) * 2018-05-22 2018-11-13 同济大学 Vehicle type identifier method under complex environment background based on deep learning
US10176405B1 (en) * 2018-06-18 2019-01-08 Inception Institute Of Artificial Intelligence Vehicle re-identification techniques using neural networks for image analysis, viewpoint-aware pattern recognition, and generation of multi- view vehicle representations
CN110147709A (en) * 2018-11-02 2019-08-20 腾讯科技(深圳)有限公司 Training method, device, terminal and the storage medium of vehicle attribute model
CN110458120A (en) * 2019-08-15 2019-11-15 中国水利水电科学研究院 Different automobile types recognition methods and system under a kind of complex environment
CN110874578A (en) * 2019-11-15 2020-03-10 北京航空航天大学青岛研究院 Unmanned aerial vehicle visual angle vehicle identification and tracking method based on reinforcement learning
CN111079640A (en) * 2019-12-09 2020-04-28 合肥工业大学 Vehicle type identification method and system based on automatic amplification sample
CN111260072A (en) * 2020-01-08 2020-06-09 上海交通大学 Reinforced learning exploration method based on generation countermeasure network
CN111680640A (en) * 2020-06-11 2020-09-18 合肥工业大学 Vehicle type identification method and system based on domain migration

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Survey on Visual Navigation for Artificial Agents With Deep Reinforcement Learning;FANYU ZENG;Digital Object Identifier;第8卷;135426-135442 *
Machine Learning for Security in Vehicular Networks: A Comprehensive Survey;Anum Talpur;COMMUNICATIONS SURVEYS & TUTORIALS;第24卷(第1期);346-379 *

Also Published As

Publication number Publication date
CN112508080A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
US20210150345A1 (en) Conditional Computation For Continual Learning
US20200134455A1 (en) Apparatus and method for training deep learning model
WO2021138083A1 (en) Neural architecture search based on synaptic connectivity graphs
WO2021138092A1 (en) Artificial neural network architectures based on synaptic connectivity graphs
WO2019053052A1 (en) A method for (re-)training a machine learning component
WO2021138091A1 (en) Reservoir computing neural networks based on synaptic connectivity graphs
CN114357594A (en) Bridge abnormity monitoring method, system, equipment and storage medium based on SCA-GRU
CN114842343A (en) ViT-based aerial image identification method
CN112508901A (en) Underwater structure disease identification method, system and device and storage medium
CN113780242A (en) Cross-scene underwater sound target classification method based on model transfer learning
CN112508080B (en) Vehicle model identification method, device, equipment and medium based on experience playback
CN110427978B (en) Variational self-encoder network model and device for small sample learning
CN113221758B (en) GRU-NIN model-based underwater sound target identification method
CN113592071B (en) Equipment fault recognition model training and recognition method, system, device and medium
CN111160526A (en) Online testing method and device for deep learning system based on MAPE-D annular structure
CN111914949B (en) Zero sample learning model training method and device based on reinforcement learning
KR20220094967A (en) Method and system for federated learning of artificial intelligence for diagnosis of depression
CN116189130A (en) Lane line segmentation method and device based on image annotation model
CN113239809B (en) Underwater sound target identification method based on multi-scale sparse SRU classification model
CN114722942A (en) Equipment fault diagnosis method and device, electronic equipment and storage medium
CN117523218A (en) Label generation, training of image classification model and image classification method and device
CN113488027A (en) Hierarchical classification generated audio tracing method, storage medium and computer equipment
CN114618167A (en) Anti-cheating detection model construction method and anti-cheating detection method
CN113887357A (en) Face representation attack detection method, system, device and medium
CN114444597B (en) Visual tracking method and device based on progressive fusion network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240507

Address after: 1003, Building A, Zhiyun Industrial Park, No. 13 Huaxing Road, Tongsheng Community, Dalang Street, Longhua District, Shenzhen City, Guangdong Province, 518000

Patentee after: Shenzhen Wanzhida Enterprise Management Co.,Ltd.

Country or region after: China

Address before: 510006 No. 230 West Ring Road, University of Guangdong, Guangzhou

Patentee before: Guangzhou University

Country or region before: China