CN111079851B - Vehicle type identification method based on reinforcement learning and bilinear convolution network - Google Patents
Vehicle type identification method based on reinforcement learning and bilinear convolution network Download PDFInfo
- Publication number
- CN111079851B CN111079851B CN201911371980.5A CN201911371980A CN111079851B CN 111079851 B CN111079851 B CN 111079851B CN 201911371980 A CN201911371980 A CN 201911371980A CN 111079851 B CN111079851 B CN 111079851B
- Authority
- CN
- China
- Prior art keywords
- network
- state
- fine
- reinforcement learning
- action
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/08—Detecting or categorising vehicles
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
本发明公开了一种基于强化学习和双线性卷积网络的车型识别方法,构建深度网络模型,设置细粒度分类网络的超参数并初始化网络;建立优化显著性特征的马尔科夫决策模型;对数据集进行尺度变换;优化注意力区域:在细粒度分类网络参数固定的情况下,将数据集输入细粒度分类网络,并采用强化学习算法优化显著性区域,选择最优的注意力区域;建立对细粒度分类网络参数进行更新的损失函数;融合特征后重复训练网络直到注意力区域不再变化为止;采用需要测试的车型图像输入到训练完成的模型中,获得相应的检测结果。本发明利用强化学习网络来提取底层的显著性特征,并通过双线性插值法来对高层语义特征和低层的显著性特征进行融合提高识别准确率。
The invention discloses a vehicle identification method based on reinforcement learning and bilinear convolutional network, constructs a deep network model, sets hyperparameters of a fine-grained classification network and initializes the network, and establishes a Markov decision model for optimizing salient features; Perform scale transformation on the dataset; optimize the attention area: when the parameters of the fine-grained classification network are fixed, input the dataset into the fine-grained classification network, and use the reinforcement learning algorithm to optimize the saliency area and select the optimal attention area; A loss function is established to update the parameters of the fine-grained classification network; after the features are fused, the network is repeatedly trained until the attention region no longer changes; the model images to be tested are input into the trained model to obtain the corresponding detection results. The invention utilizes the reinforcement learning network to extract the salient features of the bottom layer, and uses the bilinear interpolation method to fuse the high-level semantic features and the salient features of the lower layers to improve the recognition accuracy.
Description
技术领域technical field
本发明涉及一种车型识别方法,特别是涉及一种基于强化学习和双线性卷积网络的车型识别方法。The invention relates to a vehicle identification method, in particular to a vehicle identification method based on reinforcement learning and bilinear convolutional network.
背景技术Background technique
车型识别问题可以看作是细粒度分类问题一个应用分支,即对外观非常相似的同一个类别的不同子类进行分类。由于日常采集的车型图片容易受到姿势、视角和遮挡等因素影响,使得不同品牌的车型之间存在着较小的差异,而同一品牌的车型之间反而存在着较大的差异。如何有效地对车型识别是细粒度分类中的一个亟需解决的应用问题。The vehicle model recognition problem can be viewed as an applied branch of the fine-grained classification problem, that is, classifying different sub-classes of the same class with very similar appearance. Since the daily collected model pictures are easily affected by factors such as posture, viewing angle and occlusion, there are small differences between models of different brands, but there are large differences between models of the same brand. How to effectively identify vehicle models is an urgent application problem in fine-grained classification.
双线性卷积网络是近年来能以较高精度来实现细粒度分类的一种模型,具有结构简单和训练高效的优点,但其仅将最后一层的特征作为分类的输入特征,利用这类特征来进行训练时,会丢失较多的细节信息,而保留大部分的高层特征。由于细粒度分类的对象往往是外型相似,但在细节的表现上各不相同的物体,因此对于细节特征的刻画对于细粒度分类的识别率有着很大的影响。如果直接将双线性网络的底层特征和高层特征融合,由于底层特征的尺度较大,因此在和高层特征融合时,需要采用一些方法进行降维。当降维后得到的特征损失的主要信息为细节信息时,不仅无法提高分类的准确率,反而会延长网络的训练时间和最终的分类效率。Bilinear convolutional network is a model that can achieve fine-grained classification with high accuracy in recent years. It has the advantages of simple structure and efficient training, but it only uses the features of the last layer as the input features of classification. When training with class features, more detailed information will be lost, and most of the high-level features will be retained. Since the objects of fine-grained classification are often objects with similar appearance but different details in performance, the characterization of detailed features has a great impact on the recognition rate of fine-grained classification. If the low-level features of the bilinear network and the high-level features are directly fused, because the scale of the low-level features is relatively large, some methods need to be used for dimensionality reduction when merging with the high-level features. When the main information of the feature loss obtained after dimensionality reduction is the detailed information, it will not only fail to improve the accuracy of classification, but will prolong the training time of the network and the final classification efficiency.
强化学习作为一种序列决策问题的求解方法,通过将要求解的问题建模为MDP模型,再采用强化学习中的经典方法如时间差分算法、最小二乘时间差分算法和行动者评论家算法等来求解最优策略。因此,强化学习是一种非常适合用来提取底层特征中的显著性的方法。Reinforcement learning is a method for solving sequence decision problems. By modeling the problem to be solved as an MDP model, the classical methods in reinforcement learning such as time difference algorithm, least square time difference algorithm and actor-critic algorithm are used. to find the optimal strategy. Therefore, reinforcement learning is a very suitable method for extracting saliency in low-level features.
发明内容SUMMARY OF THE INVENTION
本发明的目的是提供一种基于强化学习和双线性卷积网络的车型识别方法,在较少的车型图片情况下提高车型识别准确率。The purpose of the present invention is to provide a vehicle identification method based on reinforcement learning and bilinear convolutional network, which can improve the accuracy of vehicle identification in the case of fewer vehicle pictures.
本发明的技术方案是这样的:一种基于强化学习和双线性卷积网络的车型识别方法,包括以下步骤:The technical scheme of the present invention is as follows: a vehicle identification method based on reinforcement learning and bilinear convolutional network, comprising the following steps:
(1)构建深度网络模型:构建用于进行车辆识别的基于强化学习和双线性卷积网络的细粒度分类网络;(1) Build a deep network model: build a fine-grained classification network based on reinforcement learning and bilinear convolutional network for vehicle recognition;
(2)设置细粒度分类网络的超参数:所述超参数包括网络的学习率、迭代次数和批量大小;(2) Setting the hyperparameters of the fine-grained classification network: the hyperparameters include the learning rate, the number of iterations and the batch size of the network;
(3)初始化网络:初始化细粒度分类网络的权值和阈值;(3) Initialize the network: initialize the weights and thresholds of the fine-grained classification network;
(4)建立优化显著性特征的马尔科夫决策模型;(4) Establish a Markov decision model to optimize saliency features;
(5)预处理数据集:对数据集进行尺度变换;(5) Preprocessing data set: scale transformation of the data set;
(6)优化注意力区域:在细粒度分类网络参数固定的情况下,将数据集输入细粒度分类网络,并采用强化学习算法优化显著性区域,选择最优的注意力区域;(6) Optimize the attention area: when the parameters of the fine-grained classification network are fixed, input the data set into the fine-grained classification network, and use the reinforcement learning algorithm to optimize the saliency area and select the optimal attention area;
(7)构造损失函数:建立对细粒度分类网络参数进行更新的损失函数,损失函数的定义为数据的真实标签与数据的预测标签的误差平方和;(7) Construct loss function: establish a loss function for updating fine-grained classification network parameters, the loss function is defined as the sum of squares of errors between the true label of the data and the predicted label of the data;
(8)融合特征:对数据集中的每个数据,利用步骤(6)优化的注意力区域和第五卷积层的特征,可以得到最终融合的结果,并用于进行分类;(8) Fusion features: For each data in the data set, the final fusion result can be obtained by using the attention region optimized in step (6) and the features of the fifth convolutional layer, and used for classification;
(9)训练网络:在固定最优注意力区域的情况下,利用数据集并采用梯度下降方法对细粒度分类网络再次训练,直到训练误差小于预设的阈值;(9) Training the network: In the case of fixing the optimal attention area, the fine-grained classification network is retrained by using the dataset and gradient descent method until the training error is less than the preset threshold;
(10)交替训练:重复执行(6)-(9)直到注意力区域不再变化为止;(10) Alternate training: Repeat (6)-(9) until the attention area no longer changes;
(11)采用需要测试的车型图像输入到训练完成的深度网络模型中,获得相应的检测结果。(11) Input the image of the vehicle model to be tested into the trained deep network model to obtain the corresponding detection result.
进一步地,所述步骤(1)中所述的双线性卷积网络的并行特征提取层采用VGG16的第一卷积层至第五卷积层,所述第一卷积层至第五卷积层输出的特征从细节特征向高级的语义特征注意力过渡,在所述第五卷积层后通过外积操作获得一个双线性向量,最后连接全连接层,并在输出上进行软最大化操作,实现对车型的识别与分类。Further, the parallel feature extraction layer of the bilinear convolutional network described in the step (1) adopts the first convolutional layer to the fifth convolutional layer of VGG16, and the first convolutional layer to the fifth convolutional layer. The features output by the convolution layer transition from detailed features to high-level semantic feature attention. After the fifth convolutional layer, a bilinear vector is obtained through the outer product operation, and finally the fully connected layer is connected, and soft max is performed on the output. To realize the identification and classification of vehicle models.
进一步地,所述步骤(4)建立优化显著性特征的马尔科夫决策模型的包括:Further, the step (4) of establishing the Markov decision model for optimizing the salient features includes:
401)状态空间X设为第三卷积层生成的特征中尺度大小为第五卷积层的所有子特征构成的集合,X={x1,x2,…,xn};401) The state space X is set as a set composed of all sub-features of the fifth convolutional layer in the feature mesoscale generated by the third convolutional layer, X = { x1, x2 ,...,xn};
402)动作空间U设为状态在状态空间的上下左右的移动构成的集合;402) The action space U is set as the set formed by the movement of the state in the up, down, left, and right of the state space;
403)状态迁移函数为f:X×U→X,对于状态空间中的任意状态x∈X,从动作空间中任意一个动作u∈U,下一个状态为动作u发生后的状态,该状态为第三卷积层生成的特征中的尺度大小为第五卷积层的某个子特征;403) The state transition function is f:X×U→X, for any state x∈X in the state space, from any action u∈U in the action space, the next state is the state after the action u occurs, and the state is The scale in the features generated by the third convolutional layer is a sub-feature of the fifth convolutional layer;
404)奖赏函数为:r:X×U→R,对于状态空间中的任意x∈X,从动作空间中任意一个动作u∈U,得到的立即奖赏。404) The reward function is: r:X×U→R, for any x∈X in the state space, the immediate reward obtained from any action u∈U in the action space.
优选地,所述动作空间U={0,1,2,3},0表示状态向上的移动,1表示状态向左的移动,2表示状态向下的移动,3表示状态向右的移动。Preferably, the action space U={0, 1, 2, 3}, where 0 represents an upward movement of the state, 1 represents a leftward movement of the state, 2 represents a downward movement of the state, and 3 represents a rightward movement of the state.
进一步地,所述(6)优化注意力区域包括步骤:Further, the (6) optimization of the attention region includes the steps:
601)设置参数的值:折扣率γ,衰减因子λ,迭代的轮数e,每个迭代对应的最大时间步T,学习率α,探索率ε;601) Set the values of parameters: discount rate γ, decay factor λ, number of iterations e, maximum time step T corresponding to each iteration, learning rate α, exploration rate ε;
602)对于初始化Q01(x,u)=0,Q02(x,u)=0;602) For Initialize Q 0 1(x, u)=0, Q 0 2(x, u)=0;
603)判断情节数已达到最大值E:如果达到,转入步骤612);否则转入步骤604);603) It is judged that the number of episodes has reached the maximum value E: if it is reached, go to step 612); otherwise, go to step 604);
604)判断是否达到最大时间步:如果达到,转入步骤603);否则转入步骤605)604) Determine whether the maximum time step is reached: if it is reached, go to step 603); otherwise, go to step 605)
605)初始化当前状态x=x0;605) Initialize the current state x=x 0 ;
606)在(0,1)之间随机产生一个概率p,判断p<ε是否成立:如果成立,在当前状态选择的动作为:u=argmaxu(Q1(x,u)+Q2(x,u));否则在动作集中随机选择任意一个动作;606) Randomly generate a probability p between (0, 1), and judge whether p<ε is true: if it is true, the action selected in the current state is: u=argmax u (Q 1 (x,u)+Q 2 ( x, u)); otherwise, randomly select any action in the action set;
607)执行当前选择的动作u,得到其对应的下一个状态x';607) Execute the currently selected action u to obtain its corresponding next state x';
608)判断输出层得到的分类结果与真实标签是否一样:如果相同,立即奖赏r=1;否则立即奖赏r=0;608) Determine whether the classification result obtained by the output layer is the same as the real label: if they are the same, immediately reward r=1; otherwise, immediately reward r=0;
609)在(0,1)之间随机产生一个概率p,判断p<0.5是否成立:如果成立,更新Q值:Q1(x,u)=r+γmaxuQ1(x′,u);否则更新Q值:Q2(x,u)=r+γmaxuQ2(x′,u);609) Randomly generate a probability p between (0, 1), and judge whether p<0.5 holds: if it holds, update the Q value: Q 1 (x,u)=r+γmaxuQ 1 (x′,u); otherwise Update the Q value: Q 2 (x,u)=r+γmaxuQ 2 (x′,u);
610)更新当前时间步:t=t+1,并转步骤604)进行判断;610) Update the current time step: t=t+1, and go to step 604) to judge;
611)更新当前情节:e=e+1;611) Update the current plot: e=e+1;
612)输出当前的最优策略和值函数Q1(x,u)、Q2(x,u)。612) Output the current optimal policy and value functions Q 1 (x, u), Q 2 (x, u).
进一步地,所述步骤(7)中损失函数为:Further, the loss function in the step (7) is:
其中,y表示网络得到处的车型分类结果,y′表示车型图片的真实的标签。Among them, y represents the model classification result obtained by the network, and y′ represents the real label of the model image.
本发明所提供的技术方案的有益效果是,采用双线性卷积网络作为基本的深度网络构架,利用强化学习网络来提取底层的显著性特征,并通过双线性插值法来对高层语义特征和低层的显著性特征进行融合,最后,通过双线性卷积网络的全连接层和软最大化操作进行具体的车型识别,提高了车型识别准确率。结合强化学习网络,可在车型图片较少时很好地提取车型图片的显著性特征,适合进行在线车型识别,能被应用于视频监控领域的在线实时识别。The beneficial effect of the technical solution provided by the present invention is that the bilinear convolutional network is used as the basic deep network structure, the reinforcement learning network is used to extract the salient features of the bottom layer, and the bilinear interpolation method is used to analyze the high-level semantic features. It is fused with the salient features of the lower layers, and finally, the specific vehicle recognition is carried out through the fully connected layer of the bilinear convolutional network and the soft maximization operation, which improves the recognition accuracy of the vehicle. Combined with the reinforcement learning network, the salient features of the model pictures can be well extracted when there are few pictures of the model, which is suitable for online vehicle recognition and can be applied to online real-time recognition in the field of video surveillance.
附图说明Description of drawings
图1为本发明方法的流程图;Fig. 1 is the flow chart of the inventive method;
图2为本发明方法网络模型图;Fig. 2 is the network model diagram of the method of the present invention;
图3为本发明方法中双线性模型的单网络模型细化图。FIG. 3 is a detailed diagram of a single network model of the bilinear model in the method of the present invention.
具体实施方式Detailed ways
请结合图1所示,本实例涉及的基于强化学习和双线性卷积网络的车型识别方法,包含以下步骤:Please refer to Figure 1. The vehicle recognition method based on reinforcement learning and bilinear convolutional network involved in this example includes the following steps:
(1)构建深度网络模型:构建用于进行车辆识别的基于强化学习和双线性卷积网络的细粒度分类网络,其模型图如图2及图3所示。双线性卷积网络的并行特征提取层采用VGG16的第一卷积层至第五卷积层,第一卷积层至第五卷积层输出的特征从细节特征向高级的语义特征注意力过渡,在第五卷积层后通过外积操作获得一个双线性向量,最后连接全连接层,并在输出上进行软最大化操作,实现对车型的识别与分类。(1) Build a deep network model: Build a fine-grained classification network based on reinforcement learning and bilinear convolutional network for vehicle recognition. The model diagrams are shown in Figure 2 and Figure 3. The parallel feature extraction layer of the bilinear convolutional network adopts the first convolutional layer to the fifth convolutional layer of VGG16, and the features output from the first convolutional layer to the fifth convolutional layer are from detailed features to high-level semantic feature attention. In the transition, a bilinear vector is obtained through the outer product operation after the fifth convolutional layer, and finally the fully connected layer is connected, and a soft maximization operation is performed on the output to realize the identification and classification of vehicle models.
(2)设置网络的超参数:网络的学习率为0.02,迭代次数为10000次,批量大小为10张图片,训练的阈值为0.01;(2) Set the hyperparameters of the network: the learning rate of the network is 0.02, the number of iterations is 10,000, the batch size is 10 images, and the training threshold is 0.01;
(3)初始化网络:设置网络的所有权值和阈值为0.00001;(3) Initialize the network: set the ownership value and threshold of the network to 0.00001;
(4)构建MDP模型:构建优化显著性特征的马尔科夫决策模型,建立的MDP模型如下:(4) Build MDP model: build a Markov decision model that optimizes saliency features. The established MDP model is as follows:
401)状态空间建模:状态空间是在Conv3的输出特征图的基础上,所有能采用第五卷积层的尺度在第三卷积层输出特征上得到的特征构成了状态空间,其中,状态空间中包含4个包含边缘四个角落的特征图;401) State space modeling: The state space is based on the output feature map of Conv3, and all the features that can be obtained from the output features of the third convolution layer using the scale of the fifth convolution layer constitute the state space. The space contains 4 feature maps containing the four corners of the edge;
402)动作空间建模:动作空间建模向上、向作、向下和向右的移动,分别采用数字0、1、2和3对动作进行刻画;402) Action space modeling: the action space modeling moves upward, downward, and rightward, and uses
403)迁移函数建模:假设当前状态对应的特征所对应的位置为(x,y),则:403) Migration function modeling: Assuming that the position corresponding to the feature corresponding to the current state is (x, y), then:
如果采取了向上的动作后,下一个状态的位置为(x,y-1);If the upward action is taken, the position of the next state is (x, y-1);
如果采取了向左的动作,下一个状态的位置为(x-1,y);If the action to the left is taken, the position of the next state is (x-1,y);
如果采取了向下的动作,下一个状态的位置为(x,y+1)If the downward action is taken, the position of the next state is (x,y+1)
如果采取了向右的动作,下一个状态的位置为(x+1,y)If the action to the right is taken, the position of the next state is (x+1,y)
404)奖赏函数建模:奖赏函数的建模依赖于深度网络当前的输出,即在将某一车型图输入深度网络时,采用目前的最优的注意力区域,得到的车型类别。当车型类别与真实类别相同时,立即奖赏为1;否则,奖赏为0。404) Reward function modeling: The modeling of the reward function depends on the current output of the deep network, that is, when a certain vehicle model is input into the deep network, the current optimal attention region is used to obtain the vehicle type. When the model category is the same as the real category, the immediate reward is 1; otherwise, the reward is 0.
(5)预处理数据集:下载数据集,并对数据集进行尺度变换,即平移和旋转等操作,对原始的数据集进行扩充,扩充的目的是增加网络的鲁棒性,即在对一些有噪声的图,网络具有很好的识别能力,同时防止训练时的过拟合现象。数据集Car-196下载的地址为:Car-196:https://ai.stanford.edu/~jkrause/cars/car_dataset.html。(5) Preprocessing data set: Download the data set, and perform scale transformation on the data set, that is, operations such as translation and rotation, and expand the original data set. The purpose of the expansion is to increase the robustness of the network, that is, in some With noisy graphs, the network has good recognition ability while preventing overfitting during training. The download address of the dataset Car-196 is: Car-196: https://ai.stanford.edu/~jkrause/cars/car_dataset.html.
为了使得网络具有更好的泛化能力,在训练阶段也采用鸟类数据集CUB-200和飞机类FGVC-Aircraft进行训练,其下载地址分别为:In order to make the network have better generalization ability, the bird dataset CUB-200 and the aircraft class FGVC-Aircraft are also used for training in the training phase. The download addresses are:
CUB-200:http://www.vision.caltech.edu/visipedia/CUB-200.html和FGVC-Aircraft:http://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/。CUB-200: http://www.vision.caltech.edu/visipedia/CUB-200.html and FGVC-Aircraft: http://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft /.
(6)优化注意力区域:采用强化学习算法优化显著性区域,进行网络的训练,选择最优的注意力区域,优化的具体实施过程可以描述为:(6) Optimize the attention area: Use the reinforcement learning algorithm to optimize the saliency area, train the network, and select the optimal attention area. The specific implementation process of optimization can be described as:
601)设置参数的值:折扣率γ=0.9,衰减因子λ=0.95,迭代的轮数E=200,每个迭代对应的最大时间步T=1000,学习率α=0.5,探索率ε=0.1;601) Set the values of parameters: discount rate γ=0.9, decay factor λ=0.95, number of iterations E=200, maximum time step T=1000 corresponding to each iteration, learning rate α=0.5, exploration rate ε=0.1 ;
602)对于初始化Q01(x,u)=0,Q02(x,u)=0,判断情节数已达到最大值E:602) For Initialize Q 0 1(x, u)=0, Q 0 2(x, u)=0, and judge that the number of episodes has reached the maximum value E:
如果达到:If it reaches:
转入步骤Go to step
否则:otherwise:
转入步骤604);Go to step 604);
603)判断是否达到最大时间步:603) Determine whether the maximum time step is reached:
如果达到:If it reaches:
转入步骤603)Go to step 603)
否则:otherwise:
转入步骤605)Go to step 605)
604)随机初始化当前状态x=x0;604) Randomly initialize the current state x=x 0 ;
605)在(0,1)之间随机产生一个概率p,判断p<ε是否成立:605) Randomly generate a probability p between (0, 1), and judge whether p<ε holds:
如果成立:If true:
在当前状态选择的动作为:u=argmaxu(Q1(x,u)+Q2(x,u))The action selected in the current state is: u=argmax u (Q 1 (x,u)+Q 2 (x,u))
否则:otherwise:
在动作集中随机选择四个动作中的任意一个;Randomly select any one of the four actions in the action set;
606)执行当前选择的动作u,得到其对应的下一个状态x′606) Execute the currently selected action u to obtain its corresponding next state x'
607)判断输出层得到的分类结果与真实标签是否一样:607) Determine whether the classification result obtained by the output layer is the same as the real label:
如果相同:If the same:
立即奖赏r=1Immediate reward r=1
否则:otherwise:
立即奖赏r=0Immediate reward r=0
608)在(0,1)之间随机产生一个概率p,判断p<0.5是否成立:608) Randomly generate a probability p between (0, 1), and judge whether p<0.5 holds:
如果成立:If true:
更新Q值:Q1(x,u)=r+γmaxuQ1(x′,u)Update Q value: Q 1 (x,u)=r+γmax u Q 1 (x′,u)
否则:otherwise:
更新Q值:Q2(x,u)=r+γmaxuQ2(x′,u)Update Q value: Q 2 (x,u)=r+γmax u Q 2 (x′,u)
609)更新当前时间步:t=t+1,并转步骤4)进行判断609) Update the current time step: t=t+1, and go to step 4) for judgment
610)更新当前情节:e=e+1610) Update current episode: e=e+1
611)输出当前的最优策略和值函数Q1(x,u)、Q2(x,u)611) Output the current optimal policy and value functions Q 1 (x, u), Q 2 (x, u)
(7)构造损失函数:构造网络训练的损失函数为:(7) Constructing the loss function: The loss function for constructing the network training is:
其中,y表示网络得到处的车型分类结果,y′表示车型图片的真实的标签。Among them, y represents the model classification result obtained by the network, and y′ represents the real label of the model image.
(8)融合特征:在获得了最优值的特征区域后,固定该区域,并用于对高层特征(第5个卷积模块的输出)采用加和的方式进行融合,得到融合的高层特征,各层特征的输出以及融合特征的输出如图2所示;(8) Fusion features: After obtaining the feature region of the optimal value, fix the region and use it to fuse the high-level features (the output of the fifth convolution module) by summing to obtain the fused high-level features, The output of each layer feature and the output of the fusion feature are shown in Figure 2;
(9)训练网络:在固定最优注意力区域的情况下,利用数据集,并采用梯度下降方法对网络再次训练,直到训练误差小于预设的阈值;(9) Train the network: In the case of fixing the optimal attention area, use the data set and use the gradient descent method to retrain the network until the training error is less than the preset threshold;
(10)交替训练:重复执行(6)-(9)直到注意力区域不再变化为止;(10) Alternate training: Repeat (6)-(9) until the attention area no longer changes;
(11)采用需要测试的车型图像输入到深度网络模型中,获得相应的检测结果。(11) Input the vehicle image to be tested into the deep network model to obtain the corresponding detection result.
采用本发明方法进行车型识别方法在各数据集的识别准确率如下表:The recognition accuracy rate of the vehicle type recognition method in each data set using the method of the present invention is as follows:
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911371980.5A CN111079851B (en) | 2019-12-27 | 2019-12-27 | Vehicle type identification method based on reinforcement learning and bilinear convolution network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911371980.5A CN111079851B (en) | 2019-12-27 | 2019-12-27 | Vehicle type identification method based on reinforcement learning and bilinear convolution network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111079851A CN111079851A (en) | 2020-04-28 |
CN111079851B true CN111079851B (en) | 2020-09-18 |
Family
ID=70318777
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911371980.5A Active CN111079851B (en) | 2019-12-27 | 2019-12-27 | Vehicle type identification method based on reinforcement learning and bilinear convolution network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111079851B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112149720A (en) * | 2020-09-09 | 2020-12-29 | 南京信息工程大学 | A Fine-Grained Vehicle Type Recognition Method |
CN112183602B (en) * | 2020-09-22 | 2022-08-26 | 天津大学 | Multi-layer feature fusion fine-grained image classification method with parallel rolling blocks |
CN113191218A (en) * | 2021-04-13 | 2021-07-30 | 南京信息工程大学 | Vehicle type recognition method based on bilinear attention collection and convolution long-term and short-term memory |
CN113158980A (en) * | 2021-05-17 | 2021-07-23 | 四川农业大学 | Tea leaf classification method based on hyperspectral image and deep learning |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106096535A (en) * | 2016-06-07 | 2016-11-09 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | A kind of face verification method based on bilinearity associating CNN |
US9569736B1 (en) * | 2015-09-16 | 2017-02-14 | Siemens Healthcare Gmbh | Intelligent medical image landmark detection |
CN109086792A (en) * | 2018-06-26 | 2018-12-25 | 上海理工大学 | Based on the fine granularity image classification method for detecting and identifying the network architecture |
CN109359684A (en) * | 2018-10-17 | 2019-02-19 | 苏州大学 | A fine-grained vehicle identification method based on weakly supervised localization and subcategory similarity measure |
CN109858430A (en) * | 2019-01-28 | 2019-06-07 | 杭州电子科技大学 | A kind of more people's attitude detecting methods based on intensified learning optimization |
CN109902562A (en) * | 2019-01-16 | 2019-06-18 | 重庆邮电大学 | A driver abnormal posture monitoring method based on reinforcement learning |
CN110135231A (en) * | 2018-12-25 | 2019-08-16 | 杭州慧牧科技有限公司 | Animal face recognition methods, device, computer equipment and storage medium |
CN110334572A (en) * | 2019-04-04 | 2019-10-15 | 南京航空航天大学 | A fine-grained recognition method for car models from multiple angles |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8874498B2 (en) * | 2011-09-16 | 2014-10-28 | International Business Machines Corporation | Unsupervised, supervised, and reinforced learning via spiking computation |
CN108898060A (en) * | 2018-05-30 | 2018-11-27 | 珠海亿智电子科技有限公司 | Based on the model recognizing method of convolutional neural networks under vehicle environment |
CN109086672A (en) * | 2018-07-05 | 2018-12-25 | 襄阳矩子智能科技有限公司 | A kind of recognition methods again of the pedestrian based on reinforcement learning adaptive piecemeal |
-
2019
- 2019-12-27 CN CN201911371980.5A patent/CN111079851B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9569736B1 (en) * | 2015-09-16 | 2017-02-14 | Siemens Healthcare Gmbh | Intelligent medical image landmark detection |
CN106096535A (en) * | 2016-06-07 | 2016-11-09 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | A kind of face verification method based on bilinearity associating CNN |
CN109086792A (en) * | 2018-06-26 | 2018-12-25 | 上海理工大学 | Based on the fine granularity image classification method for detecting and identifying the network architecture |
CN109359684A (en) * | 2018-10-17 | 2019-02-19 | 苏州大学 | A fine-grained vehicle identification method based on weakly supervised localization and subcategory similarity measure |
CN110135231A (en) * | 2018-12-25 | 2019-08-16 | 杭州慧牧科技有限公司 | Animal face recognition methods, device, computer equipment and storage medium |
CN109902562A (en) * | 2019-01-16 | 2019-06-18 | 重庆邮电大学 | A driver abnormal posture monitoring method based on reinforcement learning |
CN109858430A (en) * | 2019-01-28 | 2019-06-07 | 杭州电子科技大学 | A kind of more people's attitude detecting methods based on intensified learning optimization |
CN110334572A (en) * | 2019-04-04 | 2019-10-15 | 南京航空航天大学 | A fine-grained recognition method for car models from multiple angles |
Non-Patent Citations (1)
Title |
---|
Bilinear CNN Model for Fine-Grained Classification Based on Subcategory-Similarity Measurement;Xinghua Dai et.al;《applied sciences》;20190116;第1-16页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111079851A (en) | 2020-04-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Stagewise unsupervised domain adaptation with adversarial self-training for road segmentation of remote-sensing images | |
CN111079851B (en) | Vehicle type identification method based on reinforcement learning and bilinear convolution network | |
CN110619369B (en) | A Fine-Grained Image Classification Method Based on Feature Pyramid and Global Average Pooling | |
CN110245655B (en) | A Single-Stage Object Detection Method Based on Lightweight Image Pyramid Network | |
CN108062756B (en) | Image Semantic Segmentation Based on Deep Fully Convolutional Networks and Conditional Random Fields | |
CN111191526B (en) | Pedestrian attribute recognition network training method, system, medium and terminal | |
CN108399380A (en) | A kind of video actions detection method based on Three dimensional convolution and Faster RCNN | |
CN111666836A (en) | High-resolution remote sensing image target detection method of M-F-Y type lightweight convolutional neural network | |
CN111652124A (en) | A Construction Method of Human Action Recognition Model Based on Graph Convolutional Network | |
CN108734210B (en) | An object detection method based on cross-modal multi-scale feature fusion | |
CN108764281A (en) | A kind of image classification method learning across task depth network based on semi-supervised step certainly | |
CN108509978A (en) | The multi-class targets detection method and model of multi-stage characteristics fusion based on CNN | |
CN110569886A (en) | A Bidirectional Channel Attention Meta-Learning Approach for Image Classification | |
CN107767384A (en) | A kind of image, semantic dividing method based on dual training | |
CN107480726A (en) | A kind of Scene Semantics dividing method based on full convolution and shot and long term mnemon | |
CN113111814B (en) | Semi-supervised pedestrian re-identification method and device based on regularization constraints | |
CN110781894B (en) | Point cloud semantic segmentation method, device and electronic device | |
CN106022363A (en) | Method for recognizing Chinese characters in natural scene | |
CN115222998B (en) | Image classification method | |
CN110826609B (en) | Double-current feature fusion image identification method based on reinforcement learning | |
CN114626461B (en) | Cross-domain object detection method based on domain adaptation | |
CN115661463B (en) | A semi-supervised semantic segmentation method based on scale-aware attention | |
CN116563682A (en) | An Attention Scheme and Strip Convolutional Semantic Line Detection Method Based on Deep Hough Networks | |
CN116266387A (en) | YOLOV4 image recognition algorithm and system based on reparameterized residual structure and coordinate attention mechanism | |
Salem et al. | Semantic image inpainting using self-learning encoder-decoder and adversarial loss |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20210414 Address after: 215000 Building 1, Wujiang Taihu new city science and Technology Innovation Park, No.18, Suzhou River Road, Wujiang District, Suzhou City, Jiangsu Province Patentee after: Jiangsu Yiyou Huiyun Software Co.,Ltd. Address before: 215500 Changshou City South Three Ring Road No. 99, Suzhou, Jiangsu Patentee before: CHANGSHU INSTITUTE OF TECHNOLOGY |
|
TR01 | Transfer of patent right |