AU2018102037A4 - A method of recognition of vehicle type based on deep learning - Google Patents

A method of recognition of vehicle type based on deep learning Download PDF

Info

Publication number
AU2018102037A4
AU2018102037A4 AU2018102037A AU2018102037A AU2018102037A4 AU 2018102037 A4 AU2018102037 A4 AU 2018102037A4 AU 2018102037 A AU2018102037 A AU 2018102037A AU 2018102037 A AU2018102037 A AU 2018102037A AU 2018102037 A4 AU2018102037 A4 AU 2018102037A4
Authority
AU
Australia
Prior art keywords
layer
neural network
training
image
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2018102037A
Inventor
Jiahao Ge
Yu Han
Sitong Jin
Huancheng Song
Mingchen Yang
Hao Zeng
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to AU2018102037A priority Critical patent/AU2018102037A4/en
Application granted granted Critical
Publication of AU2018102037A4 publication Critical patent/AU2018102037A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/584Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

This patent is a specifically image recognized method based on deep learning. The image-based vehicle recognition system first captures the vehicle image from the video stream in camera, and then pattern recognition and computer vision are employed to attain the valuable information in order to carry out classification and recognition. This invention is mainly intended to preprocess images of various vehicles and then to divide all of the images into training set and testing set. The training set image is used for Deep Convolutional Neural Network training, and then the parameters between the layers of the Convolutional Neural Network are modified, which can be used in our testing set. After the training is completed, the neural network for testing is initialized with the parameters we get before to train a similar network as in training part. After that, the images for testing are putted into the Deep Neural Network to embody the identification and classification of the vehicle images. The invention does not require human participation and adjustment, and automatically performs the extraction and classification of features, providing a reliable high-performance image recognition technology based on deep learning. Standard Neural Net After Applying Dropout Figure 5

Description

TITLE
A method of recognition of vehicle type based on deep learning
FIELD OF THE INVENTION
The invention belongs to the technical field of digital image processing, and in particular relates to a vehicle image recognition method based on deep learning.
BACKGROUND OF THE INVENTION
With the improvement of living standard, car, as the most important vehicle in this world, is chosen to be utilized widely. The popularity of cars has brings great convenience to people's travel; However, various traffic problems have become increasingly prominent, seriously endangering people's travel safety. Under these circumstances, automatic identification and tracking of escaping vehicles through intelligent transportation systems are in urgent needs. The travel environment has become more complicated, and various types of traffic problems have been exposed to public. Cases such as accidents escaping, stealing vehicles, and loading fake license plates occur from time to time. Vehicle identification technology is an important tool in maintaining the stability of intelligent transportation system. It is a comprehensive application of design analysis, visual development, artificial intelligence, software development and other
2018102037 09 Dec 2018 technologies. Every year, crimes such as stealing vehicles and robbing seriously disturb the public security and cause serious economic losses. Because the car cases are generally hidden and the investigation work is huge, the clear-up rate of similar cases is not high. Facing with these problems, Chinese government has introduced some corresponding policies to relieve the situation. Hence, a productive recognition technology is indispensable.
Due to the lack of advanced technology, researchers must observe and select the useful massage by themselves, which also brings inconvenience to the public. In addition, they need to extract the figures, then use algorithm to recognize it. For these reasons, it is highly necessary to apply a method to promote the precision. One way to finish the job is to use the new research called deep learning, the advantage of which is to extract the character automatically. Also, with the help of the method, the celerity of recognition is ensured. In summary, this patent utilizes the technology of image recognition to identify vehicle type, further solving problems related with statistics, security and car tracking. Compared to other methods, this idea is not only full of efficiency but also is easy to control, which is worthy to be deeply explored.
2018102037 09 Dec 2018
SUMMARY OF THE INVENTION
In order to solve the deficiencies in modem technology, we propose a vehicle image recognition method based on deep learning, which adopts a multi-layer Convolutional Neural Network and a Full Connected Network serial connection method so that we are able to fully exploit the advantages of deep learning automatic feature extraction, effectively solving problems like difficulties of characters extraction and automatic, simultaneously identification.
The technical solution of the present invention is implemented as follows:
A vehicle image recognition method based on deep learning includes:
The construction of the vehicle model image dataset,
Parameter training and optimization of deep learning network,
The design of network structure of deep learning,
The real-time identification of deep learning network.
The construction of the vehicle model image dataset.
1. Using the crawler program to get enough pictures of different car categories on a specific website and classifying them according to
2018102037 09 Dec 2018 car, bus, truck, moto
2. Pre-processing image data, through image filtering, rotation, compression, tomat, load, etc., to convert image data of different sizes into image data of the same size and size, with the same number of color channels, as depth Learn the input of the network. The data of training and testing should be divided into appropriate proportions. In this invention, the uniform size is 32x32x1 and the ratio of training and testing is 4:1.
Parameter training and optimization of deep learning network
1. Putting data into the network in batched
The advantage of batching trained images is that it is easy for computers’ individual training, lest problems like system halted caused by one-time statistics’ training.
2. Optimizing Category
This invention applies Adam as the optimizing model. Adam is a first order algorithm that can replace traditional ‘Stochastic gradient descent’. It can update the weight of neural internet and pass it to the next generation. The difference between ‘Adam’ and ‘Stochastic gradient descent’ is that the latter one maintains identical study rate to renew all weights, and the former one sets different but appropriate study rate for the parameter by
2018102037 09 Dec 2018 calculating first and second moment grads. Also, Adam uses less capacity and its calculation is more effective and easy to achieve.
The design of network structure of deep learning
Figure 2 shows the Neural Network structure of the present invention.
This framework is divided into two parts:
Convolutional Neural Network and Fully Connected Neural Network.
1. Convolutional Neural Network:
Using structure like:
[[Convolution —>Relu]*M—> MaxPooling]* N
Through different sliding window filters preset in the Convolutional Neural Network, different feature point sets in each data source image are obtained, and a feature set pooling layer is formed as an input of the next layer of the Neural Network. In the pooling process, the combination function completes the maximum pooling, strengthens the general feature points, and reduces the amount of data to facilitate the calculation of feature points in the next layer network. In the
2018102037 09 Dec 2018
Convolutional layer, there are three main nodes: Convolution, Relu, and Maxpolling.
Convolution is a Convolution node. In this kind of node, Convolution operation on image data is completed by a preset number of N*N Convolution kernels, and different Convolution kernels are multiplied with corresponding M*M image pixels. Then, the summation results in different N*N Convolutions to obtain the acquisition of different features of the image, which is the feature value set of the image data. The values in the Convolution kernel are dynamically obtained during the learning training process. After completing a Convolution operation, the Convolution result set of M*M is taken as the input of the Relu node. The Relu node is the excitation layer, and the Convolution output result is nonlinearly mapped. The negative eigenvalue in the input N*N Convolution output result set is modified to 0 to achieve the purpose of enhancing the eigenvalue, and output N*N. The result set.
With Relu as the excitation function of the excitation layer, the convergence is fast, and the gradient is simple to speed up the calculation and reduce the computational difficulty.
2018102037 09 Dec 2018
The Maxpolling node is used as a pooling layer in a Convolutional Neural Network. The pooling layer is sandwiched between successive Convolutional layers to compress the amount of data and parameters and reduce overfitting. In short, if the input is an image, then the primary role of the pooling layer is to compress the image and reduce the latitude of the image data. Since the image data is still large after the Convolution operation is completed, in order to reduce the data dimension, downsampling is performed. The reason why this can be done is that even if a lot of data is reduced, the statistical properties of the feature can still describe the image, and because of the reduced data dimension, the over-fitting is effectively avoided. Data compression is performed using the Maxpolling maximum down sampling function while maintaining data characteristics. The input N*N data set can be compressed into a Y*Y data set (N>Y) by Maxpolling.
2. Fully connected Neural Network:
Using structure like:
[Full Connected —>Relu]*P —> Full Connected—> SoftMax
2018102037 09 Dec 2018
A fully connected Neural Network is a plurality of neurons connected in accordance with certain rules. Each neuron contains an activation function. The neurons are laid out in layers. The leftmost layer is called the input layer and is responsible for receiving input data; the rightmost layer is called the output layer, from which the Neural Network output data can be obtained. The layer between the input layer and the output layer is called the hidden layer, and each neuron in the Nth layer is connected to all the neurons in the N-l layer. The output of the N-l layer neuron is the input of the Nth layer neuron. Each connection has a weight. Compress and use The image data feature set of the Convolutional Neural Network as the input layer of the fully connected Neural Network, and obtain final output step by step through the different weights of each layer and the activation function of the neurons, which is the category indicating specific type of the vehicle. The connection weights between different neurons are obtained through the reverse conduction dynamic learning in the model training learning process. By training to improve the weights, the fully connected Neural Network model has the ability to identify data by eigenvalues. We also added an excitation layer
2018102037 09 Dec 2018 to the fully connected Neural Network, using the Relu excitation function to stimulate the output of the output layer of the fully connected Neural Network, clarify the output classification, and simplify the output. The SoftMax function has been added after the Relu activation layer. The SoftMax function is used to complete the probability distribution of the classification of data samples. The classification output result of the Relu active layer is used as an input of the SoftMax function, and the predicted label and the predicted classification probability of each data are output by calculation. Refine the data classification results.
The real-time identification of deep learning network
After the end of the model training, the present invention inputs the picture data for testing into the trained network for identification and classification. Print the accuracy of each batch of test results to the console, output the average accuracy and standard deviation of all test batches, and plot the confusion matrix of the test results, respectively showing the identification of the four types of vehicles during the test. Correct and identify as other types of quantities, and count the accuracy of each type.
2018102037 09 Dec 2018
The above recognition results can well reflect the level of model training. By comparing the accuracy of each type and identifying other types of probabilities, these results can effectively help us understand the shortcomings in data preparation data optimization.
The final accuracy of this experiment is 91%, and the specific results analysis will be shown in detail below.
DESCRIPTION OF THE DRAWINGS
Figure 1 Flowchart of the program
Figure 2 Neural Network structure
Figure 3 Full Connected Network structure
Figure 4 Flowchart of SoftMax
Figure 5 Flowchart of dropout operation
Figure 6 Sketch map of depth model
Figure 7 The structure diagram of training and testing
Figure 8-19 shows the result of the invention
Figure 8 The running results of the best parameters
Figure 9 Select the best number of iterations
Figure 10 Select the best number of iterations- line chart
Figure 11 Average accuracy under different iterations
Figure 12 Optimum decay rate parameter
Figure 13 Optimum decay rate parameter-line chart-1350 times
2018102037 09 Dec 2018
Figure 14 Optimum decay rate parameter-line chart-1400 times
Figure 15 Optimum decay rate parameter-line chart-1600 times
Figure 16 Optimum decay rate
Figure 17 Optimum decay rate-line chart-1350 times
Figure 18 Optimum decay rate-line chart-1400 times
Figure 19 Optimum decay rate-line chart-1600 times
DESCRIPTION OF PREFERRED EMBODIMENT
STEP1: Using the CrawPic.py crawler program in the python program to crawl four types of vehicle images (car, bus, truck, moto), each with 5000 images to ensure sufficient data volume , from http://image.baidu.com, namely,
STEP2: Using the PreprocessingPic.py program to perform data preprocessing on the images obtained from the network, select specific images suitable for the experiment, delete the ambiguous, unrelated ones, and only displays pictures of a small part of the car, leaving only clear pictures which are useful for the invention.
Because sizes and pixels of the downloaded image are different, we
2018102037 09 Dec 2018 need to pre-process the image, unify the size of the image, use the python Opencv library to process the image, cv2. resize() to convert the image to the final required size which is uniform and suitable as the input of Neural Network structure.
During the compression process, there may occur images with different sizes, which will seriously distort the picture, so pay special attention to the process of downloading and filtering. When the number of images is not enough, we can use cv2.getRotationMatrix2D() to rotate the image to the left or right by 5-15 angles to get a sufficient amount of data.
STEP3: Using the tomat. py program to label the image data according to its classification, and save it to the train.mat and test.mat files according to a certain ratio, so as to facilitate the reading of the data by the later program.
The four data types car, bus, truck, moto are numbered sequentially, which are 0, 1, 2, and 3, and store all the image data in the X array, label data in the y array
Using skleam. model selection in Python program to divide all image data and corresponding labels into 4 groups of train and test them according to the ratio of 4:1, and store them in train, mat and test.
2018102037 09 Dec 2018 mat.
STEP4: Using the load.py program to read the generated mat file and perform a series of operations on the acquired data to convert them into ones fit for our formats. In reformat method, use np.Transpose().Astype() to change the original shape data, which is modified to (the number of pictures, high-picture, the picture width, the channel number) format, and then labels are reformatted into one-hot encoding. (The label corresponding to 0 is [1, 0, 0, 0].)
In the normalize method, the data is grayed out and changed from a three-color channel to a monochrome channel to save memory and speed up the training.
Finally, each image data is used as an input to the deep learning network in the form of 32x32x1.
STEP5: Method to define the optimization parameter model, and use Adam as the optimization.
Define train batch size and test batch size to 64 and 500 respectively. That is, the amount of data for each batch of training and testing is 64 and 500, respectively.
STEP6: Extract Image features, using a Convolutional Neural
2018102037 09 Dec 2018
Network to prepare for a fully connected Neural Network. In this invention, 32x32x1 is used as an input, that is, input a 32x32 matrix, and the final output is 8x8x32.
In the feature extraction layer, including the [Convolution -> Relu]*M step, where M=2 is set,
Set parameters:
Filters=32: Each Convolution operation extracts 32 features to generate 3 2-layer output depth
Patch size=3: Set the size of the sliding window to 3x3
Padding=l: Add 1 layer 0 to the outermost layer of the result matrix after each Convolution operation, so that the size remains the same size.
Activation=’Relu’: set the activation function to Relu, set the value which is less than 0 in each result matrix to 0 to eliminate unnecessary influencing factors.
In the pooling layer, the MaxPooling operation is included to reduce the complexity of the feature.
Parameters:
Patch size=2: The window of MaxPooling is 2x2, and each time the pooling reduces the width and height of the matrix by half.
Set N=2, that is, perform a total of MxN=4 Convolution operations,
2018102037 09 Dec 2018 and perform a MaxPooling operation after each two Convolution operations.
STEP7: Relu is used as the activation function by utilize full connected layer to integrate Convolusitonal layer or information in the pooling layer. The output value of the last layer of the fully connected layer is passed to an output and sorted using SoftMax.
Figure 3 shows the structure of our Full Connected Network
In the present invention, we use 3 layers: one input layer, one hidden layer and one output layer. The number of nodes input is 8x8x32=2048. The final number of output nodes is 4, corresponding to the four categories of images. The value of P is set to 1, which means one hidden layer is contained, and the number of hidden layer nodes is set to 128. Therefore, the input 2048 nodes are reduced to 128 nodes in the hidden layer by the weight value, the bias and activation functions. Use these 128 nodes as the input of the output layer. In the output layer, 128 nodes are reduced to 4 nodes, then the remaining 4 nodes are calculated by SoftMax, and the output of the fully connected Neural Network is obtained.
So far, the design of the Neural Network has been completed. Figure
2018102037 09 Dec 2018 shows in detail the parameters ,inputs and outputs required for each layer in the Neural Network structure. Figure 7 shows the process of image data from input to output in a Neural Network, in which the data is displayed in the form of cuboids, the area of a cuboid represents the product of the length and width of the matrix of the nodes in the data, and the thickness represents the quantity of nodes in a set of data.
STEP8: After the training of the Neural Network model is completed and the results are obtained through testing, the parameters in the network structure need to be changed according to the feedback of the results. The parameters that need to be changed are:
Base learning rate, dropout rate, decay rate, and iteration steps Finally, we can get the best results by adjusting the above parameters.
Explanation
Relu
Relu is a sub-function: If the input is smaller or equal to zero, the output will be zero too, but if the input is greater than zero, the result will not changed. Formula: /(x)=max( 0, x). The formula will force some of the data be zero. Nevertheless, it improves the scarcity of the after- trained Internet, which is more consistent with the principle of
2018102037 09 Dec 2018 neuron excitation. This formula not only reduces the interdependence of parameter but also greatly alleviated the over-fitting problem, enhancing the functionalities of the Neural Network.
MaxPooling
Pooling operation is a basic procedure used in the Convolution Neural Network. Usually, there exists a pooling operation behind the Convolution layer, but nowadays the mainstream is to use MaxPooling. The pooling layer is to reduce the dimension of the characteristic in the filter layer, forming the ultimate classification result. There are two kinds of pooling: MaxPooling and Average Pooling. MaxPooling extracts certain characteristic values from a filter and saves the greatest pooling layer as the retention value, abandon the rest. Average- pooling is to average these values. Normally, man- pooling is used in Convolution Neural Network Because of the feature selection of MaxPooling, a better classification and identification rate is selected, which provides nonlinearity.
Full Connected Neural Network
A Full Connected Neural Network is a plurality of neurons connected according to certain rules. Each neuron contains an activation function, and the activation function is often chosen as a
2018102037 09 Dec 2018 sigmoid or than function.
In a Full Connected Neural Network, neurons are laid out in layers. The leftmost layer is called the input layer and it is responsible for receiving input data; the rightmost layer is called the output layer and it is used to obtain Neural Network output data. The layer between the input layer and the output layer is called a hidden layer, because the hidden layer is invisible. In a Full Connected Neural Network, there is no connection between neurons in the same layer. Each neuron in the Nth layer and all neurons in the N-l layer are connected to each other. The output of the N-lth layer neurons is the input of Nth layer neurons. The connections between neurons in different neural layers have different weights. The Neural Network is a function of f(x)=ax+b, where a represents the weight and b represents the offset. For calculating the output of the Neural Network based on the input, the value of each element x of the input vector X needs to be assigned to the corresponding neuron of the input layer of the Neural Network firstly, and then each neuron of each layer is sequentially calculated according to the activation function. The value is calculated until the values of all neurons in the last layer of the output layer are calculated. Finally, the values of each neuron in the output layer are stringed together to obtain the target output vector.
2018102037 09 Dec 2018
SoftMax
SoftMax is a normalized exponential function, an extend of logistic function. SoftMax has highly application in machine studying and deep learning, especially in dealing with problem like multiclassification. The final output unit in the classifier needs SoftMax function to process figures. SoftMax function is actually a normalization of finite term discrete probability distribution. So, SoftMax function has popular applications in issues related to multi-classification which are based on probability such as multinomial Logistic regression, multinomial linear discriminant analysis, simple Linear Bayesian Classifier, and artificial Neural Network. In Figure 4, left to the equality sign is how multi- connected layer’s work. W is the parameter in the Full Connected layer which we called the weight. X is the input of the Full Connected layer, which we called characteristic. There is a vector of characteristic ‘X=N*1”. This characteristic is resulted from the procession by multiple Convolutional layer and pooling layer in front of the Full Connected layer. Supposing that there is a Convolutional layer in front of the Full Connected layer, then the output of the Convolutional layer is 32 channel, meaning 32 characteristics and each one has a size of 8*8. W is the parameter of the Full Connection layer. It is a
2018102037 09 Dec 2018 matrix of T*N. This N corresponds to the N of X. T indicates the number of classes o For example, the classification of invention is 5, then T is 5. Therefore, training a network is the most suitable W matrix for the full link layer. So the Full Connected layer is the vector that performs W*X to get a T*l. On the definition of SoftMax function: the input of SoftMax is the vector of T*l, and the output is also a vector of T*l, each value of this vector represents the probability of the sample belonging t o each class, but the value of each vector of the output vector ranges from 0 to 1.
The input of SoftMax is W*X: assuming that the input sample of the model is I, discusses a 4 classification problem (Category 1, 2, 3). If the sample I is 3, then the sample I gets W*X before reaching the SoftMax level through the network layer. That is to say, W*X is a vector of 4* 1, and AJ in the formula indicates the No j in the vector of 4*1. Because the AK in the denominator represents the 4 values in the vector of 4*1, there will be a summation sign (here summation is K from 1 to T, T and the T in the graph above are equivalent, that is, and the j range is 1 to T). Because ex is always greater than 0, the numerator is always positive, and the denominator is positive, J is positive whose range is (0, 1). If it is not a training model, but a test
2018102037 09 Dec 2018 model, when a sample passes through the SoftMax layer and outputs a vector of Τ* 1, the index of the largest number in this vector will be taken as the prediction label of this sample.
Back Propagation
Iterative parameter training is performed for each batch of training data. In order to realize the training adjustment of parameter weights, a Back Propagation algorithm is selected. The back propagation algorithm calculates the δ of each node error term, based on the output result value y of each node and the targets result value Y firstly. After the error term is obtained, the connection weight of the node is updated, and the new weight is calculated according to the error term of the node, the learning rate, and the current weight. We use the iteration to calculate the weight of each node from the N layer to the N-l layer.
Si = yj (1 — yD Ekeoutputs wkj 5k (2)
Wji +- Wji + pbjXji (3) wy The weight of the edge
Xji'. Input passed to j by i node δ: The error between the actual value of the node and the expected value
y. The output value of the node η: Learning rate
Regularized:
2018102037 09 Dec 2018
Over-fitting is common. The main reason is that the amount of data is not enough to support a model with high complexity. Therefore, the most straightforward solution to solve the over-fitting problem is to increase the amount of training data. In our experiment, the DNN is regularized by dropout.
Dropoutrate:
Figure 5 represents the flowchart of dropout operation
In each training batch, the over-fitting phenomenon is reduced by omitting a certain number of (dropout_rate) hidden layer nodes randomly, and the importance of some hidden layer nodes is highlighted for achieving the purpose of increasing the accuracy. Dropout can be regarded as a model averaging. The model averaging is to average the estimates from different models by a certain weight. In each batch of training process, because each hidden layer is ignored randomly, this makes the network of each training different, which can be used as a new model; Implicit nodes randomly appear with a certain probability, so there is no guarantee that every 2 hidden nodes appear at the same time; therefore, the update of weights no longer depends on the total number of the same effect of hidden nodes with fixed relationships.
Baselearningrate:
2018102037 09 Dec 2018
Learning rate (stride) controls the learning progress of the model. During the training process, the dynamically changing learning rate is generally set according to the number of training rounds. At the beginning of training, the learning rate is preferably 0.01-0.001. After a certain number of rounds, the rate gradually slows down. Finally near the end of training, the attenuation of that the learning rate should be more than 100 times. The initial learning rate chosen for this experiment is 0.001.
Decayrate:
According to the characteristic of different stages of the optimization process, a general idea is to use a large learning rate to accelerate convergence in the early stage, and to ensure stability with a small learning rate later. In this experiment, the step size and attenuation coefficient are empirical values. Learning rate mitigation mechanism: this experiment slows down the number of rounds, and the learning rate is reduced to 0.7 per 500 rounds.
Iteration steps:
Because of the number of iterations is too small, the machine depth learning is not enough. The number of iterations is too much (over-learning), making the learning effect unsatisfied.
2018102037 09 Dec 2018
RESULT ANALYSIS
The initial parameters of this experiment are preset to a learning rate of 0.0001, a decay rate of 0.9, a dropout rate of 0.9, and a number of iterations of 2000.
The best parameters of this experiment are learning rate 0.001, decay rate 0.7, dropout rate 0.9999, and iteration times 1350 times or 1400 times.
In Fig.9 and Fig. 10, we use the best parameters of this experiment to determine the optimal number of iterations. It can be seen from the table and line graph that the number of iterations is better at 1700 and 1400. Because the amount of the data is small, the optimal number of iterations cannot be accurately determined.
In order to carefully study the optimal parameters of the number of iterations, we do another experiment. Since the test set has reached 100% accuracy as early as 1000 times, in order to ensure that the model does not appear to be unsatisfied phenomenon, such as over-fitting, and ensure good recognition accuracy at the same time, we select 1300 times, 1350 times, 1400 times, 1450 times until 1800 times as the test object of this experiment, the number of iterations is 50 per iteration. Since the accuracy of machine learning is not certain, we measure 15 sets of data in each group and calculate the average accuracy of these 15 sets of data. As
2018102037 09 Dec 2018 shown in Fig. 11, it can be explained that the accuracy is quite high at 1350 times and 1400 times and 1600 times. So in the next few experiments, we used these three different times as test objects.
As shown in Fig. 12, 13, 14, and 15, in the fourth experiment, we prepare to change the learning rate first because the learning rate of the previous experiment was too low. As can be seen from the table above and the three line charts, when the learning rate is 0.001, the accuracy of machine learning is the highest, so we use 0.001 as the best learning rate.
As shown in Fig. 16, 17, 18, 19, in the three different iterations, when the decay rate is 0.7, module has the best accuracy.
Finally, we need to determine the optimal parameters of the dropout rate. Since the dropout is higher, the more nodes are removed, the more random our data is. Therefore, based on the initial dropout rate of 0.9, we choose 0.9999 as the best drop out rate. The parameters, in the end, also get more ideal results.
In conclusion, in the above multiple sets of comparative experiments, we finally determine the learning rate of 0.001, the decay rate of 0.7, the dropout rate of 0.9999, and the number of iterations of 1350 or 1400 times as the best parameters.
2018102037 09 Dec 2018
EDITORIAL NOTE
There is one page in the claims only .

Claims (2)

1. A method of recognition of vehicle type based on deep learning , in which using structure
Input [[CONV -+ Relu] * M -+ MaxPooling] *N [FC -+ Relu] * P -> FC -> SoftMax as the Convolutional Neural Network construction mode and Full Connected Neural Network construction mode, Μ, N and P depends on the situation.
2. A method of recognition of vehicle type based on deep learning mentioned as claim 1, in which selection of model parameters: M=4 in the convolution, filters=32, size/strides = 2x2, Drop rate = 0. 99, Base learning rate = 0. 001, Decay rate = 0. 7, iteration steps = 1600, full Connected Neural Network: the node of hidden layer is node=128, the above parameters make the training results more accurate so that good model can be obtained.
AU2018102037A 2018-12-09 2018-12-09 A method of recognition of vehicle type based on deep learning Ceased AU2018102037A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2018102037A AU2018102037A4 (en) 2018-12-09 2018-12-09 A method of recognition of vehicle type based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2018102037A AU2018102037A4 (en) 2018-12-09 2018-12-09 A method of recognition of vehicle type based on deep learning

Publications (1)

Publication Number Publication Date
AU2018102037A4 true AU2018102037A4 (en) 2019-01-17

Family

ID=65009427

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2018102037A Ceased AU2018102037A4 (en) 2018-12-09 2018-12-09 A method of recognition of vehicle type based on deep learning

Country Status (1)

Country Link
AU (1) AU2018102037A4 (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978847A (en) * 2019-03-19 2019-07-05 东南大学 Drag-line housing disease automatic identifying method based on transfer learning Yu drag-line robot
CN110008360A (en) * 2019-04-09 2019-07-12 河北工业大学 Vehicle target image data base method for building up comprising specific background image
CN110852358A (en) * 2019-10-29 2020-02-28 中国科学院上海微系统与信息技术研究所 Vehicle type distinguishing method based on deep learning
CN111027445A (en) * 2019-12-04 2020-04-17 安徽工程大学 Target identification method for marine ship
CN111160100A (en) * 2019-11-29 2020-05-15 南京航空航天大学 Lightweight depth model aerial photography vehicle detection method based on sample generation
CN111582213A (en) * 2020-05-15 2020-08-25 北京铁科时代科技有限公司 Automobile identification method based on Centernet
CN111767860A (en) * 2020-06-30 2020-10-13 阳光学院 Method and terminal for realizing image recognition through convolutional neural network
CN111895931A (en) * 2020-07-17 2020-11-06 嘉兴泊令科技有限公司 Coal mine operation area calibration method based on computer vision
CN112101117A (en) * 2020-08-18 2020-12-18 长安大学 Expressway congestion identification model construction method and device and identification method
CN112329569A (en) * 2020-10-27 2021-02-05 武汉理工大学 Freight vehicle state real-time identification method based on image deep learning system
CN112508036A (en) * 2020-11-16 2021-03-16 杭州电子科技大学 Handwritten digit recognition method based on convolutional neural network and codes
CN113065653A (en) * 2021-04-27 2021-07-02 北京工业大学 Design method of lightweight convolutional neural network for mobile terminal image classification
CN113128578A (en) * 2021-04-08 2021-07-16 青岛农业大学 Peanut excellent seed screening system and screening method thereof
CN113205107A (en) * 2020-11-02 2021-08-03 哈尔滨理工大学 Vehicle type recognition method based on improved high-efficiency network
CN113297936A (en) * 2021-05-17 2021-08-24 北京工业大学 Volleyball group behavior identification method based on local graph convolution network
CN114627342A (en) * 2022-03-03 2022-06-14 北京百度网讯科技有限公司 Training method, device and equipment of image recognition model based on sparsity
CN115050092A (en) * 2022-05-20 2022-09-13 宁波明家智能科技有限公司 Lip reading algorithm and system for intelligent driving
CN115731436A (en) * 2022-09-21 2023-03-03 东南大学 Highway vehicle image retrieval method based on deep learning fusion model
CN115953486A (en) * 2022-12-30 2023-04-11 国网电力空间技术有限公司 Automatic coding method for direct-current T-shaped tangent tower component inspection image

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978847B (en) * 2019-03-19 2023-07-04 东南大学 Automatic inhaul cable sleeve disease identification method based on transfer learning and inhaul cable robot
CN109978847A (en) * 2019-03-19 2019-07-05 东南大学 Drag-line housing disease automatic identifying method based on transfer learning Yu drag-line robot
CN110008360A (en) * 2019-04-09 2019-07-12 河北工业大学 Vehicle target image data base method for building up comprising specific background image
CN110852358A (en) * 2019-10-29 2020-02-28 中国科学院上海微系统与信息技术研究所 Vehicle type distinguishing method based on deep learning
CN111160100A (en) * 2019-11-29 2020-05-15 南京航空航天大学 Lightweight depth model aerial photography vehicle detection method based on sample generation
CN111027445A (en) * 2019-12-04 2020-04-17 安徽工程大学 Target identification method for marine ship
CN111582213A (en) * 2020-05-15 2020-08-25 北京铁科时代科技有限公司 Automobile identification method based on Centernet
CN111767860A (en) * 2020-06-30 2020-10-13 阳光学院 Method and terminal for realizing image recognition through convolutional neural network
CN111895931A (en) * 2020-07-17 2020-11-06 嘉兴泊令科技有限公司 Coal mine operation area calibration method based on computer vision
CN111895931B (en) * 2020-07-17 2021-11-26 嘉兴泊令科技有限公司 Coal mine operation area calibration method based on computer vision
CN112101117A (en) * 2020-08-18 2020-12-18 长安大学 Expressway congestion identification model construction method and device and identification method
CN112329569A (en) * 2020-10-27 2021-02-05 武汉理工大学 Freight vehicle state real-time identification method based on image deep learning system
CN112329569B (en) * 2020-10-27 2024-02-09 武汉理工大学 Freight vehicle state real-time identification method based on image deep learning system
CN113205107A (en) * 2020-11-02 2021-08-03 哈尔滨理工大学 Vehicle type recognition method based on improved high-efficiency network
CN112508036A (en) * 2020-11-16 2021-03-16 杭州电子科技大学 Handwritten digit recognition method based on convolutional neural network and codes
CN113128578B (en) * 2021-04-08 2022-07-19 青岛农业大学 Screening method for good peanut seeds
CN113128578A (en) * 2021-04-08 2021-07-16 青岛农业大学 Peanut excellent seed screening system and screening method thereof
CN113065653A (en) * 2021-04-27 2021-07-02 北京工业大学 Design method of lightweight convolutional neural network for mobile terminal image classification
CN113065653B (en) * 2021-04-27 2024-05-28 北京工业大学 Design method of lightweight convolutional neural network for mobile terminal image classification
CN113297936A (en) * 2021-05-17 2021-08-24 北京工业大学 Volleyball group behavior identification method based on local graph convolution network
CN113297936B (en) * 2021-05-17 2024-05-28 北京工业大学 Volleyball group behavior identification method based on local graph convolution network
CN114627342A (en) * 2022-03-03 2022-06-14 北京百度网讯科技有限公司 Training method, device and equipment of image recognition model based on sparsity
CN114627342B (en) * 2022-03-03 2024-09-06 北京百度网讯科技有限公司 Sparsity-based image recognition model training method, device and equipment
CN115050092A (en) * 2022-05-20 2022-09-13 宁波明家智能科技有限公司 Lip reading algorithm and system for intelligent driving
CN115731436A (en) * 2022-09-21 2023-03-03 东南大学 Highway vehicle image retrieval method based on deep learning fusion model
CN115731436B (en) * 2022-09-21 2023-09-26 东南大学 Highway vehicle image retrieval method based on deep learning fusion model
CN115953486A (en) * 2022-12-30 2023-04-11 国网电力空间技术有限公司 Automatic coding method for direct-current T-shaped tangent tower component inspection image
CN115953486B (en) * 2022-12-30 2024-04-12 国网电力空间技术有限公司 Automatic encoding method for inspection image of direct-current T-shaped tangent tower part

Similar Documents

Publication Publication Date Title
AU2018102037A4 (en) A method of recognition of vehicle type based on deep learning
CN110135267B (en) Large-scene SAR image fine target detection method
CN108052911B (en) Deep learning-based multi-mode remote sensing image high-level feature fusion classification method
CN110414377B (en) Remote sensing image scene classification method based on scale attention network
CN106228185B (en) A kind of general image classifying and identifying system neural network based and method
CN112270347A (en) Medical waste classification detection method based on improved SSD
CN110309856A (en) Image classification method, the training method of neural network and device
CN110097145A (en) One kind being based on CNN and the pyramidal traffic contraband recognition methods of feature
CN107480261A (en) One kind is based on deep learning fine granularity facial image method for quickly retrieving
CN108427921A (en) A kind of face identification method based on convolutional neural networks
EP4163831A1 (en) Neural network distillation method and device
CN108416318A (en) Diameter radar image target depth method of model identification based on data enhancing
CN109344891A (en) A kind of high-spectrum remote sensing data classification method based on deep neural network
CN112529146B (en) Neural network model training method and device
CN112084890B (en) Method for identifying traffic signal sign in multiple scales based on GMM and CQFL
CN111950583B (en) Multi-scale traffic signal sign recognition method based on GMM (Gaussian mixture model) clustering
CN111524140B (en) Medical image semantic segmentation method based on CNN and random forest method
CN113298032A (en) Unmanned aerial vehicle visual angle image vehicle target detection method based on deep learning
CN113033321A (en) Training method of target pedestrian attribute identification model and pedestrian attribute identification method
CN116468740A (en) Image semantic segmentation model and segmentation method
CN114266757A (en) Diabetic retinopathy classification method based on multi-scale fusion attention mechanism
CN114648667A (en) Bird image fine-granularity identification method based on lightweight bilinear CNN model
CN114170519A (en) High-resolution remote sensing road extraction method based on deep learning and multidimensional attention
CN114065831A (en) Hyperspectral image classification method based on multi-scale random depth residual error network
CN112633169A (en) Pedestrian recognition algorithm based on improved LeNet-5 network

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry