WO2018157862A1

WO2018157862A1 - Vehicle type recognition method and device, storage medium and electronic device

Info

Publication number: WO2018157862A1
Application number: PCT/CN2018/077898
Authority: WO
Inventors: 郑克松; 张力; 徐浩; 申玉
Original assignee: 腾讯科技（深圳）有限公司
Priority date: 2017-03-02
Filing date: 2018-03-02
Publication date: 2018-09-07
Also published as: CN108304754A

Abstract

Disclosed are a vehicle type recognition method and device, a storage medium and an electronic device. The method comprises: acquiring a request to recognise a vehicle type of a vehicle in a target image; using a pre-set neural network model to recognise the vehicle in the target image as a target vehicle type, wherein the pre-set neural network model is obtained by training a deep convolutional neural network model for performing vehicle type recognition using a training set, the training set is an image set obtained by performing first pre-processing on images of a plurality of vehicle types, the first pre-processing is used for enhancing the robustness of the pre-set neural network model, and the plurality of vehicle types comprising the target vehicle type; and in response to the request, returning a recognition result comprising the target vehicle type. The present application solves the technical problem in the prior art of low accuracy when performing vehicle type recognition.

Description

Vehicle identification method and device, storage medium, electronic device

The present application claims priority to the Chinese Patent Application, the entire disclosure of which is hereby incorporated by reference.

Technical field

The present application relates to the field of monitoring, and in particular to a method and device for identifying a vehicle type, a storage medium, and an electronic device.

Background technique

In intelligent video surveillance, vehicle identification technology is an important part of a pre-processing of public security image detection and traffic state analysis. With the development of information technology, the vehicle identification technology has been further developed. The current vehicle identification technology is quite different from the traditional vehicle identification. In the traditional sense, the vehicle identification can only distinguish the approximate type of the vehicle. Such as small vehicles, medium vehicles and large vehicles. In the current sense, the vehicle identification technology classifies the vehicle model features extracted from the vehicle face region image to determine the brand model to which the vehicle belongs. With the advancement of computer technology, the vehicle identification technology based on the vehicle face features has gradually become practical, not only can identify the vehicle brand, but also can identify the series and the annual model of the vehicle brand, thus greatly expanding the application of the technology in related fields.

The current vehicle identification algorithms mainly include the following two methods: template-based matching methods and statistical pattern recognition-based methods, which have high image requirements (such as illumination, angle, sharpness, occlusion, etc.), and Low recognition rate and lack of robustness.

In view of the low accuracy of vehicle identification in the related art, an effective solution has not yet been proposed.

Summary of the invention

The embodiment of the present application provides a method and device for identifying a vehicle type, a storage medium, and an electronic device, so as to at least solve the technical problem of low accuracy of vehicle type recognition in the related art.

According to an aspect of the embodiments of the present application, a method for identifying a vehicle type includes: obtaining a request for vehicle type recognition of a vehicle in a target picture; and using a preset neural network model to identify a vehicle in the target image as a target vehicle The preset neural network model is obtained by training a deep convolutional neural network model for performing vehicle type recognition using a training set, and the training set is a picture set obtained by performing first preprocessing on pictures of a plurality of vehicle models, A pre-processing is used to enhance the robustness of the preset neural network model, the plurality of models including the target vehicle type; and in response to the request, returning the recognition result including the target vehicle type.

According to another aspect of the embodiments of the present application, there is also provided an identification device for a vehicle type, comprising: an acquisition unit configured to acquire a request for vehicle type recognition of a vehicle in a target picture; and an identification unit for using a preset nerve The network model identifies that the vehicle in the target picture is the target vehicle type, wherein the preset neural network model is obtained by training the deep convolutional neural network model for vehicle type recognition using the training set, and the training set is for multiple models. The picture is subjected to a first pre-processed picture set, the first pre-processing is used to enhance the robustness of the preset neural network model, the plurality of models includes the target vehicle type, and the response unit is configured to return the identification including the target vehicle type in response to the request. result.

According to another aspect of an embodiment of the present application, there is also provided a storage medium comprising a stored program, wherein the program is configured to execute any of the methods described above at runtime.

According to another aspect of an embodiment of the present application, there is also provided an electronic device comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, the processor being configured to be executed by a computer program Any of the above methods.

In the embodiment of the present application, the deep convolutional neural network model for performing vehicle type recognition is trained by using the training set to obtain a preset neural network model, and when a request for vehicle type recognition of the vehicle in the target picture is obtained, Identification by a preset neural network model, since the training set is a set of pictures obtained by performing first pre-processing on pictures of a plurality of vehicle models, and the first pre-processing can enhance the robustness of the preset neural network model, that is, it can be eliminated The influence of environment and shooting angle on vehicle identification can solve the technical problem of low accuracy of vehicle identification in related technologies, and further achieve the technical effect of improving the accuracy of vehicle identification.

DRAWINGS

The drawings described herein are intended to provide a further understanding of the present application, and are intended to be a part of this application. In the drawing:

1 is a schematic diagram of a hardware environment of a method of identifying a vehicle type according to an embodiment of the present application;

2 is a flow chart of an alternative vehicle type identification method in accordance with an embodiment of the present application;

3 is a flow chart of an alternative vehicle type identification method in accordance with an embodiment of the present application;

4 is a flow chart of an alternative vehicle type identification method in accordance with an embodiment of the present application;

5 is a flow chart of an alternative vehicle type identification method in accordance with an embodiment of the present application;

6 is a flow chart of an alternative vehicle type identification method in accordance with an embodiment of the present application;

7 is a schematic diagram of an optional vehicle type identification device according to an embodiment of the present application;

8 is a schematic diagram of an optional vehicle type identification device according to an embodiment of the present application;

FIG. 9 is a structural block diagram of a terminal according to an embodiment of the present application.

detailed description

The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present application. It is an embodiment of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope shall fall within the scope of the application.

It should be noted that the terms "first", "second" and the like in the specification and claims of the present application and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or order. It is to be understood that the data so used may be interchanged where appropriate, so that the embodiments of the present application described herein can be implemented in a sequence other than those illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units may include other steps or units not explicitly listed or inherent to such processes, methods, products or devices.

First, some of the nouns or terms that appear in the process of describing the embodiments of the present application are applicable to the following explanations:

Robustness: It is the transliteration of Robust, which means strong and strong. It is the key to system survival in exceptional and dangerous situations. For example, if the computer software fails to crash or crash under the condition of input error, disk failure, network overload or intentional attack, it is the robustness of the software. The so-called "robustness" refers to the characteristic that the control system maintains some other performance under a certain (structure, size) parameter perturbation. According to different definitions of performance, it can be divided into stable robustness and performance robustness. A fixed controller designed with the robustness of a closed-loop system as a target is called a robust controller.

Artificial neural network: Artificial Neural Networks, abbreviated as ANN, also referred to as neural network (NNs) or connection model (Connection Model), it is an algorithmic mathematics that mimics the behavioral characteristics of animal neural networks and performs distributed parallel information processing. model. This kind of network relies on the complexity of the system to adjust the relationship between a large number of internal nodes to achieve the purpose of processing information.

According to an embodiment of the present application, a method embodiment of a method for identifying a vehicle type is provided.

Alternatively, in the present embodiment, the identification method of the above-described vehicle type can be applied to a hardware environment constituted by the server 102 and the terminal 104 as shown in FIG. 1. As shown in FIG. 1, the server 102 is connected to the terminal 104 through a network, and may further include a database 106 for providing a data storage service for the server, including but not limited to: a wide area network, a metropolitan area network, or a local area network, and the terminal 104 is not Limited to PCs, mobile phones, tablets, etc. The identification method of the vehicle type of the embodiment of the present application may be performed by the server 102 (specifically, the following steps S202 to S206 may be performed), may be performed by the terminal 104, or may be performed by the server 102 and the terminal 104 in common. The method for identifying the vehicle type in which the terminal 104 executes the embodiment of the present application may also be performed by a client installed thereon.

2 is a flow chart of an alternative vehicle type identification method according to an embodiment of the present application. As shown in FIG. 2, the method may include the following steps:

Step S202, acquiring a request for vehicle type recognition of the vehicle in the target picture, for example, a request for clicking the "query model" triggered as shown in FIG. 1;

Step S204, using a preset neural network model to identify that the vehicle in the target image is the target vehicle type, and the preset neural network model is obtained by training the deep convolutional neural network model for vehicle type recognition using the training set, and the training set is Performing a first pre-processed picture set on a picture of a plurality of vehicle models, the first pre-processing is for enhancing the robustness of the preset neural network model, and the plurality of models includes the target vehicle type;

Step S206, in response to the request, returning the recognition result including the target vehicle type, as shown in FIG. 1, the recognition result may be displayed on the user terminal, such as "recognition result: XX brand XX model".

Through the above steps S202 to S206, the deep convolutional neural network model for performing vehicle type recognition is trained by using the training set to obtain a preset neural network model, and when a request for vehicle type recognition of the vehicle in the target picture is obtained, It can be identified by a preset neural network model. Since the training set is a set of pictures obtained by performing the first pre-processing on pictures of a plurality of vehicle models, the first pre-processing can enhance the robustness of the preset neural network model, that is, It can eliminate the influence of environment and shooting angle on vehicle identification, and can solve the technical problem of low accuracy of vehicle identification in related technologies, and further achieve the technical effect of improving the accuracy of vehicle identification.

The above-mentioned target picture is a picture of a car photographed by the user arbitrarily; the pictures of the plurality of models are pre-acquired pictures of the known models, and in order to improve the efficiency and accuracy of the recognition, multiple pictures can be taken from different angles for each model. The picture, and then through the pre-processing can get the above training set.

Optionally, the robustness described above includes performance robustness and stable robustness.

The above neural network model refers to a deep convolutional neural network model including a plurality of hidden layers (ie, convolutional layers) and feature extraction layers.

It should be noted that for the common neural network DNN, the lower neurons and all the upper neurons can form a connection, and the potential problem is the expansion of the number of parameters, which is not only easy to over-fit during training, but also easy to fall into Locally optimal, intrinsic local patterns (such as contours, boundaries, etc.) can be utilized in the image, and the concept of image processing can be combined with neural network technology, and the deep convolutional neural network model of the present application can be realized. The above purpose.

In the deep convolutional neural network model, not all upper and lower neurons can be directly connected, but through the "convolution kernel" (ie convolutional layer) as the intermediary, the same convolution kernel is shared in all images. The image retains the original positional relationship after the convolution operation. It is precisely because the deep convolutional neural network model limits the number of parameters and mines the characteristics of the local structure, which can reduce the parameter quantity and improve the robustness of the network. Sex.

For example, you need to identify a color image with four channels of ARGB (ie, transparency and red, green, and blue, corresponding to four images of the same size). Assuming a convolution kernel size of 2*2, a total of 10 convolutions are used. The kernel (w1 to w10, each convolution kernel is used to learn different structural features), and the convolution operation on the ARGB image with w1 can obtain the first image of the hidden layer; the upper left corner of the hidden layer image The first pixel is the weighted summation of the pixels in the 2*2 region of the upper left corner of the four input images, and so on, to obtain other images corresponding to the convolution kernel. Similarly, counting other convolution kernels, the hidden layer corresponds to 10 "images". Each image pair responds to different features in the original image, and continues to be transmitted according to such a structure. In addition, operations such as max-pooling in the deep convolutional neural network model further improve the robustness.

The Deep Convolutional Neural Network (DCNN) includes a convolutional layer and a pooled layer (ie, a feature classification layer), and a plurality of convolution-pooling units constitute a feature expression, which can be applied to two-dimensional image recognition. The DCNN can be thought of as a DNN with a two-dimensional discrete convolution operation.

The structure of the DCNN includes a plurality of feature extraction layers (ie, convolution layers) and feature mapping layers (ie, feature classification layers). In the convolution layer, the input of each neuron is connected to the local accepted domain of the previous layer, and is extracted. The local feature, once the local feature is extracted, its positional relationship with other features is also determined; each computing layer of the network is composed of multiple feature maps, each feature map is a plane, on the plane The weights of all neurons are equal. The feature mapping structure can use the sigmoid function which affects the function kernel as the activation function of the convolution network, so that the feature map has displacement invariance. In addition, since the neurons on a mapped surface share the weight, thus reducing the number of network free parameters, each convolutional layer in the convolutional neural network can be followed by a local average and secondary extraction. The computational layer, this unique two feature extraction structure reduces the feature resolution.

In the case of vehicle type identification, a template-based matching method can be used. The disadvantage of this method is that the template is difficult to establish, and the vehicle cannot recognize the vehicle when the image rotates or the scale changes in the image, even if a small range of occlusion occurs. The model is recognized normally; a method based on statistical pattern recognition can also be used, which requires that the probability distribution of each category is known, and the number of categories of decision classification is consistent, and the method is not robust to factors such as illumination and occlusion. That is, it is easily affected by factors such as illumination and occlusion.

Compared with the above-mentioned vehicle identification method, the deep convolution network model of the present application has a better effect on image recognition, and the deep convolutional neural network can solve the problems encountered by the traditional vehicle identification algorithm and solve the above-mentioned vehicle identification scheme. High technical requirements (such as illumination, angle, sharpness, occlusion), low recognition rate, poor robustness, etc., avoiding the influence of illumination, occlusion and other factors on vehicle identification, and improving recognition. Robustness.

In the above embodiment, before acquiring the request for vehicle type recognition of the vehicle in the target picture, the training may be performed as follows to obtain a preset neural network model: performing a first pre-processing on the pictures of the plurality of vehicle models to obtain a training set. The training set is used to train the deep convolutional neural network model for vehicle type recognition, and a preset neural network model is obtained.

Optionally, performing the first pre-processing on the pictures of the multiple models includes: processing each of the pictures of the plurality of models as follows, wherein each picture is regarded as the current picture: performing the first processing on the current picture The operation and the second processing operation obtain a first picture, wherein the first picture is regarded as a picture in the training set, the first processing operation includes an operation of random rotation and random clipping, and the operation of random rotation and random clipping is used to cancel the image The impact of the acquisition angle on the vehicle identification, the second processing operation is used to eliminate the impact of the image acquisition environment on the vehicle identification.

Optionally, performing the first processing operation and the second processing operation on the current picture includes: performing at least one of the following processing on the current picture: size adjustment, random rotation, random clipping, Gaussian smoothing processing, brightness adjustment, and saturation adjustment (ie, the first processing operation), obtaining a second picture; performing histogram equalization processing on the second picture to obtain a third picture; performing whitening processing on the third picture (equalization processing and whitening processing, ie, second processing operation), Get the first picture.

Optionally, when performing the first processing operation and the second processing operation on the current picture, one of performing operations such as “sizing, random rotation, random clipping, Gaussian smoothing, brightness adjustment, and saturation adjustment” may be performed, such as only Perform random rotation to improve the ability of the model to identify the vehicle from different angles, perform only brightness adjustment to improve the model's ability to identify the vehicle at different brightness, and only perform dimensional adjustment to improve the ability of the model to identify the vehicle in different appearances. Etc.; You can also perform multiple of these operations. It should be noted that when performing multiple of these operations, you can follow "Size, Random Rotation, Random Crop, Gaussian Smoothing, Brightness Adjustment, and Saturation Adjustment". This sequence is executed. The execution order can also be selected according to the requirements. The selected operation can be size adjustment and random rotation, or random cropping, Gaussian smoothing and brightness adjustment, or size adjustment, random rotation, brightness adjustment. And saturation adjustment, even for size adjustment, random rotation, with Cutting, Gaussian smoothing processing, brightness adjustment and saturation adjustment.

It should be noted that one or more of the above operations may be selected according to the ability of the model to be upgraded. If multiple of these operations are selected at the same time, it is equivalent to improving the model after one training (the same training set). A variety of abilities can obviously improve training efficiency.

Optionally, the training set is used to train the deep convolutional neural network model for performing vehicle type recognition, and the preset neural network model is obtained by: identifying, by the multiple convolution layers of the deep convolutional neural network model, the training set belongs to Multiple first feature information of all pictures of each vehicle type; after extracting the feature set of each vehicle type from all the first feature information of each vehicle type by the feature classification layer of the deep convolutional neural network model, the preset nerve is obtained The network model, wherein the feature set includes second feature information for indicating the vehicle type in all the first feature information, and the feature set is saved in the preset neural network model.

In the above random rotation and random cropping, multiple random rotations can be performed, and each time a random rotation obtains one picture, that is, multiple random rotations can be obtained to obtain multiple pictures; similarly, in random cutting, it can also be performed. Multiple random cropping to get multiple images. By using multiple random rotations and random cropped images for training, it includes a richer shooting environment (shooting angle, captured content, etc.), so you can extract more features and learn more. Many features to learn the difference between different models, but also to more accurately identify the vehicle models captured at different angles.

In the technical solution provided in step S202, acquiring the request for vehicle type identification of the vehicle in the target picture comprises: receiving a request sent by the first client to the server for performing vehicle type identification, wherein the first client Server connection.

A function interface (ie, a preset interface) for vehicle type identification is provided on the server, so that the server can receive a request sent by the client for vehicle type identification on the preset interface.

It should be noted that the function interface is a universal interface, and the client or the webpage can call the interface for vehicle type identification. For example, in a communication application, the function interface is used to send the captured image to the server, and the server recognizes After the result is returned, the recognition result including the target vehicle type is returned to the object calling the preset interface through the preset interface.

In the technical solution provided in step S204, the second pre-processing of the target image is performed before the vehicle in the target image is identified as the target vehicle model by using the preset neural network model. When the preset neural network model is used to identify that the vehicle in the target image is the target vehicle, the image data of the second pre-processed target image is used as an input of the preset neural network model to identify the vehicle in the target image as a target. Model.

Optionally, performing the second pre-processing on the target image includes: performing at least one of the following processes on the target image: performing cropping, resizing, Gaussian smoothing, brightness adjustment, and saturation adjustment according to the car frame to obtain the fourth image. For example, performing only the size adjustment, or only performing brightness adjustment, etc., or performing size adjustment and Gaussian smoothing processing, or performing brightness adjustment and saturation adjustment; performing a histogram equalization process on the fourth picture to obtain a fifth picture; The fifth picture is whitened to obtain picture data to be input into the preset neural network model.

Optionally, the picture data of the second preprocessed target picture is used as an input of the preset neural network model to identify that the vehicle in the target picture is the target vehicle type: multiple convolution layers through the preset neural network model Identifying third feature information of the target image; acquiring a matching degree of the third feature information of the target image and the feature information in the feature set of each vehicle type, for example, determining a feature of each of the third feature information of the target image and each vehicle type The number of the feature information in the set is the same, and the ratio of the number to the number of all the feature information in the feature set of each vehicle type is used as the matching degree; and the model corresponding to the target matching degree among the plurality of matching degrees of the plurality of vehicle models is determined. For the target vehicle type, the target matching degree includes the first N matching degrees in the arrangement of the plurality of matching degrees from large to small, and N is a positive integer.

The above-mentioned N is a positive integer smaller than the number of vehicle models, and the above-described target matching degree is N which is the top of all the matching degrees from large to small.

In the technical solution provided in step S206, in response to the request, returning the recognition result including the target vehicle type includes: after obtaining the target matching degree (ie, N large matching degrees), passing the N matching degrees through the function interface. Returned to the client, there is a client to show to the user.

In the method provided by the present application, a vehicle identification framework based on deep convolutional neural network is proposed for the vehicle identification problem. The image preprocessing is combined with the traditional digital image processing technology to solve the input picture requirement when the traditional method is used to identify the vehicle. High, and the problem of low recognition rate. The scheme has good robustness when identifying the vehicle model, and is insensitive to factors such as illumination, noise, rotation, partial occlusion, etc., and the recognition rate is greatly improved. The difference from the existing method is that no manual design or extraction is required. Image features, using CNN to automatically design, extract, and optimize features. Not only reduces the workload, but also improves the recognition rate.

As an optional embodiment, when performing picture recognition by using the method provided by the present application, the training picture (ie, the current picture) is preprocessed as an input of the deep convolutional neural network, the training is started, the network parameters are obtained, and saved; When operating online, the network parameters are reloaded; the user inputs the picture, performs preprocessing, and takes the picture as a network input; obtains various types of probability distributions, and returns the first N largest probability categories as output. Embodiments of the present application are described in detail below with reference to FIGS. 3 through 6.

Step S302, acquiring a training data set.

Collecting training pictures to data collections, the collection of data collection is very important. The quality of data collection will directly determine the pros and cons of the model parameters. In practice, manual acquisition and collaboration with the automotive professional website can be used to obtain training pictures for each model.

Step S304, preprocessing the data in the data set.

In the process of inputting the training picture into the deep convolutional neural network, the training picture is preprocessed first. The preprocessing process is very important. Different preprocessing processes have a significant impact on the prediction result. At the same time, the preprocessing process of the design of the solution It is also an important source of good robustness and other advantages of this solution.

Step S306, the preprocessed data is input into the deep convolutional neural network.

Deep convolutional neural networks need to pay attention to the selection of networks that can adapt to a large number of categories and can identify the details. Because the category data of the models is large, the difference between the appearance of the cars, especially the different brands of the same brand, is very small. .

Pre-select a deep convolutional neural network, such as Google's InceptionV3. An important feature of the Inception-V3 network is the decomposition of large convolution kernels into small convolution kernels, such as the solution of 7*7 volume integrals into two one-dimensional convolutions (1*7 and 7*1), which can speed up Calculating the speed, while increasing the depth of the network and the nonlinearity of the network, enhances the feature extraction and presentation capabilities of the network.

In deep convolutional neural networks, the convolutional layer can be regarded as a feature extractor. In order to achieve scale invariance, rotation invariance, illumination, etc. of feature extraction, image feature extractor can be used as image pyramid. , whitening and other technologies. For deep convolutional neural networks, these problems can be solved by data preprocessing and appropriate network structure. For scale invariance, Inception-V3 adapts to different scale object images through different sizes of convolution kernel stacks.

In this scheme, a deep convolutional neural network can be used as a vehicle model image feature extractor, and softmax is used as a classifier. During training, use a richer pre-processing technique to solve problems such as rotation invariance and illumination.

In step S308, a deep convolutional neural network is obtained.

The preprocessing method for the training process is shown in Figure 4:

Step S402, adjusting the size of the picture to be input.

In order to adapt to the input layer of the deep convolutional network, such as its input layer requirement is 299 * 299, then the size of the picture can be adjusted to 299 * 299.

In step S404, in order to enhance the adaptability of the network to the image after the object is rotated, it is necessary to randomly rotate the picture.

In step S406, in order to enhance the adaptability of the network when the object is partially occluded, the picture may be randomly cropped.

In step S408, in order to remove the Gaussian noise and enhance the adaptability of the network to the input image, the Gaussian smoothing processing may be performed on the image.

In step S410, in order to enhance the adaptability to the darker image, the brightness of the picture may be randomly processed.

In step S412, in order to enhance the adaptability of the network to different saturation images, the saturation of the picture may be adjusted.

Step S414, after adjusting the saturation, performing histogram equalization processing, using histogram equalization, enhancing the contrast of the image while making the input pixel values more uniform.

Histogram is also called mass distribution map. If the pixels of an image occupy a lot of gray levels and are evenly distributed, such images tend to have high contrast and variable gray tone. Histogram equalization is a transformation function that automatically achieves this effect by simply inputting histogram information. Its basic idea is to broaden the gray level of the number of pixels in the image, and compress the gray level with a small number of pixels in the image, thereby expanding the dynamic range of the original value, and improving the contrast and gray tone. The changes make the image clearer.

In step S416, a whitening process is performed, that is, the image data is decentralized, and the input can be normalized as an input of the network.

Optionally, the whitening process is similar to the Principal Component Analysis (PCA) algorithm, for example, assuming that the training data is a third image, and is used for training due to strong correlation between adjacent pixels in the third image. The input is redundant, and the purpose of whitening is to reduce the redundancy of the input, that is, decentralization; the input of the deep convolutional neural network through the whitening process has the following properties: correlation between features (images input) Lower, features have the same variance (such as the unit variance set in image processing).

Optionally, the normalization process is a normalization process on image data, including but not limited to length, width, gray value, etc., such as the gray value of the pixel point, the value interval It is 0 to 255. For any pixel with a gray value of N, the gray value can be normalized to N/255.

The process of making predictions is shown in Figure 5:

Step S502, acquiring parameters of the deep convolutional neural network.

Step S504, obtaining a car picture.

Step S506, pre-processing the acquired car picture.

Step S508, the pre-processed car picture is input into the deep convolutional neural network.

In step S510, the result of the output of the deep convolutional neural network is obtained.

It should be noted that the model parameters adopt a Gaussian distribution. For the convolution kernel, a two-dimensional Gaussian distribution is used to initialize the convolution kernel parameters of the corresponding position; various parameters of the preprocessing process, such as random lifting brightness, where random Using a Gaussian distribution, the corresponding parameter is the Gaussian distribution parameter.

The preprocessing for prediction is shown in Figure 6:

Step S602, adjusting the size of the picture to be input.

Step S604, detecting a rough frame of the car in the picture.

In step S606, the picture is randomly cropped.

Step S608, performing Gaussian smoothing on the picture.

In step S610, the brightness of the picture is randomly processed.

In step S612, the saturation of the picture is adjusted.

In step S614, a histogram equalization process is performed on the picture.

In step S616, the picture is whitened.

It should be noted that the pre-processing in the prediction process does not require rotation and random cropping, but uses a simple car detector to take the approximate location of the car. The reason for this is that if the user is taking pictures, the car is not in the foreground, and the network is likely to cause misclassification.

In the method provided by the present application, a vehicle identification framework based on deep convolutional neural network is proposed for the vehicle identification problem. The image processing is combined with the traditional digital image processing technology to solve the problem of inputting pictures when the traditional method is used to identify the vehicle. The problem is high and the recognition rate is low. The scheme has good robustness when identifying the vehicle model, and is insensitive to factors such as illumination, noise, rotation, partial occlusion, etc., and the recognition rate is greatly improved. The difference from the existing method is that no manual design or extraction is required. Image features, using CNN to automatically design, extract, and optimize features. Not only reduces the workload, but also improves the recognition rate.

It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present application is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present application. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.

Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, The optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present application.

According to an embodiment of the present application, there is also provided an identification device for a vehicle type for implementing the identification method of the above-described vehicle type. FIG. 7 is a schematic diagram of an optional vehicle type identification device according to an embodiment of the present application. As shown in FIG. 7, the device may include an acquisition unit 72, an identification unit 74, and a response unit 76.

The obtaining unit 72 is configured to acquire a request for vehicle type recognition of the vehicle in the target picture;

The identifying unit 74 is configured to use the preset neural network model to identify that the vehicle in the target image is the target vehicle type, and the preset neural network model is obtained by training the deep convolutional neural network model for performing vehicle model identification using the training set. The training set is a set of pictures obtained by first pre-processing a picture of a plurality of vehicle models, and the first pre-processing is used to enhance the robustness of the preset neural network model, and the plurality of models include the target vehicle type;

The response unit 76 is configured to return a recognition result including the target vehicle type in response to the request.

It should be noted that the obtaining unit 72 in this embodiment may be used to perform step S202 in the embodiment of the present application. The identifying unit 74 in this embodiment may be used to perform step S204 in the embodiment of the present application. The response unit 76 can be used to perform step S206 in the embodiment of the present application.

It should be noted that the foregoing modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the contents disclosed in the foregoing embodiments. It should be noted that the foregoing module may be implemented in a hardware environment as shown in FIG. 1 as part of the device, and may be implemented by software or by hardware.

Through the above module, the deep convolutional neural network model for performing vehicle type recognition is trained by using the training set to obtain a preset neural network model, and the preset can be obtained by obtaining a request for vehicle type recognition of the vehicle in the target picture. The neural network model, because the training set is a set of pictures obtained by performing the first pre-processing on the pictures of the plurality of vehicle models, and the first pre-processing can enhance the robustness of the preset neural network model, that is, the environment and the shooting angle can be eliminated. The influence of the vehicle type identification can solve the technical problem of low accuracy of vehicle identification in the related art, and further achieve the technical effect of improving the accuracy of the vehicle identification.

It should be noted that for the common neural network DNN, the lower neurons and all the upper neurons can form a connection, and the potential problem is the expansion of the number of parameters, which is not only easy to over-fit during training, but also easy to fall into Locally optimal, there are inherent local patterns (such as contours, boundaries, etc.) in the image. It is obvious that the concept of image processing should be combined with neural network technology, and the deep convolutional neural network model of this application can be used. To achieve the above objectives.

Compared with the related vehicle identification method, the deep convolution network model of the present application has a better effect on image recognition, and the deep convolutional neural network can solve the problems encountered by the traditional vehicle identification algorithm and solve the above-mentioned vehicle identification scheme. High technical requirements (such as illumination, angle, sharpness, occlusion), low recognition rate, poor robustness, etc., avoiding the influence of illumination, occlusion and other factors on vehicle identification, and improving recognition. Robustness.

Optionally, as shown in FIG. 8, the apparatus of the present application further includes: a processing unit 82, configured to perform a first pre-processing on the pictures of the plurality of vehicle models before acquiring the request for vehicle type recognition of the vehicle in the target picture. And obtaining a training set; the training unit 84 is configured to train the deep convolutional neural network model for performing vehicle type recognition using the training set to obtain a preset neural network model.

The processing unit is further configured to perform processing on each of the pictures of the plurality of vehicle models, where each picture is regarded as a current picture: performing a first processing operation and a second processing operation on the current picture to obtain the first a picture, wherein the first picture is regarded as a picture in the training set, the first processing operation includes a random rotation and a random clipping operation, and the random rotation and the random clipping operation are used to eliminate the influence of the image collection angle on the vehicle type recognition, and second Processing operations are used to eliminate the impact of the image capture environment on vehicle identification.

Optionally, the processing unit includes: a first processing module, configured to perform at least one of the following processes on the current image: size adjustment, random rotation, random cropping, Gaussian smoothing, brightness adjustment, and saturation adjustment, to obtain a second The second processing module is configured to perform a histogram equalization process on the second image to obtain a third image. The third processing module is configured to perform whitening processing on the third image to obtain a first image.

Optionally, the training unit includes: an identification module, configured to identify, by using a plurality of convolution layers of the deep convolutional neural network model, a plurality of first feature information of all pictures belonging to each vehicle type in the training set; After extracting the feature set of each vehicle model from all the first feature information of each vehicle model through the feature classification layer of the deep convolutional neural network model, a preset neural network model is obtained, wherein the feature set includes all the first feature information. The second feature information is used to indicate the vehicle type, and the feature set is saved in the preset neural network model.

Optionally, the identifying unit is further configured to perform a second pre-processing on the target image before using the preset neural network model to identify that the vehicle in the target image is the target vehicle model; and identifying the target image in the target image by using the preset neural network model When the vehicle is the target vehicle, the picture data of the target picture subjected to the second pre-processing is used as an input of the preset neural network model to identify that the vehicle in the target picture is the target vehicle type.

Optionally, the identifying unit includes a pre-processing module, and the pre-processing module is configured to perform at least one of the following processes on the target image: cutting, resizing, Gaussian smoothing, brightness adjustment, and saturation adjustment according to the automobile frame, Four pictures; the fourth picture is subjected to histogram equalization processing to obtain a fifth picture; the fifth picture is whitened to obtain picture data to be input into the preset neural network model.

Optionally, the identifying unit includes an identifying module, configured to identify third feature information of the target image by using multiple convolution layers of the preset neural network model; acquiring third feature information of the target image and characteristics of each vehicle type a matching degree of the feature information in the set; determining a vehicle type corresponding to the target matching degree among the plurality of matching degrees of the plurality of vehicle models as the target vehicle type, wherein the target matching degree includes the plurality of matching degrees in the arrangement of the numerical values from large to small The first N matching degrees, N is a positive integer.

The obtaining unit is further configured to receive a request for the vehicle type identification sent by the first client to the server, where the first client connects to the server through the Internet.

It should be noted that the function interface is a universal interface, and any client or webpage can call the interface for vehicle identification. For example, in a communication application, the function interface is used to send the captured image to the server. After the result is recognized by the server, the object including the recognition result of the target vehicle type is returned to the preset interface through the preset interface.

It should be noted that the foregoing modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the contents disclosed in the foregoing embodiments. It should be noted that the foregoing module may be implemented in a hardware environment as shown in FIG. 1 as part of the device, and may be implemented by software or by hardware, where the hardware environment includes a network environment.

According to another aspect of an embodiment of the present application, there is also provided a storage medium (also referred to as a memory), the storage medium comprising a stored program, wherein the program is configured to execute any of the methods described above at runtime.

According to an embodiment of the present application, there is also provided a server or terminal (also referred to as an electronic device) for implementing the identification method of the above-described vehicle type.

9 is a structural block diagram of a terminal according to an embodiment of the present application. As shown in FIG. 9, the terminal may include: one or more (only one shown in FIG. 9) processor 901, memory 903, and transmission device. 905 (such as the transmitting device in the above embodiment), as shown in FIG. 9, the terminal may further include an input/output device 907.

The memory 903 can be used to store software programs and modules, such as the identification method of the vehicle type in the embodiment of the present application and the program instructions/modules corresponding to the device, and the processor 901 executes by executing the software program and the module stored in the memory 903. Various functional applications and data processing, that is, the identification method of the above-mentioned vehicle type is realized. Memory 903 can include high speed random access memory, and can also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, memory 903 can further include memory remotely located relative to processor 901, which can be connected to the terminal over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 905 described above is for receiving or transmitting data via a network, and can also be used for data transmission between the processor and the memory. Specific examples of the above network may include a wired network and a wireless network. In one example, the transmission device 905 includes a Network Interface Controller (NIC) that can be connected to other network devices and routers via a network cable to communicate with the Internet or a local area network. In one example, the transmission device 905 is a Radio Frequency (RF) module for communicating with the Internet wirelessly.

Optionally, the memory 903 is configured to store an application.

The processor 901 can call the application stored in the memory 903 through the transmission device 905 to perform the steps of: acquiring a request for vehicle type recognition of the vehicle in the target picture; and using the preset neural network model to identify the vehicle in the target picture as The target vehicle model, wherein the preset neural network model is obtained by training a deep convolutional neural network model for vehicle type recognition using a training set, and the training set is a picture set obtained by performing first preprocessing on pictures of a plurality of vehicle models. The first pre-processing is for enhancing the robustness of the preset neural network model, the plurality of models including the target vehicle model; and in response to the request, returning the recognition result including the target vehicle model.

The processor 901 is further configured to: perform a first pre-processing on the pictures of the plurality of vehicle models to obtain a training set; and use the training set to train the deep convolutional neural network model for performing the vehicle type recognition to obtain a preset neural network. Network model.

According to the embodiment of the present application, the deep convolutional neural network model for performing vehicle type recognition is trained by using the training set to obtain a preset neural network model, and the vehicle neural network model is obtained when the vehicle identification request for the vehicle in the target image is obtained. The preset neural network model, because the training set is a set of pictures obtained by performing the first pre-processing on the pictures of the plurality of vehicle models, and the first pre-processing can enhance the robustness of the preset neural network model, that is, the environment can be eliminated, and the shooting can be eliminated. The influence of the angle on the vehicle identification can solve the technical problem of low accuracy of the vehicle identification in the related art, and further achieve the technical effect of improving the accuracy of the vehicle identification.

For example, the specific examples in this embodiment may refer to the examples described in the foregoing embodiments, and details are not described herein again.

A person skilled in the art can understand that the structure shown in FIG. 9 is only illustrative, and the terminal can be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, a palm computer, and a mobile Internet device (MID). Terminal equipment such as PAD. FIG. 9 does not limit the structure of the above electronic device. For example, the terminal may also include more or fewer components (such as a network interface, display device, etc.) than shown in FIG. 9, or have a different configuration than that shown in FIG.

A person of ordinary skill in the art may understand that all or part of the steps of the foregoing embodiments may be completed by a program to instruct terminal device related hardware, and the program may be stored in a computer readable storage medium, and the storage medium may be Including: flash disk, read-only memory (ROM), random access memory (RAM), disk or optical disk.

Embodiments of the present application also provide a storage medium. Alternatively, in the embodiment, the above storage medium may be used to execute a program code of a vehicle type identification method.

Optionally, in this embodiment, the foregoing storage medium may be located on at least one of the plurality of network devices in the network shown in the foregoing embodiment.

Optionally, in the present embodiment, the storage medium is arranged to store program code for performing the following steps:

S1, obtaining a request for vehicle type identification of the vehicle in the target picture;

S2, using a preset neural network model to identify that the vehicle in the target image is the target vehicle type, wherein the preset neural network model is obtained by training the deep convolutional neural network model for vehicle type recognition using the training set, and the training set is obtained. The image set obtained by performing the first pre-processing on the pictures of the plurality of models, the first pre-processing is used to enhance the robustness of the preset neural network model, and the plurality of models include the target vehicle type;

S3, in response to the request, returning a recognition result including the target vehicle type.

Optionally, in this embodiment, the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory. A variety of media that can store program code, such as a disc or a disc.

The serial numbers of the embodiments of the present application are merely for the description, and do not represent the advantages and disadvantages of the embodiments.

The integrated unit in the above embodiment, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in the above-described computer readable storage medium. Based on such understanding, the technical solution of the present application, in essence or the contribution to the prior art, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium. A number of instructions are included to cause one or more computer devices (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.

In the above-mentioned embodiments of the present application, the descriptions of the various embodiments are different, and the parts that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.

In the several embodiments provided by the present application, it should be understood that the disclosed client may be implemented in other manners. The device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

The above description is only a preferred embodiment of the present application, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present application. It should be considered as the scope of protection of this application.

Claims

A method for identifying a vehicle includes:

Obtaining a request to identify a vehicle in a target picture;

Identifying, by using a preset neural network model, the vehicle in the target image is a target vehicle type, wherein the preset neural network model is obtained by training a deep convolutional neural network model for performing vehicle type recognition using a training set, The training set is a set of pictures obtained by performing a first pre-processing on a picture of a plurality of vehicle models, the first pre-processing for enhancing robustness of the preset neural network model, the plurality of the vehicle models including the Target model

In response to the request, a recognition result including the target vehicle type is returned.
The method of claim 1 wherein the method further comprises: prior to obtaining a request to perform vehicle type identification of the vehicle in the target picture, the method further comprising:

Performing the first pre-processing on a picture of a plurality of the vehicle models to obtain the training set;

The deep convolutional neural network model for performing vehicle type recognition is trained using the training set to obtain the preset neural network model.
The method of claim 2 wherein said performing said first pre-processing on a plurality of pictures of said vehicle type comprises:

Each of the pictures of the plurality of the models is processed as follows, wherein each of the pictures is regarded as a current picture:

Performing a first processing operation and a second processing operation on the current picture to obtain a first picture, where the first picture is regarded as a picture in the training set, and the first processing operation includes random rotation and random The operation of the cropping, the random rotation and the random cropping operation are used to eliminate the influence of the image capturing angle on the vehicle type recognition, and the second processing operation is for eliminating the influence of the image capturing environment on the vehicle type recognition.
The method of claim 3, wherein performing the first processing operation and the second processing operation on the current picture comprises:

Performing at least one of the following processes on the current picture: size adjustment, random rotation, random cropping, Gaussian smoothing processing, brightness adjustment, and saturation adjustment to obtain a second picture;

Performing a histogram equalization process on the second picture to obtain a third picture;

Performing whitening on the third picture to obtain the first picture.
The method according to claim 2, wherein the deep convolutional neural network model for performing vehicle type recognition is trained using the training set, and the predetermined neural network model is obtained by:

Identifying, by the plurality of convolution layers of the deep convolutional neural network model, a plurality of first feature information of all pictures belonging to each of the vehicle models in the training set;

After the feature classification layer of the deep convolutional neural network model extracts a feature set of each of the vehicle models from all of the first feature information of each of the vehicle models, the preset neural network model is obtained, wherein And the feature set includes second feature information for indicating the vehicle type in all the first feature information, where the feature set is saved in the preset neural network model.
The method of claim 1 wherein

Before using the preset neural network model to identify that the vehicle in the target image is the target vehicle, the method further includes: performing a second pre-processing on the target image;

Identifying, by using a preset neural network model, that the vehicle in the target picture is a target vehicle model comprises: using image data of the target picture that passes the second pre-processing as an input of the preset neural network model to identify The vehicle in the target picture is the target vehicle type.
The method of claim 6 wherein the second pre-processing of the target picture comprises:

Performing at least one of the following processes on the target picture: performing cropping, resizing, Gaussian smoothing, brightness adjustment, and saturation adjustment according to the car frame to obtain a fourth picture;

Performing a histogram equalization process on the fourth picture to obtain a fifth picture;

The fifth picture is whitened to obtain picture data to be input into the preset neural network model.
The method according to claim 6, wherein the picture data of the target picture subjected to the second pre-processing is used as an input of the preset neural network model to identify a vehicle in the target picture as a location The target models include:

Identifying, by the plurality of convolution layers of the preset neural network model, third feature information of the target picture;

Obtaining a matching degree between the third feature information of the target picture and the feature information in the feature set of each of the vehicle models;

Determining that the vehicle type corresponding to a target matching degree among the plurality of the matching degrees of the plurality of vehicle models is the target vehicle type, wherein the target matching degree includes a plurality of the matching degrees from a large to a small value The first N of the matching degrees in the arrangement, N is a positive integer.
The method according to any one of claims 1 to 8, wherein the obtaining a request for vehicle type recognition of a vehicle in the target picture comprises:

Receiving the request sent by the first client to the server for performing vehicle type identification, wherein the first client is connected to the server via the Internet.
The method according to any one of claims 1 to 8, wherein

Acquiring the request for vehicle type identification of the vehicle in the target picture comprises: receiving, on the preset interface, the request for performing vehicle type identification sent to the server, wherein the preset interface is provided by the server for a function interface for vehicle type identification;

In response to the request, returning the recognition result including the target vehicle type includes, in response to the request, returning, by the preset interface, an identification result including the target vehicle type to an object that invokes the preset interface.
A vehicle identification device comprising:

An obtaining unit configured to obtain a request for vehicle type recognition of a vehicle in the target picture;

An identification unit configured to identify, by using a preset neural network model, that the vehicle in the target picture is a target vehicle type, wherein the preset neural network model is a deep convolutional neural network for performing vehicle type recognition using a training set The model is trained, and the training set is a set of pictures obtained by performing first pre-processing on pictures of a plurality of vehicle models, where the first pre-processing is used to enhance robustness of the preset neural network model, and multiple The vehicle model includes the target vehicle model;

The response unit is configured to return a recognition result including the target vehicle type in response to the request.
The apparatus of claim 11 wherein said apparatus further comprises:

The processing unit is configured to perform the first pre-processing on the pictures of the plurality of vehicle models to obtain the training set before acquiring the request for vehicle type identification of the vehicle in the target picture;

And a training unit configured to train the deep convolutional neural network model for performing vehicle type recognition using the training set to obtain the preset neural network model.
The apparatus according to claim 12, wherein said processing unit is further configured to perform processing on each of a plurality of pictures of said vehicle type, wherein said each picture is regarded as a current picture: The current picture performs a first processing operation and a second processing operation to obtain a first picture, where the first picture is regarded as a picture in the training set, and the first processing operation includes random rotation and random cutting The operation of the random rotation and the random cropping is used to eliminate the influence of the image capturing angle on the vehicle type recognition, and the second processing operation is for eliminating the influence of the image capturing environment on the vehicle type recognition.
The apparatus of claim 13 wherein said processing unit comprises:

The first processing module is configured to perform at least one of the following processing on the current picture: size adjustment, random rotation, random cropping, Gaussian smoothing processing, brightness adjustment, and saturation adjustment to obtain a second picture;

a second processing module, configured to perform a histogram equalization process on the second picture to obtain a third picture;

The third processing module is configured to perform whitening processing on the third picture to obtain the first picture.
The apparatus of claim 12 wherein said training unit comprises:

An identification module configured to identify, by the plurality of convolution layers of the deep convolutional neural network model, a plurality of first feature information of all pictures belonging to each of the models in the training set;

a training module configured to obtain the feature set after extracting a feature set of each of the vehicle models from all of the first feature information of each of the vehicle models by a feature classification layer of the deep convolutional neural network model A neural network model is provided, wherein the feature set includes second feature information for indicating the vehicle type in all of the first feature information, and the feature set is saved in the preset neural network model.
A storage medium, wherein the storage medium stores a computer program, the computer program being arranged to perform the method of any one of claims 1 to 10 when executed.
An electronic device comprising a memory and a processor, wherein the memory stores a computer program, the processor being arranged to run the computer program to perform the method of any one of claims 1 to 10. .