CN112418214B

CN112418214B - Vehicle identification code identification method and device, electronic equipment and storage medium

Info

Publication number: CN112418214B
Application number: CN202011232933.5A
Authority: CN
Inventors: 关鹏; 赏宇; 孙玉川
Original assignee: Beijing Love Car Technology Co ltd
Current assignee: Beijing Love Car Technology Co ltd
Priority date: 2020-11-06
Filing date: 2020-11-06
Publication date: 2024-07-23
Anticipated expiration: 2040-11-06
Also published as: CN112418214A

Abstract

The invention provides a vehicle identification code identification method, a device, electronic equipment and a storage medium. The method comprises the following steps: acquiring an original image to be identified, wherein the original image contains a vehicle identification code; intercepting an area where a vehicle identification code is located from an original image to obtain a target image, and acquiring a classification identifier of the target image, wherein the classification identifier is used for representing the area attribute of the position of the vehicle identification code in the target image in a vehicle; identifying a vehicle identification code contained in the target image through a text identification model suitable for classification identification; the text recognition model is obtained through training a plurality of sample pictures with known contents, the sample pictures are obtained through at least one character sample through random combination, the character samples are obtained through intercepting characters in at least one vehicle recognition code sample under classification identification, and the character samples contain single characters or a plurality of continuous characters. Thereby obtaining the beneficial effects of improving the VIN code recognition efficiency and the recognition result accuracy.

Description

Vehicle identification code identification method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a vehicle identification code identification method, a device, an electronic apparatus, and a storage medium.

Background

The vehicle identification code (VIN, vehicle Identification Number), also called frame number, is composed of 17 digits, and is determined according to national vehicle management standard, and contains information of manufacturer, year, model, body type and code, engine code and assembling place of vehicle. The correct interpretation of the VIN code is important for us to correctly identify the vehicle model, so that both diagnosis and maintenance are correctly performed.

In the related art, when identifying vehicle related information based on VIN codes, a user is required to manually input 17-bit VIN code characters, and irregularities of VIN code components are required to be carefully input by the user, which results in longer time spent in the process of filling in VIN codes by the user and higher error rate, thereby affecting the identification efficiency and accuracy of identification results of VIN codes.

Disclosure of Invention

The embodiment of the invention provides a vehicle identification code identification method, a device, electronic equipment and a storage medium, which are used for solving the problems of identification efficiency and identification result accuracy of VIN codes caused by the fact that the conventional VIN codes are required to be manually input by a user.

In order to solve the technical problems, the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides a vehicle identification code identification method, including:

Acquiring an original image to be identified, wherein the original image contains a vehicle identification code;

Intercepting an area where a vehicle identification code is located from the original image to obtain a target image, and acquiring a classification identifier of the target image, wherein the classification identifier is used for representing an area attribute of the position of the vehicle identification code in the target image in a vehicle;

Identifying a vehicle identification code contained in the target image through a text identification model suitable for the classification identifier;

the text recognition model is obtained through training sample pictures with a plurality of known contents, the sample pictures are obtained through random combination of at least one character sample, the character sample is obtained through intercepting characters in at least one vehicle recognition code sample under the classification mark, and the character sample contains single characters or a plurality of continuous characters.

Optionally, the step of capturing an area where the vehicle identification code is located from the original image to obtain a target image and obtaining a classification identifier of the target image includes:

Text detection is carried out on the original image so as to locate and intercept the area where the vehicle identification code is located from the original image, and a target image is obtained;

And classifying the target image to obtain a classification identifier of the target image.

Classifying the original image to obtain a classification identifier of the original image, wherein the classification identifier of the original image is used for representing the regional attribute of the position of the vehicle identification code in the original image in the vehicle;

Performing text detection on the original image through a text detection model matched with the classification identifier so as to locate and intercept an area where a vehicle identification code is located from the original image to obtain a target image, and taking the classification identifier of the original image as the classification identifier of the target image;

The text detection model is obtained through training a plurality of training samples with known text detection results, and the classification identification of the training samples is consistent with the classification identification of the original image.

Optionally, the area attribute includes at least one of a window glass, an automobile nameplate, and a driver license.

Optionally, the step of identifying the vehicle identification code contained in the target image through a text identification model applicable to the classification identifier includes:

responding to the region attribute represented by the classification mark as the window glass or the automobile nameplate, and identifying the vehicle identification code contained in the target image through a first text identification model;

And identifying the vehicle identification code contained in the target image through a second text identification model in response to the regional attribute characterized by the classification identifier as a driving license.

Optionally, before the step of identifying the vehicle identification code contained in the target image by the text identification model applicable to the classification identifier, the method further includes:

aiming at any sort of sort identification, at least one vehicle identification code sample under the sort identification is obtained;

Detecting and intercepting characters in each vehicle identification code sample to obtain a character sample set corresponding to the classification identifier, wherein the character sample set comprises a plurality of character samples;

According to character samples contained in the character sample set, randomly combining to obtain a plurality of sample pictures;

And training a text recognition model suitable for the classification mark through the sample picture.

Optionally, the step of detecting and intercepting the characters in each vehicle identification code sample to obtain a character sample set corresponding to the classification identifier includes:

aiming at any vehicle identification code sample, carrying out graying treatment on the vehicle identification code sample, and splitting a target row where the vehicle identification code is located from the vehicle identification code sample subjected to graying treatment by a horizontal projection method;

Performing binarization processing on the target row, and performing contour detection and contour denoising on the target row subjected to the binarization processing to obtain a character sample contained in the vehicle identification code sample;

And constructing and obtaining a character sample set of the classification identifier according to character samples contained in all the vehicle identification code samples under the same classification identifier.

Optionally, the character lengths contained in the sample pictures corresponding to the same classification identifier are not completely consistent, and the text recognition model comprises a convolutional neural network, a cyclic neural network and a join sense time classification network which are sequentially cascaded.

Optionally, the recurrent neural network includes a gated recurrent cell network.

In a second aspect, an embodiment of the present invention provides a vehicle identification code identifying device, including:

The image acquisition module is used for acquiring an original image to be identified, wherein the original image contains a vehicle identification code;

The VIN code detection module is used for intercepting an area where the vehicle identification code is located from the original image to obtain a target image, and acquiring a classification identifier of the target image, wherein the classification identifier is used for representing an area attribute of the position where the vehicle identification code is located in the vehicle in the target image;

The VIN code recognition module is used for recognizing the vehicle recognition code contained in the target image through a text recognition model suitable for the classification identifier;

Optionally, the VIN code detection module includes:

the first VIN code detection sub-module is used for carrying out text detection on the original image so as to locate and intercept the area where the vehicle identification code is located from the original image and obtain a target image;

And the first classification sub-module is used for classifying the target image and acquiring a classification identifier of the target image.

Optionally, the VIN code detection module includes:

The second classification sub-module is used for classifying the original image to obtain a classification identifier of the original image, wherein the classification identifier of the original image is used for representing the regional attribute of the position of the vehicle identification code in the original image in the vehicle;

The second VIN code detection sub-module is used for carrying out text detection on the original image through a text detection model matched with the classification identifier so as to locate and intercept the area where the vehicle identification code is located from the original image to obtain a target image, and the classification identifier of the original image is used as the classification identifier of the target image;

Optionally, the VIN code recognition module includes:

The first VIN code recognition sub-module is used for responding to the region attribute represented by the classification identifier as the window glass or the automobile nameplate and recognizing the vehicle identification code contained in the target image through a first text recognition model;

and the second VIN code recognition sub-module is used for responding to the regional attribute represented by the classification identifier as a driving license and recognizing the vehicle identification code contained in the target image through a second text recognition model.

Optionally, the apparatus further comprises:

The VIN code sample acquisition module is used for acquiring at least one vehicle identification code sample under any classification identifier;

The character sample set construction module is used for detecting and intercepting characters in each vehicle identification code sample to obtain a character sample set corresponding to the classification identifier, wherein the character sample set comprises a plurality of character samples;

the sample picture construction module is used for randomly combining a plurality of sample pictures according to the character samples contained in the character sample set;

And the model training module is used for training a text recognition model applicable to the classification mark through the sample picture.

Optionally, the character sample set construction module includes:

the VIN code row extraction submodule is used for carrying out grey treatment on any vehicle identification code sample, and splitting a target row where the vehicle identification code is located from the grey treated vehicle identification code sample by a horizontal projection method;

the character sample interception sub-module is used for carrying out binarization processing on the target row, carrying out contour detection and contour denoising on the target row after the binarization processing, and obtaining a character sample contained in the vehicle identification code sample;

And the character sample set construction submodule is used for constructing and obtaining the character sample set of the classification identifier according to the character samples contained in all the vehicle identification code samples under the same classification identifier.

In a third aspect, an embodiment of the present invention further provides an electronic device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor performs the steps of the vehicle identification code identification method according to the first aspect.

In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where a computer program is stored on the computer readable storage medium, where the computer program when executed by a processor implements the steps of the vehicle identification code identification method according to the first aspect.

In the embodiment of the invention, the text recognition model adapting to different region attributes is set by distinguishing the region attribute of the position of the VIN code in the original image in the vehicle, and when the text recognition model is trained, various sample pictures can be split and combined based on the existing vehicle recognition code samples. Thereby obtaining the beneficial effect of the accuracy of the VIN code recognition result.

The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of steps of a method for identifying a vehicle identification code in an embodiment of the invention;

FIG. 2 is a schematic diagram of a VIN code recognition process for an original image according to an embodiment of the present invention;

FIG. 3 is a flow chart of steps of another vehicle identification code identification method in an embodiment of the invention;

FIG. 4 is a schematic flow chart of the method for intercepting characters in VIN code samples according to the embodiment of the invention;

FIG. 5 is a schematic flow chart of a method for combining sample pictures based on a character sample set in an embodiment of the invention;

fig. 6 is a schematic structural view of a vehicle identification code recognition device in an embodiment of the present invention;

Fig. 7 is a schematic structural view of another vehicle identification code recognition device in the embodiment of the invention;

Fig. 8 is a schematic diagram of a hardware structure of an electronic device in an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, a flowchart of steps of a vehicle identification code recognition method according to an embodiment of the present invention is shown.

Step 110, obtaining an original image to be identified, wherein the original image contains a vehicle identification code;

step 120, intercepting an area where a vehicle identification code is located from the original image to obtain a target image, and acquiring a classification identifier of the target image, wherein the classification identifier is used for representing an area attribute of a position where the vehicle identification code in the target image is located in a vehicle;

Step 130, identifying the vehicle identification code contained in the target image through a text identification model suitable for the classification identifier; the text recognition model is obtained through training sample pictures with a plurality of known contents, the sample pictures are obtained through random combination of at least one character sample, the character sample is obtained through intercepting characters in at least one vehicle recognition code sample under the classification mark, and the character sample contains single characters or a plurality of continuous characters.

Compared with the traditional OCR (Optical Character Recognition ) scene (print, scanned document, etc.), vin code recognition scene mainly extracts and recognizes text information for images such as photographed photos, video frames in video, etc., and thus mainly faces the following challenges: imaging is complex, such as noise, blurring, light variation, and variational effects; the characters are complex, such as the influence of factors such as fonts, font sizes, colors, abrasion, random stroke widths, random stroke directions and the like; the background is complex, such as the layout is missing, background interference and other factors influence.

In order to solve the above problem, in the embodiment of the present invention, the Vin code identification procedure may be optimized in the following two aspects: firstly, in the text line extraction process, text line information is extracted by utilizing the thought of general target detection. Secondly, character line recognition, in which character lines are recognized by conventional OCR, although recognition rate is effectively improved based on convolutional neural network and the like through training, compatibility of conditions such as character adhesion, blurring and deformation is poor, accurate recognition cannot be performed on conditions which cannot be accurately segmented, and VIN code recognition schemes in the embodiment of the invention can be used for directly recognizing character lines based on sequence modeling.

Specifically, the original image to be identified may be any image that can be directly obtained and includes the VIN code, for example, a picture taken by an electronic device such as a camera or a mobile phone, a video frame in the taken video, a screenshot of the video or the picture, and the original image may also include other contents besides the VIN code. For example, the VIN code may be present in the front windshield, the nameplate, and the driver license, and other information may be included in the front windshield, the nameplate, and the driver license, and then when a picture is taken for the front windshield, the nameplate, or the driver license, other information is generally taken.

Therefore, in the embodiment of the invention, in order to improve the accuracy of the VIN code recognition result, after the original image to be recognized including the vehicle recognition code is acquired, the area where the vehicle recognition code is located is intercepted from the original image to obtain the target image, so that in the subsequent VIN code recognition process, only the area where the VIN code is located is recognized, and the interference of other information except the VIN code in the original image on the recognition result is effectively avoided. Specifically, the area where the vehicle identification code is located may be intercepted from the original image by any available method to obtain the target image, which is not limited in this embodiment of the present invention. For example, the text line of the VIN code in the original image may be extracted by using the idea of general target detection, so as to obtain the target image. Specifically, text lines of the VIN code in the original image may be extracted by pre-training a machine learning model for detecting the VIN code, which is not limited to the embodiment of the present invention.

In addition, as mentioned above, the VIN code may exist in different positions such as the automobile front windshield, the automobile nameplate and the driving license, and due to the materials at different positions, the character size, the character shape, the color and other attributes of the VIN code, the background color and other attributes of the VIN code may be different, and the above differences may affect the accuracy of the identification result of the VIN code. Thus, in the embodiment of the present invention, different text recognition models may be set for the above-described different positions. In order to assign a suitable text recognition model to the target image extracted from the current raw image, a classification of the target image is acquired, wherein the classification is used to characterize the region attribute of the target image, in which the vehicle identification code is located in the vehicle, for example, in which of the above-mentioned front windshield, nameplate, and license.

In the embodiment of the present invention, the classification identifier of the target image may be obtained in any available manner, which is not limited to the embodiment of the present invention. For example, a classification model may be trained in advance to obtain a classification identifier of the target image, the classification model may be any machine learning model or a combination of multiple machine learning models, and a training sample of the classification model may include target images under multiple different classification identifiers. For example, a ResNet network may be used as a classification model.

After the target image is obtained by intercepting the original image and the classification identifier of the target image is obtained, the vehicle identification code contained in the corresponding target image can be identified through a text identification model suitable for the corresponding classification identifier.

The text recognition model suitable for the corresponding classification identifier is obtained through sample picture training of a plurality of known contents (namely known VIN code concrete contents), training data are the most difficult to train the text recognition model in practical application, a large amount of data are required for training in order to ensure accuracy, sample collection is difficult for scenes such as window glass, and a large number of samples are generated based on a small number of samples, character samples under the corresponding classification identifier can be obtained by intercepting characters in at least one vehicle recognition code sample under each classification identifier, and then the character samples are extracted and randomly combined to obtain sample pictures for training the text recognition model suitable for the corresponding classification identifier, so that a large number of sample pictures are generated based on a small number of vehicle recognition code samples.

In the embodiment of the present invention, the characters in the vehicle identification code sample may be intercepted in any available manner, which is not limited to the embodiment of the present invention. Moreover, in the process of intercepting the characters in the vehicle identification code sample, there may be situations such as sticking of the characters, so that the intercepted character sample may contain a single character or a plurality of continuous characters. In addition, the specific number of the character samples forming the sample picture can be set in a self-defined manner according to the requirement, and the specific number can be a fixed value or an indefinite value, so that the embodiment of the invention is not limited.

Fig. 2 is a schematic diagram of a process for identifying VIN codes in an original image. Firstly, an area where a vehicle identification code is located is intercepted from an original image, a target image is obtained, and a classification identifier of the target image is obtained, wherein the area circled by a rectangular frame in the original image in fig. 2 is the area where the target image is located, and then the vehicle identification code contained in the target image can be identified through a text identification model suitable for the classification identifier.

In addition, in practical application, if the VIN code has a certain service rule, for example, the VIN code includes a check bit to check whether the current VIN code is wrong, in order to improve accuracy of the identification result and reduce identification error of the VIN in the process of identifying the VIN code, the identification result may be checked twice based on the service rule of the VIN, if the VIN code identification result accords with the service rule of the VIN, the corresponding VIN code identification result may be confirmed to be the correct VIN code corresponding to the corresponding original image and may be output to a subsequent operation, and if the VIN code identification result does not accord with the service rule of the VIN, the above step 120 or step 130 may be returned to re-execute the VIN code identification procedure, and the VIN code identification is re-performed until the VIN code identification result conforming to the service rule of the VIN is obtained, or the VIN code identification failure result is returned after repeating the preset upper limit times.

In the embodiment of the invention, the text recognition model adapting to different region attributes is set by distinguishing the region attribute of the position of the VIN code in the original image in the vehicle, and when the text recognition model is trained, various sample pictures can be split and combined based on the existing vehicle recognition code samples. Thereby obtaining the beneficial effects of improving the identification efficiency of VIN codes and the accuracy of identification results.

Optionally, in the embodiment of the present invention, the step of capturing an area where the vehicle identification code is located from the original image to obtain a target image and obtaining a classification identifier of the target image may specifically include:

step A121, performing text detection on the original image to locate and intercept the area where the vehicle identification code is located from the original image, so as to obtain a target image;

And step A122, classifying the target image to obtain a classification identifier of the target image.

step B121, classifying the original image to obtain a classification identifier of the original image, wherein the classification identifier of the original image is used for representing the regional attribute of the position of the vehicle identification code in the original image in the vehicle;

Step B122, performing text detection on the original image through a text detection model matched with the classification identifier so as to locate and intercept an area where a vehicle identification code is located from the original image, thereby obtaining a target image, and taking the classification identifier of the original image as the classification identifier of the target image; the text detection model is obtained through training a plurality of training samples with known text detection results, and the classification identification of the training samples is consistent with the classification identification of the original image.

In the embodiment of the invention, text detection can be performed on the original image first so as to locate and intercept the area where the vehicle identification code is located from the original image and obtain the target image. And classifying the intercepted target image, for example, classifying the intercepted target image through the ResNet network to obtain the classification identifier of the target image. When the original image is subjected to text detection, the original image under different classification identifications can be subjected to text detection through the same text detection model, and the original image is not classified, so that the text detection efficiency is improved.

In the embodiment of the present invention, text detection may be performed on the original image in any available manner, which is not limited to the embodiment of the present invention. For example, text detection may be performed through ADVANCEDEAST, and so on.

In addition, the original image can be classified first to obtain a classification identifier of the original image, wherein the classification identifier of the original image is used for representing the regional attribute of the position of the vehicle identification code in the original image in the vehicle; further, text detection is carried out on the original image through a text detection model matched with the classification identifier, so that an area where a vehicle identification code is located is positioned and intercepted from the original image, a target image is obtained, and the classification identifier of the original image is used as the classification identifier of the target image; the text detection model is obtained through training a plurality of training samples with known text detection results, and the classification identification of the training samples is consistent with the classification identification of the original image.

At this time, the original image is classified to obtain the classification identifier, so when the text detection is performed on the target image, the text detection can be performed on the original image through the text detection model adapted to the classification identifier, so as to locate and intercept the area where the vehicle identification code is located from the original image, obtain the target image, and use the classification identifier of the original image as the classification identifier of the target image. Then, in order to adapt to different classification identifications, text detection models adapting to various classification identifications can be trained in advance, and the text detection model adapting to each classification identification can be obtained through training samples of known text detection results under a plurality of corresponding classification identifications.

The text detection model may be any machine learning model or a combination of multiple machine learning models, which is not limited to the embodiment of the present invention. For example, the text detection model may be set as a combination of CNN (Convolutional Neural Networks, convolutional neural network) and RNN (Recurrent Neural Network ), and the CNN and RNN are cascaded in order.

As can be seen from the foregoing, in the two ways of capturing the area where the vehicle identification code is located from the original image to obtain the target image and obtaining the classification identifier of the target image, the first way may only train one text detection model, while the second way needs to train the text detection model applicable to each classification identifier separately. In the embodiment of the present invention, any one of the above modes may be selected according to the requirement, and the embodiment of the present invention is not limited thereto.

Optionally, in an embodiment of the present invention, the area attribute includes at least one of a window glass, an automobile nameplate, and a driver license.

Referring to fig. 3, in an embodiment of the present invention, the step 130 may further include:

Step 131, identifying a vehicle identification code contained in the target image through a first text identification model in response to the region attribute characterized by the classification identifier being a vehicle window glass or a vehicle nameplate;

And step 132, identifying the vehicle identification code contained in the target image through a second text identification model in response to the region attribute characterized by the classification identifier as a driving license.

The vehicle nameplate containing the VIN code is also called a vehicle nameplate, and is generally placed in a place where the front part of the vehicle is easy to observe, for example, above a front passenger door in the vehicle, and is relatively close to the VIN code in the window glass compared with a vehicle license made of paper. Therefore, in the embodiment of the invention, in order to reduce the training cost of the text recognition model while improving the accuracy of the VIN code recognition result, only the window glass and the automobile nameplate can be applied to the same text recognition model, namely, the first text recognition model, and the driving license can be applied to another text recognition model, namely, the second text recognition model.

If the region attribute represented by the classification identifier is a vehicle window glass or a vehicle nameplate, the vehicle identification code contained in the target image can be identified through the first text identification model, and if the region attribute represented by the classification identifier is a driving license, the vehicle identification code contained in the target image can be identified through the second text identification model.

Correspondingly, combining the character samples of the sample pictures for training the first text recognition model can comprise intercepting characters in at least one vehicle recognition code sample under the window glass and the automobile nameplate, and combining the character samples of the sample pictures for training the second text recognition model can comprise intercepting characters in at least one vehicle recognition code sample under the driving license.

The first text recognition model and the second text recognition model are both text recognition models, and the model structures of the first text recognition model and the second text recognition model may be the same or different, which is not limited in this embodiment of the present invention.

Referring to fig. 3, in an embodiment of the present invention, before step 130, it may further include:

Step 10, aiming at any sort of sort identification, obtaining at least one vehicle identification code sample under the sort identification;

step 20, detecting and intercepting characters in each vehicle identification code sample to obtain a character sample set corresponding to the classification identifier, wherein the character sample set comprises a plurality of character samples;

Step 30, according to the character samples contained in the character sample set, randomly combining to obtain a plurality of sample pictures;

And step 40, training a text recognition model applicable to the classification mark through the sample picture.

As described above, in order to improve accuracy of the VIN code recognition result, text recognition models adapted to various classification identifiers may be trained respectively. Then, first, sample pictures suitable for training each text recognition model are respectively constructed, and in order to improve the diversity of the sample pictures, a corresponding character sample set can be respectively set for each classification mark.

Specifically, for any sort identifier, at least one vehicle identification code sample under the corresponding sort identifier may be obtained, and then characters in each vehicle identification code sample may be detected and intercepted, so as to obtain a character sample set corresponding to the sort identifier, where the character sample set includes a plurality of character samples. In the embodiment of the present invention, the vehicle identification code sample under each classification identifier may be obtained in any available manner, which is not limited to the embodiment of the present invention.

The vehicle identification code sample may be understood as a sample of the complete VIN code or a sample of a partial fragment of the VIN code, which is not limited in this embodiment of the present invention. For example, in general, a VIN code is composed of seventeen characters, and a vehicle identification code sample may include a complete VIN code composed of seventeen characters, or may include an incomplete VIN code composed of less than seventeen characters, that is, a partial character in a VIN code.

Moreover, in the embodiment of the present invention, the characters in each of the vehicle identification code samples may be detected and intercepted in any available manner, which is not limited thereto. And when characters are detected, the detected characters can be marked through the rectangular boxes, and then character interception is carried out according to the marked rectangular boxes, so that character samples are obtained.

When training the text recognition model adapting to a certain classification identifier, a plurality of sample pictures can be obtained by random combination according to character samples contained in a character sample set corresponding to the corresponding classification identifier, and then the text recognition model applicable to the classification identifier is trained through the corresponding sample pictures. Moreover, in the process of combining the sample pictures, since specific characters contained therein can be simultaneously recognized in the character detection process, when the sample pictures are combined, the characters contained in the sample pictures can be known in the case of knowing the character samples constituting the sample pictures.

Of course, in the embodiment of the present invention, the text recognition model applicable to the corresponding classification identifier may be trained by only using the manner of recombining the sample pictures with the character samples for part of the classification identifiers (for example, the classification identifiers of the front windshield of the automobile, that is, the classification identifiers of a kind of window glass, which is difficult to collect the VIN code), and for other classification identifiers (for example, the classification identifiers of the driving license, which is easy to collect the VIN code), the text recognition model applicable to the corresponding classification identifier may be directly trained by the VIN code samples of the known characters.

Moreover, if multiple classification identifiers apply to the same text recognition model, then the text recognition model may be trained by sample pictures under the corresponding multiple classification identifiers while the text recognition model is being trained.

Optionally, in an embodiment of the present invention, the step 20 may further include:

step 21, aiming at any vehicle identification code sample, carrying out grey treatment on the vehicle identification code sample, and splitting a target row where the vehicle identification code is located from the vehicle identification code sample subjected to grey treatment by a horizontal projection method;

step 22, performing binarization processing on the target line, and performing contour detection and contour denoising on the target line after the binarization processing to obtain a character sample contained in the vehicle identification code sample;

And step 23, constructing and obtaining a character sample set of the classification identifier according to character samples contained in all the vehicle identification code samples under the same classification identifier.

In practical applications, the vehicle identifier sample may include only the vehicle identifier, or may include other reference information related to the vehicle identifier, such as a bar code, and in general, the vehicle identifier is a single line of characters, so in order to improve accuracy of a detection result when detecting the character sample included in the vehicle identifier sample, and thus improve accuracy of the character sample, the target line where the vehicle identifier is located may be first split from the vehicle identifier sample.

Taking a behavior example, specifically, for any vehicle identification code sample, the vehicle identification code sample may be subjected to gray-scale processing, and the target row where the vehicle identification code is located may be separated from the vehicle identification code sample after the gray-scale processing by a horizontal projection method.

The horizontal projection method is to assume that there are many horizontal straight lines on the text image, some lines pass through the text area, and some lines pass between text lines. The number of black pixels (the text portion is black) encountered by each line as it passes through the image is recorded to obtain a value which is taken as the value of the line at the y-coordinate, a graph is obtained in which the length of each point represents the number of black pixels at the y-coordinate. In the text area, there is a value because there is a word. The blank area between text lines has a value of 0 because of no words. The resulting image will be a valued, a 0, a valued, a 0. Thus, these values can be traversed, with a 0 encountered indicating that it is inter-row. So if after encountering a value (text line) a value (line-to-line) is encountered and then a value (text line) is encountered again, this means that the image is a plurality of lines of text, otherwise it is not. Meanwhile, according to the y coordinate points with 0, the segmentation points among the text lines can be judged, and segmentation can be performed.

In addition, before horizontal projection, the text image of the vehicle identification code sample can be subjected to morphological processing, and corrosion and expansion are the most common methods. The corrosion is to shrink the color area in the image to a certain degree so that the edge rough part is rounded off, and the characters can be shrunk to a certain degree so that the dense characters are not doped with each other. The expansion is to expand the color area in the image to some extent so that the small holes inside the image are filled, and the characters can be changed into whole character groups to some extent. Also, open and closed operations actually combine corrosion and expansion.

In addition, in order to facilitate the detection of characters, the target line may be binarized, that is, the gray value of the pixel point on the image is set to 0 or 255, that is, the whole image exhibits obvious visual effects of only black and white. An image includes a target object, a background and noise, and in order to directly extract the target object from a multi-valued digital image, a most common method is to set a threshold T, and divide the data of the image into two parts by using T: a group of pixels greater than T and a group of pixels less than T. This is the most specific method of studying gray scale transformation, called Binarization (BINARIZATION) of the image. The value of T can be set in a self-defined manner according to the requirement, which is not limited in the embodiment of the invention. In the embodiment of the invention, the vehicle identification code sample may be an image, and the target row split from the image may be understood to be an image only containing the target row.

And performing contour detection and contour denoising on the target line subjected to binarization processing to obtain a character sample contained in the vehicle identification code sample. In the embodiment of the present invention, the contour detection and the contour denoising can be performed in any available manner, and the embodiment of the present invention is not limited thereto.

After the character samples containing single characters or continuous characters are intercepted from the target line, before the character sample set is generated, correction standards can be manually carried out on each character sample, and the character sample set is formed by the corrected character samples.

Fig. 4 is a schematic flow chart of intercepting characters in VIN code samples by means of opencv. At this time, the input VIN code sample contains VIN code and bar code, and then the processing such as graying, horizontal projection splitting line, self-adaptive binarization, contour detection, contour denoising and the like can be sequentially performed on the VIN code sample, so as to obtain a character sample containing single character or continuous characters.

Optionally, in the embodiment of the present invention, the character lengths contained in the sample pictures corresponding to the same classification identifier are not completely consistent, and the text recognition model includes a convolutional neural network, a cyclic neural network and a join-sense time classification network that are sequentially cascaded.

In practical applications, the number of characters contained in the VIN code is limited, that is, the number of characters is also a recognition target of the text recognition model. When training a text recognition model for recognizing VIN codes, if the number of characters contained in the adopted sample pictures is the same, the sensitivity of the trained text recognition model to the number of characters is easily affected, and when other non-VIN codes composed of a plurality of characters are input, the non-VIN codes can be recognized as VIN codes, so that the accuracy of the VIN code recognition result is affected.

Therefore, in the embodiment of the invention, in order to improve the sensitivity of the text recognition model to the number of characters, when constructing a sample picture corresponding to any classification identifier, sample pictures with different lengths and different contents, that is, the lengths of characters contained in the sample pictures corresponding to the same classification identifier, can be generated by different combinations based on the character samples in the character sample set corresponding to the classification identifier. In addition, when the text recognition model is trained, positive samples with character length meeting the VIN code requirement can be marked, and other positive samples can be marked, however, in the embodiment of the invention, the positive and negative samples can be marked according to the requirement without being based on the number of characters, and the embodiment of the invention is not limited.

Fig. 5 is a schematic flow chart of generating sample pictures with different lengths and different contents by different combinations based on a character sample set. The upper dotted rectangular box is a schematic diagram of a character sample contained in a character sample set, and the lower dotted rectangular box is a schematic diagram of a sample picture obtained based on the combination of the character samples shown by the upper dotted rectangular box.

Therefore, the sample picture finally achieves the advantages of random content, random length, abundant sample quantity and the like. In addition, in the embodiment of the invention, sample enhancement processing, such as overturn transformation, random clipping, scale transformation, contrast transformation, noise disturbance and the like, can be performed on the combined sample pictures, so that the diversity of the sample pictures is further improved, and the model training effect is improved.

The text recognition model may include a convolutional neural network CNN, a recurrent neural network RNN, and a join-sense time-sorting network, which are cascaded in sequence. That is, the text recognition model may adopt CTC (Connectionist Temporal Classification, joint sense time classification) loss function, and in the text recognition model, the text recognition model may include L convolutional neural networks CNN, M cyclic neural networks RNN and N joint sense time classification networks that are sequentially cascaded, where L, N, M is a positive integer, and the specific value of L, N, M may be set in a customized manner according to the requirement, which is not limited in this embodiment of the present invention.

Optionally, in an embodiment of the present invention, the recurrent neural network includes a gated recurrent unit network.

In practical applications, LSTM (Long Short Term Memory, long and short term memory) network, GRU (Gated Recurrent Unit, gated loop unit) network, etc. are all a loop neural network, and the GRU parameters are fewer and therefore easier to converge.

Referring to fig. 6, a schematic diagram of a vehicle identification code recognition device according to an embodiment of the present invention is shown.

The vehicle identification code identification device of the embodiment of the invention comprises: the image acquisition module 210, the VIN code detection module 220, and the VIN code identification module 230.

The functions of the modules and the interaction relationship between the modules are described in detail below.

An image obtaining module 210, configured to obtain an original image to be identified, where the original image includes a vehicle identification code;

the VIN code detection module 220 is configured to intercept an area where a vehicle identification code is located from the original image, obtain a target image, and obtain a classification identifier of the target image, where the classification identifier is used to characterize an area attribute of a position where the vehicle identification code is located in a vehicle in the target image;

A VIN code recognition module 230, configured to recognize a vehicle identification code included in the target image through a text recognition model applicable to the classification identifier;

Optionally, the VIN code detection module 220 may further include:

Referring to fig. 7, in an embodiment of the present invention, the VIN code identification module 230 may further include:

A first VIN code recognition sub-module 231, configured to recognize, through a first text recognition model, a vehicle identification code included in the target image in response to the region attribute characterized by the classification identifier being a window glass or an automobile nameplate;

The second VIN code recognition sub-module 232 is configured to recognize, through a second text recognition model, a vehicle identification code included in the target image in response to the region attribute characterized by the classification identifier being a driving license.

Referring to fig. 7, in an embodiment of the present invention, the apparatus may further include:

The VIN code sample acquiring module 310 is configured to acquire, for any one of the classification identifiers, at least one vehicle identification code sample under the classification identifier;

The character sample set construction module 320 is configured to detect and intercept characters in each of the vehicle identifier samples, and obtain a character sample set corresponding to the classification identifier, where the character sample set includes a plurality of character samples;

The sample picture construction module 330 is configured to randomly combine to obtain a plurality of sample pictures according to the character samples included in the character sample set;

Model training module 340 is configured to train a text recognition model applicable to the classification identifier through the sample picture.

Optionally, in an embodiment of the present invention, the character sample set construction module 320 may further include:

The vehicle identification code identifying device provided by the embodiment of the invention can realize each process realized in the method embodiments of fig. 1 and 3, and in order to avoid repetition, the description is omitted here.

Preferably, the embodiment of the present invention further provides an electronic device, including: the processor, the memory, store on the memory and can be on the computer program of the operation of processor, this computer program realizes the above-mentioned each process of vehicle identification code identification method embodiment when being carried out by the processor, and can reach the same technical result, in order to avoid repetition, the redundant description is omitted here.

The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, realizes the processes of the vehicle identification code identification method embodiment, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted. The computer readable storage medium is, for example, a Read-Only Memory (ROM), a random access Memory (Random Access Memory RAM), a magnetic disk or an optical disk.

Fig. 8 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present invention.

The electronic device 500 includes, but is not limited to: radio frequency unit 501, network module 502, audio output unit 503, input unit 504, sensor 505, display unit 506, user input unit 507, interface unit 508, memory 509, processor 510, and power source 511. It will be appreciated by those skilled in the art that the electronic device structure shown in fig. 8 is not limiting of the electronic device and that the electronic device may include more or fewer components than shown, or may combine certain components, or a different arrangement of components. In the embodiment of the invention, the electronic equipment comprises, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer and the like.

It should be understood that, in the embodiment of the present invention, the radio frequency unit 501 may be used to receive and send information or signals during a call, specifically, receive downlink data from a base station, and then process the downlink data with the processor 510; and, the uplink data is transmitted to the base station. Typically, the radio frequency unit 501 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 501 may also communicate with networks and other devices through a wireless communication system.

The electronic device provides wireless broadband internet access to the user through the network module 502, such as helping the user to send and receive e-mail, browse web pages, access streaming media, and the like.

The audio output unit 503 may convert audio data received by the radio frequency unit 501 or the network module 502 or stored in the memory 509 into an audio signal and output as sound. Also, the audio output unit 503 may also provide audio output (e.g., a call signal reception sound, a message reception sound, etc.) related to a specific function performed by the electronic device 500. The audio output unit 503 includes a speaker, a buzzer, a receiver, and the like.

The input unit 504 is used for receiving an audio or video signal. The input unit 504 may include a graphics processor (Graphics Processing Unit, GPU) 5041 and a microphone 5042, the graphics processor 5041 processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 506. The image frames processed by the graphics processor 5041 may be stored in the memory 509 (or other storage medium) or transmitted via the radio frequency unit 501 or the network module 502. Microphone 5042 may receive sound and may be capable of processing such sound into audio data. The processed audio data may be converted into a format output that can be transmitted to the mobile communication base station via the radio frequency unit 501 in case of a phone call mode.

The electronic device 500 also includes at least one sensor 505, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 5061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 5061 and/or the backlight when the electronic device 500 is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and direction when stationary, and can be used for recognizing the gesture of the electronic equipment (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; the sensor 505 may further include a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, etc., which are not described herein.

The display unit 506 is used to display information input by a user or information provided to the user. The display unit 506 may include a display panel 5061, and the display panel 5061 may be configured in the form of a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD), an Organic Light-Emitting Diode (OLED), or the like.

The user input unit 507 is operable to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 507 includes a touch panel 5071 and other input devices 5072. Touch panel 5071, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on touch panel 5071 or thereabout using any suitable object or accessory such as a finger, stylus, etc.). Touch panel 5071 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 510, and receives and executes commands sent by the processor 510. In addition, the touch panel 5071 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 5071, the user input unit 507 may include other input devices 5072. In particular, other input devices 5072 may include, but are not limited to, physical keyboards, function keys (e.g., volume control keys, switch keys, etc.), trackballs, mice, joysticks, and so forth, which are not described in detail herein.

Further, the touch panel 5071 may be overlaid on the display panel 5061, and when the touch panel 5071 detects a touch operation thereon or thereabout, the touch operation is transmitted to the processor 510 to determine a type of touch event, and then the processor 510 provides a corresponding visual output on the display panel 5061 according to the type of touch event. Although in fig. 8, the touch panel 5071 and the display panel 5061 are two independent components for implementing the input and output functions of the electronic device, in some embodiments, the touch panel 5071 and the display panel 5061 may be integrated to implement the input and output functions of the electronic device, which is not limited herein.

The interface unit 508 is an interface for connecting an external device to the electronic apparatus 500. For example, the external devices may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 508 may be used to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the electronic apparatus 500 or may be used to transmit data between the electronic apparatus 500 and an external device.

The memory 509 may be used to store software programs as well as various data. The memory 509 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory 509 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The processor 510 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 509, and calling data stored in the memory 509, thereby performing overall monitoring of the electronic device. Processor 510 may include one or more processing units; preferably, the processor 510 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 510.

The electronic device 500 may also include a power supply 511 (e.g., a battery) for powering the various components, and preferably the power supply 511 may be logically connected to the processor 510 via a power management system that performs functions such as managing charging, discharging, and power consumption.

In addition, the electronic device 500 includes some functional modules, which are not shown, and will not be described herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims

1. A vehicle identification code identification method, characterized by comprising:

Intercepting an area where a vehicle identification code is located from the original image to obtain a target image, and acquiring a classification identifier of the target image, wherein the classification identifier is used for representing an area attribute of the position of the vehicle identification code in the target image in a vehicle; the regional attribute comprises at least one of window glass, an automobile nameplate and a driving license;

2. The method according to claim 1, wherein the step of capturing the area where the vehicle identification code is located from the original image to obtain a target image, and obtaining the classification identifier of the target image includes:

3. The method according to claim 1, wherein the step of capturing the area where the vehicle identification code is located from the original image to obtain a target image, and obtaining the classification identifier of the target image includes:

4. The method according to claim 1, wherein the step of identifying the vehicle identification code contained in the target image by a text recognition model adapted to the classification identifier comprises:

5. The method according to any one of claims 1-4, further comprising, before the step of identifying the vehicle identification code contained in the target image by a text recognition model adapted to the classification identifier:

6. The method of claim 5, wherein the step of detecting and intercepting the characters in each of the vehicle identification code samples to obtain the set of character samples corresponding to the classification identifier comprises:

7. The method of any one of claims 1-4, wherein character lengths contained in sample pictures corresponding to the same class identification are not completely identical, and the text recognition model comprises a convolutional neural network, a recurrent neural network, and a join-sense time class network, which are sequentially cascaded.

8. The method of claim 7, wherein the recurrent neural network comprises a gated recurrent cell network.

9. A vehicle identification code recognition device, characterized by comprising:

The VIN code detection module is used for intercepting an area where the vehicle identification code is located from the original image to obtain a target image, and acquiring a classification identifier of the target image, wherein the classification identifier is used for representing an area attribute of the position where the vehicle identification code is located in the vehicle in the target image; the regional attribute comprises at least one of window glass, an automobile nameplate and a driving license;

10. An electronic device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor performs the steps of the vehicle identification code identification method according to any one of claims 1 to 8.

11. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the vehicle identification code identification method according to any one of claims 1 to 8.