CN110543877A - Identification recognition method, training method and device of model thereof and electronic system - Google Patents
Identification recognition method, training method and device of model thereof and electronic system Download PDFInfo
- Publication number
- CN110543877A CN110543877A CN201910834664.0A CN201910834664A CN110543877A CN 110543877 A CN110543877 A CN 110543877A CN 201910834664 A CN201910834664 A CN 201910834664A CN 110543877 A CN110543877 A CN 110543877A
- Authority
- CN
- China
- Prior art keywords
- identification
- characters
- identifier
- preset
- character
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Character Discrimination (AREA)
Abstract
the invention provides an identification recognition method, a model training method, an identification recognition device and an electronic system, wherein the identification recognition method comprises the following steps: extracting the position information of characters in the identification contained in the picture to be identified through a preset first feature extraction network; extracting a feature map of the identifier through a preset second feature extraction network; and according to the position information, carrying out weighting processing on the characteristic values in the characteristic diagram so as to identify the characters in the mark. According to the method, the characteristic diagram of the identifier is weighted through the position information of the characters in the identifier, so that an attention mechanism based on the position information is introduced into the characteristic diagram to effectively distinguish the position of each character, and the identification accuracy is improved.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to an identification recognition method, a model training method, a device and an electronic system.
background
With the continuous development of the transportation in China, the popularization of the intelligent transportation system becomes more and more important, and as an important component of the intelligent transportation system, the vehicle license plate recognition system plays an important role in the intellectualization of the work such as traffic management, public security and punishment. The method can be widely applied to the fields of traffic flow detection, traffic control and guidance, vehicle management of airports, ports and communities, illegal vehicle monitoring such as non-stop automatic charging, red light running and the like, vehicle safety and theft prevention and the like, and has wide application prospect.
in the related art, when a license plate and other marks containing a plurality of characters are identified, the characters in the marks are mostly identified based on a characteristic diagram, the position of each character is difficult to effectively distinguish, and the accuracy of mark identification is further influenced.
Disclosure of Invention
the invention aims to provide an identification recognition method, a model training method, an identification recognition device and an electronic system, so as to improve the identification recognition accuracy.
In a first aspect, the present invention provides an identifier identification method, where the method includes: acquiring a picture to be identified containing an identifier; extracting the position information of the characters in the identification in the picture to be recognized through a preset first feature extraction network; extracting a feature map of the identifier through a preset second feature extraction network; according to the position information, carrying out weighting processing on the characteristic values in the characteristic diagram; and identifying characters in the identification according to the processed feature map.
Further, the first feature extraction network comprises a convolutional layer and a full connection layer; the step of extracting the position information of the characters in the identifier in the picture to be recognized through a preset first feature extraction network comprises the following steps: extracting feature data of the picture to be identified through the convolutional layer; and inputting the characteristic data into the full connection layer, and outputting the position coordinates of the characters in the identification.
Further, the indicia comprises a plurality of lines of characters; before the step of extracting the identified feature map through a preset second feature extraction network, the method further includes: and correcting the identifier to obtain the identifier containing the single-row characters.
Further, the step of performing a correction process on the identifier includes: extracting the vertex coordinates of the identifier and the boundary endpoint coordinates between two adjacent lines of characters in the identifier in the picture to be recognized through a first feature extraction network; and correcting the identifier according to the vertex coordinates and the boundary endpoint coordinates.
Further, the step of performing correction processing on the identifier according to the vertex coordinates and the boundary endpoint coordinates includes: calculating a perspective transformation matrix according to the vertex coordinates and the demarcation line endpoint coordinates; carrying out perspective transformation on the identifier according to the perspective transformation conversion matrix to obtain a transformed identifier; splitting each line of characters in the transformed identifier according to the boundary end point coordinates in the transformed identifier; and splicing the split character sequence of each row into a single row of characters.
further, after the step of performing correction processing on the identifier according to the vertex coordinates and the boundary endpoint coordinates, the method further includes: and correcting the position information of the characters in the mark according to the vertex coordinates and the boundary endpoint coordinates.
further, the step of performing correction processing on the position information of the character in the identifier according to the vertex coordinates and the boundary endpoint coordinates includes: calculating a perspective transformation matrix according to the vertex coordinates and the demarcation line endpoint coordinates; and carrying out perspective transformation on the position information of the characters in the identifier according to the perspective transformation conversion matrix to obtain the transformed position information.
Further, the mark comprises a plurality of characters; the step of performing weighting processing on the feature values in the feature map according to the position information includes: aiming at each character, generating a normal distribution graph corresponding to the current character according to the position information of the current character; in the normal distribution graph, the numerical value at the center position of the current character is the largest, and the numerical value at the position farther away from the center position is smaller; and multiplying each characteristic value in the characteristic diagram with the numerical value at the position corresponding to the characteristic value in the normal distribution diagram to obtain the characteristic diagram corresponding to the current character.
Further, the step of recognizing the characters in the identifier according to the processed feature map includes: inputting a characteristic diagram corresponding to the current character into a preset full-connection network aiming at each character, and outputting a probability distribution sequence; wherein, the probability distribution sequence comprises a plurality of probability values, and each probability value is used for indicating: the probability that the current character is a preset character; and determining the preset character corresponding to the maximum probability value in the probability distribution sequence as the current character.
in a second aspect, the present invention provides a training method for an identification recognition model, the method including: determining a current training picture based on a preset training set; wherein, the current training picture is marked with standard identification characters and standard position coordinates of the characters in the identification; inputting the current training picture into a preset first feature extraction network, and outputting the predicted position coordinates of the characters in the identification in the current training picture; calculating a loss value of the predicted position coordinate according to the standard position coordinate and a preset first loss function; extracting a feature map of the identifier through a preset second feature extraction network; according to the predicted position coordinates, carrying out weighting processing on characteristic values in the characteristic diagram; identifying characters in the identification according to the processed characteristic diagram to obtain an identification result; calculating a loss value of the recognition result according to the standard identification character and a preset second loss function; and continuing to execute the step of determining the current training picture based on a preset training set until the loss value of the predicted position coordinate is converged and the loss value of the recognition result is converged to obtain an identification recognition model.
in a third aspect, the present invention provides an identifier recognition apparatus, including: the acquisition module is used for acquiring the picture to be identified containing the identification; the first extraction module is used for extracting the position information of the characters in the identification in the picture to be recognized through a preset first feature extraction network; the second extraction module is used for extracting the feature map of the identifier through a preset second feature extraction network; the first processing module is used for carrying out weighting processing on the characteristic values in the characteristic diagram according to the position information; and the first recognition module is used for recognizing the characters in the identification according to the processed feature map.
In a fourth aspect, the present invention provides a training apparatus for identifying a recognition model, the apparatus comprising: the determining module is used for determining a current training picture based on a preset training set; wherein, the current training picture is marked with standard identification characters and standard position coordinates of the characters in the identification; the output module is used for inputting the current training picture into a preset first feature extraction network and outputting the predicted position coordinates of the characters in the identification in the current training picture; the first calculation module is used for calculating a loss value of the predicted position coordinate according to the standard position coordinate and a preset first loss function; the third extraction module is used for extracting the feature map of the identifier through a preset second feature extraction network; the second processing module is used for carrying out weighting processing on the characteristic values in the characteristic diagram according to the predicted position coordinates; the second identification module is used for identifying characters in the identification according to the processed characteristic diagram to obtain an identification result; the second calculation module is used for calculating a loss value of the recognition result according to the standard identification character and a preset second loss function; and the execution module is used for continuously executing the step of determining the current training picture based on a preset training set until the loss value of the predicted position coordinate is converged and the loss value of the recognition result is converged to obtain the identification recognition model.
In a fifth aspect, the present invention provides an electronic system, comprising: the device comprises an image acquisition device, a processing device and a storage device; the image acquisition equipment is used for acquiring preview video frames or image data; the storage device has stored thereon a computer program which, when executed by the processing apparatus, performs the method for signature recognition according to the first aspect or the method for training a signature recognition model according to the second aspect.
in a sixth aspect, the present invention provides a computer-readable storage medium, having a computer program stored thereon, where the computer program is executed by a processing device to perform the steps of the identity recognition method of the first aspect or the training method of the identity recognition model of the second aspect.
according to the identification recognition method and the model training method, device and electronic system thereof, the position information of the characters in the identification contained in the picture to be recognized is extracted through the preset first feature extraction network; extracting a feature map of the identifier through a preset second feature extraction network; and according to the position information, carrying out weighting processing on the characteristic values in the characteristic diagram so as to identify the characters in the mark. According to the method, the characteristic diagram of the identifier is weighted through the position information of the characters in the identifier, so that an attention mechanism based on the position information is introduced into the characteristic diagram to effectively distinguish the position of each character, and the identification accuracy is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic structural diagram of an electronic system according to an embodiment of the present invention;
fig. 2 is a flowchart of an identifier recognition method according to an embodiment of the present invention;
Fig. 3 is a flowchart of another identification recognition method according to an embodiment of the present invention;
fig. 4 is a flowchart of another identification recognition method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a location of a tag coordinate provided by an embodiment of the present invention;
fig. 6 is a schematic diagram of a network structure according to an embodiment of the present invention;
FIG. 7 is a signal flow diagram according to an embodiment of the present invention;
FIG. 8 is a schematic signal flow diagram according to another embodiment of the present invention;
Fig. 9 is a flowchart of another identification recognition method according to an embodiment of the present invention;
fig. 10 is a schematic diagram of another network structure provided in the embodiment of the present invention;
FIG. 11 is a flowchart of a training method for an identification recognition model according to an embodiment of the present invention;
FIG. 12 is a schematic signal flow diagram according to another embodiment of the present invention;
fig. 13 is a schematic structural diagram of an identifier recognition apparatus according to an embodiment of the present invention;
Fig. 14 is a schematic structural diagram of a training apparatus for identifying a recognition model according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Taking the identification of a license plate as an example, the identification process of the license plate usually comprises the steps of firstly detecting the position of the license plate, and then extracting the license plate to identify the license plate. The technology of license plate recognition is gradually perfected, but some problems exist, firstly, because the angle of the license plate shooting may be in a very deviated position, the license plate can obtain a license plate photo inclined by a large angle, which also brings great difficulty to the detection and recognition of the license plate. In addition, the detected result is a rectangular frame under normal conditions, and the detected result cannot be completely accurate; secondly, a small number of double-row license plates exist at present, the current recognition model realizes the recognition of single-double-row license plates in the same model, and the recognition model has poor compatibility on the recognition of double-row license plates; thirdly, the same feature map is used for recognizing all characters, and the position of each character cannot be effectively distinguished; fourthly, the current recognition model sequentially executes the positioning and recognition algorithms, and the execution flow is complex.
Based on the problem that the position of each character is difficult to be effectively distinguished by the existing recognition method, and the accuracy of identification recognition is further influenced, the embodiment of the invention provides an identification recognition method and a training method, a device and an electronic system of a model thereof.
the first embodiment is as follows:
first, an example electronic system 100 for implementing the identity recognition method and the training method and apparatus of the model thereof according to the embodiment of the present invention will be described with reference to fig. 1.
As shown in FIG. 1, an electronic system 100 includes one or more processing devices 102, one or more memory devices 104, an input device 106, an output device 108, and one or more image capture devices 110, which are interconnected via a bus system 112 and/or other type of connection mechanism (not shown). It should be noted that the components and structure of the electronic system 100 shown in fig. 1 are exemplary only, and not limiting, and that the electronic system may have other components and structures as desired.
The processing device 102 may be a gateway or an intelligent terminal, or a device including a Central Processing Unit (CPU) or other form of processing unit having data processing capability and/or instruction execution capability, and may process data of other components in the electronic system 100 and may control other components in the electronic system 100 to perform desired functions.
The storage 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processing device 102 to implement client functionality (implemented by the processing device) and/or other desired functionality in embodiments of the present invention described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.
The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.
the output device 108 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like.
the image capture device 110 may capture preview video frames or pictures and store the captured preview video frames or pictures in the storage 104 for use by other components.
For example, the devices in the example electronic system for implementing the identification recognition method and the training method and apparatus for the model thereof according to the embodiment of the present invention and the electronic system may be integrally disposed, or may be disposed in a decentralized manner, such as integrally disposing the processing device 102, the storage device 104, the input device 106 and the output device 108, and disposing the image capturing device 110 at a specific position where the target image can be captured. When the above-described devices in the electronic system are integrally provided, the electronic system may be implemented as an intelligent terminal such as a camera, a smart phone, a tablet computer, a vehicle-mounted terminal, and the like.
example two:
The embodiment provides an identification recognition method, which is executed by a processing device in the electronic system; the processing device may be any device or chip having data processing capabilities. The processing equipment can independently process the received information, can also be connected with the server, jointly analyzes and processes the information, and uploads a processing result to the cloud.
The marks can be license plates, billboards, road signs, traffic signs and the like; as shown in fig. 2, the identification recognition method includes the following steps:
Step S202, acquiring the picture to be recognized containing the identification.
The picture to be identified can be a video frame acquired by the vehicle-mounted equipment or the monitoring device, and can also be a picture acquired by other equipment; the mark generally refers to a part to be recognized in the picture to be recognized, and may include characters such as words, letters or numbers.
and step S204, extracting the position information of the characters in the mark in the picture to be recognized through a preset first feature extraction network.
the first feature extraction network may be implemented by various convolutional neural networks, such as a residual error network, a VGG network, and the like. The position information of the character in the identifier may be coordinate information of each character in the identifier, and specifically may be a center position coordinate, an upper left corner position coordinate, and the like of the character. And extracting the central position coordinate information of each character from the picture to be recognized through a preset first feature extraction network.
and step S206, extracting the feature map of the identifier through a preset second feature extraction network.
The second feature extraction network can also be realized by various convolutional neural networks, such as a residual error network, a VGG network and the like; the feature map usually includes a plurality of layers of two-dimensional matrices, and the number of layers (or the number of channels) of the two-dimensional matrices in the feature map output by the second feature extraction network and the attribute of the features in each layer of the two-dimensional matrices may be preset.
In step S208, a weighting process is performed on the feature values in the feature map based on the position information.
And generating a weight distribution graph corresponding to the current character according to the coordinate information of the current character and a preset weight design rule for each character in the mark, and performing weighting processing on the characteristic value in the characteristic graph according to the weight distribution graph. In the weight distribution map, the coordinates of the current character can be used as a reference, a higher weight is set in a preset range of the coordinates of the current character, and lower weights are set at other positions. Typically, a weight profile may be generated for each character based on the position of each character.
and step S210, recognizing characters in the mark according to the processed feature map.
In actual implementation, the processed feature map may be processed using a convolutional layer, a full link layer, or the like. The processed feature map strengthens the features of some characters, thereby being more beneficial to recognition. In one mode, each character may correspond to a processed feature map, in the processed feature map, the feature of the corresponding character is strengthened, and other features are weakened, so that the strengthened feature has more attention, and the recognition accuracy of the character is improved.
The embodiment of the invention provides an identification method, which comprises the steps of extracting position information of characters in an identification contained in a picture to be identified through a preset first characteristic extraction network; extracting a feature map of the identifier through a preset second feature extraction network; and according to the position information, carrying out weighting processing on the characteristic values in the characteristic diagram so as to identify the characters in the mark. According to the method, the characteristic diagram of the identifier is weighted through the position information of the characters in the identifier, so that an attention mechanism based on the position information is introduced into the characteristic diagram to effectively distinguish the position of each character, and the identification accuracy is improved.
Example three:
The embodiment provides another identification method, which is implemented on the basis of the embodiment; the embodiment mainly describes a specific process of extracting the position information of the characters in the identifier in the picture to be identified through a preset first feature extraction network, wherein the first feature extraction network comprises a convolution layer and a full connection layer; as shown in fig. 3, the method comprises the steps of:
step S302, acquiring the picture to be recognized containing the identification.
As an example, when the identification is a license plate, the picture to be recognized may be a partial region in the original picture; specifically, when the position of the identifier is detected from the acquired picture, such as the license plate position, the result of the license plate information obtained by detection is incomplete due to the fact that the size of the candidate frame is small, at this time, the candidate frame can be expanded according to a preset expansion mode to ensure that the candidate frame contains complete license plate information, the expansion direction can be extended in the diagonal direction along four vertexes of a license plate rectangular frame, and the expansion size can be a multiple of the diagonal, such as one time. The external expanding mode can also be as follows: performing outward expansion to the upper side, the lower side, the left side and the right side by taking each vertex coordinate as a reference, for example, outward expansion to the left side and the upper side by taking the vertex at the upper left corner as a reference; for the vertex of the lower left corner, the vertex of the lower left corner is taken as a reference to expand outwards to the left side and the lower side; for the top right corner vertex, outwards expanding towards the right side and the upper side by taking the top right corner vertex as a reference; for the vertex of the lower right corner, the vertex of the lower right corner is taken as a reference to expand outwards to the right side and the lower side; the specific flaring distance can be set by the flaring size described above. And in the external expansion area, the image area containing the identification and an annular area positioned around the image area are included, the annular area is an area obtained by external expansion, and the area obtained by external expansion is cut out from the original picture to obtain the picture to be identified.
Step S304, extracting the characteristic data of the picture to be identified through the convolution layer.
the convolution layers can comprise a plurality of convolution layers, and excitation layers, pooling layers and the like can be inserted among the plurality of convolution layers; the convolutional layer is mainly used for feature extraction to obtain feature data corresponding to a picture to be identified, and the feature data is usually a multi-dimensional feature vector.
Step S306, inputting the characteristic data to the full connection layer, and outputting the position coordinates of the characters in the identifier.
each node of the full-connection layer is connected with all nodes of the previous layer and used for integrating the extracted characteristics of the convolution layer so as to classify the picture to be identified. Specifically, the output end of the convolutional layer may be connected to each fully-connected layer, so as to input the feature data into each fully-connected layer, to obtain the position coordinates of the character in the identifier, and in actual implementation, the position coordinates usually select the coordinates of the center position of the corresponding character.
and step S308, extracting the feature map of the identifier through a preset second feature extraction network.
step S310 performs weighting processing on the feature values in the feature map based on the position information.
Step S312, recognizing the character in the mark according to the processed feature map.
The embodiment of the invention provides another identification method, which describes a specific process of extracting position information of characters in an identification in a picture to be identified through a preset first feature extraction network in detail.
Example four:
The embodiment provides another identification method, which is implemented on the basis of the embodiment; as shown in fig. 4, the method includes the steps of:
Step S402, acquiring the picture to be recognized containing the identification.
And S404, extracting the position information of the characters in the mark in the picture to be recognized through a preset first feature extraction network.
step S406, the mark is corrected to obtain a mark containing a single line of characters.
The identifier may deform in the picture to be identified under the influence of the shooting angle, for example, the rectangular identifier may deform into a trapezoid or a parallelogram in the picture to be identified; in addition, the mark of the picture to be recognized may include multiple lines of characters, and the recognition effect of the existing method on the multiple lines of characters is poor, so that in order to restore the original shape of the mark, the mark in the picture to be recognized can be corrected in a recognition model, the picture to be recognized after correction is rectangular, and the multiple lines of characters are horizontally spliced into a single line of characters. Based on this, the above step S406 can also be realized by the following steps 01 to 03:
and step 01, extracting the vertex coordinates of the mark in the picture to be recognized and the boundary endpoint coordinates between two adjacent lines of characters in the mark through the first feature extraction network.
For convenience of understanding, firstly, a schematic diagram of the positions of the identification coordinates shown in fig. 5 is provided, which is described by taking a license plate picture as an example, the vertex coordinates of the identification may specifically be coordinate information of four vertices of the picture to be identified, such as the positions marked by "+" in the figure, and the vertex coordinates may be used for obtaining accurate license plate frame positioning; the boundary line end point coordinates can be specifically coordinate information of two end points of a boundary between two adjacent lines of characters in the picture to be recognized, such as positions marked by "-", the boundary line end point coordinates can be used for recognizing boundaries between two upper lines and two lower lines of a double-line license plate, if the license plate is a single-line license plate, the lower boundary line of the license plate is obtained, and vertex coordinates and boundary line end point coordinates in the picture to be recognized are detected through a positioning network; the location identified by "+" in the figure is the location coordinate of the character in the identification.
And step 02, correcting the identifier according to the vertex coordinates and the boundary endpoint coordinates.
and based on the detected vertex coordinates and boundary end point coordinates, correcting the picture to be recognized by adopting a preset correction network, restoring the corrected picture to be recognized into a rectangular shape, and horizontally splicing a plurality of lines of characters into a single line of characters. Based on this, the above step 02 can also be realized by the following steps 021-:
and 021, calculating a perspective transformation matrix according to the vertex coordinates and the endpoint coordinates of the demarcation line.
In one of the manners, the perspective transformation matrix may be calculated according to the vertex coordinates and the boundary end point coordinates, and the preset corrected vertex coordinates and boundary end point coordinates. The preset corrected vertex coordinates may specifically be: coordinate information of four vertexes of the corrected picture to be recognized to be generated; the preset corrected boundary end point coordinates may specifically be: coordinate information of two end points of a boundary between two adjacent lines of characters in the corrected picture to be recognized to be generated; firstly, determining the corresponding relation between the preset four vertexes and two end points of a boundary after correction and the four vertexes and two end points of the boundary before correction, namely determining which vertex corresponds to which vertex of the picture to be identified before correction in the preset four vertexes after correction; and calculating a preset mapping matrix of the vertex coordinates and boundary end point coordinates after correction and the vertex coordinates and boundary end point coordinates before correction by using a conversion matrix generator, namely the mapping matrix is a perspective transformation conversion matrix. The method can supervise the accuracy of the perspective transformation matrix through the preset corrected coordinates, so that the perspective transformation matrix is more accurate.
In another mode, a perspective transformation matrix may be obtained based on an STN (spatial transformation Network) principle, where the STN Network is composed of three parts, namely a local Network (localization Network), a Grid generator (Grid generator), and a Sampler (Sampler); the STN network needs to be trained in advance, after the training is finished, the picture to be recognized before the correction is input into the STN network, transformation parameters are predicted through a local network, and the transformation parameters form the perspective transformation matrix; and then correcting the picture to be recognized through the perspective transformation conversion matrix.
022, performing perspective transformation on the identifier according to the perspective transformation matrix to obtain a transformed identifier.
The perspective transformation conversion matrix is used for carrying out perspective transformation on the identifier in the picture to be identified, correcting the identifier into a form of orthographic projection, and obtaining the identifier after the perspective transformation. For example, the rectangular identifier can be obtained after the parallelogram identifier is subjected to perspective transformation.
023, splitting each line of characters in the transformed logo according to the coordinates of the boundary end points in the transformed logo.
according to the mapping relation and the picture to be recognized before correction, simple interpolation processing is carried out on the converted identifier, so that the converted identifier is clearer; splitting according to the boundary end point coordinates to split the multi-line character pictures into a plurality of single-line character pictures; for example: if the license plate is identified as a double-row license plate, respectively correcting an uplink part and a downlink part of the transformed double-row license plate into rectangles, wherein the uplink part is a picture area formed by two vertexes at the upper end of the license plate and two end points of a boundary, and the downlink part is a picture area formed by two vertexes at the lower end of the license plate and two end points of the boundary; if the license plate is identified as a single line of license plates, the coordinates of the endpoint of the lower boundary line are superposed with two vertexes at the lower end of the license plate, and the image area of the descending part is empty.
And 024, splicing the split character lines into a single character line in sequence.
Splicing the split single-line character pictures in sequence from top to bottom, and finally splicing the single-line character pictures into an integral single-line character picture; if the license plate is marked as a single-row license plate, the picture area of the descending part is empty, so that the whole single-row picture is still obtained after sequential splicing. Because the identification recognition process cannot distinguish whether the license plate is a single line or a double line, all the single-double line license plates are corrected into single line license plates through the process, and the accuracy of identification recognition is improved.
And 03, correcting the position information of the characters in the marks according to the vertex coordinates and the demarcation line endpoint coordinates.
The method comprises the steps that the marks are possibly deformed in a picture to be recognized due to the influence of a shooting angle, characters in the corresponding marks are also deformed, the picture to be recognized is corrected by adopting a preset correction network based on the detected vertex coordinates and boundary end point coordinates of the picture to be recognized, and the positions of the characters in the marks after correction are corrected into an orthographic projection form. Based on this, the step 03 can be further implemented by the following steps 031-:
And 031, calculating a perspective transformation matrix according to the vertex coordinates and the endpoint coordinates of the demarcation line.
in one of the manners, the perspective transformation matrix may be calculated according to the vertex coordinates and the boundary end point coordinates, and the preset corrected vertex coordinates and boundary end point coordinates. In another way, the perspective transformation matrix may be obtained based on an STN (spatial transformation network) principle, which may be specifically described in the foregoing embodiments and will not be described herein again.
Step 032, performing perspective transformation on the position information of the character in the identifier according to the perspective transformation matrix to obtain transformed position information.
in the perspective transformation process, the deformation of the characters in the identification can be corrected along with the perspective transformation, so that the position information of the characters after the perspective transformation is obtained, and the content identification of the subsequent identification is facilitated; the transformed position information may be coordinate information of each character in the transformed identifier, and specifically may be center coordinate information of the character.
Step S408, extracting the feature map of the identifier through a preset second feature extraction network.
Step S410, performing weighting processing on the feature values in the feature map according to the position information.
Step S412, recognizing the character in the mark according to the processed feature map.
Another identification method provided by the embodiment of the present invention further includes a step of correcting the identifier to obtain an identifier including a single line of characters, and a detailed description is given of a specific process of the step, wherein a network is extracted through a preset first feature, a vertex coordinate of the identifier in a picture to be identified and a boundary end coordinate between two adjacent lines of characters in the identifier are extracted, and the identifier and position information in the identifier are corrected according to the vertex coordinate and the boundary end coordinate; the correction process of the identifier comprises the steps of carrying out perspective transformation on the identifier through a perspective transformation conversion matrix, splitting each row of characters in the transformed identifier, and splicing each row of split characters into a single row of characters. According to the method, the characteristic diagram of the identifier is weighted through the position information of the characters in the identifier, so that an attention mechanism based on the position information is introduced into the characteristic diagram to effectively distinguish the position of each character, and the identification accuracy is improved.
Based on the above embodiments, in order to further understand the above identification method, a schematic diagram of a network structure is provided below, as shown in fig. 6, so as to describe a process of performing a correction process on the identification.
As shown in fig. 6, the picture to be recognized is processed by a rectification network, and the rectified picture to be recognized is output, where the rectification network includes: a positioning network, a conversion matrix generator and a sampler; the positioning network is used for detecting coordinate information in the picture to be identified, and the coordinate information comprises vertex coordinates and boundary end point coordinates; the transformation matrix generator is used for calculating a preset mapping matrix of the vertex coordinates and boundary end point coordinates after correction and the vertex coordinates and boundary end point coordinates before correction, namely a perspective transformation matrix, and outputting transformed identifiers, namely rectangular identifiers, through the perspective transformation matrix; the sampler is used for carrying out simple interpolation processing according to the rectangular mark and the picture to be recognized so as to generate a corrected mark.
As an example, when the identifier in the picture to be recognized is a double-row license plate, referring to a signal flow diagram shown in fig. 7, the picture to be recognized outputs coordinate information of four fixed points of the picture to be recognized (positions marked with an x in the figure) and coordinate information of two end points of a boundary between two rows of characters (positions marked with an x in the figure) through a positioning network, and then outputs a transformed identifier through a conversion matrix generator, that is, an uplink part and a downlink part of the double-row license plate are respectively corrected into rectangular identifiers; and carrying out simple interpolation processing on the rectangular identification through a sampler, outputting the identification after interpolation, and splicing the identification after interpolation into a single row of characters in sequence, namely the corrected identification.
As another example, when the identifier in the picture to be recognized is a single-row license plate, referring to another signal flow diagram shown in fig. 8, the picture to be recognized outputs coordinate information of four fixed points of the picture to be recognized (positions marked with an "x" in the figure) and coordinate information of two end points of a lower boundary line of the single-row license plate (positions marked with an "y" in the figure) through a positioning network, because of the single-row license plate, coordinates of the two end points of the lower boundary line coincide with two vertices of the lower end of the license plate, and then the transformed identifier is output through a transformation matrix generator, that is, the single-row license plate is corrected into a rectangular identifier; the rectangular identification is simply interpolated by a sampler, the interpolated identification is output, the interpolated identification is sequentially spliced into a single-row character, and the single-row character is still obtained after splicing because the picture area of the downlink part of the single-row license plate is empty, namely the corrected identification.
Example five:
The embodiment provides another identification method, which is implemented on the basis of the embodiment; this embodiment mainly describes a specific process of performing weighting processing on feature values in a feature map according to location information, and a specific process of recognizing characters in a logo according to the processed feature map, as shown in fig. 9, the method includes the following steps:
step S902, acquiring a to-be-recognized picture including the identifier.
Step S904, extracting the position information of the character in the identifier in the picture to be recognized through a preset first feature extraction network.
And step S906, extracting the feature map of the identifier through a preset second feature extraction network.
step S908, aiming at each character, generating a normal distribution graph corresponding to the current character according to the position information of the current character; in the normal distribution graph, the value at the center position of the current character is the largest, and the value at a position farther from the center position is smaller.
based on the corrected picture to be recognized, the position coordinates of each character in the mark are obtained, distribution which is gradually weakened from 1 to 0 in space, such as normal distribution, is generated by taking the position coordinates of the current character as the center, and a normal distribution table shown in table 1 is generated by taking the current character as the second character as an example, wherein the position with the value of 1 is the position of the current character, and the position with the value of 0 is the character position which is farthest away from the current character.
TABLE 1
0.8 | 1 | 0.8 | 0.7 | 0.5 | 0.2 | 0 |
step S910, multiply each feature value in the feature map by the value at the position corresponding to the feature value in the normal distribution map to obtain the feature map corresponding to the current character.
And when the current character is identified, multiplying the distribution diagram corresponding to the current character by the characteristic diagram of the corrected picture to be identified according to the position to obtain the characteristic diagram corresponding to the current character, namely, identifying the current character by focusing on the position of the current character, and playing the role of attention.
Step S912, inputting a feature map corresponding to the current character into a preset full-connection network aiming at each character, and outputting a probability distribution sequence; wherein, the probability distribution sequence comprises a plurality of probability values, and each probability value is used for indicating: probability that the current character is a preset character.
The preset full-connection network mainly extracts different types of information from the characteristic diagram extracted from the convolutional layer, and can determine the classification times according to the number of characters to be recognized. The preset characters can be understood as all characters related to the license plate identification, including characters, letters, numbers and the like; taking a license plate picture as an example, the license plate picture usually includes 7 characters, so that classification needs to be performed 7 times, a feature map corresponding to each character is input to a preset full-connection network, a probability distribution sequence for a current character is output to obtain a classification result of each character, and a probability value in the probability distribution sequence is used for indicating the probability that the current character is a preset character.
Step S914, determining a preset character corresponding to the maximum probability value in the probability distribution sequence as a current character.
and selecting the maximum probability value based on the probability distribution sequence, and determining the corresponding preset character as the current character.
according to another identification recognition method provided by the embodiment of the invention, a corresponding normal distribution graph is generated according to the position information of the current character aiming at each character; and multiplying each characteristic value in the characteristic graph by a numerical value at a position corresponding to the characteristic value in the normal distribution graph to obtain the characteristic graph corresponding to the current character, inputting the characteristic graph into a preset full-connection network, and outputting a corresponding probability distribution sequence, wherein the preset character corresponding to the maximum probability value is determined as the current character. According to the method, the characteristic diagram of the identifier is weighted through the position information of the characters in the identifier, so that an attention mechanism based on the position information is introduced into the characteristic diagram to effectively distinguish the position of each character, and the identification accuracy is improved.
Based on the above embodiments, in order to further understand the above identification recognition method, another network structure diagram is provided below, as shown in fig. 10, to explain a process of recognizing characters in an identification.
As shown in fig. 10, the network structure includes a first feature extraction network, a correction network, and a second feature extraction network, where the first feature extraction network and the second feature extraction network both include a convolution layer and a full-link layer, and the convolution layer is mainly used for feature extraction to obtain feature data corresponding to a picture to be identified; the full connection layer mainly extracts different types of information from the characteristic data extracted from the convolution layer; the correction network is used for correcting the picture to be recognized and restoring the picture to be recognized into a rectangular shape. In the second feature extraction network, the convolutional layer is usually connected with a plurality of full connection layers, one full connection layer is used for identifying one character, and taking a license plate picture as an example, the license plate picture usually comprises 7 characters, so that 7 full connection layers are needed to complete the identification of the 7 characters in the license plate picture. The meaning that the correction network and the convolution layer in the second feature extraction network are connected with each full connection layer is as follows: obtaining the position coordinate of each character in the identification based on the picture to be recognized corrected through the correction network, and generating a normal distribution graph corresponding to each character by taking the position coordinate of the current character as a center; and multiplying the normal distribution graph corresponding to each character by the feature graph extracted by the convolutional layer according to the position to obtain the feature graph corresponding to the current character, and respectively inputting the feature graph corresponding to each character into the full-connection layer to realize the identification of each character in the picture to be identified.
example six:
In order to implement the above identification method, this embodiment further provides a training method for an identification model, and during actual training, two training methods may be designed according to whether data in a training set includes a license plate vertex, a demarcation line endpoint, and a true value of a character position coordinate: one is positioning unsupervised, namely the positioning result is not subjected to loss value (loss) supervision training, and the whole network is only subjected to loss value (loss) supervision by adopting a recognition network; and the other method is that the positioning is supervised, namely the positioning result and the real value of the four-vertex coordinates of the license plate calculate smooth l1 loss, and finally the loss of the whole network is added by the positioning loss and the identification loss. For the positioning supervised training mode, this embodiment further provides a training method for the identification recognition model, as shown in fig. 11, the method includes the following steps:
Step S1102, determining a current training picture based on a preset training set; wherein, the current training picture is marked with standard identification characters and standard position coordinates of the characters in the identification.
The current training picture may include rectangular marks taken from different angles, or may include marks of triangles, pentagons, or other polygonal shapes taken from different angles. The current training picture can be obtained from an image database, or can be captured from a video frame, or can be directly obtained by shooting. The standard identification character may include characters such as letters, letters or numbers, and the standard position coordinates of each character in the standard identification character, such as the center position coordinates of each character, are recognized.
And step S1104, inputting the current training picture into a preset first feature extraction network, and outputting the predicted position coordinates of the characters in the identification in the current training picture.
the first feature extraction network can be realized by various convolutional neural networks, such as a residual error network, a VGG network, and the like, wherein the convolutional layer and the fully-connected layer are included. The predicted position coordinate may be a position coordinate of each character in the current training picture identifier, and specifically may be a center position coordinate of the character. And outputting the center position coordinates of each character in the current training picture through a preset first feature extraction network, namely the predicted position coordinates of the corresponding characters.
In step S1106, a loss value of the predicted position coordinate is calculated according to the standard position coordinate and a preset first loss function.
the preset first loss function is generally integrated with two functions, one function is used for calculating the probability of each group of position coordinates, and the position coordinate with the maximum probability is determined as the predicted position coordinate, and the function may be specifically a Softmax function or other probability regression functions. Another function is used to calculate a loss value of the predicted position coordinate, the function may calculate a difference between the predicted position coordinate output by the model and the standard position coordinate, where the difference is the loss value, and the function may specifically be a cross entropy function, or may be other functions used to evaluate model loss.
Step S1108, extracting the identified feature map through a preset second feature extraction network.
The second feature extraction network can also be implemented by various convolutional neural networks, such as a residual error network, a VGG network, and the like, including a convolutional layer and a fully-connected layer.
In step S1110, a weighting process is performed on the feature values in the feature map based on the predicted position coordinates.
and generating a weight distribution graph corresponding to the current character according to the predicted position coordinates of the current character and a preset weight design rule for each character in the training picture, and performing weighting processing on the characteristic values in the characteristic graph according to the weight distribution graph.
in step S1112, the characters in the mark are recognized according to the processed feature map, and a recognition result is obtained.
Step S1114 calculates a loss value of the recognition result according to the standard identification character and a preset second loss function.
The preset second loss function is generally integrated with two functions, one function is used for calculating the probability of the identification result, and the identification character with the maximum probability is determined as the identification result, and the function may be specifically a Softmax function, and may also be other probability regression functions. Another function is used to calculate a loss value of the identification result, the function may calculate a difference between the identification result output by the model and the standard identification character, where the difference is the loss value, and the function may specifically be a cross entropy function, or may also be other functions used to evaluate model loss.
Step S1116, the step of determining the current training picture based on the preset training set is continuously performed until the loss value of the predicted position coordinate converges and the loss value of the recognition result converges, so as to obtain the identifier recognition model.
In actual implementation, if the loss value of the predicted position coordinate is not converged or the loss value of the recognition result is not converged, the step of determining the current training picture based on the preset training set is continuously executed until the loss value of the predicted position coordinate and the loss value of the recognition result are both converged, so that the identification recognition model can be obtained, and the position coordinate of the character in the predicted identification and the accuracy of the identification recognition result can be ensured through the identification recognition model.
In the above mode, the predicted position coordinates of the characters in the network output identification are extracted through the first feature; calculating a loss value of the predicted position coordinate through a first loss function; extracting a characteristic diagram of the identifier through a preset second characteristic extraction network, carrying out weighting processing on a characteristic value in the characteristic diagram according to the predicted position coordinate to further obtain a character recognition result, and calculating a loss value of the recognition result through a second loss function; and training the initial identification recognition model according to the loss value of the predicted position coordinate and the loss value of the recognition result until the loss value is converged to obtain the identification recognition model. The identification recognition model carries out weighting processing on the characteristic diagram of the identification through the position information of the characters in the identification, thereby introducing an attention mechanism based on the position information on the characteristic diagram to effectively distinguish the position of each character, and further improving the accuracy of identification recognition.
Based on the network structure diagram provided in the foregoing embodiment and fig. 10, in order to further understand the training method of the identification recognition model, another signal flow diagram is provided below, as shown in fig. 12, a to-be-recognized picture passes through a convolutional layer in a first feature extraction network, extracts a feature map of the to-be-recognized picture, then passes through a fully-connected layer in the first feature extraction network, outputs a position coordinate, four vertex coordinates, and a boundary end point coordinate of each character, and calculates the position coordinate of each character and a real position coordinate of each character through a first loss function to obtain a corresponding loss value; correcting the four vertex coordinates and the boundary endpoint coordinates through a correction network to obtain a corrected picture and the position coordinates of each character, then extracting a convolution layer in the network through a second feature, extracting a feature map of the corrected picture, multiplying the normal distribution map corresponding to each character by the feature map in a bit manner to obtain a feature map corresponding to the current character, respectively inputting the feature map corresponding to each character into a full-connection layer, outputting the recognition result of each character, and calculating the real license plate number and the recognition result of each character through a second loss function to obtain a corresponding loss value.
Example seven:
Corresponding to the above embodiment of the identification method, see fig. 13 for a schematic structural diagram of an identification device, the device includes: an obtaining module 130, configured to obtain a to-be-recognized picture including an identifier; the first extraction module 131 is configured to extract, through a preset first feature extraction network, position information of a character in an identifier in a picture to be recognized; a second extraction module 132, configured to extract a feature map of the identifier through a preset second feature extraction network; the first processing module 133 is configured to perform weighting processing on the feature values in the feature map according to the location information; and a first recognition module 134, configured to recognize characters in the identifier according to the processed feature map.
the identification device provided by the embodiment of the invention extracts the position information of the characters in the identification contained in the picture to be identified through a preset first characteristic extraction network; extracting a feature map of the identifier through a preset second feature extraction network; and according to the position information, carrying out weighting processing on the characteristic values in the characteristic diagram so as to identify the characters in the mark. The device carries out weighting processing on the characteristic diagram of the identification through the position information of the characters in the identification, thereby introducing an attention mechanism based on the position information on the characteristic diagram to effectively distinguish the position of each character, and further improving the accuracy of identification.
Further, the first feature extraction network comprises a convolutional layer and a full connection layer; the first extraction module 131 is further configured to extract feature data of the picture to be identified through the convolutional layer; and inputting the characteristic data into the full connection layer, and outputting the position coordinates of the characters in the identifier.
Further, the indicia comprises a plurality of lines of characters; as shown in fig. 13, the apparatus further includes: and the correction processing module 135 is configured to perform correction processing on the identifier to obtain an identifier containing a single row of characters.
Further, the corrective processing module 135 is further configured to: extracting the vertex coordinates of the mark and the boundary endpoint coordinates between two adjacent lines of characters in the mark in the picture to be recognized through a first feature extraction network; and correcting the mark according to the vertex coordinates and the boundary end point coordinates.
Further, the corrective processing module 135 is further configured to: calculating a perspective transformation matrix according to the vertex coordinates and the demarcation line endpoint coordinates; carrying out perspective transformation on the identifier according to the perspective transformation conversion matrix to obtain a transformed identifier; splitting each line of characters in the transformed identifier according to the boundary end point coordinates in the transformed identifier; and splicing the split character sequence of each row into a single row of characters.
Further, the corrective processing module 135 is further configured to: and correcting the position information of the characters in the marks according to the vertex coordinates and the boundary end point coordinates.
Further, the corrective processing module 135 is further configured to: calculating a perspective transformation matrix according to the vertex coordinates and the demarcation line endpoint coordinates; and carrying out perspective transformation on the position information of the characters in the identifier according to the perspective transformation conversion matrix to obtain the transformed position information.
further, the mark comprises a plurality of characters; the first processing module 133 is further configured to: generating a normal distribution graph corresponding to the current character according to the position information of the current character aiming at each character; in the normal distribution diagram, the numerical value at the center position of the current character is the largest, and the numerical value at the position farther away from the center position is smaller; and multiplying each characteristic value in the characteristic graph by the numerical value at the position corresponding to the characteristic value in the normal distribution graph to obtain the characteristic graph corresponding to the current character.
Further, the first identification module 134 is further configured to: inputting a characteristic diagram corresponding to the current character into a preset full-connection network aiming at each character, and outputting a probability distribution sequence; wherein, the probability distribution sequence comprises a plurality of probability values, and each probability value is used for indicating: probability that the current character is a preset character; and determining the preset character corresponding to the maximum probability value in the probability distribution sequence as the current character.
the implementation principle and the generated technical effect of the identifier recognition apparatus provided by the embodiment of the present invention are the same as those of the identifier recognition method embodiment, and for the sake of brief description, reference may be made to corresponding contents in the identifier recognition method embodiment where no mention is made in the embodiment of the identifier recognition apparatus.
example eight:
corresponding to the above embodiment of the method for training the logo recognition model, refer to fig. 14, which is a schematic structural diagram of a device for training the logo recognition model, and the device includes: a determining module 140, configured to determine a current training picture based on a preset training set; wherein, the current training picture is marked with standard identification characters and standard position coordinates of the characters in the identification; the output module 141 is configured to input the current training picture to a preset first feature extraction network, and output the predicted position coordinates of the characters in the identifier in the current training picture; a first calculating module 142, configured to calculate a loss value of the predicted position coordinate according to the standard position coordinate and a preset first loss function; a third extraction module 143, configured to extract a feature map of the identifier through a preset second feature extraction network; the second processing module 144 is configured to perform weighting processing on the feature values in the feature map according to the predicted position coordinates; the second recognition module 145 is configured to recognize characters in the identifier according to the processed feature map, so as to obtain a recognition result; a second calculating module 146, configured to calculate a loss value of the recognition result according to the standard identifier character and a preset second loss function; the execution module 147 is configured to continue to perform the step of determining the current training picture based on the preset training set until the loss value of the predicted position coordinate converges and the loss value of the recognition result converges to obtain the identifier recognition model.
According to the training device for the identification recognition model, the predicted position coordinates of the characters in the identification are output through the first feature extraction network; calculating a loss value of the predicted position coordinate through a first loss function; extracting a characteristic diagram of the identifier through a preset second characteristic extraction network, carrying out weighting processing on a characteristic value in the characteristic diagram according to the predicted position coordinate to further obtain a character recognition result, and calculating a loss value of the recognition result through a second loss function; and training the initial identification recognition model according to the loss value of the predicted position coordinate and the loss value of the recognition result until the loss value is converged to obtain the identification recognition model. The device integrates the positioning function and the recognition function into one model, can effectively utilize two kinds of information, improves the identification accuracy, and can simplify the process by adopting an end-to-end network. The identification recognition model carries out weighting processing on the characteristic diagram of the identification through the position information of the characters in the identification, thereby introducing an attention mechanism based on the position information on the characteristic diagram to effectively distinguish the position of each character, and further improving the accuracy of identification recognition.
Example nine:
An embodiment of the present invention provides an electronic system, including: the device comprises an image acquisition device, a processing device and a storage device; the image acquisition equipment is used for acquiring preview video frames or image data; the storage means has stored thereon a computer program which, when being executed by a processing device, performs the steps of the above-mentioned signature recognition method, or the above-mentioned training method of the signature recognition model.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working process of the electronic system described above may refer to the corresponding process in the foregoing method embodiments, and is not described herein again.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processing device, the method for identifying a tag or the method for training a tag identification model is performed.
the identifier recognition method and the training method and device for the model thereof and the computer program product of the electronic system provided by the embodiments of the present invention include a computer-readable storage medium storing a program code, and instructions included in the program code may be used to execute the method described in the foregoing method embodiments.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and/or the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
in addition, in the description of the embodiments of the present invention, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
in the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (14)
1. an identification recognition method, the method comprising:
acquiring a picture to be identified containing an identifier;
extracting the position information of the characters in the identification in the picture to be recognized through a preset first feature extraction network;
Extracting a feature map of the identifier through a preset second feature extraction network;
According to the position information, carrying out weighting processing on the characteristic values in the characteristic diagram;
And identifying characters in the identification according to the processed feature map.
2. the method of claim 1, wherein the first feature extraction network comprises a convolutional layer and a fully connected layer;
The step of extracting the position information of the characters in the identifier in the picture to be recognized through a preset first feature extraction network comprises the following steps:
extracting feature data of the picture to be identified through the convolutional layer;
And inputting the characteristic data into the full connection layer, and outputting the position coordinates of the characters in the identification.
3. The method of claim 1, wherein the indicia comprises a plurality of lines of characters;
Before the step of extracting the identified feature map through a preset second feature extraction network, the method further includes: and correcting the identifier to obtain the identifier containing the single-row characters.
4. the method of claim 3, wherein the step of applying the identifier to a corrective process comprises:
Extracting the vertex coordinates of the identifier and the boundary endpoint coordinates between two adjacent lines of characters in the identifier in the picture to be recognized through the first feature extraction network;
and correcting the identifier according to the vertex coordinates and the boundary endpoint coordinates.
5. The method of claim 4, wherein the step of performing a corrective action on the identifier based on the vertex coordinates and the boundary point coordinates comprises:
Calculating a perspective transformation matrix according to the vertex coordinates and the demarcation line endpoint coordinates;
Carrying out perspective transformation on the identifier according to the perspective transformation conversion matrix to obtain a transformed identifier;
Splitting each line of characters in the transformed identifier according to the boundary end point coordinates in the transformed identifier;
And splicing the split character sequence of each row into a single row of characters.
6. The method of claim 4, wherein after the step of performing a corrective action on the identifier based on the vertex coordinates and the boundary point coordinates, the method further comprises:
And correcting the position information of the characters in the mark according to the vertex coordinates and the boundary endpoint coordinates.
7. The method according to claim 6, wherein the step of performing correction processing on the position information of the character in the logo according to the vertex coordinates and the boundary end point coordinates comprises:
Calculating a perspective transformation matrix according to the vertex coordinates and the demarcation line endpoint coordinates;
and carrying out perspective transformation on the position information of the characters in the identifier according to the perspective transformation conversion matrix to obtain the transformed position information.
8. The method of claim 1, wherein the identifier comprises a plurality of characters;
The step of performing weighting processing on the feature values in the feature map according to the position information includes:
aiming at each character, generating a normal distribution graph corresponding to the current character according to the position information of the current character; in the normal distribution graph, the numerical value at the center position of the current character is the largest, and the numerical value at the position farther away from the center position is smaller;
and multiplying each characteristic value in the characteristic diagram with the numerical value at the position corresponding to the characteristic value in the normal distribution diagram to obtain the characteristic diagram corresponding to the current character.
9. The method of claim 8, wherein the step of recognizing the character in the symbol according to the processed feature map comprises:
Inputting a characteristic diagram corresponding to the current character into a preset full-connection network aiming at each character, and outputting a probability distribution sequence; wherein, the probability distribution sequence comprises a plurality of probability values, and each probability value is used for indicating: the probability that the current character is a preset character;
And determining the preset character corresponding to the maximum probability value in the probability distribution sequence as the current character.
10. a training method for a signature recognition model, the method comprising:
determining a current training picture based on a preset training set; wherein, the current training picture is marked with standard identification characters and standard position coordinates of the characters in the identification;
Inputting the current training picture into a preset first feature extraction network, and outputting the predicted position coordinates of the characters in the identification in the current training picture;
Calculating a loss value of the predicted position coordinate according to the standard position coordinate and a preset first loss function;
Extracting a feature map of the identifier through a preset second feature extraction network;
According to the predicted position coordinates, carrying out weighting processing on characteristic values in the characteristic diagram;
Identifying characters in the identification according to the processed characteristic diagram to obtain an identification result;
Calculating a loss value of the recognition result according to the standard identification character and a preset second loss function;
and continuing to execute the step of determining the current training picture based on a preset training set until the loss value of the predicted position coordinate is converged and the loss value of the recognition result is converged to obtain an identification recognition model.
11. An identification recognition apparatus, the apparatus comprising:
The acquisition module is used for acquiring the picture to be identified containing the identification;
The first extraction module is used for extracting the position information of the characters in the identification in the picture to be recognized through a preset first feature extraction network;
the second extraction module is used for extracting the feature map of the identifier through a preset second feature extraction network;
The first processing module is used for carrying out weighting processing on the characteristic values in the characteristic diagram according to the position information;
and the first recognition module is used for recognizing the characters in the identification according to the processed feature map.
12. a training apparatus for identifying a recognition model, the apparatus comprising:
the determining module is used for determining a current training picture based on a preset training set; wherein, the current training picture is marked with standard identification characters and standard position coordinates of the characters in the identification;
The output module is used for inputting the current training picture into a preset first feature extraction network and outputting the predicted position coordinates of the characters in the identification in the current training picture;
The first calculation module is used for calculating a loss value of the predicted position coordinate according to the standard position coordinate and a preset first loss function;
the third extraction module is used for extracting the feature map of the identifier through a preset second feature extraction network;
The second processing module is used for carrying out weighting processing on the characteristic values in the characteristic diagram according to the predicted position coordinates;
The second identification module is used for identifying characters in the identification according to the processed characteristic diagram to obtain an identification result;
The second calculation module is used for calculating a loss value of the recognition result according to the standard identification character and a preset second loss function;
And the execution module is used for continuously executing the step of determining the current training picture based on a preset training set until the loss value of the predicted position coordinate is converged and the loss value of the recognition result is converged to obtain the identification recognition model.
13. An electronic system, characterized in that the electronic system comprises: the device comprises an image acquisition device, a processing device and a storage device;
The image acquisition equipment is used for acquiring preview video frames or image data;
The storage means has stored thereon a computer program which, when executed by the processing device, performs the method of signature recognition according to any one of claims 1 to 9 or the method of training a signature recognition model according to claim 10.
14. a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processing device, carries out the steps of the method for signature recognition according to any one of claims 1 to 9 or the method for training a signature recognition model according to claim 10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910834664.0A CN110543877A (en) | 2019-09-04 | 2019-09-04 | Identification recognition method, training method and device of model thereof and electronic system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910834664.0A CN110543877A (en) | 2019-09-04 | 2019-09-04 | Identification recognition method, training method and device of model thereof and electronic system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110543877A true CN110543877A (en) | 2019-12-06 |
Family
ID=68711269
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910834664.0A Pending CN110543877A (en) | 2019-09-04 | 2019-09-04 | Identification recognition method, training method and device of model thereof and electronic system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110543877A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111222513A (en) * | 2019-12-31 | 2020-06-02 | 深圳云天励飞技术有限公司 | License plate number recognition method and device, electronic equipment and storage medium |
CN111242176A (en) * | 2019-12-31 | 2020-06-05 | 北京迈格威科技有限公司 | Computer vision task processing method and device and electronic system |
CN111582259A (en) * | 2020-04-10 | 2020-08-25 | 支付宝实验室(新加坡)有限公司 | Machine-readable code identification method and device, electronic equipment and storage medium |
CN111639717A (en) * | 2020-06-04 | 2020-09-08 | 网易(杭州)网络有限公司 | Image character recognition method, device, equipment and storage medium |
CN111931693A (en) * | 2020-08-31 | 2020-11-13 | 平安国际智慧城市科技股份有限公司 | Traffic sign recognition method, device, terminal and medium based on artificial intelligence |
CN112016467A (en) * | 2020-08-28 | 2020-12-01 | 展讯通信(上海)有限公司 | Traffic sign recognition model training method, recognition method, system, device and medium |
WO2022056873A1 (en) * | 2020-09-18 | 2022-03-24 | 西门子股份公司 | Method and apparatus for updating transformer attribute information, and computer-readable storage medium |
CN114332447A (en) * | 2022-03-14 | 2022-04-12 | 浙江大华技术股份有限公司 | License plate correction method, license plate correction device and computer readable storage medium |
CN116597437A (en) * | 2023-07-18 | 2023-08-15 | 昆明理工大学 | End-to-end Laos license plate identification method and device integrating double-layer attention network |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229474A (en) * | 2017-12-29 | 2018-06-29 | 北京旷视科技有限公司 | Licence plate recognition method, device and electronic equipment |
CN108875722A (en) * | 2017-12-27 | 2018-11-23 | 北京旷视科技有限公司 | Character recognition and identification model training method, device and system and storage medium |
CN109543667A (en) * | 2018-11-14 | 2019-03-29 | 北京工业大学 | A kind of text recognition method based on attention mechanism |
CN109670503A (en) * | 2018-12-19 | 2019-04-23 | 北京旷视科技有限公司 | Label detection method, apparatus and electronic system |
CN109829437A (en) * | 2019-02-01 | 2019-05-31 | 北京旷视科技有限公司 | Image processing method, text recognition method, device and electronic system |
CN109919174A (en) * | 2019-01-16 | 2019-06-21 | 北京大学 | A kind of character recognition method based on gate cascade attention mechanism |
CN109960734A (en) * | 2017-12-22 | 2019-07-02 | 奥多比公司 | It is answered for the problem of data visualization |
CN109977872A (en) * | 2019-03-27 | 2019-07-05 | 北京迈格威科技有限公司 | Motion detection method, device, electronic equipment and computer readable storage medium |
CN110070085A (en) * | 2019-04-30 | 2019-07-30 | 北京百度网讯科技有限公司 | Licence plate recognition method and device |
CN110135427A (en) * | 2019-04-11 | 2019-08-16 | 北京百度网讯科技有限公司 | The method, apparatus, equipment and medium of character in image for identification |
CN110163199A (en) * | 2018-09-30 | 2019-08-23 | 腾讯科技(深圳)有限公司 | Licence plate recognition method, license plate recognition device, car license recognition equipment and medium |
-
2019
- 2019-09-04 CN CN201910834664.0A patent/CN110543877A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109960734A (en) * | 2017-12-22 | 2019-07-02 | 奥多比公司 | It is answered for the problem of data visualization |
CN108875722A (en) * | 2017-12-27 | 2018-11-23 | 北京旷视科技有限公司 | Character recognition and identification model training method, device and system and storage medium |
CN108229474A (en) * | 2017-12-29 | 2018-06-29 | 北京旷视科技有限公司 | Licence plate recognition method, device and electronic equipment |
CN110163199A (en) * | 2018-09-30 | 2019-08-23 | 腾讯科技(深圳)有限公司 | Licence plate recognition method, license plate recognition device, car license recognition equipment and medium |
CN109543667A (en) * | 2018-11-14 | 2019-03-29 | 北京工业大学 | A kind of text recognition method based on attention mechanism |
CN109670503A (en) * | 2018-12-19 | 2019-04-23 | 北京旷视科技有限公司 | Label detection method, apparatus and electronic system |
CN109919174A (en) * | 2019-01-16 | 2019-06-21 | 北京大学 | A kind of character recognition method based on gate cascade attention mechanism |
CN109829437A (en) * | 2019-02-01 | 2019-05-31 | 北京旷视科技有限公司 | Image processing method, text recognition method, device and electronic system |
CN109977872A (en) * | 2019-03-27 | 2019-07-05 | 北京迈格威科技有限公司 | Motion detection method, device, electronic equipment and computer readable storage medium |
CN110135427A (en) * | 2019-04-11 | 2019-08-16 | 北京百度网讯科技有限公司 | The method, apparatus, equipment and medium of character in image for identification |
CN110070085A (en) * | 2019-04-30 | 2019-07-30 | 北京百度网讯科技有限公司 | Licence plate recognition method and device |
Non-Patent Citations (5)
Title |
---|
YUNZE GAO等: ""Dense Chained Attention Network for Scene Text Recognition"", 《2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)》 * |
ZBIGNIEW WOJNA等: ""Attention-Based Extraction of Structured Information from Street View Imagery"", 《2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR)》 * |
刘衍平: ""基于深度学习的端到端场景文本识别方法研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
白翔等: ""基于深度学习的场景文字检测与识别"", 《中国科学:信息科学》 * |
罗聪: ""基于深度学习的自然场景文本检测与定位方法研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111222513A (en) * | 2019-12-31 | 2020-06-02 | 深圳云天励飞技术有限公司 | License plate number recognition method and device, electronic equipment and storage medium |
CN111242176A (en) * | 2019-12-31 | 2020-06-05 | 北京迈格威科技有限公司 | Computer vision task processing method and device and electronic system |
CN111242176B (en) * | 2019-12-31 | 2023-10-13 | 北京迈格威科技有限公司 | Method and device for processing computer vision task and electronic system |
CN111582259A (en) * | 2020-04-10 | 2020-08-25 | 支付宝实验室(新加坡)有限公司 | Machine-readable code identification method and device, electronic equipment and storage medium |
CN111582259B (en) * | 2020-04-10 | 2024-04-16 | 支付宝实验室(新加坡)有限公司 | Machine-readable code identification method, device, electronic equipment and storage medium |
CN111639717A (en) * | 2020-06-04 | 2020-09-08 | 网易(杭州)网络有限公司 | Image character recognition method, device, equipment and storage medium |
CN112016467B (en) * | 2020-08-28 | 2022-09-20 | 展讯通信(上海)有限公司 | Traffic sign recognition model training method, recognition method, system, device and medium |
CN112016467A (en) * | 2020-08-28 | 2020-12-01 | 展讯通信(上海)有限公司 | Traffic sign recognition model training method, recognition method, system, device and medium |
CN111931693A (en) * | 2020-08-31 | 2020-11-13 | 平安国际智慧城市科技股份有限公司 | Traffic sign recognition method, device, terminal and medium based on artificial intelligence |
CN111931693B (en) * | 2020-08-31 | 2024-08-13 | 平安国际智慧城市科技股份有限公司 | Traffic sign recognition method, device, terminal and medium based on artificial intelligence |
WO2022056873A1 (en) * | 2020-09-18 | 2022-03-24 | 西门子股份公司 | Method and apparatus for updating transformer attribute information, and computer-readable storage medium |
CN114332447A (en) * | 2022-03-14 | 2022-04-12 | 浙江大华技术股份有限公司 | License plate correction method, license plate correction device and computer readable storage medium |
CN114332447B (en) * | 2022-03-14 | 2022-08-09 | 浙江大华技术股份有限公司 | License plate correction method, license plate correction device and computer readable storage medium |
CN116597437A (en) * | 2023-07-18 | 2023-08-15 | 昆明理工大学 | End-to-end Laos license plate identification method and device integrating double-layer attention network |
CN116597437B (en) * | 2023-07-18 | 2023-10-03 | 昆明理工大学 | End-to-end Laos license plate identification method and device integrating double-layer attention network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110543877A (en) | Identification recognition method, training method and device of model thereof and electronic system | |
CN107944450B (en) | License plate recognition method and device | |
CN108304835A (en) | character detecting method and device | |
CN110853033B (en) | Video detection method and device based on inter-frame similarity | |
CN108960115B (en) | Multidirectional text detection method based on angular points | |
CN110533119B (en) | Identification recognition method, model training method and device thereof, and electronic system | |
CN114155527A (en) | Scene text recognition method and device | |
CN108304839B (en) | Image data processing method and device | |
CN112926564B (en) | Picture analysis method, system, computer device and computer readable storage medium | |
KR102375325B1 (en) | Method for detection and recognition of distant high-density visual markers | |
CN115731477A (en) | Image recognition method, illicit detection method, terminal device, and storage medium | |
CN114022837A (en) | Station left article detection method and device, electronic equipment and storage medium | |
CN112183148A (en) | Batch bar code positioning method and identification system | |
CN115953744A (en) | Vehicle identification tracking method based on deep learning | |
CN109919164B (en) | User interface object identification method and device | |
CN109740527B (en) | Image processing method in video frame | |
CN114880730A (en) | Method and device for determining target equipment and photovoltaic system | |
CN114399729A (en) | Monitoring object movement identification method, system, terminal and storage medium | |
CN111507119A (en) | Identification code identification method and device, electronic equipment and computer readable storage medium | |
CN111738310B (en) | Material classification method, device, electronic equipment and storage medium | |
CN111047614B (en) | Feature extraction-based method for extracting target corner of complex scene image | |
CN109740337B (en) | Method and device for realizing identification of slider verification code | |
CN111539390A (en) | Small target image identification method, equipment and system based on Yolov3 | |
CN111815594A (en) | Reinforcing steel bar detection method and related equipment and device | |
CN110956174A (en) | Device number identification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191206 |
|
RJ01 | Rejection of invention patent application after publication |