CN112446370B - Method for identifying text information of nameplate of power equipment - Google Patents
Method for identifying text information of nameplate of power equipment Download PDFInfo
- Publication number
- CN112446370B CN112446370B CN202011327387.3A CN202011327387A CN112446370B CN 112446370 B CN112446370 B CN 112446370B CN 202011327387 A CN202011327387 A CN 202011327387A CN 112446370 B CN112446370 B CN 112446370B
- Authority
- CN
- China
- Prior art keywords
- nameplate
- text
- information
- power equipment
- loss
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000001514 detection method Methods 0.000 claims abstract description 41
- 238000013135 deep learning Methods 0.000 claims abstract description 18
- 230000009466 transformation Effects 0.000 claims abstract description 10
- 239000011159 matrix material Substances 0.000 claims description 18
- 238000013527 convolutional neural network Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 238000013518 transcription Methods 0.000 claims description 8
- 230000035897 transcription Effects 0.000 claims description 8
- 238000009826 distribution Methods 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 230000005764 inhibitory process Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 2
- 238000007667 floating Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
- G06V10/464—Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/146—Aligning or centring of the image pick-up or image-field
- G06V30/1475—Inclination or skew detection or correction of characters or of image to be recognised
- G06V30/1478—Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method for identifying text information of a nameplate of power equipment, which comprises the following steps: s1, acquiring an input image; s2, positioning a power equipment nameplate in the input image by using a target detection algorithm based on deep learning, and extracting positioning information; s3, calculating a text inclination angle of the positioned electric power nameplate image by utilizing perspective transformation to obtain an inclination angle of each pixel point of the nameplate area; s4, detecting text information in the nameplate image by utilizing a text detection algorithm based on deep learning and combining the inclination angle information to obtain a nameplate text detection result; and S5, automatically identifying text character information in the nameplate text detection result by using a text identification algorithm based on deep learning, and obtaining an identification result of the nameplate text information of the power equipment. The invention realizes the automatic acquisition of the information of the power equipment, solves the key problem of the automatic management of the information of the power equipment, and improves the efficiency and the accuracy of the information acquisition of the power equipment.
Description
Technical Field
The invention relates to the technical field of image recognition, in particular to a method for recognizing text information of a nameplate of power equipment.
Background
Various devices running in the power system are provided with brands, manufacturers, models, product names and various power parameter information of the devices on the device nameplates, and in the power production process, power companies need to be familiar with the technical parameters, so that the device performance can be conveniently known, and the technical parameters of the devices can be recorded and archived. Therefore, the automatic identification and collection of the text information of the nameplate of the power equipment are realized, and the method has important significance for improving the equipment management level and the management efficiency of the power system.
At present, the recognition of the nameplate of the electric equipment mostly adopts an OCR recognition technology, and the technology has higher recognition rate for clear printing font recognition, but because the technology adopts an optical mode to convert characters in a paper document into an image file with black and white dot matrix, when the image acquisition of the steel seal text engraved on the metal surface is carried out, optical reflection occurs on the seal text, the recognition rate of the information engraved by the seal text is very low, so that the recognition by the OCR recognition principle is difficult. Therefore, a new method suitable for the text recognition of the nameplate of the power equipment in a complex background is needed.
Disclosure of Invention
In order to solve the defects in the background art, the invention aims to provide a method for identifying the text information of the nameplate of the power equipment, which realizes the automatic acquisition of the information of the power equipment, solves the key problem of the automatic management of the information of the power equipment and improves the efficiency and the accuracy of the information acquisition of the power equipment.
The aim of the invention can be achieved by the following technical scheme:
a method for identifying text information of a nameplate of a power device, comprising the following steps:
s1, acquiring an input image;
s2, positioning a power equipment nameplate in the input image by using a target detection algorithm based on deep learning, and extracting positioning information;
s3, calculating a text inclination angle of the positioned electric power nameplate image by utilizing perspective transformation to obtain an inclination angle of each pixel point of the nameplate area;
s4, detecting text information in the nameplate image by utilizing a text detection algorithm based on deep learning and combining the inclination angle information to obtain a nameplate text detection result;
and S5, automatically identifying text character information in the nameplate text detection result by using a text identification algorithm based on deep learning, and obtaining an identification result of the nameplate text information of the power equipment.
Preferably, step S2 includes:
s201, processing an input image by using a convolutional neural network and a residual network to generate three feature graphs with different scales, wherein the scales of the feature graphs are 1/32,1/16 and 1/8 of the input image respectively;
s202, carrying out regression prediction by combining anchor parameters of a priori prediction frame based on the feature map of the input image, and positioning a nameplate area in the input image.
Preferably, the loss function L of the neural network in step S201 includes three parts, namely, coordinate loss, target loss, and classification loss, specifically:
L=λ coord L coord +(λ obj L obj +λ noobj L noobj )+λ cls L cls (1)
in the formula (1), L represents the total loss, L coord Representing the loss of coordinates, L cls Representing the classification loss, L obj And L noobj Loss of presence and absence of targets in the candidate boxes, respectively; lambda (lambda) coord 、λ obj 、λ noobj And lambda (lambda) cls Weights for different losses;
when the prediction frame and the real frame are A and B respectively, and C is the minimum convex set of A and B: ,
L obj =-α(1-y * ) γ logy * (4),
L noobj =-(1-α)(y * ) γ log(1-y * ) (5),
IoU in the formulas (2) - (5) is the cross ratio, p i For the class probability of the output,for the corresponding tag value, y * The model is a predicted value, y is a sample label true value, for the former formula, y is 1, for the latter formula, y is 0; gamma is a weight factor and alpha is a balance factor.
Preferably, step S3 includes:
s301, detecting a straight line in a nameplate image through Hough transformation;
s302, according to the detected straight line set, utilizing the slope and intercept thereof to convert the perspective transformation coefficient mu 1 、μ 2 And performing linear fitting, calculating to obtain the slope of each pixel point in the graph, and generating an inclination angle matrix.
Preferably, step S4 includes:
s401, predicting texts in the nameplate image by using a text detection algorithm based on deep learning, extracting image features by using a ResNet50 network, merging all feature layers, gradually merging upper layer features with lower layer features by an up-sampling and convolution network, and combining the inclination angle matrix in the step S3 to obtain a plurality of text detection candidate frames and coordinate positions and confidence information thereof;
and S402, processing the candidate frames according to the coordinate positions and the confidence degrees of the preliminarily obtained text detection candidate frames, and finally obtaining the text detection result of the nameplate image.
Preferably, the loss function of the text detection algorithm based on deep learning in step S401 consists of text score loss and geometry loss, specifically satisfying the following relation:
L=L s +λ g L g (6),
L g =L AABB +λ θ L θ (8),
L θ =1-cos(θ-θ * ) (10),
in the formulae (6) to (9), L is the total loss, L s For text score loss, L g Lambda is the geometric loss g For lost weight, Y represents the predictive value matrix of text scores, Y * Representing a label value matrix corresponding to the text score, |Y n Y * I represents Y and Y * Overlap between two matrix products is calculated and summed, |Y| and |Y| * I is the number of elements in the matrix, R and theta are the predicted axis parallel rectangular frame area and the inclination angle thereof, R * 、θ * Is the corresponding real label.
Preferably, the processing of the candidate frame in step S402 is: traversing all text detection candidate frames, screening out candidate frames which contain characters and have confidence coefficient higher than a threshold value, judging the cross-over ratio of adjacent candidate frames, and merging the candidate frames when the cross-over ratio is greater than the threshold value; and performing non-maximum inhibition treatment on the residual candidate frames after one-time traversal to obtain the nameplate text detection result.
Preferably, step S5 is a process of passing the nameplate text detection result through a convolution layer, a circulation layer, and a transcription layer to obtain the identification result of the nameplate text information of the electrical equipment, and specifically includes the following steps:
s501, extracting a characteristic sequence of a nameplate text image in a convolutional layer by utilizing a CNN convolutional neural network, and inputting the characteristic sequence into the convolutional layer;
s502, circulating a network learning feature sequence in a circulating layer by using an LSTM long-short-term memory model, and predicting label distribution of a text sequence;
s503, converting label distribution of a text sequence into a recognition result by utilizing a CTC algorithm in a transcription layer;
s504, correcting the text recognition result to obtain the text information recognition result of the nameplate of the power equipment.
Preferably, in step S501, the feature sequence of extracting the nameplate text image by using CNN is as follows: converting the nameplate text image into a gray level image, and scaling to a fixed height; and downsampling by a convolution layer to ensure that the height of the output characteristic sequence is 1, the width is 1/4 of the nameplate text image, and 512 characteristic values are included.
Preferably, the step of correcting the text recognition result in step S504 is: firstly, establishing a dictionary base according to high-frequency words in a nameplate text of the power equipment; calculating the similarity between the recognition result output by the transcription layer and words in the dictionary, when the similarity is larger than a set threshold value, replacing the recognition result by words in the dictionary library, and when the similarity is smaller than the set threshold value, retaining the original recognition result; wherein, define:
in the formulae (11) - (12), w 1 Character string is text recognition result, and the length isw 2 For the character string to be matched in the dictionary, the length is +.>Its word frequency ranking in dictionary is +.>N dict The number of vocabulary in the dictionary; lambda is dictionary ranking weight; d (w) 1 ,w 2 ) Is w 1 And w 2 Degree of difference between; s (w) 1 ,w 2 ) Is w 1 And w 2 Similarity between them.
The invention has the beneficial effects that:
according to the invention, the deep learning-based algorithm is utilized to automatically identify the text information of the nameplate of the power equipment, and the result is corrected, so that the end-to-end automatic identification of the nameplate text of the power equipment from nameplate positioning and text detection to text identification is intelligently and accurately realized, the automatic acquisition of the information of the power equipment is realized, the key problem of the automatic management of the information of the power equipment is solved, and the efficiency and the accuracy of the information acquisition of the power equipment are improved.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a general flow chart of a method of identifying text information of a nameplate of an electrical device in accordance with the present invention;
FIG. 2 is a flowchart of a method step S2 of the identification of the text information of the nameplate of the electrical equipment of the present invention;
FIG. 3 is a flowchart of method step S4 of the identification of the text information of the nameplate of the electrical equipment of the present invention;
FIG. 4 is a diagram showing the implementation effect of step S4 of the method for recognizing the text information of the nameplate of the electric power equipment;
fig. 5 is a flowchart of a method step S5 of the identification of the text information of the nameplate of the electrical equipment according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the description of the present invention, it should be understood that the terms "open," "upper," "lower," "thickness," "top," "middle," "length," "inner," "peripheral," and the like indicate orientation or positional relationships, merely for convenience in describing the present invention and to simplify the description, and do not indicate or imply that the components or elements referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be construed as limiting the present invention.
As shown in fig. 1, a method for identifying text information of a nameplate of a power equipment comprises the following steps:
s1, acquiring an input image;
s2, positioning a power equipment nameplate in the input image by using a target detection algorithm based on deep learning, and extracting positioning information;
s3, calculating a text inclination angle of the positioned electric power nameplate image by utilizing perspective transformation to obtain an inclination angle of each pixel point of the nameplate area;
s4, detecting text information in the nameplate image by utilizing a text detection algorithm based on deep learning and combining the inclination angle information to obtain a nameplate text detection result;
and S5, automatically identifying text character information in the nameplate text detection result by using a text identification algorithm based on deep learning, and obtaining an identification result of the nameplate text information of the power equipment.
The step S1 specifically comprises the following steps:
the image containing the power equipment nameplate is obtained through means of manually shooting the power equipment photo by an operation and maintenance person, automatically shooting the power equipment photo by the unmanned aerial vehicle, extracting the power equipment photo by a monitoring video and the like, and the image is used as an input image of the method.
In one example, as shown in fig. 2, the step 2 specifically includes:
s201, preprocessing data. Marking the acquired image containing the power equipment nameplate, framing a nameplate area, and marking the power equipment nameplate, the power equipment prompt plate and the like with corresponding labels; and performing data augmentation operations such as rotation, scaling, clipping and the like on the acquired image, and expanding a data set.
S202, searching the neural network super parameters in advance. Selecting various super parameters of the network model by adopting a random parameter searching mode, including but not limited to GIoU loss weight, classification loss weight, initial learning rate and the like, and adopting the weighted values of mAP and F1 as the basis for measuring the comprehensive performance index of the model and selecting the super parameters; selecting a group of super parameters with the best performance according to the basis in the earlier training, and randomly changing the super parameters on the basis to serve as parameters of the next training; when the maximum number of iterations is reached, the update is stopped.
Wherein, the weighted values of mAP and F1 are as follows: m=λ 1 ×F1+λ 2 X mAP, λ in this example 1 And lambda (lambda) 2 Taking 0.3 and 0.7 respectively;
the rule of the super parameter random change is as follows: zeta type i =ζ i ×(r i ×b i +1) 2 In zeta, zeta i Is the ith super parameter; r is (r) i Random numbers conforming to a standard normal distribution; b i The floating coefficient of the super-parameter is set to 0.2 for the general super-parameter and 0.02 for the super-parameter with smaller floating in this example.
S203, anchor parameter clustering, namely acquiring anchor parameters by adopting K-means clustering based on IoU distance to obtain anchor parameters of 9 clustering centers in order to enable the anchor parameters to be as close as possible to the size of the nameplate to be detected.
Wherein, the distance calculation formula based on IoU is: d (box, center) =1-IoU (box, center), where box represents a sample box in the dataset, center represents a sample box in the cluster center, and IoU is the intersection ratio of the two.
S204, processing an input image by using a convolutional neural network and a residual network to generate a feature map of the input image, processing the input image by using a Darknet-53 network to extract a feature map with the dimension of 1/32 of the input image, and performing upsampling and tensor connection to obtain a feature map with the dimension of 1/16 and 1/8 of the input image;
s205, positioning a nameplate area in the input image based on the feature map of the input image, wherein the method comprises the following steps:
and carrying out regression prediction by combining anchor parameters of a priori prediction frame based on the feature map of the input image. The result of regression prediction is a feature map with three different dimensions, such as NxNxMxL, wherein N is 1/32,1/16 and 1/8 of the size of an input image respectively, and represents the number of feature units, M is the number of anchor frames preset in each feature unit, and L comprises the position information of a prediction frame, the confidence level of a nameplate and category information;
and carrying out non-maximum inhibition operation on all predicted frames, and positioning a nameplate area in the image.
The neural network loss function in steps S204 and S205 described above includes three parts, namely, coordinate loss, target loss and classification loss: l=λ coord L coord +(λ obj L obj +λ noobj L noobj )+λ cls L cls L represents total loss, L coord Representing the loss of coordinates, L cls Representing the classification loss, L obj And L noobj Loss of presence and absence of targets in the candidate boxes, respectively; lambda (lambda) coord 、λ obj 、λ noobj And lambda (lambda) cls The weights of different losses are all used as super parameters for adjustment;
when the prediction frame and the real frame are respectively A and B, and C is the minimum convex set of A and B:
L obj =-α(1-y * ) γ logy *
L noobj =-(1-α)(y * ) γ log(1-y * )
wherein IoU is the cross-over ratio, p i For the class probability of the output,for the corresponding tag value, y * Is the predicted value of the model and,y is the true value of the sample label, for the former formula, the y value is 1, for the latter formula, the y value is 0, and gamma is a weight factor, so as to reduce the loss of the sample easy to classify, and make the model pay more attention to the sample difficult to identify and easy to classify by mistake; alpha is a balance factor and is used for solving the problem of unbalanced proportion of positive and negative samples.
The nameplate is positioned by the method in the step S2, the effect is compared with the YOLOv3 algorithm, and mAP values under different models are shown in the table 1:
TABLE 1 mAP values for different models
The step S3 specifically includes:
s301, converting an input image into a gray scale image and performing Gaussian filtering;
s302, performing edge detection on the image, and when the number of detected edges is enough, performing straight line detection by using Hough transformation, and removing straight lines with unreasonable inclination angles to obtain enough straight lines;
s303, according to the detected straight line set, utilizing the slope and intercept thereof to convert the perspective transformation coefficient mu 1 、μ 2 And performing linear fitting, calculating to obtain the slope of each pixel point in the graph, and generating an inclination angle matrix.
As shown in fig. 3, the step S4 specifically includes:
s401, label processing. Because the labels of the texts in the acquired power equipment nameplates are usually irregular quadrilateral labels, and the input required by the neural network is rectangular labels with inclination angles, the manually marked quadrilateral label frames are contracted by 0.3 times of the side lengths, and the errors of the manual labels are reduced; then generating a fraction label according to the contracted quadrilateral frame, wherein the pixels in the frame are set as 1, and the outside of the frame is set as 0; and finally generating a minimum circumscribed rectangle marked by the quadrangle, and generating an axis parallel rectangle frame label and an inclination angle label according to the distance from each point in the frame to each side of the rectangle.
S402, image preprocessing. Data cleaning and image augmentation are performed on the image data: reading the labeling information, and deleting the nonstandard labeling frames (such as misplacement of the sequence of each vertex of the rectangular frame, too small frame and the like) in the labeling information; and (5) performing the operations of enlarging, cutting, scaling and the like on the image.
S403, predicting the text in the nameplate image by using a text detection algorithm based on deep learning, wherein the method comprises the following steps:
extracting image features by using a ResNet50 network to obtain feature images of 1/32,1/16,1/8 and 1/4 scale of an input image, merging all feature layers, gradually merging upper-layer features with lower-layer features through an up-sampling and convolution network, finally combining the inclination angle matrix in the step 3 to obtain a plurality of text detection candidate frames and coordinate positions and confidence information thereof, and outputting an axis parallel rectangular frame vector containing 5 channels and a rotation angle vector of 1 channel respectively, wherein the inclination angle range is (-45 degrees and 45 degrees ];
s404, processing the candidate frames according to the preliminarily obtained coordinate positions and confidence degrees of the text detection candidate frames to finally obtain the nameplate text detection result, wherein the method comprises the following steps:
traversing all the text detection candidate frames, screening out candidate frames which contain characters and have confidence coefficient higher than a threshold value, judging the cross-over ratio of adjacent candidate frames, and merging the candidate frame frames when the cross-over ratio is greater than the threshold value;
and performing non-maximum inhibition treatment on the residual candidate frames after one-time traversal to obtain the nameplate text detection result.
The neural network loss function in steps S403 and S404 described above is composed of text score loss and geometry loss, and is: l=l s +λ g L g Wherein L is total loss, L s For text score loss, L g Lambda is the geometric loss g The value in this example is set to 1 for the lost weight;
wherein,wherein Y represents a predictive value matrix of text scores, Y * Representing label value moment corresponding to text scoreMatrix, |Y n Y * I represents Y and Y * Overlap between two matrix products is calculated and summed, |Y| and |Y| * The I is the number of elements in the matrix respectively;
L g =L AABB +λ θ L θ ,L θ =1-cos(θ-θ * ) In which, in the process,
r and theta are respectively predicted axis parallel rectangular frame areas and inclination angles thereof, R * 、θ * Loss weight lambda for corresponding real tag θ Let 10 be the number.
The nameplate text is detected by the method of the step S4, the detection effect is shown in figure 4, and compared with a standard EAST (Efficient and Accurate Scene Text Detector) algorithm on a test set, and the evaluation index F1 is shown in table 2:
table 2 evaluation index F1
As shown in fig. 5, the step S5 specifically includes:
s501, extracting a feature sequence of a nameplate text image in a convolution layer by utilizing CNN, and inputting the feature sequence into a circulation layer, wherein the method comprises the following steps:
converting the nameplate text image into a gray scale image, and scaling to a fixed height, wherein the fixed height is 32 pixels in the example, the input size is 32×w×1, wherein W is the width of the scaled input image;
downsampling by a convolution layer to ensure that the height of the output characteristic sequence is 1, the width is 1/4 of the nameplate text image, 512 characteristic values are contained, and the size of the characteristic image is 1 x (W/4) x 512;
s502, predicting label distribution of a text sequence in a circulating layer by utilizing an LSTM learning feature sequence;
s503, converting label distribution of a text sequence into a recognition result by using a standard CTC algorithm in a transcription layer;
s504, correcting the text recognition result to obtain the text information recognition result of the nameplate of the power equipment, wherein the method comprises the following steps:
establishing a dictionary base according to high-frequency words in the nameplate text of the power equipment;
calculating the similarity between the recognition result output by the transcription layer and words in the dictionary, and replacing the recognition result by words in the dictionary library when the similarity is larger than a set threshold value; and when the similarity is smaller than a set threshold value, the original recognition result is reserved.
Wherein, defineW in 1 For text recognition result string, the length is +.>w 2 For the character string to be matched in the dictionary, the length is +.>Its word frequency ranking in dictionary is +.>N dict The number of vocabulary in the dictionary; λ is the dictionary ranking weight, set to 0.1 in this example; d (w) 1 ,w 2 ) Is w 1 And w 2 Degree of difference between; s (w) 1 ,w 2 ) Is w 1 And w 2 Similarity between them.
By the method provided by the invention, the character recognition accuracy rate of finally recognizing the text of the nameplate of the power equipment can reach 93.2%, and the text line recognition accuracy rate can reach 82.3%.
In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims.
Claims (7)
1. The method for identifying the text information of the nameplate of the power equipment is characterized by comprising the following steps of:
s1, acquiring an input image;
s2, positioning a power equipment nameplate in the input image by using a target detection algorithm based on deep learning, and extracting positioning information;
s3, calculating a text inclination angle of the positioned electric power nameplate image by utilizing perspective transformation to obtain an inclination angle of each pixel point of the nameplate area;
s4, detecting text information in the nameplate image by utilizing a text detection algorithm based on deep learning and combining the inclination angle information to obtain a nameplate text detection result;
s5, automatically identifying text character information in the nameplate text detection result by using a text identification algorithm based on deep learning to obtain an identification result of the nameplate text information of the power equipment;
step S5 is to process the nameplate text detection result through a convolution layer, a circulation layer and a transcription layer to obtain a recognition result of the nameplate text information of the electrical equipment, and specifically includes the following steps:
s501, extracting a characteristic sequence of a nameplate text image in a convolutional layer by utilizing a CNN convolutional neural network, and extracting the characteristic sequence
Inputting a circulating layer;
s502, circulating a network learning feature sequence in a circulating layer by using an LSTM long-short-term memory model, and predicting label distribution of a text sequence;
s503, converting label distribution of a text sequence into a recognition result by utilizing a CTC algorithm in a transcription layer;
s504, correcting the text recognition result to obtain a text information recognition result of the nameplate of the power equipment;
in the step S501, the feature sequence of extracting the nameplate text image by using CNN is as follows: converting the nameplate text image into a gray level image, and scaling to a fixed height; downsampling by a convolution layer to ensure that the height of the output characteristic sequence is 1, the width is 1/4 of the nameplate text image, and 512 characteristic values are contained;
the step of correcting the text recognition result in the step S504 is as follows: firstly, establishing a dictionary base according to high-frequency words in a nameplate text of the power equipment; calculating the similarity between the recognition result output by the transcription layer and words in the dictionary, when the similarity is larger than a set threshold value, replacing the recognition result by words in the dictionary library, and when the similarity is smaller than the set threshold value, retaining the original recognition result; wherein, define:
in the formulae (11) - (12), w 1 Character string is text recognition result, and the length isw 2 For the character string to be matched in the dictionary, the length is +.>Its word frequency ranking in dictionary is +.>N dict The number of vocabulary in the dictionary; lambda is dictionary ranking weight; d (w) 1 ,w 2 ) Is w 1 And w 2 Degree of difference between; s (w) 1 ,w 2 ) Is w 1 And w 2 Similarity between them.
2. The method for identifying text information of a nameplate of a power equipment according to claim 1, wherein the step S2 includes:
s201, processing an input image by using a convolutional neural network and a residual network to generate three feature graphs with different scales, wherein the scales of the feature graphs are 1/32,1/16 and 1/8 of the input image respectively;
s202, carrying out regression prediction by combining anchor parameters of a priori prediction frame based on the feature map of the input image, and positioning a nameplate area in the input image.
3. The method for identifying text information of a nameplate of a power equipment according to claim 2, wherein the loss function L of the neural network in the step S201 includes three parts, namely, a coordinate loss, a target loss and a classification loss, specifically:
L=λ coord L coord +(λ obj L obj +λ noobj L noobj )+λ cls L cls (1)
in the formula (1), L represents the total loss, L coord Representing the loss of coordinates, L cls Representing the classification loss, L obj And L noobj Loss of presence and absence of targets in the candidate boxes, respectively; lambda (lambda) coord 、λ obj 、λ noobj And lambda (lambda) cls Weights for coordinate loss, presence of target loss, absence of target loss, and classification loss, respectively;
when the prediction frame and the real frame are A and B respectively, and C is the minimum convex set of A and B:
L obj =-α(1-y * ) γ log y * (4)
L noobj =-(1-α)(y * ) γ log(1-y * ) (5)
IoU in the formulas (2) - (5) is the cross ratio, p i For the class probability of the output,for the corresponding tag value, y * Is a predictive value of the model; gamma is a weight factor and alpha is a balance factor.
4. The method for identifying text information of a nameplate of a power equipment according to claim 1, wherein the step S3 includes:
s301, detecting a straight line in a nameplate image through Hough transformation;
s302, according to the detected straight line set, utilizing the slope and intercept thereof to convert the perspective transformation coefficient mu 1 、μ 2 And performing linear fitting, calculating to obtain the slope of each pixel point in the graph, and generating an inclination angle matrix.
5. The method for identifying text information of a nameplate of a power equipment according to claim 1, wherein the step S4 includes:
s401, predicting texts in the nameplate image by using a text detection algorithm based on deep learning, extracting image features by using a ResNet50 network, merging all feature layers, gradually merging upper layer features with lower layer features by an up-sampling and convolution network, and combining the inclination angle matrix in the step S3 to obtain a plurality of text detection candidate frames and coordinate positions and confidence information thereof;
and S402, processing the candidate frames according to the coordinate positions and the confidence degrees of the preliminarily obtained text detection candidate frames, and finally obtaining the text detection result of the nameplate image.
6. The method for identifying text information of a nameplate of a power equipment according to claim 5, wherein the loss function of the text detection algorithm based on deep learning in the step S401 is composed of text score loss and geometric shape loss, and specifically satisfies the following relation:
L=L s +λ g L g (6)
L g =L AABB +λ θ L θ (8)
L θ =1-cos(θ-θ * ) (10)
in the formulae (6) to (10), L is the total loss, L s For text score loss, L g Lambda is the geometric loss g For lost weight, Y represents the predictive value matrix of text scores, Y * Representing a label value matrix corresponding to the text score, |Y n Y * I represents Y and Y * Overlap between two matrix products is calculated and summed, |Y| and |Y| * I is the number of elements in the matrix, R and theta are the predicted axis parallel rectangular frame area and the inclination angle thereof, R * 、θ * Is the corresponding real label.
7. The method for identifying text information of a nameplate of a power apparatus according to claim 5, wherein the processing of the candidate frame in step S402 is: traversing all text detection candidate frames, screening out candidate frames which contain characters and have confidence coefficient higher than a threshold value, judging the cross-over ratio of adjacent candidate frames, and merging the candidate frames when the cross-over ratio is greater than the threshold value; and performing non-maximum inhibition treatment on the residual candidate frames after one-time traversal to obtain the nameplate text detection result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011327387.3A CN112446370B (en) | 2020-11-24 | 2020-11-24 | Method for identifying text information of nameplate of power equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011327387.3A CN112446370B (en) | 2020-11-24 | 2020-11-24 | Method for identifying text information of nameplate of power equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112446370A CN112446370A (en) | 2021-03-05 |
CN112446370B true CN112446370B (en) | 2024-03-29 |
Family
ID=74737363
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011327387.3A Active CN112446370B (en) | 2020-11-24 | 2020-11-24 | Method for identifying text information of nameplate of power equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112446370B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113191343A (en) * | 2021-03-31 | 2021-07-30 | 成都飞机工业(集团)有限责任公司 | Aviation wire identification code automatic identification method based on convolutional neural network |
CN113033559A (en) * | 2021-04-19 | 2021-06-25 | 深圳市华汉伟业科技有限公司 | Text detection method and device based on target detection and storage medium |
CN113989793A (en) * | 2021-11-08 | 2022-01-28 | 成都天奥集团有限公司 | Graphite electrode embossed seal character recognition method |
CN113920497B (en) * | 2021-12-07 | 2022-04-08 | 广东电网有限责任公司东莞供电局 | Nameplate recognition model training method, nameplate recognition method and related devices |
CN114387590A (en) * | 2021-12-23 | 2022-04-22 | 东软集团股份有限公司 | Material information input method and device, storage medium and electronic equipment |
CN114863084B (en) * | 2022-04-19 | 2024-06-25 | 北京化工大学 | Nameplate recognition and leaning equipment based on deep learning target detection |
CN115424121B (en) * | 2022-07-30 | 2023-10-13 | 南京理工大学紫金学院 | Electric power pressing plate switch inspection method based on computer vision |
CN115187881A (en) * | 2022-09-08 | 2022-10-14 | 国网江西省电力有限公司电力科学研究院 | Power equipment nameplate identification and platform area compliance automatic checking system and method |
CN116110036B (en) * | 2023-04-10 | 2023-07-04 | 国网江西省电力有限公司电力科学研究院 | Electric power nameplate information defect level judging method and device based on machine vision |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109409355A (en) * | 2018-08-13 | 2019-03-01 | 国网陕西省电力公司 | A kind of method and device of novel transformer nameplate identification |
CN110956171A (en) * | 2019-11-06 | 2020-04-03 | 广州供电局有限公司 | Automatic nameplate identification method and device, computer equipment and storage medium |
CN111310861A (en) * | 2020-03-27 | 2020-06-19 | 西安电子科技大学 | License plate recognition and positioning method based on deep neural network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108416377B (en) * | 2018-02-26 | 2021-12-10 | 阿博茨德(北京)科技有限公司 | Information extraction method and device in histogram |
-
2020
- 2020-11-24 CN CN202011327387.3A patent/CN112446370B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109409355A (en) * | 2018-08-13 | 2019-03-01 | 国网陕西省电力公司 | A kind of method and device of novel transformer nameplate identification |
CN110956171A (en) * | 2019-11-06 | 2020-04-03 | 广州供电局有限公司 | Automatic nameplate identification method and device, computer equipment and storage medium |
CN111310861A (en) * | 2020-03-27 | 2020-06-19 | 西安电子科技大学 | License plate recognition and positioning method based on deep neural network |
Also Published As
Publication number | Publication date |
---|---|
CN112446370A (en) | 2021-03-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112446370B (en) | Method for identifying text information of nameplate of power equipment | |
WO2020259060A1 (en) | Test paper information extraction method and system, and computer-readable storage medium | |
CN110837835B (en) | End-to-end scene text identification method based on boundary point detection | |
US20210374466A1 (en) | Water level monitoring method based on cluster partition and scale recognition | |
WO2019192397A1 (en) | End-to-end recognition method for scene text in any shape | |
CN111460927B (en) | Method for extracting structured information of house property evidence image | |
CN100565559C (en) | Image text location method and device based on connected component and support vector machine | |
CN112580507B (en) | Deep learning text character detection method based on image moment correction | |
CN110598693A (en) | Ship plate identification method based on fast-RCNN | |
CN111553346A (en) | Scene text detection method based on character region perception | |
CN114694165A (en) | Intelligent PID drawing identification and redrawing method | |
CN113971809A (en) | Text recognition method and device based on deep learning and storage medium | |
CN111626292A (en) | Character recognition method of building indication mark based on deep learning technology | |
CN114241469A (en) | Information identification method and device for electricity meter rotation process | |
CN116612478A (en) | Off-line handwritten Chinese character scoring method, device and storage medium | |
CN116259008A (en) | Water level real-time monitoring method based on computer vision | |
Yu et al. | Tiny vehicle detection for mid-to-high altitude UAV images based on visual attention and spatial-temporal information | |
CN114758341A (en) | Intelligent contract image identification and contract element extraction method and device | |
CN111832497B (en) | Text detection post-processing method based on geometric features | |
CN111914706B (en) | Method and device for detecting and controlling quality of text detection output result | |
CN113313678A (en) | Automatic sperm morphology analysis method based on multi-scale feature fusion | |
CN117218672A (en) | Deep learning-based medical records text recognition method and system | |
CN112364687A (en) | Improved Faster R-CNN gas station electrostatic sign identification method and system | |
CN116704512A (en) | Instrument identification method and system integrating semantic and visual information | |
CN116258908A (en) | Ground disaster prediction evaluation classification method based on unmanned aerial vehicle remote sensing image data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |