CN112446370A - Method for recognizing text information of nameplate of power equipment - Google Patents
Method for recognizing text information of nameplate of power equipment Download PDFInfo
- Publication number
- CN112446370A CN112446370A CN202011327387.3A CN202011327387A CN112446370A CN 112446370 A CN112446370 A CN 112446370A CN 202011327387 A CN202011327387 A CN 202011327387A CN 112446370 A CN112446370 A CN 112446370A
- Authority
- CN
- China
- Prior art keywords
- nameplate
- text
- information
- power equipment
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000001514 detection method Methods 0.000 claims abstract description 45
- 238000013135 deep learning Methods 0.000 claims abstract description 18
- 230000009466 transformation Effects 0.000 claims abstract description 9
- 238000004364 calculation method Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 15
- 238000013527 convolutional neural network Methods 0.000 claims description 10
- 238000013518 transcription Methods 0.000 claims description 8
- 230000035897 transcription Effects 0.000 claims description 8
- 238000009826 distribution Methods 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 125000004122 cyclic group Chemical group 0.000 claims description 3
- 238000010586 diagram Methods 0.000 claims description 3
- 230000005764 inhibitory process Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000003416 augmentation Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
- G06V10/464—Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/146—Aligning or centring of the image pick-up or image-field
- G06V30/1475—Inclination or skew detection or correction of characters or of image to be recognised
- G06V30/1478—Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method for recognizing text information of a nameplate of electric power equipment, which comprises the following steps of: s1, acquiring an input image; s2, positioning the power equipment nameplate in the input image and extracting positioning information by using a target detection algorithm based on deep learning; s3, performing text inclination angle calculation on the positioned electric power nameplate image by utilizing perspective transformation to obtain an inclination angle of each pixel point in the nameplate area; s4, detecting text information in the nameplate image by utilizing a text detection algorithm based on deep learning and combining the inclination angle information to obtain a nameplate text detection result; and S5, automatically recognizing text character information in the nameplate text detection result by using a text recognition algorithm based on deep learning to obtain a nameplate text information recognition result of the power equipment. The invention realizes the automatic collection of the information of the power equipment, solves the key problem of the automatic management of the information of the power equipment and improves the efficiency and the accuracy of the information collection of the power equipment.
Description
Technical Field
The invention relates to the technical field of image recognition, in particular to a method for recognizing text information of a nameplate of electric power equipment.
Background
In the power production process, an electric power company needs to be familiar with the technical parameters, so that the performance of the equipment can be conveniently known, and the technical parameters of the equipment can be recorded and filed. Therefore, automatic identification and collection of the electric power equipment nameplate text information are achieved, and the method has important significance for improving the equipment management level and the management efficiency of the electric power system.
At present, the OCR recognition technology is mostly adopted for recognizing the nameplate of the power equipment, the technology has higher recognition rate for clear printing font recognition, but because the technology adopts an optical mode to convert characters in a paper document into an image file of a black-white dot matrix, when a steel seal text image engraved on the metal surface is acquired, optical reflection can occur on a seal, the recognition rate of information engraved by the seal is very low, and the identification by the OCR recognition principle is often difficult. Therefore, a new method suitable for the nameplate text recognition of the power equipment in a complex background is needed.
Disclosure of Invention
In order to solve the defects mentioned in the background art, the invention aims to provide a method for recognizing the text information of the nameplate of the power equipment.
The purpose of the invention can be realized by the following technical scheme:
a method for recognizing text information of a nameplate of electric power equipment comprises the following steps:
s1, acquiring an input image;
s2, positioning the power equipment nameplate in the input image and extracting positioning information by using a target detection algorithm based on deep learning;
s3, performing text inclination angle calculation on the positioned electric power nameplate image by utilizing perspective transformation to obtain an inclination angle of each pixel point in the nameplate area;
s4, detecting text information in the nameplate image by utilizing a text detection algorithm based on deep learning and combining the inclination angle information to obtain a nameplate text detection result;
and S5, automatically recognizing text character information in the nameplate text detection result by using a text recognition algorithm based on deep learning to obtain a nameplate text information recognition result of the power equipment.
Preferably, step S2 includes:
s201, processing an input image by using a convolutional neural network and a residual error network to generate three feature maps with different scales, wherein the scales of the feature maps are 1/32, 1/16 and 1/8 of the input image respectively;
s202, performing regression prediction based on the characteristic diagram of the input image and by combining the anchor parameters of the prior prediction frame, and positioning a nameplate region in the input image.
Preferably, the loss function L of the neural network in step S201 includes three parts, namely, coordinate loss, target loss and classification loss, specifically:
L=λcoordLcoord+(λobjLobj+λnoobjLnoobj)+λclsLcls (1)
in the formula (1), L represents the total loss, LcoordDenotes coordinate loss, LclsRepresents a classification loss, LobjAnd LnoobjLoss of the target in the candidate box and loss of the target in the candidate box are respectively present and absent; lambda [ alpha ]coord、λobj、λnoobjAnd λclsWeights for different losses;
when the prediction box and the real box are A and B, respectively, and C is the minimum convex set of A and B: ,
Lobj=-α(1-y*)γlogy* (4),
Lnoobj=-(1-α)(y*)γlog(1-y*) (5),
in the formulae (2) to (5), IoU is the cross-over ratio, piAs the probability of the class of the output,is the corresponding tag value, y*The predicted value of the model is y, the real value of the sample label is y, the value of y is 1 for the front formula, and the value of y is 0 for the rear formula; gamma is a weighting factor and alpha is a balancing factor.
Preferably, step S3 includes:
s301, detecting a straight line in the nameplate image through Hough transformation;
s302, according to the detected straight line set, using the slope and intercept to the perspective transformation coefficient mu1、μ2And performing linear fitting, calculating to obtain the slope of each pixel point in the graph, and generating a tilt angle matrix.
Preferably, step S4 includes:
s401, predicting texts in the nameplate image by using a text detection algorithm based on deep learning, extracting image features by adopting a ResNet50 network, merging all feature layers, gradually fusing upper-layer features with lower-layer features through an upsampling and convolution network, and combining the inclination angle matrix in the step S3 to obtain a plurality of text detection candidate frames, coordinate positions of the text detection candidate frames and confidence degree information of the text detection candidate frames;
s402, processing the candidate frame according to the preliminarily obtained coordinate position and confidence of the text detection candidate frame, and finally obtaining a text detection result of the nameplate image.
Preferably, the loss function of the text detection algorithm based on deep learning in step S401 is composed of a text score loss and a geometric shape loss, and specifically satisfies the following relations:
L=Ls+λgLg (6),
Lg=LAABB+λθLθ (8),
Lθ=1-cos(θ-θ*) (10),
in the formulae (6) to (9), L is the total loss, LsFor text score loss, LgFor geometric losses, λgFor lost weight, Y represents a predictor matrix for text scores, Y*The label value matrix corresponding to the text score is expressed, | Y ≦ Y*I represents Y and Y*The overlap between the two matrices is calculated as the product of the two matrices and summed, | Y | and | Y |*I is the number of elements in the matrix, R and theta are the predicted rectangular frame area parallel to the axis and the inclination angle thereof, R*、θ*Is the corresponding real label.
Preferably, the processing of the candidate frame in step S402 is: traversing all text detection candidate frames, screening out candidate frames which contain characters and have confidence degrees higher than a threshold value, carrying out intersection comparison judgment on adjacent candidate frames, and merging the candidate frames when the intersection ratio is greater than the threshold value; and then carrying out non-maximum inhibition processing on the remaining candidate frames after one-time traversal to obtain the nameplate text detection result.
Preferably, step S5 is to obtain the nameplate text information identification result of the power equipment by processing the nameplate text detection result through a convolutional layer, a loop layer, and a transcription layer, and specifically includes the following steps:
s501, extracting a feature sequence of a nameplate text image in the convolutional layer by using a CNN convolutional neural network, and inputting the feature sequence into a cyclic layer;
s502, in a circulation layer, a feature sequence is learnt by using an LSTM long-short term memory model circulation network, and the label distribution of a text sequence is predicted;
s503, converting label distribution of the text sequence into an identification result by using a CTC algorithm in the transcription layer;
s504, correcting the text recognition result to obtain the electric power equipment nameplate text information recognition result.
Preferably, the feature sequence of the nameplate text image extracted by using the CNN in step S501 is as follows: firstly, converting the nameplate text image into a gray level image, and zooming to a fixed height; and performing convolutional layer down-sampling to enable the height of the output feature sequence to be 1, the width of the output feature sequence to be 1/4 of the nameplate text image, and the output feature sequence comprises 512 feature values.
Preferably, the step of correcting the text recognition result in step S504 is: firstly, establishing a dictionary library according to high-frequency words in a nameplate text of the power equipment; then calculating the similarity between the recognition result output by the transcription layer and the words in the dictionary, replacing the recognition result with the words in the dictionary library when the similarity is greater than a set threshold, and keeping the original recognition result when the similarity is less than the set threshold; wherein, the definition:
in formulae (11) to (12), w1For the text recognition result string, length isw2Is a character string to be matched in a dictionary with the length ofIts word frequency ranking in the dictionary isNdictThe number of vocabularies in the dictionary; λ is dictionary ranking weight; d (w)1,w2) Is w1And w2The degree of difference therebetween; s (w)1,w2) Is w1And w2The similarity between them.
The invention has the beneficial effects that:
the method and the device realize automatic identification of the nameplate text information of the electric power equipment by utilizing the algorithm based on deep learning, correct the result, intelligently and accurately realize end-to-end automatic identification of the nameplate text of the electric power equipment from nameplate positioning and text detection text identification, realize automatic acquisition of the information of the electric power equipment, solve the key problem of automatic management of the information of the electric power equipment and improve the efficiency and the accuracy of the information acquisition of the electric power equipment.
Drawings
The invention will be further described with reference to the accompanying drawings.
FIG. 1 is an overall flow chart of a method for identification of textual information for a nameplate of an electrical power apparatus of the present invention;
FIG. 2 is a flowchart of the method step S2 of the identification of nameplate text information of the power equipment of the present invention;
FIG. 3 is a flowchart of the method step S4 of the identification of nameplate text information of the power equipment of the present invention;
FIG. 4 is a diagram of an embodiment of a step S4 of the method for recognizing nameplate text information of an electrical device according to the present invention;
fig. 5 is a flowchart of the method step S5 of the identification of nameplate text information of the power equipment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "opening," "upper," "lower," "thickness," "top," "middle," "length," "inner," "peripheral," and the like are used in an orientation or positional relationship that is merely for convenience in describing and simplifying the description, and do not indicate or imply that the referenced component or element must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be considered as limiting the present invention.
As shown in fig. 1, a method for recognizing textual information of a nameplate of an electrical device includes the following steps:
s1, acquiring an input image;
s2, positioning the power equipment nameplate in the input image and extracting positioning information by using a target detection algorithm based on deep learning;
s3, performing text inclination angle calculation on the positioned electric power nameplate image by utilizing perspective transformation to obtain an inclination angle of each pixel point in the nameplate area;
s4, detecting text information in the nameplate image by utilizing a text detection algorithm based on deep learning and combining the inclination angle information to obtain a nameplate text detection result;
and S5, automatically recognizing text character information in the nameplate text detection result by using a text recognition algorithm based on deep learning to obtain a nameplate text information recognition result of the power equipment.
Step S1 specifically includes:
the method includes the steps that operation and maintenance personnel manually shoot pictures of the power equipment, an unmanned aerial vehicle automatically shoots the pictures of the power equipment, monitoring videos and extracting the pictures of the power equipment and the like, images containing nameplates of the power equipment are obtained, and the images are used as input images of the method.
In one example, as shown in fig. 2, the step 2 specifically includes:
s201, preprocessing the data. Marking the acquired image containing the power equipment nameplate, framing out a nameplate area, and marking a corresponding label on the power equipment nameplate, the power equipment notice plate and the like; and carrying out data augmentation operations such as rotation, scaling, clipping and the like on the acquired image, and expanding a data set.
S202, pre-searching the neural network hyper-parameter. Selecting various hyper-parameters of the network model by adopting a random parameter search mode, wherein the various hyper-parameters include but are not limited to GIoU loss weight, classification loss weight, initial learning rate and the like, and adopting the weighted value of mAP and F1 as the basis for measuring the comprehensive performance index of the model and selecting the hyper-parameters; selecting a group of hyper-parameters with the best performance according to the above basis in the early training, and randomly changing the hyper-parameters on the basis to be used as the parameters of the next training; when the maximum number of iterations is reached, the update is stopped.
Wherein, the weighted value of mAP and F1 is: λ is defined as M ═ λ1×F1+λ2X mAP, in this example, λ1And λ2Respectively taking 0.3 and 0.7;
the hyper-parameter random change rule is as follows: zetai=ζi×(ri×bi+1)2In the formula, ζiIs the ith hyperparameter; r isiRandom numbers obeying a standard normal distribution; biIn this example, the normal superparameter is set to 0.2, and the smaller superparameter is set to 0.02 for the floating factor of the superparameter.
S203, clustering anchor parameters, in order to enable the anchor parameters to be as close to the size of the nameplate to be measured as possible, acquiring the anchor parameters by adopting K-means clustering based on IoU distance, and finally obtaining the anchor parameters of 9 clustering centers.
The distance calculation formula based on IoU is as follows: d (box, center) ═ 1-IoU (box, center), where box represents the sample frame in the data set, center represents the sample frame in the cluster center, and IoU is the intersection ratio of the two.
S204, processing the input image by using a convolutional neural network and a residual error network to generate a feature map of the input image, processing the input image by using a Darknet-53 network to extract the feature map with the dimension of the input image 1/32, and then obtaining the feature maps with the dimensions of the input images 1/16 and 1/8 by means of upsampling and tensor connection;
s205, based on the feature map of the input image, locating a nameplate region in the input image, including:
and performing regression prediction by combining the anchor parameters of the prior prediction frame based on the characteristic graph of the input image. The regression prediction result is a feature map with three different scales, such as NXNXMXL, wherein N is 1/32, 1/16 and 1/8 of the size of the input image respectively, the number of feature units is represented at the same time, M is the number of anchor frames preset in each feature unit, and L comprises position information of a prediction frame, confidence coefficient of a nameplate and category information;
and carrying out non-maximum suppression operation on all the predicted prediction frames, and positioning a nameplate area in the image.
The loss function of the neural network in the above steps S204 and S205 includes three parts, namely, coordinate loss, target loss, and classification loss: l ═ λcoordLcoord+(λobjLobj+λnoobjLnoobj)+λclsLclsL represents the total loss, LcoordDenotes coordinate loss, LclsRepresents a classification loss, LobjAnd LnoobjLoss of the target in the candidate box and loss of the target in the candidate box are respectively present and absent; lambda [ alpha ]coord、λobj、λnoobjAnd λclsThe weights of different losses are used as hyper-parameters to be adjusted;
when the prediction frame and the real frame are respectively A and B, and C is the minimum convex set of A and B:
Lobj=-α(1-y*)γlogy*
Lnoobj=-(1-α)(y*)γlog(1-y*)
wherein IoU is the cross-over ratio, piAs the probability of the class of the output,is the corresponding tag value, y*The method is characterized in that the method is a predicted value of a model, y is a real value of a sample label, the y value is 1 for a front formula, the y value is 0 for a rear formula, and gamma is a weight factor, so that the loss of samples which are easy to classify is reduced, and the model focuses more on samples which are difficult to identify and easy to classify by mistake; alpha is a balance factor and is used for solving the problem of unbalanced proportion of positive and negative samples.
The nameplate is positioned by the method of the step S2 of the invention, the effect is compared with the YOLOv3 algorithm, and the mAP values under different models are shown in Table 1:
TABLE 1 mAP values of different models
The step S3 specifically includes:
s301, converting an input image into a gray-scale image, and performing Gaussian filtering;
s302, carrying out edge detection on the image, when the number of the detected edges is enough, carrying out straight line detection by using Hough transform, and removing straight lines with unreasonable inclination angles to obtain straight lines with enough number;
s303, according to the detected straight line set, utilizing the slope and intercept to the perspective transformation coefficient mu1、μ2And performing linear fitting, calculating to obtain the slope of each pixel point in the graph, and generating a tilt angle matrix.
As shown in fig. 3, the step S4 specifically includes:
and S401, label processing. Because the labels of texts in the collected nameplates of the power equipment are irregular quadrilateral labels and the input required by the neural network is a rectangular label with an inclination angle, firstly, a quadrilateral label frame which is manually labeled is contracted by 0.3 time of side length, and the error of manual labeling is reduced; then generating a score label according to the contracted quadrilateral frame, wherein the pixel in the frame is set to be 1, and the pixel outside the frame is set to be 0; and finally, generating a minimum circumscribed rectangle of the quadrilateral label, and generating an axis-parallel rectangular frame label and an inclination angle label according to the distance from each point in the frame to each side of the rectangle.
S402, preprocessing the image. Data cleaning and image augmentation of image data: reading the marking information, and deleting irregular marking frames (such as dislocation of the vertex sequence of the rectangular frame, small frame and the like) in the marking information; and performing augmentation operations such as cropping and zooming on the image.
S403, predicting the text in the nameplate image by using a text detection algorithm based on deep learning, wherein the method comprises the following steps:
extracting image features by adopting a ResNet50 network to obtain feature maps of input images 1/32, 1/16, 1/8 and 1/4 in scale size, merging all feature layers, gradually fusing upper-layer features with lower-layer features through an upsampling and convolution network, finally combining the inclination angle matrix in the step 3 to obtain a plurality of text detection candidate frames and coordinate positions and confidence degree information thereof, and outputting an axis-parallel rectangular frame vector containing 5 channels which are 4 channels and a rotation angle vector of 1 channel respectively, wherein the range of the inclination angle is (-45 degrees and 45 degrees);
s404, processing the candidate frame according to the preliminarily obtained coordinate position and confidence of the text detection candidate frame, and finally obtaining the nameplate text detection result, wherein the processing comprises the following steps:
traversing all the text detection candidate frames, screening out the candidate frames which contain characters and have confidence degrees higher than a threshold value, carrying out intersection comparison judgment on adjacent candidate frames, and merging the candidate frames when the intersection ratio is greater than the threshold value;
and then carrying out non-maximum inhibition processing on the remaining candidate frames after one-time traversal to obtain the nameplate text detection result.
The loss function of the neural network in the above steps S403 and S404 is composed of textThe yield loss and geometry loss consist of: l ═ Ls+λgLgWherein L is the total loss, LsFor text score loss, LgFor geometric losses, λgFor the weight lost, the value is set to 1 in this example;
wherein the content of the first and second substances,in the formula, Y represents a predictive value matrix of text scores, and Y represents*The label value matrix corresponding to the text score is expressed, | Y ≦ Y*I represents Y and Y*The overlap between the two matrices is calculated as the product of the two matrices and summed, | Y | and | Y |*I is the number of elements in the matrix respectively;
r, theta are respectively the predicted axis parallel rectangular frame area and the inclination angle thereof, R*、θ*For the corresponding true tag, the weight λ is lostθSet to 10.
The nameplate Text is detected by the method of the step S4 of the invention, the detection effect is shown in FIG. 4, and compared with the standard EAST (efficient and Accurate Scene Text detector) algorithm on the test set, the evaluation index F1 is shown in Table 2:
TABLE 2 evaluation index F1
As shown in fig. 5, the step S5 specifically includes:
s501, extracting a feature sequence of a nameplate text image by using CNN in the convolutional layer, and inputting the feature sequence into a circulation layer, wherein the feature sequence comprises the following steps:
converting the nameplate text image into a gray-scale image and zooming to a fixed height, in this example fixed as 32 pixels, with an input size of 32 xWx 1, where W is the width of the zoomed input image;
performing convolutional layer down-sampling to enable the height of an output feature sequence to be 1, the width of the output feature sequence to be 1/4 of the nameplate text image, the output feature sequence comprises 512 feature values, and the feature map size is 1 x (W/4) x 512;
s502, predicting label distribution of a text sequence by using an LSTM learning characteristic sequence in a circulation layer;
s503, converting label distribution of the text sequence into an identification result by using a standard CTC algorithm in a transcription layer;
s504, correcting the text recognition result to obtain the electric power equipment nameplate text information recognition result, and the method comprises the following steps:
establishing a dictionary base according to high-frequency words in the nameplate text of the power equipment;
calculating the similarity between the recognition result output by the transcription layer and the words in the dictionary, and replacing the recognition result with the words in the dictionary library when the similarity is greater than a set threshold value; and when the similarity is smaller than a set threshold value, keeping the original recognition result.
Wherein, defineIn the formula w1For the text recognition result string, length isw2Is a character string to be matched in a dictionary with the length ofIts word frequency ranking in the dictionary isNdictThe number of vocabularies in the dictionary; λ is the dictionary ranking weight, set to 0.1 in this example; d (w)1,w2) Is w1And w2The degree of difference therebetween; s (w)1,w2) Is w1And w2The similarity between them.
By the method, the character recognition accuracy rate of the electric power equipment nameplate text recognition can reach 93.2%, and the text line recognition accuracy rate can reach 82.3%.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed.
Claims (10)
1. A method for recognizing text information of a nameplate of electric power equipment is characterized by comprising the following steps:
s1, acquiring an input image;
s2, positioning the power equipment nameplate in the input image and extracting positioning information by using a target detection algorithm based on deep learning;
s3, performing text inclination angle calculation on the positioned electric power nameplate image by utilizing perspective transformation to obtain an inclination angle of each pixel point in the nameplate area;
s4, detecting text information in the nameplate image by utilizing a text detection algorithm based on deep learning and combining the inclination angle information to obtain a nameplate text detection result;
and S5, automatically recognizing text character information in the nameplate text detection result by using a text recognition algorithm based on deep learning to obtain a nameplate text information recognition result of the power equipment.
2. The method for recognizing nameplate text information of an electric power apparatus as claimed in claim 1, wherein the step S2 includes:
s201, processing an input image by using a convolutional neural network and a residual error network to generate three feature maps with different scales, wherein the scales of the feature maps are 1/32, 1/16 and 1/8 of the input image respectively;
s202, performing regression prediction based on the characteristic diagram of the input image and by combining the anchor parameters of the prior prediction frame, and positioning a nameplate region in the input image.
3. The method for recognizing textual information of nameplate of electric power equipment as recited in claim 2, wherein the loss function L of neural network in step S201 includes three parts, namely coordinate loss, target loss and classification loss, specifically:
L=λcoordLcoord+(λobjLobj+λnoobjLnoobj)+λclsLcls (1)
in the formula (1), L represents the total loss, LcoordDenotes coordinate loss, LclsRepresents a classification loss, LobjAnd LnoobjLoss of the target in the candidate box and loss of the target in the candidate box are respectively present and absent; lambda [ alpha ]coord、λobj、λnoobjAnd λclsWeights for different losses;
when the prediction box and the real box are A and B, respectively, and C is the minimum convex set of A and B: ,
Lobj=-α(1-y*)γlogy* (4),
Lnoobj=-(1-α)(y*)γlog(1-y*) (5),
in the formulae (2) to (5), IoU is the cross-over ratio, piAs the probability of the class of the output,is the corresponding tag value, y*The predicted value of the model is y, the real value of the sample label is y, the value of y is 1 for the front formula, and the value of y is 0 for the rear formula; gamma is a weighting factor and alpha is a balancing factor.
4. The method for recognizing nameplate text information of an electric power apparatus as claimed in claim 1, wherein the step S3 includes:
s301, detecting a straight line in the nameplate image through Hough transformation;
s302, according to the detected straight line set, using the slope and intercept to the perspective transformation coefficient mu1、μ2And performing linear fitting, calculating to obtain the slope of each pixel point in the graph, and generating a tilt angle matrix.
5. The method for recognizing nameplate text information of an electric power apparatus as claimed in claim 1, wherein the step S4 includes:
s401, predicting texts in the nameplate image by using a text detection algorithm based on deep learning, extracting image features by adopting a ResNet50 network, merging all feature layers, gradually fusing upper-layer features with lower-layer features through an upsampling and convolution network, and combining the inclination angle matrix in the step S3 to obtain a plurality of text detection candidate frames, coordinate positions of the text detection candidate frames and confidence degree information of the text detection candidate frames;
s402, processing the candidate frame according to the preliminarily obtained coordinate position and confidence of the text detection candidate frame, and finally obtaining a text detection result of the nameplate image.
6. The method for recognizing textual information of nameplate of electric power equipment as recited in claim 5, wherein the loss function of the text detection algorithm based on deep learning in the step S401 is composed of text score loss and geometric shape loss, and satisfies the following relations:
L=Ls+λgLg (6),
Lg=LAABB+λθLθ (8),
Lθ=1-cos(θ-θ*) (10),
in the formulae (6) to (10), L is the total loss, LsFor text score loss, LgFor geometric losses, λgFor lost weight, Y represents a predictor matrix for text scores, Y*The label value matrix corresponding to the text score is expressed, | Y ≦ Y*I represents Y and Y*The overlap between the two matrices is calculated as the product of the two matrices and summed, | Y | and | Y |*I is the number of elements in the matrix, R and theta are the predicted rectangular frame area parallel to the axis and the inclination angle thereof, R*、θ*Is the corresponding real label.
7. The method for recognizing the nameplate text information of the electric power equipment as claimed in claim 5, wherein the processing of the candidate box in the step S402 is: traversing all text detection candidate frames, screening out candidate frames which contain characters and have confidence degrees higher than a threshold value, carrying out intersection comparison judgment on adjacent candidate frames, and merging the candidate frames when the intersection ratio is greater than the threshold value; and then carrying out non-maximum inhibition processing on the remaining candidate frames after one-time traversal to obtain the nameplate text detection result.
8. The method for recognizing nameplate text information of electric power equipment as claimed in claim 1, wherein the step S5 is to process the nameplate text detection result through a convolutional layer, a cyclic layer and a transcription layer to obtain the nameplate text information recognition result of electric power equipment, and specifically includes the following steps:
s501, extracting a feature sequence of a nameplate text image in the convolutional layer by using a CNN convolutional neural network, and inputting the feature sequence into a cyclic layer;
s502, in a circulation layer, a feature sequence is learnt by using an LSTM long-short term memory model circulation network, and the label distribution of a text sequence is predicted;
s503, converting label distribution of the text sequence into an identification result by using a CTC algorithm in the transcription layer;
s504, correcting the text recognition result to obtain the electric power equipment nameplate text information recognition result.
9. The method for identifying nameplate text information of electric power equipment as claimed in claim 8, wherein the step S501 of extracting the feature sequence of the nameplate text image by using CNN is as follows: firstly, converting the nameplate text image into a gray level image, and zooming to a fixed height; and performing convolutional layer down-sampling to enable the height of the output feature sequence to be 1, the width of the output feature sequence to be 1/4 of the nameplate text image, and the output feature sequence comprises 512 feature values.
10. The method for recognizing the textual information of the nameplate of the power equipment as recited in claim 8, wherein the step of correcting the text recognition result in the step S504 is: firstly, establishing a dictionary library according to high-frequency words in a nameplate text of the power equipment; then calculating the similarity between the recognition result output by the transcription layer and the words in the dictionary, replacing the recognition result with the words in the dictionary library when the similarity is greater than a set threshold, and keeping the original recognition result when the similarity is less than the set threshold; wherein, the definition:
in formulae (11) to (12), w1For the text recognition result string, length isw2Is a character string to be matched in a dictionary with the length ofIts word frequency ranking in the dictionary isNdictThe number of vocabularies in the dictionary; λ is dictionary ranking weight; d (w)1,w2) Is w1And w2The degree of difference therebetween; s (w)1,w2) Is w1And w2The similarity between them.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011327387.3A CN112446370B (en) | 2020-11-24 | 2020-11-24 | Method for identifying text information of nameplate of power equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011327387.3A CN112446370B (en) | 2020-11-24 | 2020-11-24 | Method for identifying text information of nameplate of power equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112446370A true CN112446370A (en) | 2021-03-05 |
CN112446370B CN112446370B (en) | 2024-03-29 |
Family
ID=74737363
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011327387.3A Active CN112446370B (en) | 2020-11-24 | 2020-11-24 | Method for identifying text information of nameplate of power equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112446370B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113191343A (en) * | 2021-03-31 | 2021-07-30 | 成都飞机工业(集团)有限责任公司 | Aviation wire identification code automatic identification method based on convolutional neural network |
CN113920497A (en) * | 2021-12-07 | 2022-01-11 | 广东电网有限责任公司东莞供电局 | Nameplate recognition model training method, nameplate recognition method and related devices |
CN114863084A (en) * | 2022-04-19 | 2022-08-05 | 北京化工大学 | Nameplate identification and attachment equipment based on deep learning target detection |
CN115187881A (en) * | 2022-09-08 | 2022-10-14 | 国网江西省电力有限公司电力科学研究院 | Power equipment nameplate identification and platform area compliance automatic checking system and method |
CN115424121A (en) * | 2022-07-30 | 2022-12-02 | 南京理工大学紫金学院 | Power pressing plate switch inspection method based on computer vision |
CN116110036A (en) * | 2023-04-10 | 2023-05-12 | 国网江西省电力有限公司电力科学研究院 | Electric power nameplate information defect level judging method and device based on machine vision |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109409355A (en) * | 2018-08-13 | 2019-03-01 | 国网陕西省电力公司 | A kind of method and device of novel transformer nameplate identification |
US20190266435A1 (en) * | 2018-02-26 | 2019-08-29 | Abc Fintech Co., Ltd. | Method and device for extracting information in histogram |
CN110956171A (en) * | 2019-11-06 | 2020-04-03 | 广州供电局有限公司 | Automatic nameplate identification method and device, computer equipment and storage medium |
CN111310861A (en) * | 2020-03-27 | 2020-06-19 | 西安电子科技大学 | License plate recognition and positioning method based on deep neural network |
-
2020
- 2020-11-24 CN CN202011327387.3A patent/CN112446370B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190266435A1 (en) * | 2018-02-26 | 2019-08-29 | Abc Fintech Co., Ltd. | Method and device for extracting information in histogram |
CN109409355A (en) * | 2018-08-13 | 2019-03-01 | 国网陕西省电力公司 | A kind of method and device of novel transformer nameplate identification |
CN110956171A (en) * | 2019-11-06 | 2020-04-03 | 广州供电局有限公司 | Automatic nameplate identification method and device, computer equipment and storage medium |
CN111310861A (en) * | 2020-03-27 | 2020-06-19 | 西安电子科技大学 | License plate recognition and positioning method based on deep neural network |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113191343A (en) * | 2021-03-31 | 2021-07-30 | 成都飞机工业(集团)有限责任公司 | Aviation wire identification code automatic identification method based on convolutional neural network |
CN113920497A (en) * | 2021-12-07 | 2022-01-11 | 广东电网有限责任公司东莞供电局 | Nameplate recognition model training method, nameplate recognition method and related devices |
CN114863084A (en) * | 2022-04-19 | 2022-08-05 | 北京化工大学 | Nameplate identification and attachment equipment based on deep learning target detection |
CN115424121A (en) * | 2022-07-30 | 2022-12-02 | 南京理工大学紫金学院 | Power pressing plate switch inspection method based on computer vision |
CN115424121B (en) * | 2022-07-30 | 2023-10-13 | 南京理工大学紫金学院 | Electric power pressing plate switch inspection method based on computer vision |
CN115187881A (en) * | 2022-09-08 | 2022-10-14 | 国网江西省电力有限公司电力科学研究院 | Power equipment nameplate identification and platform area compliance automatic checking system and method |
CN116110036A (en) * | 2023-04-10 | 2023-05-12 | 国网江西省电力有限公司电力科学研究院 | Electric power nameplate information defect level judging method and device based on machine vision |
Also Published As
Publication number | Publication date |
---|---|
CN112446370B (en) | 2024-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112446370B (en) | Method for identifying text information of nameplate of power equipment | |
CN110837835B (en) | End-to-end scene text identification method based on boundary point detection | |
WO2020259060A1 (en) | Test paper information extraction method and system, and computer-readable storage medium | |
US20200302248A1 (en) | Recognition system for security check and control method thereof | |
WO2019192397A1 (en) | End-to-end recognition method for scene text in any shape | |
US20210374466A1 (en) | Water level monitoring method based on cluster partition and scale recognition | |
CN100565559C (en) | Image text location method and device based on connected component and support vector machine | |
CN112733822B (en) | End-to-end text detection and identification method | |
CN108805076B (en) | Method and system for extracting table characters of environmental impact evaluation report | |
CN110598693A (en) | Ship plate identification method based on fast-RCNN | |
CN112307919B (en) | Improved YOLOv 3-based digital information area identification method in document image | |
CN114155527A (en) | Scene text recognition method and device | |
CN111553346A (en) | Scene text detection method based on character region perception | |
CN115019103A (en) | Small sample target detection method based on coordinate attention group optimization | |
CN114241469A (en) | Information identification method and device for electricity meter rotation process | |
CN113688821B (en) | OCR text recognition method based on deep learning | |
CN114694130A (en) | Method and device for detecting telegraph poles and pole numbers along railway based on deep learning | |
Zhang et al. | A vertical text spotting model for trailer and container codes | |
CN111832497B (en) | Text detection post-processing method based on geometric features | |
CN113971809A (en) | Text recognition method and device based on deep learning and storage medium | |
CN114332473A (en) | Object detection method, object detection device, computer equipment, storage medium and program product | |
CN112633116B (en) | Method for intelligently analyzing PDF graphics context | |
CN113435441A (en) | Bi-LSTM mechanism-based four-fundamental operation formula image intelligent batch modification method | |
CN114494678A (en) | Character recognition method and electronic equipment | |
CN113313678A (en) | Automatic sperm morphology analysis method based on multi-scale feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |