CN115713772A - Transformer substation panel character recognition method, system, equipment and storage medium - Google Patents

Transformer substation panel character recognition method, system, equipment and storage medium Download PDF

Info

Publication number
CN115713772A
CN115713772A CN202211097849.6A CN202211097849A CN115713772A CN 115713772 A CN115713772 A CN 115713772A CN 202211097849 A CN202211097849 A CN 202211097849A CN 115713772 A CN115713772 A CN 115713772A
Authority
CN
China
Prior art keywords
character
panel
transformer substation
recognition
character recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211097849.6A
Other languages
Chinese (zh)
Inventor
陈中
李冰融
谭林林
娄骐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202211097849.6A priority Critical patent/CN115713772A/en
Publication of CN115713772A publication Critical patent/CN115713772A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Character Discrimination (AREA)

Abstract

The invention provides a transformer substation panel character recognition method, a transformer substation panel character recognition system, transformer substation panel character recognition equipment and a storage medium, and relates to the field of power grid inspection maintenance. The transformer substation panel character recognition method comprises the steps of obtaining an original picture, and preprocessing the original picture; performing character recognition processing on the preprocessed pictures; constructing a transformer substation panel character information database according to the provided instrument panel information photos; correcting the character recognition processing result according to the transformer substation panel character information database to obtain a final recognition result; the character information recognized by the CNOCR is optimized by combining with the panel text information database, so that the high-accuracy recognition of the character information of the panel is realized, and the reliability of a recognition result is improved; the light OCR is used for recognition, and the recognition speed is high and the data volume is small while the accuracy is high.

Description

Transformer substation panel character recognition method, system, equipment and storage medium
Technical Field
The invention relates to the technical field of power grid inspection maintenance, in particular to a transformer substation panel character recognition method, a transformer substation panel character recognition system, transformer substation panel character recognition equipment and a storage medium.
Background
Character Recognition (OCR), traditionally refers to a process of recognizing an image of textual material, which is then returned in a computer-understandable textual form. And (3) an application scene with a higher difficulty coefficient, namely a scene character recognition technology. In recent years, due to the rapid development of neural networks, the recognition capability of regular scene characters has been greatly improved. The pattern features extracted by the neural network become dominant compared to the handmade features. Through the development of many years, the character recognition technology obtains considerable results in a plurality of fields of China. However, the current character recognition technology still has many defects, and how to design a character recognition system with high recognition, high robustness and high accuracy is always the target of many researches. With continuous proposing and optimizing of new algorithms for character recognition and improvement of computer computing power, the application of character recognition is more and more extensive. With the intellectualization of the operation of the power system, the character recognition technology based on deep learning is gradually applied to the field of power, especially to the text information processing in the power system, wherein the most common text information processing and drawing text information processing of power equipment are available.
The current character recognition technology has many defects, the accuracy of recognizing the character information of the panel is not high, the reliability of the recognition result is insufficient, and the recognition speed is slow.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a character recognition method, a system, equipment and a storage medium, which solve the problems of a plurality of defects, low accuracy of recognition of character information of a panel, insufficient reliability of a recognition result and low recognition speed of the current character recognition technology.
(II) technical scheme
In order to realize the purpose, the invention is realized by the following technical scheme:
in one aspect, a transformer substation panel character recognition method is provided, and the method includes:
acquiring an original picture, and preprocessing the original picture;
performing character recognition processing on the preprocessed pictures;
constructing a transformer substation panel character information database according to the provided instrument panel information photos;
and correcting the character recognition processing result according to the transformer substation panel character information database to obtain a final recognition result.
Preferably, the preprocessing the original picture specifically includes:
preprocessing an original picture by adopting a graying and binarization method, selecting parameters according to different application scene characteristics, extracting an image pixel point matrix, graying the image pixel point matrix, and then performing binarization by threshold judgment; wherein the graying function is:
GRAY=0.114B+0.587G+0.299R
the GRAY is a GRAY value matrix of the grayed picture, and R, G and B are respectively matrixes formed by values of three components of R, G and B of each pixel point of the original picture.
Preferably, the character recognition processing is performed on the preprocessed picture, and the character recognition processing includes:
improving the CNOCR algorithm, and performing character recognition processing on the preprocessed picture through the improved CNOCR algorithm;
the improvement of the CNOCR algorithm is specifically as follows:
the normalized exponential function loss function of the CNOCR algorithm adjusts the optimization objective to:
||W 1 ||·||x||cos(θ 1 )>||W 2 ||·||x||cos(θ 2 )
performing strict classification and limitation
Figure BDA0003839082190000021
Obtaining:
||W 1 ||·||x||cos(θ 1 )≥||W 1 ||·||x||cos(nθ 1 )>||W 2 ||·||x||cos(θ 2 )
wherein, x is the output characteristic extracted by the characteristic layer, x i The ith output feature extracted for the feature layer, W is the class weight matrix of the classification layer, y j Is x i Number of the class to which it belongs, W yj Y representing a matrix j In the column, cos θ is the cosine distance between the two, m is a positive integer representing the inner product distance, and k is a non-negative integer serving as a parameter.
The normalized exponential function loss function is:
Figure BDA0003839082190000031
after the decision boundary is added, the final large-space normalized exponential function loss function is as follows:
Figure BDA0003839082190000032
the optimal inter-class distance under different scenes is determined by adjusting the size of m, so that the optimal recognition effect is achieved.
Preferably, the constructing of the transformer substation panel text information database includes:
preliminarily inputting the text content of the panel according to the information photo of the instrument panel;
normalizing different expression modes of different equipment, normalizing a fixed value list before matching, normalizing common irregular names, classifying different expression forms with the same meaning under the same theme, namely dividing the different expression forms into a standard name and a secondary name under the standard name, and endowing the secondary names belonging to the same standard name with the same meaning in an electrical logic relationship;
counting the information of each panel, and preferentially placing the high-frequency standard names in front and the high-frequency standard secondary names in front of the same subject;
and finally, obtaining a character information database which is arranged from high to low according to the occurrence frequency, namely arranging the character information database with the weight.
Preferably, the correcting the character recognition processing result according to the transformer substation panel character information database includes:
eliminating irrelevant messy codes generated by noise points in the result according to the panel character information base;
calculating the similarity of each database information and the matching content by using a fuzzy matching algorithm combining the keywords and the Levenson distance and through the longest subsequence of the matching character strings and the panel information, and selecting the highest content as output;
the levenson distance is the minimum editing times required by two character strings, wherein one character string is changed into the other character string through editing, and the similarity between the two character strings is determined according to the levenson distance.
Preferably, the state transition equation of the levenstein distance is as follows:
Figure BDA0003839082190000041
wherein, dlev a,b (m, n) represents the Laves distance between the first m characters of a and the first n characters of b,
Figure BDA0003839082190000042
is shown when a i =b j If so, it is 0, otherwise it is 1.
Preferably, the character recognition processing result is corrected according to the transformer substation panel character information database, and the specific steps are as follows:
(1) Traversing all panel information names in a power station panel text information database according to the result of text recognition processing on the image, performing character matching with the recognition result character strings one by one, and initializing a database index k =0;
(2) Order S k The name of the panel information which is matched currently in the database is the number P of characters which are correctly matched with the character string of the recognition result k =0;
(3) Comparing from the first character of the character string of the recognition result, and initializing a character index i =0;
(4) Will S k The i position character is matched and judged with the i position character of the character string of the recognition result, if the characters are the same, the number of correctly matched characters is P k +1;
(5) If i is smaller than the length of the recognition result character string, making i +1, and repeating the step (4); otherwise, P is saved k And according to the existing correct matching character number P k Calculating each S k Similarity D with recognition result character string k If D is k =1 and match by levenstein distance, jump out and match S k Outputting as a result, otherwise let k +1;
the specific calculation formula is as follows:
Figure BDA0003839082190000051
wherein, P k For correct matching of the number of characters,/ Sk The length of the character string is the panel information name;
(6) If k is less than the number of character strings of the database, repeating the step (2) by the algorithm, otherwise, ending the program;
(7) And selecting the information content with the highest matching degree as final output.
In another aspect, a transformer substation panel character recognition system is provided, the system including:
an acquisition unit for acquiring an original picture;
the picture processing unit is used for preprocessing an original picture;
the character recognition unit is used for carrying out character recognition processing on the preprocessed pictures;
the database construction unit is used for constructing a transformer substation panel character information database according to the provided instrument panel information photos;
and the correcting unit is used for correcting the character recognition processing result according to the transformer substation panel character information database to obtain a final recognition result.
In yet another aspect, an apparatus is provided, the apparatus comprising:
one or more processors;
a memory for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the substation panel text recognition method described above.
In yet another aspect, a computer-readable storage medium is provided, in which a computer program is stored, which program, when executed by a processor, implements the substation panel text recognition method described above.
(III) advantageous effects
The transformer substation panel character recognition method, the system, the equipment and the storage medium are combined with the panel text information database, and when the conditions of close panel information form and large noise interference in the specific implementation process are faced, the character information recognized by the CNOCR is optimized, so that the high-accuracy recognition of the panel character information is realized, and the reliability of the recognition result is improved; for an application scene with limited storage space, light-weight character OCR is used for recognition, and the recognition speed is high while the accuracy is high and the data volume is small.
Drawings
FIG. 1 is a flow chart of a method for recognizing characters on a panel of a transformer substation according to the present invention;
FIG. 2 is a flowchart illustrating a text recognition method according to an embodiment of the present invention;
FIG. 3 is a flow chart illustrating a method for constructing a database of textual information on a panel of a substation according to the present invention;
FIG. 4 is a flowchart illustrating an implementation of the recognition result correction method based on the panel information database according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the present invention, and it is to be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
As shown in fig. 1, an embodiment of the present invention provides a transformer substation panel character recognition method, which is characterized in that the method includes:
acquiring an original picture, and preprocessing the original picture;
performing character recognition processing on the preprocessed picture;
constructing a transformer substation panel character information database according to the provided instrument panel information photos;
and correcting the character recognition processing result according to the transformer substation panel character information database to obtain a final recognition result.
Firstly, preprocessing an original picture by adopting a graying and binarization method, and then performing character recognition on the picture by using improved CNOCR. Firstly, considering that the color of characters on a panel of the device is fixed, the gray scale is deep, and the characters are greatly different from the panel and other noise information, parameters are selected according to different application scene characteristics in practical application, an image pixel matrix is extracted, and the image pixel matrix is grayed and then is subjected to binarization through threshold judgment. The graying function is:
GRAY=0.114B+0.587G+0.299R
in the formula, GRAY is a GRAY value matrix of the grayed picture, and R, G and B are respectively matrixes formed by values of three components of R, G and B of each pixel point of the original picture.
Considering that the device panel information is relatively fixed and most of the panel information is Chinese, the improved CNOCR algorithm is used for character recognition. Because the characters on the panel are small, the shooting definition is limited, a plurality of characters with similar shapes are difficult to distinguish, and the recognition result is easy to be confused. The original normalization index function of the CNOCR has only one decision surface, so that the problem of confusion of two similar results easily occurs, and therefore, a large-space normalization index function loss function is adopted and improved.
Compared with the original loss function, the large-distance normalized exponential function L-Softmax is a popularization of the large-distance normalized exponential function, the required distance can be adjusted, and intra-class compactness and inter-class separability are explicitly encouraged. In the aspect of angle similarity, the method generates angle intervals through the preset constants, explicitly encourages the intra-class compactness and inter-class separability among learning features, and simultaneously provides a different visual angle for the over-fitting problem by additionally defining a more difficult learning target, so that the over-fitting is avoided to a certain extent, high-accuracy recognition of the character information of the panel is realized, and the reliability of the recognition result is improved.
Because the large-interval normalized exponential function can flexibly adjust the angle boundary constraint between classes, a plurality of learning tasks with adjustable difficulty can be provided, the difficulty is gradually increased along with the increase of the required edge, so that the learning characteristics can have larger angle separability, and two similar characters can be better judged. In the case of binary classification, the normalized exponential function loss function adjusts the optimization objective to:
||W 1 ||·||x||cos(θ 1 )>||W 2 ||·||x||cos(θ 2 )
to make the classification stricter, limit
Figure BDA0003839082190000071
Obtaining:
||W 1 ||·||x||cos(θ 1 )≥||W 1 ||·||x||cos(nθ 1 )>||W 2 ||·||x||cos(θ 2 )
wherein x is the output feature extracted by the feature layer, x i The ith output feature extracted for the feature layer, W is the class weight matrix of the classification layer, y j Is x i Number of the class to which it belongs, W yj Y representing a matrix j In the column, cos θ is the cosine distance between the two, m is a positive integer representing the inner product distance, and k is a non-negative integer serving as a parameter.
The original normalized exponential function loss function Softmax is:
Figure BDA0003839082190000081
after the decision boundary is added, the final large-distance normalized exponential function loss function L-Softmax is as follows:
Figure BDA0003839082190000082
the larger m is, the larger the inter-class distance is, the more difficult it is to learn the corresponding relation, and the more ideal the classification is. Therefore, in practical application, the optimal inter-class distance under different scenes can be determined by adjusting the size of m, and the optimal recognition effect is achieved.
As shown in fig. 2, for a picture to be recognized, an RGB matrix of each pixel point of the picture is first obtained, a color image is converted into a gray image after graying function processing, and binarization processing of the picture is achieved by a threshold-based determination method with respect to the gray value of each pixel point. And then judging the communication relation between the pixel points and the surrounding dot matrix in the binary matrix of the image, and determining the attribution sequence of the connected domain according to the attribute of the connected domain marked by the characteristics such as the size of the connected domain or the distribution of the pixel points. And then performing row-column segmentation on each connected domain which is well classified to belong to. The line-row spacing and the word spacing of the printed text images are approximately equal, and the adhesion phenomenon hardly exists, so that the images are segmented by adopting a projection method, the obtained pixel value projection curve of each line on a coordinate axis is an unsmooth curve, and the region of the curve which is smoothed by Gaussian between the positions of each wave trough is a required line. Next, feature extraction is performed for the characters. First, the pre-processed light noise image is scaled by a fixed aspect ratio. Then, a convolution characteristic matrix of an input image is extracted through a CNN network, then, a time sequence of each channel is input into a deep layer LSTM network, and character sequence characteristics are continuously extracted on the basis of the existing convolution characteristics of character pictures by utilizing the deep layer bidirectional RNN network. The invention uses a deep bidirectional LSTM network, i.e. a cyclic network layer of stack-shaped deep bidirectional structure. Then, guided by the loss of the L-Softmax function, the training target is
||W 1 ||·||x||cos(θ 1 )≥||W 1 ||·||x||cos(nθ 1 )>||W 2 ||·||x||cos(θ 2 )
Wherein x is the output characteristic extracted by the characteristic layer i The ith output feature extracted for the feature layer, W is the class weight matrix of the classification layer, y j Is x i Number of the belonging category, W yj Y representing a matrix j In the column, cos θ is the cosine distance between the two, m is a positive integer representing the inner product distance, and k is a non-negative integer as a parameter. . And after the character features of the picture to be input are matched with the data features in the model, character output is realized.
As shown in fig. 3, the method for constructing the transformer substation panel text information database includes, first, preliminarily inputting panel text contents according to instrument panel information photos. Subsequently, for various electrical names that may occur to different power devices, electrical normalization processing of the names is performed. Aiming at different expression modes which may appear in different equipment, normalization processing is carried out on the fixed value list before matching according to specific electric actual meanings, common irregular names are mainly normalized, different expression forms with the same meaning are classified under the same theme, namely, the different expression forms are divided into a standard name and a secondary name under the standard name, and the secondary name belonging to the same standard name is endowed with the same meaning in an electric logic relation. Then, statistics is carried out on the information of each panel, the canonical names with high frequency are placed in front of the information, and the canonical secondary names with high frequency in the same subject are placed in front of the information. And finally, obtaining a character information database which is arranged from high to low according to the occurrence frequency, namely arranging the character information database with the weight.
As shown in fig. 4, the method for correcting the recognition result based on the panel information database is to delete the noisy characters irrelevant to the required information in the result according to the specific content of the panel character information database from the text information of the panel of the protection device recognized by the CNOCR network based on the improved L-Softmax function, and then calculate the minimum editing times required for changing one character string into another character string by matching the longest subsequence of the character string and the panel information through a fuzzy matching algorithm combining key information matching and the leishmanian distance. The specific calculation equation is:
Figure BDA0003839082190000101
in the formula, dlev a,b (m, n) represents the Laves distance between the first m characters of a and the first n characters of b,
Figure BDA0003839082190000102
is shown when a i =b j If so, the value is 0, otherwise, it is 1. By the method, the similarity between the two character strings can be further determined, and the specific calculation formula is
Figure BDA0003839082190000103
Where Pk is the number of correctly matched characters and lSk is the length of the panel information name string. And after the similarity between each database information and the matching content is obtained, when the similarity is greater than a set threshold value, determining that the two character strings are in fuzzy matching, otherwise, determining that the two character strings are not matched. The highest content is selected as output.
The identification result correction method comprises the following specific steps:
(1) According to the CNOCR-based protection device panel image character information recognition result, traversing all panel information names in a database, performing character matching with the recognition result character strings one by one, and initializing a database index k =0.
(2) Order S k The name of the panel information which is matched currently in the database is the number P of characters which are correctly matched with the character string of the recognition result k =0。
(3) The initial character index i =0 is compared from the first character string of the recognition result.
(4) Will S k The i position character is matched and judged with the i position character of the character string of the recognition result, if the characters are the same, the number of correctly matched characters is P k +1。
(5) If i is smaller than the length of the recognition result character string, making i +1, and repeating the step (4); otherwisePreservation of P k And according to the existing correct matching character number P k Calculating each S k Similarity D with recognition result character string k If D is k =1 and match by levenstein distance, jump out and match S k Output as result, otherwise let k +1.
The specific calculation formula is as follows:
Figure BDA0003839082190000111
in the formula P k For correct matching of the number of characters,/ Sk Is the panel information name string length.
(6) If k is less than the number of character strings in the database, repeating the step (2) by the algorithm, otherwise, ending the program.
(7) And selecting the information content with the highest matching degree as final output.
Through the improvement of the detection network and the fuzzy matching method, the identification proportion of complete matching based on the detection of the normalized database reaches 98 percent, so that the result with the matching degree of not 100 percent is marked, the optimal result is output, and the original image is stored to be used as the basis of manual inspection.
As an embodiment of the invention, there is provided a transformer substation panel character recognition system, including:
an acquisition unit for acquiring an original picture;
the picture processing unit is used for preprocessing an original picture;
the character recognition unit is used for carrying out character recognition processing on the preprocessed pictures;
the database construction unit is used for constructing a transformer substation panel character information database according to the provided instrument panel information photos;
and the correcting unit is used for correcting the character recognition processing result according to the transformer substation panel character information database to obtain a final recognition result.
As an embodiment of the present invention, there is provided an apparatus including:
one or more processors;
a memory for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the substation panel text recognition method of the above embodiments.
As an embodiment of the present invention, there is provided a computer-readable storage medium storing a computer program which, when executed by a processor, implements the substation panel character recognition method in the above-described embodiments.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A transformer substation panel character recognition method is characterized by comprising the following steps:
acquiring an original picture, and preprocessing the original picture;
performing character recognition processing on the preprocessed picture;
constructing a transformer substation panel character information database according to the provided instrument panel information photos;
and correcting the character recognition processing result according to the transformer substation panel character information database to obtain a final recognition result.
2. The transformer substation panel character recognition method according to claim 1, characterized in that: the preprocessing the original picture specifically comprises the following steps:
preprocessing an original picture by adopting a graying and binarization method, selecting parameters according to different application scene characteristics, extracting an image pixel point matrix, graying the image pixel point matrix, and then performing binarization by threshold judgment; wherein the graying function is:
GRAY=0.114B+0.587G+0.299R
the GRAY is a GRAY value matrix of the grayed picture, and R, G and B are respectively matrixes formed by values of three components of R, G and B of each pixel point of the original picture.
3. The transformer substation panel character recognition method according to claim 1, characterized in that: the character recognition processing of the preprocessed picture comprises the following steps:
improving the CNOCR algorithm, and performing character recognition processing on the preprocessed picture through the improved CNOCR algorithm;
the improvement of the CNOCR algorithm is specifically as follows:
the normalized exponential function loss function of the CNOCR algorithm adjusts the optimization objective to:
||W 1 ||·||x||cos(θ 1 )>||W 2 ||·||x||cos(θ 2 )
subject to strict classification and limitation
Figure FDA0003839082180000011
Obtaining:
||W 1 ||·||x||cos(θ 1 )≥||W 1 ||·||x||cos(nθ 1 )>||W 2 ||·||x||cos(θ 2 )
wherein x is the output characteristic extracted by the characteristic layer, x i The ith output feature extracted for the feature layer, W is the class weight matrix of the classification layer, y j Is x i Number of the belonging category, W yj Representing momentsY of the matrix j In the column, cos θ is the cosine distance between the two, m is a positive integer representing the inner product distance, and k is a non-negative integer serving as a parameter.
The normalized exponential function loss function is:
Figure FDA0003839082180000021
after the decision boundary is added, the final large-space normalized exponential function loss function is as follows:
Figure FDA0003839082180000022
and the optimal inter-class distance under different scenes is determined by adjusting the size of m, so that the optimal recognition effect is achieved.
4. The transformer substation panel character recognition method according to claim 1, characterized in that: the construction of the transformer substation panel text information database comprises the following steps:
preliminarily inputting the text content of the panel according to the information photo of the instrument panel;
standardizing different expression modes appearing on different equipment, standardizing a constant value list before matching, standardizing common irregular names, classifying different expression forms with the same meaning under the same theme, namely dividing the different expression forms into a standard name and a secondary name under the standard name, and endowing the secondary names belonging to the same standard name with the same meaning in an electrical logic relationship;
counting the information of each panel, and preferentially placing the high-frequency standard names in front and the high-frequency standard secondary names in front of the same subject;
and finally, obtaining a character information database which is arranged from high to low according to the occurrence frequency, namely arranging the character information database with the weight.
5. The transformer substation panel character recognition method according to claim 1, characterized in that: correcting the character recognition processing result according to the transformer substation panel character information database, comprising the following steps of:
removing irrelevant messy codes generated by noise points in the result according to the panel character information base;
calculating the similarity of each database information and the matching content by using a fuzzy matching algorithm combining the keywords and the Levenson distance and through the longest subsequence of the matching character strings and the panel information, and selecting the highest content as output;
the levenson distance is the minimum editing times required by two character strings, wherein one character string is changed into the other character string through editing, and the similarity between the two character strings is determined according to the levenson distance.
6. The transformer substation panel character recognition method according to claim 5, wherein: the state transition equation of the levenstein distance is as follows:
Figure FDA0003839082180000031
wherein, dlev a,b (m, n) represents the leinstein distance between the first m characters of a and the first n characters of b;
Figure FDA0003839082180000032
is shown when a i =b j If so, it is 0, otherwise it is 1.
7. The transformer substation panel character recognition method according to claim 6, wherein: the method comprises the following steps of correcting a character recognition processing result according to a transformer substation panel character information database:
(1) Traversing all panel information names in a power station panel text information database according to the result of the text recognition processing of the image, performing character matching with the recognition result character strings one by one, and initializing a database index k =0;
(2) Order S k The name of the panel information which is matched currently in the database is the number P of characters which are correctly matched with the character string of the recognition result k =0;
(3) Comparing from the first character of the character string of the recognition result, and initializing a character index i =0;
(4) Will S k The i position character is matched and judged with the i position character of the character string of the recognition result, if the characters are the same, the number of correctly matched characters is P k +1;
(5) If i is smaller than the length of the recognition result character string, making i +1, and repeating the step (4); otherwise, P is saved k And according to the existing correct matching character number P k Calculating each S k Similarity D with recognition result character string k If D is k =1 and match by levenstein distance, jump out and match S k Outputting as a result, otherwise let k +1;
the specific calculation formula is as follows:
Figure FDA0003839082180000041
wherein, P k For correct matching of the number of characters,/ Sk The length of the character string is the panel information name;
(6) If k is less than the number of character strings of the database, repeating the step (2) by the algorithm, otherwise, ending the program;
(7) And selecting the information content with the highest matching degree as final output.
8. A transformer substation panel character recognition system, the system comprising:
an acquisition unit for acquiring an original picture;
the picture processing unit is used for preprocessing an original picture;
the character recognition unit is used for carrying out character recognition processing on the preprocessed pictures;
the database construction unit is used for constructing a transformer substation panel character information database according to the provided instrument panel information photos;
and the correcting unit is used for correcting the character recognition processing result according to the transformer substation panel character information database to obtain a final recognition result.
9. An apparatus, characterized in that the apparatus comprises:
one or more processors;
a memory for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the substation panel text recognition method of any of claims 1-7.
10. A computer-readable storage medium storing a computer program, characterized in that the program, when executed by a processor, implements a substation panel character recognition method according to any one of claims 1-7.
CN202211097849.6A 2022-09-08 2022-09-08 Transformer substation panel character recognition method, system, equipment and storage medium Pending CN115713772A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211097849.6A CN115713772A (en) 2022-09-08 2022-09-08 Transformer substation panel character recognition method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211097849.6A CN115713772A (en) 2022-09-08 2022-09-08 Transformer substation panel character recognition method, system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115713772A true CN115713772A (en) 2023-02-24

Family

ID=85230617

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211097849.6A Pending CN115713772A (en) 2022-09-08 2022-09-08 Transformer substation panel character recognition method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115713772A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116225770A (en) * 2023-04-26 2023-06-06 阿里云计算有限公司 Patch matching method, device, equipment and storage medium
CN117236310A (en) * 2023-10-26 2023-12-15 湖南中拓信息科技有限公司 Bill recognition method, system and readable storage medium based on OCR technology
CN118155234A (en) * 2024-05-10 2024-06-07 四川互慧软件有限公司 Information extraction method and system for medical examination report

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116225770A (en) * 2023-04-26 2023-06-06 阿里云计算有限公司 Patch matching method, device, equipment and storage medium
CN116225770B (en) * 2023-04-26 2023-10-20 阿里云计算有限公司 Patch matching method, device, equipment and storage medium
CN117236310A (en) * 2023-10-26 2023-12-15 湖南中拓信息科技有限公司 Bill recognition method, system and readable storage medium based on OCR technology
CN117236310B (en) * 2023-10-26 2024-08-02 湖南中拓信息科技有限公司 Bill recognition method, system and readable storage medium based on OCR technology
CN118155234A (en) * 2024-05-10 2024-06-07 四川互慧软件有限公司 Information extraction method and system for medical examination report

Similar Documents

Publication Publication Date Title
CN111325203B (en) American license plate recognition method and system based on image correction
CN115713772A (en) Transformer substation panel character recognition method, system, equipment and storage medium
CN109784342B (en) OCR (optical character recognition) method and terminal based on deep learning model
CN111753828B (en) Natural scene horizontal character detection method based on deep convolutional neural network
US20240037969A1 (en) Recognition of handwritten text via neural networks
CN111860525B (en) Bottom-up optical character recognition method suitable for terminal block
CN104809481A (en) Natural scene text detection method based on adaptive color clustering
CN108898138A (en) Scene text recognition methods based on deep learning
CN109086654A (en) Handwriting model training method, text recognition method, device, equipment and medium
CN109360179B (en) Image fusion method and device and readable storage medium
WO2021232670A1 (en) Pcb component identification method and device
CN112966685B (en) Attack network training method and device for scene text recognition and related equipment
CN112069900A (en) Bill character recognition method and system based on convolutional neural network
CN117197904B (en) Training method of human face living body detection model, human face living body detection method and human face living body detection device
CN113792659B (en) Document identification method and device and electronic equipment
CN109460767A (en) Rule-based convex print bank card number segmentation and recognition methods
Ovodov Optical Braille recognition using object detection neural network
CN114937278A (en) Text content extraction and identification method based on line text box word segmentation algorithm
Ovodov Optical Braille recognition using object detection CNN
CN117076455A (en) Intelligent identification-based policy structured storage method, medium and system
CN113989485B (en) Text character segmentation method and system based on OCR (optical character recognition)
CN115713776A (en) General certificate structured recognition method and system based on deep learning
CN113392814B (en) Method and device for updating character recognition model and storage medium
Ajao et al. Yoruba handwriting word recognition quality evaluation of preprocessing attributes using information theory approach
CN114187434A (en) End-to-end license plate identification method based on raspberry pi 4B

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination