CN109389115B - Text recognition method, device, storage medium and computer equipment - Google Patents

Text recognition method, device, storage medium and computer equipment Download PDF

Info

Publication number
CN109389115B
CN109389115B CN201710687380.4A CN201710687380A CN109389115B CN 109389115 B CN109389115 B CN 109389115B CN 201710687380 A CN201710687380 A CN 201710687380A CN 109389115 B CN109389115 B CN 109389115B
Authority
CN
China
Prior art keywords
character
characters
text sequence
image
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710687380.4A
Other languages
Chinese (zh)
Other versions
CN109389115A (en
Inventor
刘银松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shanghai Co Ltd
Original Assignee
Tencent Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shanghai Co Ltd filed Critical Tencent Technology Shanghai Co Ltd
Priority to CN201710687380.4A priority Critical patent/CN109389115B/en
Publication of CN109389115A publication Critical patent/CN109389115A/en
Application granted granted Critical
Publication of CN109389115B publication Critical patent/CN109389115B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The invention relates to a text recognition method, a text recognition device, a storage medium and computer equipment, wherein the method comprises the following steps: acquiring a text sequence image; performing character recognition on the text sequence image according to the character recognition mode corresponding to each character type, recognizing characters which do not belong to the corresponding character type in the text sequence image as universal characters which do not belong to the corresponding character type, and obtaining a text sequence corresponding to each character type; selecting a text sequence from text sequences corresponding to each character type; determining the position of a universal character which does not belong to the corresponding character type in the selected text sequence; acquiring characters belonging to corresponding character types at the positions in the selected residual text sequences; and correcting the characters at the positions in the selected text sequence according to the acquired characters to obtain a recognition result. The scheme provided by the application provides the accuracy of text recognition.

Description

Text recognition method, device, storage medium and computer equipment
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a text recognition method, a text recognition device, a storage medium, and a computer device.
Background
With the development of computer technology, more and more text is added to images for information dissemination. Text recognition techniques are also increasingly used to recognize text included in images, such as business cards or text recognition of text in photographs.
Currently, text recognition of various images is mainly based on fixed character feature extraction to recognize each character. However, when the text content is complex and various, the accuracy of the recognition result of recognizing the text is obviously reduced.
Disclosure of Invention
Based on this, it is necessary to provide a text recognition method, apparatus, storage medium and computer device for the problem of low recognition accuracy in the case of complicated and diverse text contents in the conventional text recognition method.
A method of text recognition, the method comprising:
acquiring a text sequence image;
performing character recognition on the text sequence image according to the character recognition mode corresponding to each character type, recognizing characters which do not belong to the corresponding character type in the text sequence image as universal characters which do not belong to the corresponding character type, and obtaining a text sequence corresponding to each character type;
Selecting a text sequence from text sequences corresponding to each character type;
determining the position of a character which does not belong to the corresponding character type in the selected text sequence;
acquiring characters belonging to corresponding character types at the positions in the selected residual text sequences;
and correcting the characters at the positions in the selected text sequence according to the acquired characters to obtain a recognition result.
A text recognition device, the device comprising:
the first acquisition module is used for acquiring the text sequence image;
the recognition module is used for carrying out character recognition on the text sequence image according to a character recognition mode corresponding to each character type, recognizing characters which do not belong to the corresponding character type in the text sequence image as universal characters which do not belong to the corresponding character type, and obtaining a text sequence corresponding to each character type;
the selecting module is used for selecting a text sequence from text sequences corresponding to each character type;
the determining module is used for determining the position of the character which does not belong to the corresponding character type in the selected text sequence;
the second acquisition module is used for acquiring characters belonging to corresponding character types at the positions in the selected residual text sequences;
And the correction module is used for correcting the characters at the positions in the selected text sequence according to the acquired characters to obtain a recognition result.
One or more non-transitory computer-readable storage media storing computer-executable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of a text recognition method.
A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of a text recognition method.
After the text sequence image is acquired, character recognition is respectively carried out according to character recognition modes corresponding to different character types, and the text sequence corresponding to each character type is obtained. When recognizing a character recognition system corresponding to a certain character type, characters in the text sequence image which do not belong to the character type are recognized as common characters which do not belong to the character type. And further selecting a text sequence, determining the position of the universal character which does not belong to the corresponding character type in the selected text sequence, and correcting the character at the position in the selected text sequence by the characters at the position in the selected text sequence left after the selection, so as to obtain a recognition result. The text sequence image is identified in a mode of identifying the character types, so that the identification accuracy of the characters belonging to each character type is ensured when the identification is carried out according to each character type, the identification of texts of various character types included in the text sequence image can be considered when the text content is complex and various, the characters belonging to the character type in the text sequence obtained by identifying the character types are utilized to correct the characters at the corresponding positions in other text sequences, the identification result can be obtained, and the text identification accuracy is improved.
Drawings
FIG. 1 is a schematic diagram of the internal architecture of a computer device in one embodiment;
FIG. 2 is a flow diagram of a text recognition method in one embodiment;
FIG. 3 is a schematic diagram of a text recognition method in one embodiment;
FIG. 4 is a schematic diagram of character correction performed when the number of characters acquired in one embodiment corresponds to the number of characters at a position in a selected text sequence;
FIG. 5 is a schematic diagram of character correction performed when the number of characters acquired in one embodiment is inconsistent with the number of characters at a position in a selected text sequence;
FIG. 6 is a schematic diagram of a text recognition method in another embodiment;
FIG. 7 is a schematic flow chart of a text recognition method in a specific application scenario;
FIG. 8 is a block diagram of a text recognition device in one embodiment;
fig. 9 is a block diagram showing a structure of a text recognition apparatus in another embodiment.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
FIG. 1 is a schematic diagram of the internal architecture of a computer device in one embodiment. As shown in fig. 1, the computer device includes a processor, a nonvolatile storage medium, and an internal memory connected by a system bus. Wherein the non-volatile storage medium of the computer device may store an operating system and computer readable instructions that, when executed, cause the processor to perform a text recognition method. The processor is used to provide computing and control capabilities to support the operation of the entire computer device. The internal memory may have stored therein computer readable instructions that, when executed by the processor, cause the processor to perform a text recognition method. The computer device may be a terminal, a server, or the like. The terminal can be a desktop terminal or a mobile terminal, and the mobile terminal can be at least one of a mobile phone, a tablet computer, a notebook computer and the like. The servers may be separate physical servers or may be a cluster of physical servers. It will be appreciated by those skilled in the art that the structure shown in fig. 1 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the terminal to which the present application is applied, and that a particular terminal may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
FIG. 2 is a flow diagram of a text recognition method in one embodiment. The present embodiment is mainly exemplified by the application of the method to the computer device in fig. 1. Referring to fig. 2, the method specifically includes the steps of:
s202, acquiring a text sequence image.
Wherein a text sequence is a string of more than one character in sequence. The text sequence image is then an image comprising the text sequence. The text sequence may be a text line or a text column depending on the typesetting of the text sequence image. Text lines are text sequences in which characters are arranged substantially in the lateral direction, and text columns are text sequences in which characters are arranged substantially in the longitudinal direction.
In one embodiment, the computer device may directly acquire the text sequence image from the text sequence segmentation. The text sequence image acquired by the computer device may be a text sequence image sent by another computer device received by the computer device, or a text sequence image crawled by the computer device from the internet, or a text sequence image obtained by scanning or shooting by the computer device, etc.
In one embodiment, the computer device may acquire an image to be subjected to text sequence segmentation processing, and then perform text sequence segmentation on the image to acquire a text sequence image. An image to be subjected to text sequence segmentation processing such as a business card image or a document image, or the like. The business card image is an image containing business card content, and can be a business card photo, a business card scanning piece, an electronic business card picture or the like. A document image is an image formed by the combination of one or more text sequences according to a particular arrangement characteristic.
In one embodiment, the computer device may detect the text sequence image from the image based on the prior arrangement features of the text sequence, due to the regular arrangement features between the different text sequences. A priori arrangement features of the text sequence such as gaps between different text lines, character spacing features within text lines or text columns, features with the center of characters within a text line or text column being approximately in a straight line, etc. The computer device may use this a priori arrangement feature to segment the different text sequence images from the image.
In one embodiment, the computer device may perform connected domain analysis on the image to extract connected domains. Since connected domains in the same text sequence can form a complete connected domain, the computer device can determine the outer contours of a plurality of connected domains which are approximately in the same straight line as the text sequence image so as to divide different text sequence images from the image.
S204, character recognition is carried out on the text sequence image according to the character recognition mode corresponding to each character type, characters which do not belong to the corresponding character type in the text sequence image are recognized as universal characters which do not belong to the corresponding character type, and the text sequence corresponding to each character type is obtained.
The character type is a type obtained by classifying characters according to character characteristics. Character features such as character stroke features or the language to which the character belongs.
In this embodiment, the computer device may classify characters by language, such as english character type, chinese character type, korean character type, and the like. For characters remaining after the classification by category, such as numbers and punctuation marks, the computer device may uniformly divide the remaining characters into separate types of characters, such as other character types. The computer device may also sort the remaining characters into one of the types of characters sorted by category, e.g., english character types may include both english characters and characters left after sorting by category.
The computer device may build a character library by character type, the character library including a plurality of characters belonging to the corresponding character type. For example, a character library established by english character type includes a large number of characters belonging to english. The character recognition method according to the character type is a recognition method for accurately recognizing the character belonging to the character type. The computer device may accurately recognize or blur recognize characters that do not belong to the character category. For example, the character recognition mode corresponding to the English character type can accurately recognize English characters, and the recognition precision of non-English characters is not required.
The universal characters are characters preset by the computer equipment and are used as recognition results of characters which do not belong to the corresponding character types when the characters are recognized according to the character recognition modes corresponding to the character types. For example, when character recognition is performed in a character recognition manner corresponding to an english character type, characters not belonging to the english character type are recognized as general characters not belonging to the english character type.
In one embodiment, for each character type, there may be one common character that does not belong to the corresponding character type. For example, for an english character type, there may be a general character "chinese" that does not belong to the english character type, and characters that do not belong to the english character type, such as chinese characters or korean characters, are identified as "chinese".
In one embodiment, for each character type, there may also be multiple common characters that do not belong to the corresponding character type. The plurality of common characters may belong to the same character type. For example, for the english character type, there may be a plurality of general characters "chinese" and "korean" and the like, which do not belong to the corresponding character category, the chinese character is recognized as "chinese", and the korean character is recognized as "korean". The plurality of common characters may be characters corresponding to other character types other than the corresponding character types one by one. For example, for English character types, there may be multiple general characters "Chinese" and "Chinese" that do not belong to the corresponding character category
Figure BDA0001377078610000061
Etc., chinese characters are identified as "Chinese", korean characters are identified as +.>
Figure BDA0001377078610000062
In one embodiment, the computer device may first perform character type recognition on the characters in the text sequence image when performing character recognition on the text sequence image in a character recognition manner corresponding to each character type. The character type recognition may be a classification process that determines whether a character belongs to a corresponding character type or does not belong to a corresponding character type. The computer equipment can accurately identify the characters belonging to the corresponding character types, and directly uses the universal characters not belonging to the corresponding character types as the identification results of the characters not belonging to the corresponding character types.
For example, for the English character type, assume that the general character preset by the computer equipment is "Chinese", and include "I am" in identifying the text sequence image according to the character identification mode corresponding to the English character type
Figure BDA0001377078610000063
In the case of A ", the first character I'm' is determined to be a character not belonging to the English character type, the Chinese is taken as the recognition result of I'm, and the second character I' m>
Figure BDA0001377078610000064
Determining that the character is not of English character type, and taking 'Chinese' as +.>
Figure BDA0001377078610000065
The second character A is determined to be the character belonging to the English character type, and further recognition is carried out to obtain an accurate recognition result.
Character type recognition may also be a multi-classification process that determines to which character type the character belongs. The computer equipment can accurately identify the characters belonging to the corresponding character types, and directly takes the universal characters which do not belong to the corresponding character types and are the same as the character types of the characters to be identified as the identification result of the characters to be identified.
For example, for the type of the chinese character with english character, it is assumed that the general character of chinese character preset by the computer device is "chinese", and the general character of korean character is
Figure BDA0001377078610000066
The text sequence image is identified according to the character identification mode corresponding to English character type, which comprises ' I ' and ' I>
Figure BDA0001377078610000067
In the case of A ", the first character I'm ' is determined to be a character of the Chinese character type, the Chinese character is taken as the recognition result of I'm ', and the second character I'm->
Figure BDA0001377078610000071
Characters determined to be Korean character type, will +.>
Figure BDA0001377078610000072
As->
Figure BDA0001377078610000073
The second character A is determined to be the character belonging to the English character type, and further recognition is carried out to obtain an accurate recognition result.
In one embodiment, the manner in which the computer device performs character recognition may be a recognition manner based on template matching. The character recognition mode corresponding to the character type is a recognition mode for matching by adopting a character template corresponding to the character type. For example, the character recognition mode corresponding to the English character type is a recognition mode for matching by adopting a character template corresponding to the English character type, so that the English character can be accurately recognized. If the computer equipment needs to accurately identify the non-English characters, the character templates corresponding to other character types can be adopted for matching. If the computer equipment does not need to accurately identify the non-English characters, the non-English characters can be directly identified as general characters of the non-English characters.
Specifically, the computer device may collect character templates of each character in the character library established according to the character types, then perform correlation matching on the character to be recognized and the collected character templates set according to the character types, calculate the similarity between the character to be recognized and each character template, and take the character corresponding to the character template with the maximum similarity as the recognition result, thereby obtaining the text sequence corresponding to each character type.
For example, the text sequence obtained by recognition according to the character recognition mode corresponding to the English character type is: "Chinese-Han My name is Addy", wherein "M", "y" and "n" exist in the character library corresponding to English character type, and are characters belonging to English character type. The Chinese character is not in the character library corresponding to the English character type, and is not the character of the English character type.
In one embodiment, the manner in which the computer device performs character recognition may also be a recognition manner based on feature extraction. The character recognition mode corresponding to the character type is a recognition mode for matching by adopting character characteristics corresponding to the character type. Specifically, the computer device may extract character features of each character in the character library established according to the character types, extract character features of the character to be recognized, and perform correlation matching with the character features of each character in the character library, calculate similarity between the character to be recognized and each character feature, and take a character corresponding to a character template with the maximum similarity as a recognition result, thereby obtaining a text sequence corresponding to each character type.
Specifically, the computer device may extract geometric features of the character, such as end points, bifurcation points, concave-convex portions, line segments in various directions such as horizontal, vertical, oblique, and the like, closed loops, and the like, and perform logical combination judgment according to the positions and correlations of the extracted features, so as to obtain a recognition result.
In one embodiment, the computer device may directly perform character recognition on the text sequence image according to the character recognition mode corresponding to each character type, or may perform character recognition on the single character image after segmenting the text sequence image into the single character image.
In one embodiment, a computer device may employ a machine learning model for character recognition. The machine learning model may be a neural network model, and specifically may be a CNN (Convolutional Neural Networks, convolutional neural network) model or a FCNN (Fully Convolutional Neural Networks, full convolutional neural network) model. The CNN model has very strong classifying capability in the visual field, and can accurately recognize single words.
S206, selecting a text sequence from the text sequences corresponding to the character types.
In particular, the computer device may randomly select a text sequence from among the text sequences corresponding to each character type. The computer device may also respectively count, for each text sequence, the number of characters of the corresponding character type included in each text sequence, and select a text sequence having the largest number of characters of the corresponding character type, before selecting the text sequence.
For example, the computer device recognizes the text sequence image by character type to obtain a text sequence a of chinese character type, a text sequence B of english character type, a text sequence C of korean character type, and a text sequence D of japanese character type. Wherein, A includes 15 Chinese characters, B includes 69 English characters, C includes 3 Korean characters, and D includes 6 Japanese characters. The computer device may choose one text sequence from the four text sequences A, B, C and D, or may choose the text sequence B that includes the most characters of the corresponding character type.
S208, determining the position of the universal character which does not belong to the corresponding character type in the selected text sequence.
Specifically, after selecting a text sequence, the computer device may first determine a character type corresponding to the text sequence, then determine general characters in the text sequence that do not belong to a corresponding character type of the text sequence, and then determine positions of the characters in the selected text sequence that do not belong to the corresponding character type of the text sequence. The characters not belonging to the corresponding character type of the text sequence may be specifically characters not included in the character library of the corresponding character type of the text sequence.
In one embodiment, the computer device may traverse characters included in the selected text sequence, and in the traversing, determine whether the traversed characters are a character library included in the corresponding character class. If the computer equipment judges that the currently traversed character is a character library comprising corresponding character types, continuing to traverse; if the computer equipment judges that the currently traversed character is a character library not included in the corresponding character type, the position of the traversed character in the selected text sequence is recorded.
In one embodiment, when the computer device performs character recognition on the text sequence image according to the character recognition mode corresponding to each character type, the computer device can mark the character when recognizing the character which does not belong to the corresponding character type. After selecting the text sequence, the computer device may view the marked characters in the selected text sequence to determine characters in the text sequence that do not belong to the corresponding character category, and further determine the positions of the characters in the selected text sequence. In one embodiment, the position of the character which does not belong to the corresponding character type in the selected text sequence may be the relative position of the character which does not belong to the corresponding character type in the text sequence and the character which belongs to the corresponding character type. For example, a text sequence obtained by English character type recognition: "Han and Han Guo My name is Addy", then the position of the character "Han" not belonging to the corresponding character category in the text sequence may be in front of "My name is Addy".
In one embodiment, the position of the character which does not belong to the corresponding character category in the selected text sequence may be the absolute position of the character which does not belong to the corresponding character category in the text sequence. For example, a text sequence obtained by English character type recognition: "Han Dynasty My name is Addy", then the position of the character "Han" not belonging to the corresponding character category in the text sequence may be the first character to the third character.
S210, acquiring characters belonging to corresponding character types at positions in the selected text sequences.
In particular, the computer device may traverse the remaining text sequences, and in the traversing, determine whether the character at the position in the traversed text sequence is a character belonging to the corresponding character class of the traversed text sequence. If the computer equipment judges that the character at the position in the text sequence traversed currently is a character belonging to the corresponding character type of the text sequence traversed currently, acquiring the character; if the computer equipment judges that the character at the position in the text sequence traversed currently is the character which does not belong to the corresponding character type of the text sequence traversed currently, continuing to traverse.
S212, correcting the characters at the positions in the selected text sequence according to the acquired characters to obtain a recognition result.
Specifically, after acquiring characters belonging to the corresponding character types at the positions in the text sequence left after selection, the computer equipment can compare the acquired characters according to the positions with the characters at the positions in the selected text sequence for each determined position, and correct the characters at the positions in the selected text sequence through the acquired characters when the two are detected to be inconsistent, and obtain a recognition result with higher accuracy after completing the correction of the characters at each determined position.
According to the text recognition method, after the text sequence image is obtained, character recognition is respectively carried out according to different character types, and the text sequence corresponding to each character type is obtained. When a character recognition method corresponding to a certain character type is used for recognition, characters which do not belong to the character type in the text sequence image are recognized as common characters which do not belong to the character type. And further selecting a text sequence, determining the position of the universal character which does not belong to the corresponding character type in the selected text sequence, and correcting the character at the position in the selected text sequence by the characters at the position in the selected text sequence left after the selection, so as to obtain a recognition result. The text sequence image is identified in a mode of identifying the character types, so that the identification accuracy of the characters belonging to each character type is ensured when the identification is carried out according to each character type, the identification of texts of various character types included in the text sequence image can be considered when the text content is complex and various, the characters belonging to the character type in the text sequence obtained by identifying the character types are utilized to correct the characters at the corresponding positions in other text sequences, the identification result can be obtained, and the text identification accuracy is improved.
Fig. 3 shows a schematic diagram of a text recognition method in one embodiment. Referring to fig. 3, after obtaining the text sequence image, the computer device performs character recognition on the text sequence image according to each character type to obtain a text sequence corresponding to each character type, and corrects the characters in the corresponding positions in other text sequences by utilizing the characters belonging to the character type in each obtained text sequence, thereby obtaining a recognition result.
In one embodiment, step S202 includes: acquiring an image to be identified; binarizing the image to be identified to obtain a text image; extracting a text texture image from the text image; determining connected domains in the text texture image; and determining the text sequence image according to the connected domain.
The image to be recognized is an image to be subjected to character recognition on a text sequence included in the image. Specifically, the image may be a business card image or a document image. Binarization of an image is to set the gray value of a pixel point on the image to two pixel values, that is, to present the whole image with a visual effect of only two pixel values.
Specifically, the computer device may employ a fixed threshold binarization algorithm or an adaptive threshold binarization algorithm to set the pixel values of the image to be identified above and below the threshold to be one of two preset pixel values, which are the first pixel value and the second pixel value, respectively. The binarized image to be identified represents a first pixel value, such as white, of the text; the background is indicated by a second pixel value, e.g. black.
Further, the computer device may extract an image area formed by the pixel points corresponding to the first pixel value representing the text from the binarized image to be identified, so as to obtain a text image. The computer device may then extract character stroke textures from the resulting text image, determine image areas formed by the pixels comprising the stroke textures, and obtain a text texture image.
Furthermore, the computer equipment can analyze the connected domain of the text texture image to extract the connected domain, and can also combine adjacent connected domains. The computer equipment can specifically adopt a stroke smoothing algorithm to analyze and combine the connected domains, the algorithm can connect pixels of adjacent connected domains to form a whole area, and the connected domains in the same text sequence can form a complete connected domain because the distances among the connected domains in the same text sequence are relatively close.
Still further, the computer device may determine the outer contours of the plurality of connected domains that are approximately collinear as the locations of the text sequence images and record to determine the corresponding text sequence images. The computer device may also treat each connected domain separately as an independent text sequence image.
In the embodiment, after the text texture image is gradually extracted from the image to be recognized, the corresponding text sequence image is determined according to the connected domain in the text texture image, so that the situation that excessive background areas are included in the text sequence image determining process is avoided, and the accuracy is higher when character recognition is carried out subsequently.
In one embodiment, step S204 includes: according to the corresponding recognition mode of each character type, recognizing the characters belonging to the corresponding character type from the text sequence image, and recognizing the universal characters not belonging to the corresponding character type from the text sequence image; and combining the characters recognized according to the character types in turn to obtain a text sequence corresponding to each character type. Specifically, the computer device may set a corresponding recognition policy for each character type in advance. In one embodiment, the computer device may accurately identify the character belonging to the character type corresponding to the character type, to obtain the character actually corresponding to the character; and carrying out fuzzy processing on the characters which do not belong to the character category, and marking the characters as universal characters which do not belong to the character category so as to distinguish the precisely recognized characters from the characters subjected to fuzzy processing.
In one embodiment, the step of identifying the characters belonging to the respective character category from the text sequence image and identifying the common characters not belonging to the respective character category from the text sequence image in the respective identification manner of the respective character category comprises: cutting out a single-word image from the text sequence image; and respectively carrying out character recognition on the single-word images through the corresponding machine learning model of each character type to obtain characters belonging to the corresponding character type and universal characters not belonging to the corresponding character type.
Wherein the single-word image is a rectangular image comprising individual characters, and the computer device cuts out individual single-word images from the text sequence image. The computer device can specifically segment the sequence of single-word images from the text sequence images according to prior knowledge such as the distance features, the character length features and the character proportion consistency of the text sequence images. The text sequence image may undergo image enhancement, such as increasing image contrast, before being cut.
In one embodiment, the computer device may binarize the text sequence image and then project each pixel value thereof onto the long-side direction of the text sequence image to obtain an accumulated value, and find a local maximum accumulated value or a local minimum accumulated value to perform segmentation, thereby obtaining a single-word image. If the pixel color of the text sequence image representing the character is white after binarization, searching a local minimum accumulated value; if the color of the pixel representing the character after binarization of the text sequence image is black, searching a local maximum accumulated value.
Further, the computer device may perform character recognition on the single-word image through the machine learning model after the single-word image is cut out from the text sequence image. The machine learning model corresponding to each character type can be trained in advance.
In one embodiment, the step of training a machine learning model corresponding to each character class includes: acquiring a character image sample set; adding labels of corresponding characters for character images belonging to corresponding character types in a character image sample set according to the character types, and adding labels of general characters for character images not belonging to the corresponding character types in the character image sample set; and training the machine learning model corresponding to each character type according to the character images in the character image sample set and the labels added according to the character types.
Wherein the character image sample set includes a plurality of character images. The character image may include character images generated by characters of various character types. The character image sample set used in training the machine learning model corresponding to each character type may be a unified character image sample set, or may be a character image sample set corresponding to each character type. The character image sample set corresponding to each character type has a bias for the corresponding character type. Specifically, a character image generated by a large number of characters belonging to the corresponding character category and a character image generated by a small number of characters not belonging to the corresponding character category may be included.
Specifically, the machine learning model is a functional relationship that maps character images to correspondingly annotated characters. According to the character image sample set, training the machine learning model, namely, utilizing the character image sample set which is known to be mapped to the corresponding marked character, adjusting parameters in the machine learning model, so that the machine learning model can predict the character mapped to the new character image, and the effect of identifying the corresponding character from the image containing the character is achieved. The machine learning model may employ an SVM (support vector machine) or various neural networks.
In one embodiment, the machine learning model employs a Convolutional Neural Network (CNN). CNN is an end-to-end learning method, CNN directly accepts the pixel input of character image, and the number of neurons of the input layer is equal to the number of pixels of normalized character image. After CNN inputs data, local feature extraction and pooling of a plurality of layers are carried out, then the middle layer carries out global feature transformation of full connection, and finally the output layer takes the task as the output.
Specifically, the computer device may add, for each character class, labels of corresponding characters for character images belonging to the corresponding character class in the character image sample set, and add labels of general characters for character images not belonging to the corresponding character class in the character image sample set. The computer equipment respectively trains the corresponding machine learning model of each character type according to the character images in the character image sample set and the labels added according to the character types.
In one embodiment, the machine learning model may be obtained by iteratively adjusting parameters of a convolutional neural network trained to identify images from a set of character image samples.
In the embodiment, the character big data learning is performed by utilizing the strong learning and representing capability of the machine learning model, and the trained machine learning model is better in character recognition effect than the traditional method.
In the embodiment, the text sequence image is segmented to obtain the single-word image, and then the character recognition is carried out on the single-word image by adopting the machine learning model, so that the character recognition process of the text sequence image can be conveniently and efficiently completed.
The computer equipment recognizes characters belonging to the corresponding character types from the text sequence image, recognizes general characters not belonging to the corresponding character types from the text sequence image, and sequentially combines the recognized characters according to the text sequence in the text sequence image to obtain a text sequence corresponding to the character types.
After each character type is obtained respectively, the computer device determines whether the character is of the corresponding character type according to whether the character in the obtained text sequence is a common character. After the computer equipment selects the text sequence from the text sequences, the universal characters in the text sequences can be directly inquired, and the positions of the universal characters are positions which are needed to be corrected according to the text sequences.
In the above embodiment, when the text sequence image is recognized according to the character types, the characters not belonging to the corresponding character types are subjected to fuzzy processing and marked by the universal characters, so that the characters to be corrected can be quickly positioned when the characters are corrected, the character correction is completed, and a more accurate recognition result is obtained.
In one embodiment, the step of segmenting the single word image from the text sequence image comprises: selecting candidate cutting points in the text sequence image along the long side of the text sequence image according to the distance shorter than the short side of the text sequence image; obtaining segmentation confidence of each candidate segmentation point; determining segmentation points according to the segmentation confidence; and cutting the single-word image from the text sequence image according to the determined cutting point.
The candidate segmentation point is a candidate segmentation position and can be expressed by coordinates or a distance from the start point of the head of the text sequence image.
In one embodiment, the text sequence image is a rectangular image, the shorter sides of the text sequence image being approximately the width or height of the characters in the text sequence, and the longer sides being approximately the length of the text sequence in the text sequence image, the computer device may select the candidate cut points at a shorter spacing than the shorter sides. The distance between the candidate segmentation points can be smaller than or equal to one half, one third or one fourth of the short sides of the text sequence image.
Further, the segmentation confidence is a quantized value of the probability that the corresponding candidate segmentation point is the actual segmentation point. The computer equipment can specifically cut out corresponding pictures according to the candidate cutting points, extract image features of the cut-out pictures, sequentially input the extracted image features into the trained classifier, and output the cutting confidence of the corresponding candidate cutting points. The classifier may employ a random forest classifier. The extracted image features may be HOG (Histogram of Oriented Gradient, direction gradient histogram) features, or other features such as LBP (Local Binary Patterns, local binary pattern) features.
Further, the computer device may compare the segmentation confidence with a preset threshold, and determine that the segmentation point is an actual segmentation point if the segmentation confidence is higher than the preset threshold. The computer equipment then performs segmentation at the segmentation points determined at each position in the text sequence image to obtain individual word images.
In the above embodiment, the candidate segmentation points can be densely selected in the text sequence image, and the segmentation confidence of each candidate segmentation point is utilized to segment the text sequence image to obtain the single-word image, so that the text sequence image can be accurately segmented, and the accuracy of the subsequent text recognition can be improved.
In one embodiment, step S212 includes: when the number of the acquired characters is consistent with the number of the characters at the positions in the selected text sequence, the characters at the positions in the selected text sequence are replaced one by the characters corresponding to the characters at the positions one by one according to the character sequence in the acquired characters; and when the number of the acquired characters is inconsistent with the number of the characters at the position in the selected text sequence and the number of the characters at the position in the selected text sequence exceeds one, replacing the whole characters at the position in the selected text sequence with the acquired characters.
Specifically, the computer device may count the number of characters at the position in the selected text sequence, then acquire the characters belonging to the corresponding character types at the position in the selected text sequence, count the number of acquired characters, and compare the counted two character numbers. If the computer equipment judges that the number of the acquired characters is consistent with the number of the characters at the positions in the selected text sequence, the computer equipment considers that the characters corresponding to the text sequence images are respectively and one-to-one corresponding to the recognized characters, and the computer equipment can replace the characters at the positions in the selected text sequence one by one with the characters corresponding to the characters at the positions one by one in the acquired characters according to the character sequence.
If the number of the acquired characters is not consistent with the number of the characters at the position in the selected text sequence, the computer equipment considers that the character unrecognized result exists in the corresponding characters in the text sequence image, and when the number of the characters at the position in the selected text sequence exceeds one, the computer equipment can replace the whole characters at the position in the selected text sequence with the acquired characters so as to obtain the accurate recognition result as far as possible.
By way of example, FIG. 4 illustrates a schematic diagram of character correction when the number of characters acquired in one embodiment corresponds to the number of characters at a position in a selected text sequence. Referring to fig. 4, the original contents in the text sequence image are: "I are a Han nationality My name is Addy", and the text sequence obtained by Chinese character type recognition is: "I are Han nationality AA AAAA AA AAAA", and the text sequence obtained by identifying English character types is: "Han Guo My name is Addy".
The computer device may select a text sequence identified by the english character type, determine where the general character "chinese" that does not belong to the english character type is located, and the number of "chinese": 7. the characters belonging to the Chinese character types in the position of the text sequence obtained by the recognition of the residual Chinese character types are 7 characters of Chinese character types, and the two numbers are the same, so that the characters in the position in the text sequence corresponding to the English character types are replaced one by one with the characters in the position, which are one by one, according to the character sequence, in the characters obtained from the text sequence corresponding to the Chinese character types.
FIG. 5 illustrates a schematic diagram of character correction when the number of characters acquired in one embodiment does not correspond to the number of characters at a position in a selected text sequence. Referring to fig. 5, the original contents in the text sequence image are: "I are a Han nationality My name is Addy", and the text sequence obtained by Chinese character type recognition is: "I are Han nationality AA AAAA", and the text sequence obtained by identifying English character types is: "Han dynasty My name is Addy".
The computer device may select a text sequence identified by the english character type, determine where the general character "chinese" that does not belong to the english character type is located, and the number of "chinese": 3. the characters belonging to the Chinese character types in the text sequence obtained by the recognition of the residual Chinese character types at the position are 7 characters of which the number is I, I is a Han nationality, the two numbers are different, and 3 is greater than 1, and the characters at the position in the text sequence corresponding to the English character types are wholly replaced by the characters obtained from the text sequence corresponding to the Chinese character types.
In the above embodiment, a processing manner of performing character correction is provided when the number of acquired characters is consistent or inconsistent with the number of characters at the position in the selected text sequence. By adopting the processing mode, the character is corrected, so that an accurate recognition result can be obtained as far as possible.
As shown in fig. 6, in a specific embodiment, the text recognition method includes the steps of:
s602, acquiring an image to be identified; binarizing the image to be identified to obtain a text image; extracting a text texture image from the text image; determining connected domains in the text texture image; and determining the text sequence image according to the connected domain.
S604, selecting candidate cutting points in the text sequence image along the long side of the text sequence image according to the distance shorter than the short side of the text sequence image; obtaining segmentation confidence of each candidate segmentation point; determining segmentation points according to the segmentation confidence; and cutting the single-word image from the text sequence image according to the determined cutting point.
S606, acquiring a character image sample set; according to the character types, adding labels of corresponding characters for the character images belonging to the corresponding character types in the character image sample set, and adding labels of general characters for the character images not belonging to the corresponding character types in the character image sample set.
S608, training machine learning models corresponding to the character types according to the character images in the character image sample set and the labels added according to the character types.
And S610, respectively carrying out character recognition on the single-word images through the machine learning model corresponding to each character type to obtain characters belonging to the corresponding character type and universal characters not belonging to the corresponding character type.
And S612, combining the characters recognized by the character types in turn to obtain a text sequence corresponding to each character type.
S614, selecting a text sequence from the text sequences corresponding to the character types.
S616, determining the position of the universal character which does not belong to the corresponding character type in the selected text sequence.
S618, acquiring characters belonging to the corresponding character types at the positions in the selected residual text sequences.
S620, judging whether the number of the acquired characters is consistent with the number of the characters at the position in the selected text sequence; if yes, go to step S622; if not, go to step S624.
S622, replacing the characters at the positions in the selected text sequence one by one with the characters corresponding to the characters at the positions one by one according to the character sequence in the acquired characters.
S624, if the number of the characters at the position in the selected text sequence exceeds one, replacing the whole characters at the position in the selected text sequence with the acquired characters.
In this embodiment, after the text sequence image is obtained, character recognition is performed according to different character types, so as to obtain a text sequence corresponding to each character type, further a text sequence is selected, the position of the character which does not belong to the corresponding character type in the selected text sequence is determined, and the character at the position in the selected text sequence is corrected by the character at the position in the selected text sequence, so as to obtain a recognition result. The text sequence image is identified in a mode of identifying the character types, so that the identification accuracy of the characters belonging to each character type is ensured when the identification is carried out according to each character type, the identification of texts of various character types included in the text sequence image can be considered when the text content is complex and various, the characters belonging to the character type in the text sequence obtained by identifying the character types are utilized to correct the characters at the corresponding positions in other text sequences, the identification result can be obtained, and the text identification accuracy is improved.
Fig. 7 shows a schematic flow chart of a text recognition method in a specific application scenario. Referring to fig. 7, this specific application scenario is text recognition in a business card image. The computer device may first perform text line detection on the business card image. After the text line is detected, the characters in the text line are respectively identified through a machine learning model corresponding to the Chinese character type and a machine learning model corresponding to the English character type. In this embodiment, other characters such as numerals and punctuations are accurately identified through a machine learning model corresponding to the english character type.
After passing through the machine learning model corresponding to the Chinese character type, the text line obtains a text sequence comprising the Chinese character obtained by accurate recognition and the general character 'A' for marking the non-Chinese character. After passing through the machine learning model corresponding to the English character type, the text line obtains a text sequence comprising English characters, numbers and punctuation marks which are accurately identified, and a general character 'Chinese' for marking Chinese characters.
The computer equipment can select the text sequence obtained by identifying the English character type, determine the position of the general character 'Chinese' which does not belong to the English character type, the character which belongs to the Chinese character type and is positioned in the position of the text sequence obtained by identifying the Chinese character type, and replace the character at the position in the text sequence corresponding to the English character type with the character obtained from the text sequence corresponding to the Chinese character type.
As shown in fig. 8, in one embodiment, a text recognition device 800 is provided. Referring to fig. 8, the text recognition apparatus 800 includes a first acquisition module 801, a recognition module 802, a selection module 803, a determination module 804, a second acquisition module 805, and a correction module 806.
A first obtaining module 801, configured to obtain a text sequence image.
The recognition module 802 is configured to perform character recognition on the text sequence image according to a recognition mode corresponding to each character type, and recognize characters in the text sequence image that do not belong to the corresponding character type as universal characters that do not belong to the corresponding character type, so as to obtain a text sequence corresponding to each character type.
A selecting module 803, configured to select a text sequence from text sequences corresponding to each character type.
A determining module 804, configured to determine a location where a universal character that does not belong to a corresponding character class in the selected text sequence is located.
A second obtaining module 805 is configured to obtain characters belonging to the corresponding character types at positions in the text sequence remaining after the selection.
And a correcting module 806, configured to correct the character at the position in the selected text sequence according to the acquired character, so as to obtain a recognition result.
The text recognition device 800 performs character recognition for each character type after acquiring the text sequence image, and obtains a text sequence corresponding to each character type. When a character recognition method corresponding to a certain character type is used for recognition, characters which do not belong to the character type in the text sequence image are recognized as common characters which do not belong to the character type. And further selecting a text sequence, determining the position of the universal character which does not belong to the corresponding character type in the selected text sequence, and correcting the character at the position in the selected text sequence by the characters at the position in the selected text sequence left after the selection, so as to obtain a recognition result. The text sequence image is identified in a mode of identifying the character types, so that the identification accuracy of the characters belonging to each character type is ensured when the identification is carried out according to each character type, the identification of texts of various character types included in the text sequence image can be considered when the text content is complex and various, the characters belonging to the character type in the text sequence obtained by identifying the character types are utilized to correct the characters at the corresponding positions in other text sequences, the identification result can be obtained, and the text identification accuracy is improved.
In one embodiment, the first obtaining module 801 is further configured to obtain an image to be identified; binarizing the image to be identified to obtain a text image; extracting a text texture image from the text image; determining connected domains in the text texture image; and determining the text sequence image according to the connected domain.
In the embodiment, after the text texture image is gradually extracted from the image to be recognized, the corresponding text sequence image is determined according to the connected domain in the text texture image, so that the situation that excessive background areas are included in the text sequence image determining process is avoided, and the accuracy is higher when character recognition is carried out subsequently.
In one embodiment, the recognition module 802 is further configured to recognize, in a corresponding recognition manner of each character type, a character belonging to the corresponding character type from the text sequence image, and a general character not belonging to the corresponding character type from the text sequence image; and combining the characters recognized according to the character types in turn to obtain a text sequence corresponding to each character type. In this embodiment, when text sequence image recognition is performed according to character types, the characters that do not belong to the corresponding character types are subjected to fuzzy processing and marked with general characters, so that when character correction is performed, the characters that need to be corrected can be quickly positioned, character correction is completed, and a more accurate recognition result is obtained.
In one embodiment, the recognition module 802 is further configured to segment a single word image from the text sequence image; and respectively carrying out character recognition on the single-word images through the corresponding machine learning model of each character type to obtain characters belonging to the corresponding character type and universal characters not belonging to the corresponding character type.
In the embodiment, the text sequence image is segmented to obtain the single-word image, and then the character recognition is carried out on the single-word image by adopting a machine learning model, so that the character recognition process of the text sequence image can be conveniently and efficiently completed.
In one embodiment, the identifying module 802 is further configured to select candidate cut points in the text sequence image along a long side of the text sequence image at a shorter distance than a short side of the text sequence image; obtaining segmentation confidence of each candidate segmentation point; determining segmentation points according to the segmentation confidence; and cutting the single-word image from the text sequence image according to the determined cutting point.
In this embodiment, the candidate segmentation points may be densely selected in the text sequence image, and the text sequence image may be segmented by using the segmentation confidence of each candidate segmentation point to obtain a single word image, so that accurate segmentation of the text sequence image may be implemented, so as to improve the accuracy of subsequent text recognition.
In one embodiment, the correction module 806 is further configured to replace the characters at the positions in the selected text sequence with the characters corresponding to the characters at the positions one by one in the acquired character according to the character sequence when the number of the acquired characters is consistent with the number of the characters at the positions in the selected text sequence; and when the number of the acquired characters is inconsistent with the number of the characters at the position in the selected text sequence and the number of the characters at the position in the selected text sequence exceeds one, replacing the whole characters at the position in the selected text sequence with the acquired characters.
In this embodiment, a processing manner of performing character correction is provided when the number of acquired characters is consistent or inconsistent with the number of characters at the position in the selected text sequence. By adopting the processing mode, the character is corrected, so that an accurate recognition result can be obtained as far as possible.
As shown in fig. 9, in one embodiment, the text recognition apparatus 800 further includes: training module 807.
A training module 807 for obtaining a character image sample set; adding labels of corresponding characters for character images belonging to corresponding character types in a character image sample set according to the character types, and adding labels of general characters for character images not belonging to the corresponding character types in the character image sample set; and training the machine learning model corresponding to each character type according to the character images in the character image sample set and the labels added according to the character types.
In the embodiment, the character big data learning is performed by utilizing the strong learning and representing capability of the machine learning model, and the trained machine learning model is better in character recognition effect than the traditional method.
In one embodiment, one or more computer-readable storage media storing computer-readable instructions are provided that, when executed by one or more processors, cause the one or more processors to perform the steps of: acquiring a text sequence image; performing character recognition on the text sequence image according to the corresponding recognition mode of each character type, recognizing characters which do not belong to the corresponding character type in the text sequence image as universal characters which do not belong to the corresponding character type, and obtaining a corresponding text sequence of each character type; selecting a text sequence from text sequences corresponding to each character type; determining the position of a universal character which does not belong to the corresponding character type in the selected text sequence; acquiring characters belonging to corresponding character types at positions in the selected residual text sequences; and correcting the characters at the positions in the selected text sequence according to the acquired characters to obtain a recognition result.
In one embodiment, acquiring a text sequence image includes: acquiring an image to be identified; binarizing the image to be identified to obtain a text image; extracting a text texture image from the text image; determining connected domains in the text texture image; and determining the text sequence image according to the connected domain.
In one embodiment, performing character recognition on the text sequence image according to a recognition mode corresponding to each character type to obtain a text sequence corresponding to each character type, including: according to the corresponding recognition mode of each character type, recognizing the characters belonging to the corresponding character type from the text sequence image, and recognizing the universal characters not belonging to the corresponding character type from the text sequence image; and combining the characters recognized according to the character types in turn to obtain a text sequence corresponding to each character type.
In one embodiment, according to the corresponding recognition mode of each character type, recognizing the characters belonging to the corresponding character type from the text sequence image and recognizing the universal characters not belonging to the corresponding character type from the text sequence image, including: cutting out a single-word image from the text sequence image; and respectively carrying out character recognition on the single-word images through the corresponding machine learning model of each character type to obtain characters belonging to the corresponding character type and universal characters not belonging to the corresponding character type.
In one embodiment, segmenting the single word image from the text sequence image includes: selecting candidate cutting points in the text sequence image along the long side of the text sequence image according to the distance shorter than the short side of the text sequence image; obtaining segmentation confidence of each candidate segmentation point; determining segmentation points according to the segmentation confidence; and cutting the single-word image from the text sequence image according to the determined cutting point.
In one embodiment, the computer readable instructions further cause the processor to perform the steps of: acquiring a character image sample set; adding labels of corresponding characters for character images belonging to corresponding character types in a character image sample set according to the character types, and adding labels of general characters for character images not belonging to the corresponding character types in the character image sample set; and training the machine learning model corresponding to each character type according to the character images in the character image sample set and the labels added according to the character types.
In one embodiment, correcting the character at the position in the selected text sequence according to the acquired character to obtain the recognition result comprises: when the number of the acquired characters is consistent with the number of the characters at the positions in the selected text sequence, the characters at the positions in the selected text sequence are replaced one by the characters corresponding to the characters at the positions one by one according to the character sequence in the acquired characters; and when the number of the acquired characters is inconsistent with the number of the characters at the position in the selected text sequence and the number of the characters at the position in the selected text sequence exceeds one, replacing the whole characters at the position in the selected text sequence with the acquired characters.
After the text sequence image is acquired, the storage medium performs character recognition according to different character types to obtain the text sequence corresponding to each character type. When a character recognition method corresponding to a certain character type is used for recognition, characters which do not belong to the character type in the text sequence image are recognized as common characters which do not belong to the character type. And further selecting a text sequence, determining the position of the universal character which does not belong to the corresponding character type in the selected text sequence, and correcting the character at the position in the selected text sequence by the characters at the position in the selected text sequence left after the selection, so as to obtain a recognition result. The text sequence image is identified in a mode of identifying the character types, so that the identification accuracy of the characters belonging to each character type is ensured when the identification is carried out according to each character type, the identification of texts of various character types included in the text sequence image can be considered when the text content is complex and various, the characters belonging to the character type in the text sequence obtained by identifying the character types are utilized to correct the characters at the corresponding positions in other text sequences, the identification result can be obtained, and the text identification accuracy is improved.
In one embodiment, a computer device is provided that includes a memory and a processor, the memory having stored therein computer readable instructions that, when executed by the processor, cause the processor to perform the steps of: acquiring a text sequence image; performing character recognition on the text sequence image according to the corresponding recognition mode of each character type, recognizing characters which do not belong to the corresponding character type in the text sequence image as universal characters which do not belong to the corresponding character type, and obtaining a corresponding text sequence of each character type; selecting a text sequence from text sequences corresponding to each character type; determining the position of a universal character which does not belong to the corresponding character type in the selected text sequence; acquiring characters belonging to corresponding character types at positions in the selected residual text sequences; and correcting the characters at the positions in the selected text sequence according to the acquired characters to obtain a recognition result.
In one embodiment, acquiring a text sequence image includes: acquiring an image to be identified; binarizing the image to be identified to obtain a text image; extracting a text texture image from the text image; determining connected domains in the text texture image; and determining the text sequence image according to the connected domain.
In one embodiment, performing character recognition on the text sequence image according to a recognition mode corresponding to each character type to obtain a text sequence corresponding to each character type, including: according to the corresponding recognition mode of each character type, recognizing the characters belonging to the corresponding character type from the text sequence image, and recognizing the universal characters not belonging to the corresponding character type from the text sequence image; and combining the characters recognized according to the character types in turn to obtain a text sequence corresponding to each character type.
In one embodiment, according to the corresponding recognition mode of each character type, recognizing the characters belonging to the corresponding character type from the text sequence image and recognizing the universal characters not belonging to the corresponding character type from the text sequence image, including: cutting out a single-word image from the text sequence image; and respectively carrying out character recognition on the single-word images through the corresponding machine learning model of each character type to obtain characters belonging to the corresponding character type and universal characters not belonging to the corresponding character type.
In one embodiment, segmenting the single word image from the text sequence image includes: selecting candidate cutting points in the text sequence image along the long side of the text sequence image according to the distance shorter than the short side of the text sequence image; obtaining segmentation confidence of each candidate segmentation point; determining segmentation points according to the segmentation confidence; and cutting the single-word image from the text sequence image according to the determined cutting point.
In one embodiment, the computer readable instructions further cause the processor to perform the steps of: acquiring a character image sample set; adding labels of corresponding characters for character images belonging to corresponding character types in a character image sample set according to the character types, and adding labels of general characters for character images not belonging to the corresponding character types in the character image sample set; and training the machine learning model corresponding to each character type according to the character images in the character image sample set and the labels added according to the character types.
In one embodiment, correcting the character at the position in the selected text sequence according to the acquired character to obtain the recognition result comprises: when the number of the acquired characters is consistent with the number of the characters at the positions in the selected text sequence, the characters at the positions in the selected text sequence are replaced one by the characters corresponding to the characters at the positions one by one according to the character sequence in the acquired characters; and when the number of the acquired characters is inconsistent with the number of the characters at the position in the selected text sequence and the number of the characters at the position in the selected text sequence exceeds one, replacing the whole characters at the position in the selected text sequence with the acquired characters.
After the computer equipment acquires the text sequence image, character recognition is respectively carried out according to different character types, and the text sequence corresponding to each character type is obtained. When a character recognition method corresponding to a certain character type is used for recognition, characters which do not belong to the character type in the text sequence image are recognized as common characters which do not belong to the character type. And further selecting a text sequence, determining the position of the universal character which does not belong to the corresponding character type in the selected text sequence, and correcting the character at the position in the selected text sequence by the characters at the position in the selected text sequence left after the selection, so as to obtain a recognition result. The text sequence image is identified in a mode of identifying the character types, so that the identification accuracy of the characters belonging to each character type is ensured when the identification is carried out according to each character type, the identification of texts of various character types included in the text sequence image can be considered when the text content is complex and various, the characters belonging to the character type in the text sequence obtained by identifying the character types are utilized to correct the characters at the corresponding positions in other text sequences, the identification result can be obtained, and the text identification accuracy is improved.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), or the like.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims (10)

1. A method of text recognition, the method comprising:
acquiring a text sequence image; selecting candidate cutting points in the text sequence image along the long side of the text sequence image according to a distance shorter than the short side of the text sequence image; obtaining segmentation confidence of each candidate segmentation point; determining a segmentation point according to the segmentation confidence; cutting out a single-word image from the text sequence image according to the determined cutting point;
respectively carrying out character recognition on the single-word image through a machine learning model corresponding to each character type to obtain characters belonging to the corresponding character type and universal characters not belonging to the corresponding character type, and respectively and sequentially combining the characters recognized according to each character type to obtain a text sequence corresponding to each character type;
selecting a text sequence from text sequences corresponding to each character type;
determining the position of a universal character which does not belong to the corresponding character type in the selected text sequence;
acquiring characters belonging to corresponding character types at the positions in the selected residual text sequences;
and correcting the characters at the positions in the selected text sequence according to the acquired characters to obtain a recognition result.
2. The method of claim 1, wherein the acquiring a text sequence image comprises:
acquiring an image to be identified;
performing binarization processing on the image to be identified to obtain a text image;
extracting a text texture image from the text image;
determining connected domains in the text texture image;
and determining a text sequence image according to the connected domain.
3. The method according to claim 1, wherein the method further comprises:
acquiring a character image sample set;
adding labels of corresponding characters for the character images belonging to the corresponding character types in the character image sample set according to the character types, and adding labels of general characters for the character images not belonging to the corresponding character types in the character image sample set;
and training the machine learning model corresponding to each character type according to the character images in the character image sample set and the labels added according to the character types.
4. A method according to any one of claims 1 to 3, wherein correcting the character at the position in the selected text sequence based on the acquired character results in a recognition result, comprising:
When the number of characters acquired corresponds to the number of characters at the position in the selected text sequence, then
Replacing the characters at the positions in the selected text sequence one by one with characters corresponding to the characters at the positions one by one according to the character sequence in the acquired characters;
when the number of characters acquired is not consistent with the number of characters at the position in the selected text sequence and the number of characters at the position in the selected text sequence exceeds one, then
And replacing the whole character at the position in the selected text sequence with the acquired character.
5. A text recognition device, the device comprising:
the first acquisition module is used for acquiring the text sequence image;
the identification module is used for selecting candidate cutting points in the text sequence image along the long side of the text sequence image according to the distance shorter than the short side of the text sequence image; obtaining segmentation confidence of each candidate segmentation point; determining a segmentation point according to the segmentation confidence; cutting out a single-word image from the text sequence image according to the determined cutting point; respectively carrying out character recognition on the single-word image through a machine learning model corresponding to each character type to obtain characters belonging to the corresponding character type and universal characters not belonging to the corresponding character type, and respectively and sequentially combining the characters recognized according to each character type to obtain a text sequence corresponding to each character type;
The selecting module is used for selecting a text sequence from text sequences corresponding to each character type;
the determining module is used for determining the position of the universal character which does not belong to the corresponding character type in the selected text sequence;
the second acquisition module is used for acquiring characters belonging to corresponding character types at the positions in the selected residual text sequences;
and the correction module is used for correcting the characters at the positions in the selected text sequence according to the acquired characters to obtain a recognition result.
6. The apparatus of claim 5, wherein the first acquisition module is further configured to acquire an image to be identified; performing binarization processing on the image to be identified to obtain a text image; extracting a text texture image from the text image; determining connected domains in the text texture image; and determining a text sequence image according to the connected domain.
7. The apparatus of claim 5, wherein the apparatus further comprises:
the training module is used for acquiring a character image sample set; adding labels of corresponding characters for the character images belonging to the corresponding character types in the character image sample set according to the character types, and adding labels of general characters for the character images not belonging to the corresponding character types in the character image sample set; and training the machine learning model corresponding to each character type according to the character images in the character image sample set and the labels added according to the character types.
8. The apparatus of any one of claims 5 to 7, wherein the correction module is further configured to:
when the number of characters acquired corresponds to the number of characters at the position in the selected text sequence, then
Replacing the characters at the positions in the selected text sequence one by one with characters corresponding to the characters at the positions one by one according to the character sequence in the acquired characters;
when the number of characters acquired is not consistent with the number of characters at the position in the selected text sequence and the number of characters at the position in the selected text sequence exceeds one, then
And replacing the whole character at the position in the selected text sequence with the acquired character.
9. One or more non-transitory computer-readable storage media storing computer-executable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the method of any of claims 1-4.
10. A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of the method of any of claims 1 to 4.
CN201710687380.4A 2017-08-11 2017-08-11 Text recognition method, device, storage medium and computer equipment Active CN109389115B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710687380.4A CN109389115B (en) 2017-08-11 2017-08-11 Text recognition method, device, storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710687380.4A CN109389115B (en) 2017-08-11 2017-08-11 Text recognition method, device, storage medium and computer equipment

Publications (2)

Publication Number Publication Date
CN109389115A CN109389115A (en) 2019-02-26
CN109389115B true CN109389115B (en) 2023-05-23

Family

ID=65413997

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710687380.4A Active CN109389115B (en) 2017-08-11 2017-08-11 Text recognition method, device, storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN109389115B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210469A (en) * 2019-05-31 2019-09-06 中科软科技股份有限公司 A kind of method and system identifying picture character languages
CN110674876A (en) * 2019-09-25 2020-01-10 北京猎户星空科技有限公司 Character detection method and device, electronic equipment and computer readable medium
CN110969161B (en) * 2019-12-02 2023-11-07 上海肇观电子科技有限公司 Image processing method, circuit, vision-impaired assisting device, electronic device, and medium
CN111339910B (en) * 2020-02-24 2023-11-28 支付宝实验室(新加坡)有限公司 Text processing and text classification model training method and device
CN111797922B (en) * 2020-07-03 2023-11-28 泰康保险集团股份有限公司 Text image classification method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11272799A (en) * 1998-03-20 1999-10-08 Canon Inc Method and device for character recognition processing and storage medium
CN101777124A (en) * 2010-01-29 2010-07-14 北京新岸线网络技术有限公司 Method for extracting video text message and device thereof
CN102156865A (en) * 2010-12-14 2011-08-17 上海合合信息科技发展有限公司 Handwritten text line character segmentation method and identification method
CN102332096A (en) * 2011-10-17 2012-01-25 中国科学院自动化研究所 Video caption text extraction and identification method
WO2013097072A1 (en) * 2011-12-26 2013-07-04 华为技术有限公司 Method and apparatus for recognizing a character of a video
WO2014131339A1 (en) * 2013-02-26 2014-09-04 山东新北洋信息技术股份有限公司 Character identification method and character identification apparatus
CN104268603A (en) * 2014-09-16 2015-01-07 科大讯飞股份有限公司 Intelligent marking method and system for text objective questions
CN106056114A (en) * 2016-05-24 2016-10-26 腾讯科技(深圳)有限公司 Business card content identification method and business card content identification device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11272799A (en) * 1998-03-20 1999-10-08 Canon Inc Method and device for character recognition processing and storage medium
CN101777124A (en) * 2010-01-29 2010-07-14 北京新岸线网络技术有限公司 Method for extracting video text message and device thereof
CN102156865A (en) * 2010-12-14 2011-08-17 上海合合信息科技发展有限公司 Handwritten text line character segmentation method and identification method
CN102332096A (en) * 2011-10-17 2012-01-25 中国科学院自动化研究所 Video caption text extraction and identification method
WO2013097072A1 (en) * 2011-12-26 2013-07-04 华为技术有限公司 Method and apparatus for recognizing a character of a video
WO2014131339A1 (en) * 2013-02-26 2014-09-04 山东新北洋信息技术股份有限公司 Character identification method and character identification apparatus
CN104268603A (en) * 2014-09-16 2015-01-07 科大讯飞股份有限公司 Intelligent marking method and system for text objective questions
CN106056114A (en) * 2016-05-24 2016-10-26 腾讯科技(深圳)有限公司 Business card content identification method and business card content identification device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
付强 ; 丁晓青 ; 蒋焰 ; .基于多信息融合的中文手写地址字符串切分与识别.电子与信息学报.2008,(12),全文. *
杨武夷 ; 张树武 ; .一种视频中字符的集成型切分与识别算法.自动化学报.2010,(10),全文. *

Also Published As

Publication number Publication date
CN109389115A (en) 2019-02-26

Similar Documents

Publication Publication Date Title
CN109389115B (en) Text recognition method, device, storage medium and computer equipment
CA3027038C (en) Document field detection and parsing
Lu et al. Scene text extraction based on edges and support vector regression
CN106156766B (en) Method and device for generating text line classifier
US8744196B2 (en) Automatic recognition of images
CN107133622B (en) Word segmentation method and device
US8965127B2 (en) Method for segmenting text words in document images
US9053361B2 (en) Identifying regions of text to merge in a natural image or video frame
US8494273B2 (en) Adaptive optical character recognition on a document with distorted characters
WO2017202232A1 (en) Business card content identification method, electronic device and storage medium
US10643094B2 (en) Method for line and word segmentation for handwritten text images
Fabrizio et al. Text detection in street level images
JP5492205B2 (en) Segment print pages into articles
CN106203539B (en) Method and device for identifying container number
CN109447080B (en) Character recognition method and device
Chandio et al. Character classification and recognition for Urdu texts in natural scene images
Slavin Using special text points in the recognition of documents
He et al. Aggregating local context for accurate scene text detection
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
Wu et al. Contour restoration of text components for recognition in video/scene images
Šarić Scene text segmentation using low variation extremal regions and sorting based character grouping
CN110796145A (en) Multi-certificate segmentation association method based on intelligent decision and related equipment
Karanje et al. Survey on text detection, segmentation and recognition from a natural scene images
Lue et al. A novel character segmentation method for text images captured by cameras
Chatbri et al. An application-independent and segmentation-free approach for spotting queries in document images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant