CN116246278A

CN116246278A - Character recognition method and device, storage medium and electronic equipment

Info

Publication number: CN116246278A
Application number: CN202211637081.7A
Authority: CN
Inventors: 吴嘉嘉; 张建树; 蒋磊; 殷兵; 胡金水; 刘聪
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2022-12-16
Filing date: 2022-12-16
Publication date: 2023-06-09

Abstract

The application provides a character recognition method, a character recognition device, a storage medium and electronic equipment, and relates to the technical field of character processing. The character recognition method comprises the following steps: disassembling the word to be identified to obtain a radical sequence of the word to be identified, wherein the radical sequence comprises at least one radical element, and the at least one radical element is combined to form the word to be identified; if at least one component element corresponds to the writing template, determining the characteristic data of the at least one component element; determining characteristic data of a writing template corresponding to each of at least one component element; and determining the mispronounced word recognition result corresponding to the word to be recognized based on the characteristic data of each of the at least one component element and the characteristic data of the writing template corresponding to each of the at least one component element. Through the scheme in this application, not only can carry out wrong word discernment, also can carry out the recognition of individual character and correct word to based on writing template's characteristic data, effectively improved the recognition accuracy of waiting to discern the word.

Description

Character recognition method and device, storage medium and electronic equipment

Technical Field

The present invention relates to the field of word processing technologies, and in particular, to a word recognition method, a device, a storage medium, and an electronic apparatus.

Background

In the student age, many handwriting homework exists. After the handwriting homework is finished, the students are compared with standard answers to judge whether the words written in the handwriting homework are correct or not, but the method generally can only carry out the recognition of the individual words and cannot carry out the recognition of the wrong words.

In the related wrong character recognition method, the wrong character is judged by acquiring the stroke structure corresponding to the character written by the student, but the stroke structure is simpler, so that the recognition precision of the wrong character is low, and the judgment precision of the wrong character is further low.

Disclosure of Invention

The present application has been made in order to solve the above technical problems. The embodiment of the application provides a character recognition method, a character recognition device, a storage medium and electronic equipment.

In a first aspect, an embodiment of the present application provides a text recognition method, including: disassembling the word to be identified to obtain a radical sequence of the word to be identified, wherein the radical sequence comprises at least one radical element, and the at least one radical element is combined to form the word to be identified; if at least one component element corresponds to the writing template, determining the characteristic data of the at least one component element; determining characteristic data of a writing template corresponding to each of at least one component element; and determining the mispronounced word recognition result corresponding to the word to be recognized based on the characteristic data of each of the at least one component element and the characteristic data of the writing template corresponding to each of the at least one component element.

With reference to the first aspect, in some implementations of the first aspect, determining a misplaced word recognition result corresponding to the word to be recognized based on the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element includes: if the characteristic data of each component element in the at least one component element is consistent with the characteristic data of the writing template corresponding to the component element, acquiring a stroke identification sequence corresponding to each component element; and determining the misplaced word recognition result corresponding to the word to be recognized based on the stroke recognition sequence corresponding to each of the at least one radical element.

With reference to the first aspect, in some implementations of the first aspect, determining, based on the stroke recognition sequences to which the at least one radical element corresponds, a misplaced word recognition result to which the word to be recognized corresponds includes: acquiring standard stroke sequences corresponding to at least one component element respectively; if the stroke recognition sequence corresponding to each component element in at least one component element is consistent with the standard stroke sequence corresponding to the component element, M dictation words are obtained, and M is a positive integer; and determining the recognition result of the wrongly written word corresponding to the word to be recognized based on the M dictation words.

With reference to the first aspect, in some implementation manners of the first aspect, determining, based on the M dictation words, a mispronounced word recognition result corresponding to the word to be recognized includes: if the word to be recognized is the same as one of the M dictation words, determining that the word to be recognized is a correct word recognition result; if the word to be recognized is different from the M listening and writing words, determining that the word to be recognized is an identification result of the other words.

With reference to the first aspect, in some implementations of the first aspect, determining a misplaced word recognition result corresponding to the word to be recognized based on the stroke recognition sequence corresponding to each of the at least one radical element further includes: judging whether the stroke recognition sequence of the component element is inconsistent with the standard stroke sequence of the component element in at least one component element; if the character string is a character string, and the character string is a character string.

With reference to the first aspect, in some implementations of the first aspect, determining a misplaced word recognition result corresponding to the word to be recognized based on the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element further includes: judging whether feature data of the component elements are inconsistent with feature data of a writing template of the component elements in at least one component element; if the character string is a character string, and the character string is a character string.

With reference to the first aspect, in some implementations of the first aspect, the disassembling the word to be recognized to obtain the radical sequence of the word to be recognized includes: and disassembling the word to be identified by using a coder-decoder model to obtain a radical sequence of the word to be identified, wherein the coder-decoder model comprises an attention mechanism capable of extracting characteristics.

In a second aspect, an embodiment of the present application provides a text recognition device, including: the first determining module is used for disassembling the word to be identified to obtain a radical sequence of the word to be identified, wherein the radical sequence comprises at least one radical element, and the at least one radical element is combined to form the word to be identified; the second determining module is used for determining the characteristic data of each of the at least one component element if the at least one component element corresponds to the writing template; the third determining module is used for determining characteristic data of the writing templates corresponding to the at least one component element respectively; and the fourth determining module is used for determining the misplaced word recognition result corresponding to the word to be recognized based on the characteristic data of each of the at least one component element and the characteristic data of the writing template corresponding to each of the at least one component element.

In a third aspect, an embodiment of the present application provides a computer readable storage medium, where a computer program is stored, where the computer program is configured to perform the text recognition method according to the first aspect.

In a fourth aspect, an embodiment of the present application provides an electronic device, including: a processor; a memory for storing processor-executable instructions; the processor is configured to perform the text recognition method according to the first aspect.

The character recognition method provided by the application has the following beneficial effects.

First, the radical sequence more accurately characterizes the structural features of a word than the stroke sequence. Therefore, based on the radical sequence of the word to be identified, the identification result corresponding to the word to be identified is finally determined, and the identification precision of the word to be identified can be improved. And compared with the stroke sequence, the component sequence has lower complexity, and can further reduce the related calculated amount when determining the recognition result, thereby improving the recognition speed.

Secondly, after the component sequences are obtained, after judging that each component element in the component sequences has a corresponding writing template, the characteristic data of the component elements and the characteristic data of the writing template are further determined, and the misprinted character recognition result of the character to be recognized is determined according to the characteristic data of the component elements and the characteristic data of the writing template. That is, the method has high operability and can simply and accurately determine various recognition results of the words to be recognized by taking the reference template as a standard for recognizing the wrong words, the wrongly written words and the correct words and comparing the similarity of the feature data of the two.

Drawings

The foregoing and other objects, features and advantages of the present application will become more apparent from the following more particular description of embodiments of the present application, as illustrated in the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts or steps.

Fig. 1 is a schematic diagram of an implementation environment of a text recognition method according to an embodiment of the present application.

Fig. 2 is a schematic diagram of an application scenario provided in an exemplary embodiment of the present application.

Fig. 3 is a flow chart illustrating a text recognition method according to an exemplary embodiment of the present application.

Fig. 4 is a flowchart illustrating a process of determining a recognition result corresponding to a word to be recognized according to another exemplary embodiment of the present application.

FIG. 5 is a schematic diagram illustrating modeling of component sequences and stroke recognition sequences provided in an exemplary embodiment of the present application.

Fig. 6 is a schematic structural diagram of a text recognition device according to an exemplary embodiment of the present application.

Fig. 7 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

Exemplary scenario

Fig. 1 is a schematic diagram of an implementation environment of a text recognition method according to an embodiment of the present application. As shown in fig. 1, the present implementation environment includes a user terminal 11 and a server 12, the user terminal 11 and the server 12 are in communication connection, and an application for performing relevant preprocessing on words to be identified is installed in the user terminal 11. The server 12 may be a separate physical server, a server cluster composed of a plurality of servers, or a cloud server capable of cloud computing, and the server may be regarded as a server for a specific service (word recognition or financial transaction). In addition, the server may be a physical machine or a virtual machine, and the number of the servers may be one or more, which is not limited in the embodiment of the present application.

Specifically, the user terminal may first obtain an image to be identified including the word to be identified, and detect the word to be identified in the image to be identified by using a classical target detection scheme. Then, the word to be recognized is disassembled by using the preprocessing application installed in the user terminal 11, and the radical sequence of the word to be recognized is obtained. Further, in order to increase the processing speed to be recognized, the user terminal 11 sends the component sequence of the word to be recognized to the server 12, the server 12 searches the decorrelation database for whether each component element in the component sequence has a writing template according to the received component sequence of the word to be recognized, if so, the writing template corresponding to the component element is called, the feature data of the component element and the feature data of the writing template are determined firstly through a pre-trained feature comparison model deployed in the server 12, similarity calculation is performed on the feature data of the component element and the feature data of the writing template, and finally, the recognition result corresponding to the word to be recognized is determined according to the calculation result.

Fig. 2 is a schematic diagram of an application scenario provided in an exemplary embodiment of the present application, where the application scenario is a dictation job modification scenario. Specifically, the student writes the related words reported and listened by the teacher on the homework paper, after the teacher obtains the homework paper written by the student, the homework paper is shot by the user terminal 11, the user terminal 11 detects the words to be identified which are listened and written on the homework paper, and the component sequence of each word to be identified is analyzed. Further, the user terminal 11 sends the radical sequence of the word to be recognized to the server 12, the server 12 calculates the similarity between the feature data of the radical element of the word to be recognized and the feature data of the writing template by using the method illustrated in fig. 1, determines the recognition result of the word to be recognized according to the similarity, finally, after all the words to be recognized which are written on the homework paper are corrected, and sends the recognition result corresponding to the answer result of each student to the user terminal 11.

Exemplary method

Fig. 3 is a flow chart illustrating a text recognition method according to an exemplary embodiment of the present application. As shown in fig. 3, the text recognition method provided in the embodiment of the present application includes the following steps.

Step S310, disassembling the word to be recognized to obtain the radical sequence of the word to be recognized.

The radical sequence comprises at least one radical element, and at least one radical element is combined to form a word to be recognized. In addition, the radical sequence also includes positional information of the radical elements in the word to be recognized, i.e., each radical element in the radical sequence is also attached with an attribute feature of the positional information.

Specifically, the components are basic structures of Chinese character shapes and are components of a combined character, so that the character to be identified can be disassembled according to the configuration of the Chinese character shapes to obtain a component sequence of the character to be identified, and specifically, the combined character refers to a Chinese character consisting of two or more single characters. For example, a part of a word that can be independent up, down, left, and right is used as a radical element. For the word "river" to be identified, the corresponding radical elements comprise three-point water and "worker", and the radical sequence corresponding to the "river" comprises the position information of the three-point water and the three-point water in the "river", and the position information of the "worker" and the "worker" in the "river"; for the word "handle" to be recognized, the corresponding radical elements include the handle side and the "bar", and the radical sequence corresponding to the "handle" includes the position information of the handle side and the handle side in the "handle", and the position information of the "bar" and the "bar" in the "handle".

Step S320, if at least one component element corresponds to the writing template, determining the characteristic data of the at least one component element.

The writing template refers to a standard reference template corresponding to the radical element, and the writing template can be any font, and is exemplified by Song Ti, regular script and the like.

Before determining the respective feature data of the at least one radical element, further comprising: and acquiring all radical element sets included in the Chinese characters in the Chinese character library. If the component elements of the word to be identified are not in the component element set, the identification result of the word to be identified is considered to be a wrong word identification result. If the component elements of the character to be recognized are all in the component element set, the corresponding writing templates exist in at least one component element, and further, the characteristic data of at least one component element are determined.

In another embodiment, if there is a writing template that the radical element does not correspond to, the word to be recognized corresponding to the radical element is considered as a wrong word recognition result.

Step S330, determining characteristic data of the writing templates corresponding to the at least one component element.

The feature data of the writing templates corresponding to the at least one component element may be determined by a feature extraction model, or the feature data of the writing templates corresponding to the at least one component element may be determined according to a feature extraction algorithm.

Step S340, determining the wrongly written word recognition result corresponding to the word to be recognized based on the characteristic data of each of the at least one component element and the characteristic data of the writing template corresponding to each of the at least one component element.

Specifically, the wrongly written or mispronounced word recognition result includes a wrongly written word recognition result, an mispronounced word recognition result, and a correct word recognition result. The similarity between the feature data corresponding to the component elements and the feature data corresponding to the writing template can be compared, and the recognition result corresponding to the word to be recognized is determined based on the similarity of the feature data and the feature data.

Illustratively, the feature data is a feature vector, and the similarity between the component element and the writing template is determined by calculating a cosine distance between the feature vector corresponding to the component element and the feature vector corresponding to the writing template.

In the embodiment of the application, first, the component sequences can more accurately represent the structural characteristics of a word than the stroke sequences. Therefore, based on the radical sequence of the word to be identified, the identification result corresponding to the word to be identified is finally determined, and the identification precision of the word to be identified can be improved. And compared with the stroke sequence, the component sequence has lower complexity, and can further reduce the related calculated amount when determining the recognition result, thereby improving the recognition speed. Secondly, after the component sequences are obtained, after judging that each component element in the component sequences has a corresponding writing template, the characteristic data of the component elements and the characteristic data of the writing template are further determined, and the misprinted character recognition result of the character to be recognized is determined according to the characteristic data of the component elements and the characteristic data of the writing template. That is, the method has high operability and can simply and accurately determine various recognition results of the words to be recognized by taking the reference template as a standard for recognizing the wrong words, the wrongly written words and the correct words and comparing the similarity of the feature data of the two.

Fig. 4 is a flowchart illustrating a process of determining a recognition result corresponding to a word to be recognized according to another exemplary embodiment of the present application. The embodiment shown in fig. 4 is extended from the embodiment shown in fig. 3, and differences between the embodiment shown in fig. 4 and the embodiment shown in fig. 3 are described with emphasis, and the details of the differences are not repeated.

As shown in fig. 4, in the embodiment of the present application, the erroneous-character recognition result corresponding to the character to be recognized is determined based on the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element, including the following steps.

Step S410, judging whether the characteristic data of each component element is consistent with the characteristic data of the writing template corresponding to the component element.

Specifically, the method in the embodiment shown in fig. 3 is followed to judge the consistency of the feature data of each radical element and the feature data of the writing template.

Illustratively, in the actual application process, if the determination result in step S410 is no, that is, there is a discrepancy between the feature data of the radical element and the feature data of the writing template, step S420 is executed; if the judgment result in step S410 is yes, that is, the feature data of each component element is consistent with the feature data of the writing template corresponding to the component element, step S430 and step S440 are performed.

Specifically, an identical condition is preset, and if the similarity value of the feature data of the component element and the feature data of the writing template meets the identical condition, the feature data of the component element and the feature data of the writing template are considered to be identical. If the similarity value of the feature data of the component element and the feature data of the writing template does not meet the identical similarity condition, the feature data of the component element and the feature data of the writing template are considered to be inconsistent.

Illustratively, the writing template and the component elements can be input into a comparison model, the comparison model extracts the characteristic data of the writing template and the characteristic data of the component elements, and further, the characteristic data of the writing template and the characteristic data of the component elements are input into a full connection layer in the comparison model, so that a similarity value between 0 and 1 output by the comparison model is obtained.

Illustratively, the equivalent similarity condition is that the cosine distance between the feature data of the radical element and the feature data of the writing template is greater than 0.7. For example, if the cosine distance between the two actually calculated values is equal to 0.65, the feature data of the component element and the feature data of the writing template are considered to be identical, otherwise, the feature data of the two are considered to be inconsistent.

Step S420, determining that the recognition result corresponding to the word to be recognized is a wrong word recognition result.

Specifically, the wrong word is a word written by itself, for example, four horizontal strokes are written for three horizontal strokes of the Chinese character 'Feng', and at this time, the written Chinese character is an inexistent word, namely the wrong word.

Further, the characteristic data of the component elements can integrally reflect the writing structural characteristics of the component elements in a low-dimensional space or a high-dimensional space, and if the characteristic data of the component elements are inconsistent with the characteristic data of the writing template, the recognition result of the word to be recognized is considered to be a wrong word recognition result.

Step S430, acquiring stroke recognition sequences corresponding to the at least one component element.

Illustratively, the strokes may be horizontal, vertical, left-falling, right-falling, hooked, vertical lifted, cross-folded, etc. Likewise, the stroke recognition sequence includes the strokes, and positional data of the strokes in the entire radical element. For example, for the radical element "day", the corresponding stroke sequence includes vertical, horizontal, and positional information of "vertical, horizontal" in "day", from which the relative position between each two strokes can be determined.

Step S440, determining a recognition result corresponding to the word to be recognized based on the stroke recognition sequences corresponding to the at least one radical element.

Specifically, firstly, standard stroke sequences corresponding to at least one component element are obtained, and if the component element with the inconsistent stroke recognition sequence and the standard stroke sequence exists in the at least one component element, the recognition result corresponding to the character to be recognized is determined to be a wrong character recognition result.

Further, the standard stroke sequence refers to a correct stroke writing template corresponding to the stroke recognition sequence. The disagreement of the stroke recognition sequence and the standard stroke sequence means that: at least one stroke in the stroke recognition sequence is inconsistent with a stroke in the standard stroke sequence, and/or at least one stroke in the stroke recognition sequence is inconsistent with the position information of a corresponding stroke in the standard stroke sequence.

Taking the example in step S430, for the radical element "day", first, determining the third stroke "horizontal" in the stroke recognition sequence of "day", the position data of the third stroke "horizontal" in the standard stroke sequence corresponding to "day" in the word to be recognized is a, the position data of the third stroke "horizontal" in "day" in the standard stroke sequence corresponding to "day" is b, and if the difference between the position data a and the position data b is greater than the preset threshold value, considering that the stroke sequence of the radical element "day" of the word to be recognized is inconsistent with the standard stroke sequence, and further determining that the word to be recognized corresponding to the radical element "day" is a wrong word recognition result.

In another possible implementation manner, if in at least one radical element, the stroke recognition sequence corresponding to each radical element is consistent with the standard stroke sequence corresponding to the radical element, M dictation words are acquired, and M is a positive integer.

Specifically, the fact that the stroke recognition sequence corresponding to the radical element is consistent with the standard stroke sequence corresponding to the radical element means that: all strokes present in the stroke recognition sequence are consistent with the strokes in the standard stroke sequence, and the positional information of each stroke in the stroke recognition sequence is consistent with the positional information of the corresponding stroke in the standard stroke sequence.

Further, if it is determined that the stroke recognition sequence corresponding to each radical element is consistent with the standard stroke sequence corresponding to the radical element, M dictations are obtained, and the listening subjects for listening to the writing M dictations may be people or systems.

If the word to be recognized is the same as one of the M dictation words, determining that the word to be recognized is a correct word recognition result; if the word to be recognized is different from the M listening and writing words, determining that the word to be recognized is an identification result of the other words.

That is, in the case where the stroke recognition sequence corresponding to each radical element coincides with the standard stroke sequence corresponding to the radical element, it can be further determined whether or not the word to be recognized is a dictated word. The term "written" refers to a word that is itself free of errors, but is written as "protection" with an error in the vocabulary or sentence, such as "protection dart".

For example, the previously determined radical sequence and stroke recognition sequence of the word to be recognized may be used to compare with the radical sequence and stroke recognition sequence of the dictation word to determine whether the word to be recognized is identical to one of the M dictation words. The deep characteristic data of the word to be identified and the deep characteristic data corresponding to the M dictation words can be directly extracted, and the deep characteristic data of the word to be identified and the deep characteristic data of the M dictation words are compared to determine whether the word to be identified is a dictation word or not.

Illustratively, M hearing writing words include "decline", "weekend", "angle", "wait for kill" and the like. If the character to be recognized is a seat, and the stroke recognition sequence corresponding to each component element in the seat is consistent with the standard stroke sequence corresponding to the component element, but the seat is not found in the hearing writing, the seat of the character to be recognized is the character. On the other hand, if the word to be recognized is "sitting", it may be determined that "sitting" is the correct word.

In the embodiment of the application, the primary judgment of the recognition result of the word to be recognized is realized through the component sequence. On the basis of the component sequences, the depth judgment of the recognition result of the word to be recognized is realized through the stroke recognition sequences, and on one hand, the recognition result of the word to be recognized can be accurately and efficiently determined through the double judgment of the component sequences and the stroke recognition sequences. On the other hand, the word to be recognized is compared with M dictation words, and meanwhile, the recognition results of the wrong word, the written word and the correct word are judged under the dictation scene.

In an embodiment of the present application, the disassembling of the word to be recognized to obtain the radical sequence of the word to be recognized includes: and disassembling the word to be identified by using a coder-decoder model to obtain a radical sequence of the word to be identified, wherein the coder-decoder model comprises an attention mechanism capable of extracting characteristics.

Specifically, first, a codec model including an attention mechanism is trained, which is capable of modeling hierarchically a radical sequence of a word to be recognized, and in addition, is also capable of modeling hierarchically a stroke sequence included in a radical element in the radical sequence.

The image containing the word to be recognized or the body of the word to be recognized is input to the codec module, which decodes the components of the word to be recognized, which is the first hierarchy, for example. On the basis of the first hierarchy, in the case where it is determined that the feature data of the radical element and the feature data of the writing template agree, the stroke recognition sequence of the radical element is continued to be decoded, which is the second hierarchy. In the training phase of the encoder-decoder model, in order to improve the accuracy of the encoder-decoder model, the training data set includes not only correctly written chinese characters, but also writing incorrect chinese characters.

In the embodiment of the application, the component sequence and the stroke recognition sequence are respectively extracted on the first level and the second level by utilizing the encoder-decoder model, and the structural characteristics of the word to be recognized are accurately and comprehensively characterized in different dimensions by utilizing the extraction results of the two levels. And, through the joint training of correctly writing the sample and wrongly writing the sample, have improved the robustness of the model of the encoder-decoder.

FIG. 5 is a schematic diagram illustrating modeling of component sequences and stroke recognition sequences provided in an exemplary embodiment of the present application. As shown in fig. 5, for the word "f" to be recognized, first, an image containing the word "f" to be recognized or a body of the word "f" is input into a codec model, which decodes the radical sequence of "f" including the radical element "two" and the radical element "person" using an attention mechanism. The characteristic data of the component element 'person' and the characteristic data corresponding to the writing template of the component element 'person' are input into a two-classifier to judge whether the characteristic data of the component element 'person' of the word to be identified and the characteristic data corresponding to the writing template are consistent or not. The characteristic data of the component element 'person' and the characteristic data corresponding to the writing template of the component element 'person' are input into a two-classifier to judge whether the characteristic data of the component element 'person' of the word to be identified and the characteristic data corresponding to the writing template are consistent or not.

Illustratively, the classifier outputs 0 or 1 according to the feature data of the input radical element and the feature data of the writing template of the radical element. Wherein 0 represents that the radical element is inconsistent with the writing template corresponding to the radical element; and 1 represents that the component element is consistent with the writing template corresponding to the component element.

Further, under the condition that the characteristic data of the two component elements are consistent with the characteristic data of the writing template, the stroke sequence of the component elements 'two' is decoded to comprise 'horizontal and transverse', and the position information corresponding to the 'horizontal and transverse' respectively; the stroke sequence of the component element "person" includes positional information corresponding to "skim-falling and" right-falling "respectively.

The above-mentioned embodiments of the text recognition method of the present application are described in detail with reference to fig. 3 to 5, and the following embodiments of the text recognition device of the present application are described in detail with reference to fig. 6. It should be understood that the description of the embodiments of the word recognition method corresponds to the description of the embodiments of the word recognition device, and thus, parts not described in detail may be referred to the previous embodiments of the method.

Fig. 6 is a schematic structural diagram of a text recognition device according to an exemplary embodiment of the present application. As shown in fig. 6, the text recognition device provided in the embodiment of the present application includes:

a first determining module 610, configured to disassemble a word to be identified to obtain a radical sequence of the word to be identified, where the radical sequence includes at least one radical element, and at least one radical element is combined to form the word to be identified;

a second determining module 620, configured to determine respective feature data of at least one component element if the at least one component element corresponds to a writing template;

a third determining module 630, configured to determine feature data of a writing template corresponding to each of the at least one component element;

and a fourth determining module 640, configured to determine a misplaced word recognition result corresponding to the word to be recognized based on the feature data of each of the at least one component element and the feature data of the writing template corresponding to each of the at least one component element.

In an embodiment of the present application, the fourth determining module 640 is further configured to obtain a stroke recognition sequence corresponding to each of the at least one component element if, in the at least one component element, feature data of each component element is consistent with feature data of a writing template corresponding to the component element; and determining the misplaced word recognition result corresponding to the word to be recognized based on the stroke recognition sequence corresponding to each of the at least one radical element.

In an embodiment of the present application, the fourth determining module 640 is further configured to obtain a standard stroke sequence corresponding to each of the at least one component element; if the stroke recognition sequence corresponding to each component element in at least one component element is consistent with the standard stroke sequence corresponding to the component element, M dictation words are obtained, and M is a positive integer; and determining the recognition result of the wrongly written word corresponding to the word to be recognized based on the M dictation words.

In an embodiment of the present application, the fourth determining module 640 is further configured to determine that the word to be recognized is a correct word recognition result if the word to be recognized is the same as one of the M dictations; if the word to be recognized is different from the M listening and writing words, determining that the word to be recognized is an identification result of the other words.

In an embodiment of the present application, the fourth determining module 640 is further configured to determine whether, in at least one component element, there is a discrepancy between the stroke recognition sequence of the component element and the standard stroke sequence of the component element; if the character string is a character string, and the character string is a character string.

In an embodiment of the present application, the fourth determining module 640 is further configured to determine whether, in at least one component element, there is inconsistency between feature data of the component element and feature data of a writing template of the component element; if the character string is a character string, and the character string is a character string.

In an embodiment of the present application, the fourth determining module 640 is further configured to disassemble the word to be identified by using a codec model, where the codec model includes an attention mechanism capable of extracting features, to obtain the radical sequence of the word to be identified.

Next, an electronic device according to an embodiment of the present application is described with reference to fig. 7. Fig. 7 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.

As shown in fig. 7, the electronic device 70 includes one or more processors 701 and memory 702.

The processor 701 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 70 to perform the desired functions.

Memory 702 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 701 to implement the text recognition methods and/or other desired functions of the various embodiments of the present application described above. Various contents such as text to be recognized, a radical sequence, feature data corresponding to radical elements, feature data corresponding to writing templates, recognition results, and the like may also be stored in the computer-readable storage medium.

In one example, the electronic device 70 may further include: input device 703 and output device 704, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

The input device 703 may include, for example, a keyboard, a mouse, and the like.

The output device 704 may output various information to the outside, including characters to be recognized, component sequences, feature data corresponding to component elements, feature data corresponding to writing templates, recognition results, and the like. The output device 704 may include, for example, a display, speakers, a printer, and a communication network and remote output apparatus connected thereto, etc.

Of course, only some of the components of the electronic device 70 that are relevant to the present application are shown in fig. 7 for simplicity, components such as buses, input/output interfaces, etc. are omitted. In addition, the electronic device 70 may include any other suitable components depending on the particular application.

In addition to the methods and apparatus described above, embodiments of the present application may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform the steps in the word recognition method according to various embodiments of the present application described above in the present specification.

The computer program product may write program code for performing the operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present application may also be a computer-readable storage medium, having stored thereon computer program instructions, which when executed by a processor, cause the processor to perform the steps in the text recognition method according to various embodiments of the present application described above in the present specification.

The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The basic principles of the present application have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present application are merely examples and not limiting, and these advantages, benefits, effects, etc. are not to be considered as necessarily possessed by the various embodiments of the present application. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the application is not intended to be limited to the details disclosed herein as such.

The block diagrams of the devices, apparatuses, devices, systems referred to in this application are only illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.

It is also noted that in the apparatus, devices and methods of the present application, the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent to the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of the application to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims

1. A method of text recognition, comprising:

disassembling a word to be identified to obtain a radical sequence of the word to be identified, wherein the radical sequence comprises at least one radical element, and the at least one radical element is combined to form the word to be identified;

if the at least one component element corresponds to the writing template, determining the characteristic data of the at least one component element;

determining characteristic data of a writing template corresponding to each of the at least one component element;

and determining the mispronounced word recognition result corresponding to the word to be recognized based on the characteristic data of each of the at least one component element and the characteristic data of the writing template corresponding to each of the at least one component element.

2. The character recognition method according to claim 1, wherein the determining of the misplaced character recognition result corresponding to the character to be recognized based on the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element includes:

if the characteristic data of each component element in the at least one component element is consistent with the characteristic data of the writing template corresponding to the component element, acquiring a stroke recognition sequence corresponding to each component element;

and determining the wrongly written word recognition result corresponding to the word to be recognized based on the stroke recognition sequence corresponding to each of the at least one radical element.

3. The text recognition method according to claim 2, wherein the determining the misplaced word recognition result corresponding to the word to be recognized based on the stroke recognition sequences corresponding to the at least one radical element, comprises:

acquiring standard stroke sequences corresponding to the at least one component element respectively;

if the stroke recognition sequence corresponding to each component element in the at least one component element is consistent with the standard stroke sequence corresponding to the component element, M dictation words are obtained, and M is a positive integer;

and determining the wrongly written word recognition result corresponding to the word to be recognized based on the M hearing-written words.

4. The text recognition method of claim 3, wherein determining, based on the M hearing words, a misplaced word recognition result corresponding to the word to be recognized includes:

if the word to be identified is the same as one of the M dictation words, determining that the word to be identified is a correct word identification result;

and if the word to be recognized is different from the M listening and writing words, determining that the word to be recognized is an identification result of the character.

5. A method of recognizing characters according to claim 3, wherein said determining the misplaced character recognition result corresponding to the character to be recognized based on the stroke recognition sequences corresponding to the at least one radical element, further comprises:

judging whether the stroke recognition sequence of the component element is inconsistent with the standard stroke sequence of the component element in the at least one component element;

and if the at least one radical element has a radical element with a inconsistent stroke recognition sequence and a standard stroke sequence, determining the word to be recognized as a wrong word recognition result.

6. The text recognition method according to claim 2, wherein the determining the misplaced word recognition result corresponding to the word to be recognized based on the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element, further comprises:

judging whether feature data of the component elements are inconsistent with feature data of a writing template of the component elements in the at least one component element;

and if the character component elements with inconsistent characteristic data of the character component elements and the characteristic data of the writing template exist in the at least one character component element, determining that the character to be recognized is a wrong character recognition result.

7. The character recognition method according to any one of claims 1 to 6, wherein the disassembling the character to be recognized to obtain the radical sequence of the character to be recognized includes:

and decomposing the word to be identified by using a coder-decoder model to obtain a radical sequence of the word to be identified, wherein the coder-decoder model comprises an attention mechanism capable of extracting characteristics.

8. A character recognition device, comprising:

the first determining module is used for disassembling the word to be identified to obtain a radical sequence of the word to be identified, wherein the radical sequence comprises at least one radical element, and the at least one radical element is combined to form the word to be identified;

the second determining module is used for determining the characteristic data of each of the at least one component element if each of the at least one component element corresponds to a writing template;

a third determining module, configured to determine feature data of a writing template corresponding to each of the at least one component element;

and the fourth determining module is used for determining the mispronounced word recognition result corresponding to the word to be recognized based on the characteristic data of each of the at least one component element and the characteristic data of the writing template corresponding to each of the at least one component element.

9. A computer-readable storage medium, characterized in that the storage medium stores a computer program for executing the text recognition method according to any one of the preceding claims 1 to 7.

10. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor is configured to perform the text recognition method of any one of claims 1 to 7.