CN112863516A

CN112863516A - Text error correction method and system and electronic equipment

Info

Publication number: CN112863516A
Application number: CN202011641951.9A
Authority: CN
Inventors: 简仁贤; 佘昌宪; 李佳纯
Original assignee: Emotibot Technologies Ltd
Current assignee: Emotibot Technologies Ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2021-05-28

Abstract

The invention discloses a text error correction method, a system and electronic equipment, wherein the text error correction method comprises the following steps: receiving a text to be corrected and acquiring pinyin of the text to be corrected, and acquiring user vocabularies and pinyin of the user vocabularies from a user lexicon; directly comparing the pinyin of the text to be corrected with the pinyin of the vocabulary of each user, and selecting error-correcting words from a user word bank according to a preset algorithm; and reversely deducing the replacement words in the text to be corrected according to the selected path of the error correction words, replacing the replacement words with the error correction words, and obtaining the text after error correction. The error correction method and the error correction system disclosed by the invention are not limited by the entity word model, and can quickly compare each position of the text and find out the position to be replaced.

Description

Text error correction method and system and electronic equipment

Technical Field

The invention relates to the technical field of intelligent recognition, in particular to a natural language processing technology.

Background

With the popularization of deep learning, great breakthroughs are made in the aspects of computer vision, speech recognition, natural language processing and the like. Taking speech recognition as an example, the accuracy of speech recognition has reached 97% at present. The breakthrough of the technology makes the application field of the voice recognition wider and wider. Compared with other human-computer interaction modes, the voice interaction is more in line with the daily habits of people and is more efficient. It can be expected that the voice recognition technology will be widely applied to various fields such as smart home, industrial production, communication, medical treatment, automatic driving, and the like. In the actual voice interaction process, the voice recognition error rate is high due to the influence of factors such as nonstandard pronunciation of a user, noise and the like. The prior art focuses on improving the accuracy of voice recognition, but lacks an error correction means for a recognition result. Due to the reasons, the popularization of voice interaction products is greatly influenced.

In the prior art, in order to improve the error correction condition of the recognition result, the entity words of the speech recognition text are often found through a pre-trained entity word model, and are compared with the user word bank in the pinyin similarity. As in the prior art, the method generally comprises the following steps: analyzing and preprocessing the text data after voice conversion to obtain a sample data set; training an entity recognition model by using sample data; constructing an entity correction data set; and according to the text data after the voice recognition, predicting, verifying the entity and the like by using the entity recognition model.

The technical scheme of the type has the following defects: before the speech recognition text is corrected, entity word models need to be labeled and trained in advance, and entity words of different types may need to be trained additionally, so that the operation time and accuracy are greatly limited by the entity word models.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a text error correction method, a text error correction system and electronic equipment.

In a first aspect, the present invention provides a text error correction method, including the following steps: acquiring a text to be corrected, identifying pinyin of the text to be corrected, and extracting user words and user word pinyin from a user word bank; directly comparing the pinyin of the text to be corrected with the pinyin of the vocabulary of each user, and selecting error-correcting words from a user word bank according to a preset algorithm; and reversely deducing the replacement words in the text to be corrected according to the selected path of the error correction words, replacing the replacement words with the error correction words, and obtaining the text after error correction.

With reference to the embodiment of the first aspect, in a possible implementation manner, the text to be corrected is a speech recognition text.

With reference to the embodiment of the first aspect, in a possible implementation manner, the preset algorithm includes a pronunciation score algorithm, and the pronunciation score algorithm includes: calculating pronunciation similarity scores of any two pinyins between the text pinyin to be corrected and the pinyin of the vocabulary of each user, accumulating the pronunciation similarity scores of the text pinyin to be corrected and the pinyin of the vocabulary of the user by using a Longest Common Subsequence (LCS) algorithm to obtain an accumulated pronunciation score of each vocabulary of the user, and recording a scoring path of the pronunciation scores; and dividing the accumulated pronunciation score of the user vocabulary by the pinyin number of the user vocabulary to obtain the pronunciation score of the user vocabulary, and selecting the user vocabulary with the highest pronunciation score as an error correction word.

With reference to the embodiment of the first aspect, in a possible implementation manner, the preset algorithm includes a pinyin score algorithm and a pronunciation score algorithm, and the pinyin score algorithm includes: calculating LCS of the text pinyin to be corrected and the pinyin of each user vocabulary to obtain the accumulated pinyin score of each user vocabulary, dividing the accumulated pinyin score of the user vocabulary by the number of the pinyins of the user vocabulary to obtain the pinyin score of the user vocabulary, and screening the user vocabulary with the pinyin score arranged at the front K position as a candidate word, wherein K is less than or equal to the total number of the user vocabulary. The pronunciation algorithm comprises: calculating pronunciation similarity scores of any two pinyins between the text pinyin to be corrected and the pinyin of each candidate word, accumulating the pronunciation similarity scores of the text pinyin to be corrected and the pinyin of the candidate word by using a Longest Common Subsequence (LCS) algorithm to obtain an accumulated pronunciation score of each candidate word, and recording a scoring path of the pronunciation scores; and dividing the accumulated pronunciation score of the candidate word by the number of the pinyin of the candidate word to obtain the pronunciation score of the candidate word, and selecting the candidate word with the highest pronunciation score as the error correction word.

With reference to the embodiment of the first aspect, in a possible implementation manner, if there are multiple candidate words with the same pronunciation score, picking the error-correcting word according to the following sequence: the candidate word with the highest pinyin score is an error correction word; the candidate word with the same number of words as the text to be corrected and the maximum number of words is the error correction word; the candidate word with the length closest to the word number of the text to be corrected is the error correction word; and randomly selecting the candidate words as error correction words.

With reference to the first aspect, in a possible implementation manner, the pronunciation similarity score of the pinyin is directly obtained by computing a pronunciation similarity algorithm, where the pronunciation similarity algorithm includes a dimim algorithm.

With reference to the first aspect, in a possible implementation manner, the pronunciation score of the error-correcting word is compared with a preset threshold, and if: if the pronunciation score of the error correction word is lower than a preset threshold value, the step of replacing the replacement word with the error correction word is not executed any more, and the text to be corrected is directly output; and (4) replacing the replacement words with the error-correcting words and outputting the error-corrected text when the pronunciation score of the error-correcting words is higher than or equal to a preset threshold value.

With reference to the embodiment of the first aspect, in a possible implementation manner, a value range of the preset threshold is 0.5 to 1.

In a second aspect, the present invention provides a text correction system, comprising the following modules: the voice recognition module is used for receiving the voice input of the user from the user and acquiring a voice recognition text and pinyin of the text to be corrected; the user word bank storage module is used for storing and extracting user words and user word pinyin, calculating the number of the user word pinyin and calculating the number of the user words; the pronunciation score calculation module is used for calculating the pronunciation score of the pinyin of the candidate word and the accumulated pronunciation score scoring path; and the text replacement module is used for text replacement.

With reference to the second aspect, in a possible implementation manner, the apparatus further includes a pinyin-score calculating module, configured to calculate a pinyin score for each user vocabulary.

In a third aspect, the present invention provides an electronic device comprising a memory and a processor, the memory and the processor being connected; the memory is used for storing programs; the processor calls a program stored in the memory to perform the method of the first aspect embodiment and/or any possible implementation manner of the first aspect embodiment.

The beneficial effects of the invention include: the method is not limited by the entity word model, and the entity word model does not need to be labeled and trained in advance, so that the workload is reduced, and the working efficiency and the accuracy are improved; each position of the text can be quickly compared, and the position to be replaced and the most matched error-correcting word can be found out at the same time, so that the operation time is shortened and the accuracy is improved; the invention not only considers the similarity of the pinyin literal, but also considers the similarity of pronunciation, further improving the accuracy.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate exemplary embodiments of the invention and, together with the description, serve to explain the invention and are not intended to limit the invention in a non-limiting sense. In the drawings:

FIG. 1 is a flow chart of a text error correction method according to the present invention;

FIG. 2 is a score chart of the cumulative pronunciation scores of "license plate number is Ku ABC 966" and "Shanghai ADD 966";

FIG. 3 is a graph of the cumulative pronunciation score scoring paths for "license plate number is Ku ABC 966" and "Shanghai ADD 966";

FIG. 4 is a graph of cumulative pronunciation score for "call to home hospital" and "micro-nail hospital";

FIG. 5 is a graph of cumulative pronunciation score scores score paths for "call to home hospital" and "micro-nail hospital";

FIG. 6 is a diagram of a text correction system according to the present invention.

Detailed Description

In order to make the technical problems, technical solutions and technical effects to be solved by the present invention clearer, the technical solutions of the embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of a portion of the invention, and not all. All other embodiments obtained by a person skilled in the art based on the embodiments of the present invention without any inventive step are also within the scope of the present invention.

The embodiment of the invention overcomes the defect that in the prior art, the entity word model needs to be labeled and trained in advance before the error correction is carried out on the voice recognition text, and different types of entity words possibly need to be trained additionally, so that the operation time and the accuracy are greatly limited by the entity word model.

The text error correction method provided by the invention comprises the following steps: acquiring a text to be corrected, identifying pinyin of the text to be corrected, and extracting user words and user word pinyin from a user word bank; directly comparing the pinyin of the text to be corrected with the pinyin of the vocabulary of each user, and selecting error-correcting words from a user word bank according to a preset algorithm; and reversely deducing the replacement words in the text to be corrected according to the selected path of the error correction words, replacing the replacement words with the error correction words, and obtaining the text after error correction.

Further, the text to be corrected is a speech recognition text

Further, the preset algorithm may adopt any one of the following two schemes:

first, the preset algorithm includes a pronunciation score algorithm, which includes: calculating pronunciation similarity scores of any two pinyins between the text pinyin to be corrected and the pinyin of the vocabulary of each user, accumulating the pronunciation similarity scores of the text pinyin to be corrected and the pinyin of the vocabulary of the user by using a Longest Common Subsequence (LCS) algorithm to obtain an accumulated pronunciation score of each vocabulary of the user, and recording a scoring path of the pronunciation scores; and dividing the accumulated pronunciation score of the user vocabulary by the pinyin number of the user vocabulary to obtain the pronunciation score of the user vocabulary, and selecting the user vocabulary with the highest pronunciation score as an error correction word.

Secondly, in order to further improve the error correction efficiency, the preset algorithm comprises a pinyin score algorithm and a pronunciation score algorithm, and the pinyin score algorithm comprises: calculating LCS of the text pinyin to be corrected and the pinyin of each user vocabulary to obtain the accumulated pinyin score of each user vocabulary, dividing the accumulated pinyin score of the user vocabulary by the number of the pinyins of the user vocabulary to obtain the pinyin score of the user vocabulary, and screening the user vocabulary with the pinyin score arranged at the front K position as a candidate word, wherein K is less than or equal to the total number of the user vocabulary. The pronunciation score algorithm includes: calculating pronunciation similarity scores of any two pinyins between the text pinyin to be corrected and the pinyin of each candidate word, accumulating the pronunciation similarity scores of the text pinyin to be corrected and the pinyin of the candidate word by using a Longest Common Subsequence (LCS) algorithm to obtain an accumulated pronunciation score of each candidate word, and recording a scoring path of the pronunciation scores; and dividing the accumulated pronunciation score of the candidate word by the number of the pinyin of the candidate word to obtain the pronunciation score of the candidate word, and selecting the candidate word with the highest pronunciation score as the error correction word.

Further, if a plurality of candidate words with the same pronunciation score exist, picking the error-correcting words according to the following sequence: the candidate word with the highest pinyin score is an error correction word; the candidate word with the same number of words as the text to be corrected and the maximum number of words is the error correction word; the candidate word with the length closest to the word number of the text to be corrected is the error correction word; and randomly selecting the candidate words as error correction words.

Furthermore, the pronunciation similarity score of the pinyin is directly obtained by the calculation of a pronunciation similarity algorithm such as a DIMSIM algorithm.

Further, in order to further improve the error correction precision, a threshold is preset, the pronunciation score of the error correction word is compared with the preset threshold, and if: if the pronunciation score of the error correction word is lower than a preset threshold value, the step of replacing the replacement word with the error correction word is not executed any more, and the text to be corrected is directly output; and (4) replacing the replacement words with the error-correcting words and outputting the error-corrected text when the pronunciation score of the error-correcting words is higher than or equal to a preset threshold value.

Further, according to the requirements of different error correction precisions, the value range of the preset threshold is 0.5-1.

In addition, the invention also provides a text error correction system, which comprises the following modules: the voice recognition module is used for receiving the voice input of the user from the user and acquiring a voice recognition text and pinyin of the text to be corrected; the user word bank storage module is used for storing and extracting user words and user word pinyin, calculating the number of the user word pinyin and calculating the number of the user words; the pronunciation score calculation module is used for calculating the pronunciation score of the pinyin of the candidate word and the accumulated pronunciation score scoring path; and the text replacement module is used for text replacement.

Further, the system also comprises a pinyin score calculation module which is used for calculating the pinyin score of each user vocabulary.

Furthermore, the invention provides an electronic device comprising a memory and a processor, the memory and the processor being connected; the memory is used for storing programs; the processor calls a program stored in the memory to perform the method of the first aspect embodiment and/or any possible implementation manner of the first aspect embodiment.

The technical solution of the present application is further illustrated by the following examples.

Example one

Step 1:

acquiring a text to be corrected in a voice recognition mode: "license plate number is cool ABC 966", and the corresponding pinyin is: [ 'che', 'pai', 'halo', 'ma', 'shi', 'ku', 'ei', 'bi', 'xi', 'jiu', 'liu', 'liu' ];

extracting user vocabularies in a user lexicon: gan A7C976, Shanghai ADD966, Wan DPC966 and Zhe EY7966 … … are respectively expressed by corresponding pinyin: [ 'gan', 'ei', 'qi', 'xi', 'jiu', 'qi', 'liu', 'hu', 'ei', 'di', 'jiu', 'liu', 'wan', 'di', 'pi', 'xi', 'jiu', 'liu', 'zhe', 'yi', 'wai', 'qi', 'jiu', 'liu', 'liu' … …

Step 2:

calculating LCS of the text pinyin to be corrected and the user vocabulary pinyin and dividing the LCS by the number of the user vocabulary pinyin to obtain the pinyin score of each user vocabulary, wherein the specific calculation mode is as follows:

LCS (license plate number cool ABC966 pinyin, gan A7C976 pinyin)/7 ═ LCS ([ ' che ', ' pai ', ' hao ', ' ma ', ' shi ', ' ku ', ' ei ', ' bi ', ' xi ', ' jiu ', ' liu ', ' gan ', ' ei ', ' jiu ', ' qi ', ' liu ', ' li ', ' 7) ═ 0.57 ═

LCS (license plate number cool ABC966 pinyin, hushi 966 pinyin)/7 ═ LCS ([ ' che ', ' pai ', ' hao ', ' ma ', ' shi ', ' ku ', ' ei ', ' bi ', ' xi ', ' jiu ', ' liu ', ' hu ', ' ei ', ' di ', ' jiu ', ' liu ', ' liu ')/7 ═ 0.57 { (che ', ' pai ', ' ha ', ' ma ', ' shi ', ' ku ', ' ei ', ' di ', ' jiu ', ' liu ', ' liu ' ]) } 7 { (0.57) }

LCS (license plate number cool ABC966 pinyin, wan DPC966 pinyin))/7 ═ LCS ([ 'che', 'pai', 'halo', 'ma', 'shi', 'ku', 'ei', 'bi', 'xi', 'jiu', 'liu', 'liu' ], [ 'wan', 'di', 'pi', 'xi', 'jiu', 'liu', 'liu' ])/7 ═ 0.57

LCS (license plate number cool ABC966 pinyin, zhe EY7966 pinyin)/7 ═ LCS ([ 'che', 'pai', 'ha', 'ma', 'shi', 'ku', 'ei', 'bi', 'xi', 'jiu', 'liu', 'liu' ], [ 'zhe', 'yi', 'wai', 'qi', 'jiu', 'liu', 'liu' ])/7 ═ 0.43.43

……

When K is set to be 3, screening out the user vocabulary with the Pinyin score at the first 3 positions as candidate words: gan A7C976, Hu ADD966 and Wan DPC 966.

And step 3:

calculating pronunciation similarity scores of any two voices between the text to be corrected and each candidate word by using a pronunciation similarity calculation method such as a DIMSIM algorithm, wherein the pronunciation similarity of "ku" and "hu (hu)" is 0.583, the pronunciation similarity of "ku" and "gan (gan)" is 0.013, the pronunciation similarity of 9(jiu) and 9(jiu) is 1 … …, accumulating the pronunciation similarity scores of the pinyin of the text to be corrected and the pinyin of the candidate word by using a longest common subsequence LCS algorithm to obtain a cumulative pronunciation score of each candidate word, and recording a score path of the pronunciation score, wherein the cumulative pronunciation score of the candidate word is divided by the number of the pinyin of the candidate word to obtain the pronunciation score of the candidate word, and the specific calculation method is as follows:

LCS (license plate number is Cool ABC966 Pinyin, JiangxA 7C976 Pinyin | DIMSIM pronunciation similarity calculation method)/7 ═ 0.59

LCS (license plate number is Ku ABC966 Pinyin, Hu ADD966 Pinyin | DIMSIM pronunciation similarity calculation method)/7 ═ 0.69

LCS (license plate number is Cool ABC966 spelling, Wan DPC966 spelling | DIMSIM pronunciation similarity calculation method)/7 ═ 0.67

Fig. 2 shows a cumulative pronunciation score map of "shanghai ADD 966", and fig. 3 shows a cumulative pronunciation score path map thereof.

And 4, step 4:

the "hunman ADD 966" with the highest pronunciation score is selected as the error-correcting word, and the text replacement word is pushed back to the "cool ABC 966" according to the accumulated score path corresponding to the accumulated pronunciation score as shown in fig. 3.

Step 5-1:

when the preset threshold is set to be 0.8, if the pronunciation score of the error correcting word 'Shanda ADD 966' is 0.69 and is smaller than the preset threshold, the step of replacing the replacement word with the error correcting word is not executed, and the text to be corrected is directly output, namely the text after error correction is still 'the license plate number is ABC cool 966'.

Step 5-2:

when the preset threshold value is set to be 0.6, the pronunciation score of the error correcting word Shanghai ADD966 is 0.69 and is larger than the preset threshold value, the error correcting word Shanghai ADD966 is used for replacing the text replacement word Ku ABC966, namely the text to be corrected with the license plate number Ku ABC966 is modified into the text with the license plate number Shanghai ADD 966.

Example two

Step 1:

acquiring a text to be corrected in a voice recognition mode: "how good to call to the tailstock hospital", the corresponding pinyin is: [ 'jiao', 'che', 'dao', 'wei', 'jia', 'yi', 'yuan', 'hao', 'ma' ];

extracting user vocabularies in a user lexicon: the corresponding pinyin of the user vocabulary of 'micro-armor hospital', 'rock hospital', 'Minghao International Hotel' … … are respectively: [ 'wei', 'jia', 'yi', 'yuan' ], [ 'shi', 'yan', 'yi', 'yuan' ], [ 'ming', 'hao', 'guo', 'ji', 'jiu', 'dian' ] … …

Step 2:

LCS (hai car to tail hospital good pinyin, mini-nail hospital pinyin)/4 ═ LCS ([ ' jiao ', ' che ', ' dao ', ' wei ', ' jia ', ' yi ', ' yuan ', ' hao ', ' ma ', ' wei ', ' jia ', ' yi ', ' yuan ', ' 4 ═ 4/4 ═ 1

LCS (good car to tail hospital pinyin, stone hospital pinyin)/4 ═ LCS ([ ' jiao ', ' che ', ' dao ', ' wei ', ' jia ', ' yi ', ' yuan ', ' hao ', ' ma ', ' wei ', ' jia ', ' yi ', ' yuan ', ' 4) ═ 2/4 ═ 0.5

LCS (hai car to tail hospital hanyu pinyin, minghao international hotel pinyin)/6 ═ LCS ([ ' jiao ', ' che ', ' dao ', ' wei ', ' jia ', ' yi ', ' yuan ', ' hao ', ' ma ', ' wei ', ' jia ', ' yi ', ' yuan ', ' 6 ═ 1/6 ═ 0.17 ═ 0.26

……

When K is set to be 2, screening out the user vocabulary with the Pinyin score at the top 2 as a candidate word: "nail hospital" and "rock hospital".

And step 3:

calculating pronunciation similarity scores of any two voices between a text to be corrected and each candidate word by adopting a pronunciation similarity calculation method such as a DIMSIM algorithm, wherein for example, the pronunciation similarity of 'tail (wei)' and 'micro (wei)' is 1, accumulating the pronunciation similarity scores of the text pinyin to be corrected and the candidate word pinyin by using a Longest Common Subsequence (LCS) algorithm to obtain an accumulated pronunciation score of each candidate word, recording a scoring path of the pronunciation score, and dividing the accumulated pronunciation score of the candidate word by the number of the candidate word pinyin to obtain the pronunciation score of the candidate word, wherein the specific calculation method is as follows:

LCS (Pinyin is good after calling the car to the home hospital, Pinyin | DIMSIM pronunciation similarity calculation method in micro-nail hospital)/4 ═ 1

LCS (Pinyin is good after calling the car to the home hospital, Pinyin | DIMSIM pronunciation similarity calculation method in rock hospital)/4-0.52

Fig. 4 shows a graph of the cumulative pronunciation score of the "nail hospital", and fig. 5 shows a graph of the cumulative pronunciation score path thereof.

And 4, step 4:

the "micro-nail hospital" with the highest pronunciation score is selected as the error correction word, and the text replacement word is pushed back to the "tail hospital" according to the accumulated score path corresponding to the highest accumulated pronunciation score as shown in fig. 5.

And 5:

when the threshold value is set to be 0.8, if the pronunciation score of the error correction word 'micro-nail hospital' is 1 and is larger than the threshold value, the error correction word 'micro-nail hospital' is used for replacing the text to replace the word 'tailstock hospital', namely the text to be corrected 'how the car is called to the tailstock hospital' is changed into 'how the car is called to the micro-nail hospital'.

The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the examples, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and variations can be made by persons skilled in the art without departing from the principles of the invention and should be considered as within the scope of the invention.

Claims

1. A text error correction method, comprising the steps of:

acquiring a text to be corrected, identifying pinyin of the text to be corrected, and extracting user words and user word pinyin from a user word bank;

comparing the pinyin of the text to be corrected with the pinyin of the vocabulary of each user, and selecting error-correcting words from a user word bank according to a preset algorithm;

and reversely deducing the replacement words in the text to be corrected according to the selected path of the error correction words, replacing the replacement words with the error correction words, and obtaining the text after error correction.

2. The text correction method of claim 1 wherein the preset algorithm comprises a pronunciation score algorithm comprising:

calculating pronunciation similarity scores of any two pinyins between the text pinyin to be corrected and the pinyin of the vocabulary of each user, accumulating the pronunciation similarity scores of the text pinyin to be corrected and the pinyin of the vocabulary of the user by using a Longest Common Subsequence (LCS) algorithm to obtain an accumulated pronunciation score of each vocabulary of the user, and recording a scoring path of the pronunciation scores;

dividing the accumulated pronunciation score of the user vocabulary by the pinyin number of the user vocabulary to obtain the pronunciation score of the user vocabulary, and selecting the user vocabulary with the highest pronunciation score as an error correction word.

3. The text correction method of claim 1, wherein the preset algorithm comprises a pinyin score algorithm and a pronunciation score algorithm, the pinyin score algorithm comprising:

calculating LCS of the text pinyin to be corrected and the pinyin of each user vocabulary to obtain the accumulated pinyin score of each user vocabulary, dividing the accumulated pinyin score of the user vocabulary by the number of the pinyins of the user vocabulary to obtain the pinyin score of the user vocabulary, and screening the user vocabulary with the pinyin score arranged at the front K position as a candidate word, wherein K is less than or equal to the total number of the user vocabulary.

The pronunciation algorithm comprises:

calculating pronunciation similarity scores of any two pinyins between the text pinyin to be corrected and the pinyin of each candidate word, accumulating the pronunciation similarity scores of the text pinyin to be corrected and the pinyin of the candidate word by using a Longest Common Subsequence (LCS) algorithm to obtain an accumulated pronunciation score of each candidate word, and recording a scoring path of the pronunciation scores;

and dividing the accumulated pronunciation score of the candidate word by the number of the pinyin of the candidate word to obtain the pronunciation score of the candidate word, and selecting the candidate word with the highest pronunciation score as the error correction word.

4. The text error correction method according to claim 3, wherein if there are a plurality of candidate words having the same pronunciation score, the error correction words are selected in the following order:

the candidate word with the highest pinyin score is an error correction word;

the candidate word with the same number of words as the text to be corrected and the maximum number of words is the error correction word;

the candidate word with the length closest to the word number of the text to be corrected is the error correction word;

and randomly selecting the candidate words as error correction words.

5. The text correction method of claim 2 or 3, wherein the pronunciation similarity score of the pinyin is directly calculated by a pronunciation similarity algorithm, the pronunciation similarity algorithm comprising a DIMSIM algorithm.

6. The text error correction method according to claim 2 or 3, wherein the pronunciation score of the error correction word is compared with a preset threshold, if:

if the pronunciation score of the error correction word is lower than a preset threshold value, the step of replacing the replacement word with the error correction word is not executed any more, and the text to be corrected is directly output;

and (4) replacing the replacement words with the error-correcting words and outputting the error-corrected text when the pronunciation score of the error-correcting words is higher than or equal to a preset threshold value.

7. The text error correction method according to claim 6, wherein the preset threshold value ranges from 0.5 to 1.

8. A text correction system, comprising the following modules:

the voice recognition module is used for receiving the voice input of the user from the user and acquiring a voice recognition text and pinyin of the text to be corrected;

the user word bank storage module is used for storing and extracting user words and user word pinyin, calculating the number of the user word pinyin and calculating the number of the user words;

the pronunciation score calculation module is used for calculating the pronunciation score of the pinyin of the candidate word and the accumulated pronunciation score scoring path;

and the text replacement module is used for text replacement.

9. The text correction system of claim 8, further comprising a pinyin-score computation module for computing a pinyin score for each user vocabulary.

10. An electronic device comprising a memory and a processor, the memory and the processor being connected;

the memory is used for storing programs;

the processor calls a program stored in the memory to perform the method of any of claims 1-4.