Summary of the invention
In view of this, the embodiment of the present invention proposes a kind of method for correcting error of voice identification result, device, storage medium and electronics
Equipment promotes the recognition accuracy of speech recognition for carrying out automatic error-correcting to speech recognition result.
In a first aspect, the embodiment of the present invention proposes a kind of method for correcting error of voice identification result, which comprises
To speech recognition result carry out phonetic notation to determine corresponding first pinyin sequence of speech recognition result;
Determine the second pinyin sequence of multiple candidates;
Calculate the editing distance of first pinyin sequence and each second pinyin sequence;
The first pinyin sequence after alignment and each second pinyin sequence after alignment are determined according to corresponding editing distance;
Each second phonetic sequence after corresponding first text of the first pinyin sequence and alignment after determining the alignment
Arrange corresponding each second text;
It is greater than first threshold with the identical number of words of each second text in response to first text, calculates the alignment
The similarity of the first pinyin sequence afterwards and each second pinyin sequence after alignment;
It is greater than second threshold in response to the maximum similarity in the similarity, institute's speech recognition result is replaced with into institute
State the text of corresponding second pinyin sequence of maximum similarity.
Preferably, it is described to speech recognition result carry out phonetic notation to determine corresponding first phonetic of speech recognition result
Sequence includes:
Determine the corresponding pinyin combinations of each word in speech recognition result;
Scheduled separator is inserted between the pinyin combinations of adjacent word to obtain first pinyin sequence;
Alternatively,
It is described to speech recognition result carry out phonetic notation to determine the corresponding first pinyin sequence packet of speech recognition result
It includes:
Determine that each word corresponds to pinyin combinations in speech recognition result;
Predetermined separator is inserted between the pinyin combinations of adjacent word and between the initial consonant and simple or compound vowel of a Chinese syllable of each pinyin combinations
To obtain first pinyin sequence.
Preferably, the second pinyin sequence of the multiple candidates of the determination includes:
Obtain candidate text sequence;
The second pinyin sequence of corresponding multiple candidates is determined according to the text sequence of the candidate.
Preferably, the text sequence of the candidate include the identification card number of user, the birthday, address, address nearby hospitals and
The text sequence of supermarket near address.
Preferably, the text sequence according to the candidate determines the second pinyin sequence packet of corresponding multiple candidates
It includes:
In response to there are polyphones in text sequence, the corresponding multiple pinyin combinations of the polyphone are determined;
Multiple corresponding second pinyin sequences are determined respectively according to multiple pinyin combinations of the polyphone.
Preferably, each described the determined according to corresponding editing distance after the first pinyin sequence after alignment and alignment
Two pinyin sequences include:
First pinyin sequence and each second pinyin sequence is marked to need to be inserted into or delete according to corresponding editing distance
The part removed;
By the most left of first pinyin sequence and each second pinyin sequence by insertion or deletion sign flag
The most right part with the insertion and deletion sign flag deleted with determine the first pinyin sequence after alignment and
Each second pinyin sequence after alignment.
Preferably, the method also includes:
It is not more than second threshold in response to the maximum similarity in the similarity, keeps institute's speech recognition result not
Become.
Second aspect, the embodiment of the present invention propose a kind of speech recognition result error correction device, and described device includes:
Phonetic notation unit, be configured as carrying out speech recognition result phonetic notation to determine speech recognition result corresponding the
One pinyin sequence;
First determination unit is configured to determine that the second pinyin sequence of multiple candidates;
First computing unit, be configured as calculating the editor of first pinyin sequence and each second pinyin sequence away from
From;
Second determination unit is configured as after determining the first pinyin sequence after alignment and alignment according to corresponding editing distance
Each second pinyin sequence;
Third determination unit, corresponding first text of the first pinyin sequence and alignment after being configured to determine that the alignment
Corresponding each second text of each second pinyin sequence afterwards;
Second computing unit is configured to respond to first text and is greater than with the identical number of words of each second text
First threshold, the similarity of each second pinyin sequence after the first pinyin sequence and alignment after calculating the alignment;
Processing unit, the maximum similarity being configured to respond in the similarity is greater than second threshold, by institute's predicate
Sound recognition result replaces with the text of corresponding second pinyin sequence of the maximum similarity.
The third aspect, the embodiment of the present invention propose a kind of computer readable storage medium, store computer program thereon
Instruction, wherein the computer program instructions realize method as described in relation to the first aspect when being executed by processor.
Fourth aspect, the embodiment of the present invention propose a kind of electronic equipment, including memory and processor, wherein described
Memory is for storing one or more computer program instructions, wherein one or more computer program instructions are by institute
Processor is stated to execute to realize method as described in relation to the first aspect.
The text sequence that the embodiment of the present invention passes through calculating speech recognition result corresponding first pinyin sequence and multiple candidates
The editing distance for arranging corresponding each second pinyin sequence, each described second after the first pinyin sequence and alignment after being aligned
Pinyin sequence, in response to corresponding first text of the first pinyin sequence after the alignment and each second phonetic after alignment
The identical number of words of corresponding each second text of sequence is greater than first threshold, the first pinyin sequence and alignment after calculating the alignment
The similarity of each second pinyin sequence afterwards, when the maximum similarity in the similarity is greater than second threshold, by institute
Speech recognition result replaces with the text of corresponding second pinyin sequence of the maximum similarity, realizes speech recognition result
Automatic error-correcting improves the recognition accuracy of speech recognition.
Specific embodiment
Below based on embodiment, present invention is described, but the present invention is not restricted to these embodiments.Under
Text is detailed to describe some specific detail sections in datail description of the invention.Do not have for a person skilled in the art
The present invention can also be understood completely in the description of these detail sections.In order to avoid obscuring essence of the invention, well known method, mistake
There is no narrations in detail for journey, process, element and circuit.
In addition, it should be understood by one skilled in the art that provided herein attached drawing be provided to explanation purpose, and
What attached drawing was not necessarily drawn to scale.
Unless the context clearly requires otherwise, otherwise the similar word such as "include", "comprise" in entire application documents should solve
It is interpreted as the meaning for including rather than exclusive or exhaustive meaning;That is, the meaning for being " including but not limited to ".
In the description of the present invention, it is to be understood that, term " first ", " second " etc. are used for description purposes only, without
It can be interpreted as indication or suggestion relative importance.In addition, in the description of the present invention, unless otherwise indicated, the meaning of " multiple "
It is two or more.
Speech recognition be using voice as research object, by Speech processing and pattern-recognition allow machine automatic identification and
Understand the language of human oral.Speech recognition technology is exactly to allow machine that voice signal is changed into phase by identification and understanding process
The high-tech of the text or order answered.
Fig. 1 is the schematic diagram of a scenario of the automatic telephone customer service of the embodiment of the present invention.As shown in Figure 1, in automatic speech customer service
Scene under, the phone or mobile phone 11 of user pass through telephone network or internet 12 and automatic speech server 13 and computer equipment
The system of 14 compositions establishes connection, and the voice messaging of user is transmitted to system, and system carries out identification to voice messaging and according to knowledge
Other result makes answer, and voice messaging can also be passed to user by same system, and user makes according to the voice messaging of system and answering
It is multiple, hereby it is achieved that automatic speech customer service.In this process, system can first confirm the identity of user, confirm and true
User information further information service is unanimously just provided, thus how using user information to speech recognition result carry out from
Dynamic error correction, so that the recognition accuracy for improving speech recognition is current urgent problem.
Fig. 2 is the flow chart of the method for correcting error of voice identification result of the embodiment of the present invention.As shown in Fig. 2, the present embodiment
Method for correcting error of voice identification result includes the following steps:
Step S110, to speech recognition result carry out phonetic notation to determine the corresponding first phonetic sequence of speech recognition result
Column.
Since speech recognition result is text, it can use and the phonetic notation mode of text is completed to speech recognition result
Phonetic notation, obtain the corresponding phonetic of speech recognition result.
Specifically, speech recognition result can be the sentence of Chinese character composition, and corresponding phonetic refers to that the Chinese of no tone is spelled
Sound sequence.
In an optional implementation manner, step S110 includes the following steps:
Step S111 determines the corresponding pinyin combinations of each word in speech recognition result.
Step S112 is inserted into scheduled separator between the pinyin combinations of adjacent word to obtain the first phonetic sequence
Column.
For example, speech recognition result is " having family good fortune supermarket ", the corresponding pinyin combinations of each word are as follows:
Have: you;Family: jia;In: li;Good fortune: fu;It is super: chao;City: shi
Scheduled " space " separator is inserted between the pinyin combinations of adjacent word are as follows:
you jia li fu chao shi
That is, the first pinyin sequence are as follows:
you jia li fu chao shi
In another optional implementation, step S110 includes the following steps:
Step S113 determines the corresponding pinyin combinations of each word in speech recognition result.
Step S114 is inserted into pre- between the pinyin combinations of adjacent word and between the initial consonant and simple or compound vowel of a Chinese syllable of each pinyin combinations
Separator is determined to obtain first pinyin sequence.
For example, speech recognition result is " having family good fortune supermarket ", the corresponding pinyin combinations of each word are as follows:
Have: you;Family: jia;In: li;Good fortune: fu;It is super: chao;City: shi
It is inserted between the pinyin combinations of adjacent word and between the initial consonant and simple or compound vowel of a Chinese syllable of each pinyin combinations scheduled " empty
Lattice " separator are as follows:
y ou j ia l i f u ch ao sh i
That is, the first pinyin sequence are as follows:
y ou j ia l i f u ch ao sh i
It should be understood that the separator can according to need the character or symbol for replacing with and being of little use in other phonetics.Meanwhile
Separator for distinguishing the separator of adjacent word and for distinguishing initial consonant and simple or compound vowel of a Chinese syllable may be the same or different.
Step S120 determines the second pinyin sequence of multiple candidates.
Specifically, step S120 may include steps of:
Step S121 obtains candidate text sequence.
The problem of being issued the user with according to system type obtains candidate's text about this problem types from user information
This sequence.Wherein, based in the identification of some address classes, due to address range broadness, and exist it is a large amount of referred to as, unisonance
The problem that location etc. causes discrimination lower, in preparatory user information collection process, can acquire supermarket near station address,
Hospital etc. easily identifies, and the higher information of recognition accuracy, can be improved so quasi- about the identification of address class problem identification
True rate is more conducive to the confirmation of user information.
For example, the supermarket near system interrogation user man, it can be from the supermarket obtained in user information near user family.If
There are Milan life supermarket, China Resources supermarket, the sincere supermarket in Weihai, love convenience store and Carrefour hypermarket in supermarket near user family, that
Supermarket, China Resources supermarket, the sincere supermarket in Weihai, love convenience store and Carrefour hypermarket are just lived into as candidate text sequence in Milan
Column.
Certainly, in addition to address class, the lower validation of information problem of other recognition accuracies can also be by acquiring some be easy
The identification and higher information of recognition accuracy carries out further user information confirmation.
Optionally, in embodiments of the present invention, the text sequence of the candidate includes the identification card number of user, the birthday, lives
The text sequence of supermarket near location, address nearby hospitals and address.
Step S122 determines the second pinyin sequence of corresponding multiple candidates according to the text sequence of the candidate.
Specifically, step S122 is similar with step S110, and different places is, step S122 further include:
In response to there are polyphones in candidate text sequence, the corresponding multiple pinyin combinations of the polyphone are determined.
Multiple corresponding second pinyin sequences are determined respectively according to multiple pinyin combinations of the polyphone.
For example, in candidate text sequence " Carrefour hypermarket ", there are polyphone " pleasure ", the pinyin combinations of " pleasure " are as follows: le or
yue
So, corresponding second pinyin sequence of candidate text sequence " Carrefour hypermarket " are as follows:
jia le fu chao shi
jia yue fu chao shi
Or
j ia l e f u ch ao sh i
j ia y ue f u ch ao sh i
In embodiments of the present invention, the considerations of polyphone situation being carried out to candidate text sequence, to there are polyphones
Text carries out the determination of multiple corresponding second pinyin sequences, improves the accuracy rate of speech recognition error correction.
In addition, determining the second pinyin sequence of the first pinyin sequence and multiple candidates in step S110 and step S120
Same way need to be used.
Step S130 calculates the editing distance of first pinyin sequence and each second pinyin sequence.
Editing distance refers between two word strings, and the minimum edit operation times needed for another are changed into as one.License
Edit operation include that a character is substituted for another character, be inserted into a character, delete a character.
In step s 130, define a kind of new editing distance, here editing distance refer to two pinyin sequences it
Between, the minimum edit operation needed for another is changed into as one.
If obtaining pinyin sequence in a manner of being inserted into scheduled separator between the pinyin combinations in adjacent word, then
The corresponding phonetic of single word is regarded as a character when carrying out editing distance calculating to be inserted into, be deleted or replacement operation.
For example, the first pinyin sequence are as follows: you jia li fu chao shi, the second pinyin sequence are as follows: jia le fu
Chao shi, then the first pinyin sequence to be changed into the editing distance of the second pinyin sequence are as follows:
Jia li fu chao shi (deletes you)
Jia le fu chao shi (replaces li with le)
It is corresponding, the second pinyin sequence is changed into the editing distance of the first pinyin sequence are as follows:
You jia le fu chao shi (insertion you)
You jia li fu chao shi (replaces le with li)
If predetermined to be inserted between the pinyin combinations of adjacent word and between the initial consonant and simple or compound vowel of a Chinese syllable of each pinyin combinations
The mode of separator obtains pinyin sequence, then regarding initial consonant, simple or compound vowel of a Chinese syllable as a character respectively when carrying out editing distance calculating
To be inserted into, be deleted or replacement operation.
For example, the first pinyin sequence are as follows: y ou j ia l i f u ch ao sh i, the second pinyin sequence are as follows: j ia
L e f u ch ao sh i, then the first pinyin sequence to be changed into the editing distance of the second pinyin sequence are as follows:
Ou j ia l i f u ch ao sh i (deletes y)
J ia l i f u ch ao sh i (deletes ou)
J ia l e f u ch ao sh i (replaces i with e)
It is corresponding, the second pinyin sequence is changed into the editing distance of the first pinyin sequence are as follows:
Y j ia l e f u ch ao sh i (insertion y)
You j ia l e f u ch ao sh i (insertion ou)
You j ia l i f u ch ao sh i (replaces e with i)
Step S140 determines the first pinyin sequence after alignment and each described second after alignment according to corresponding editing distance
Pinyin sequence.
In step S140, the operation in editing distance in addition to replacement is only considered, that is, only considering to delete and be inserted into.
Specifically, step S140 includes the following steps:
Step S141 marks first pinyin sequence and each second pinyin sequence needs according to corresponding editing distance
Insertion or the part deleted.
Step S142, by first pinyin sequence and each second phonetic sequence by insertion or deletion sign flag
It is deleted to determine that first after alignment spells the most left and most right part with the insertion and deletion sign flag of column
Each second pinyin sequence after sound sequence and alignment.
In step S142, due to the insertion of editing distance, delete operation be it is corresponding, as shown in example in step S130,
To the most left and most right of first pinyin sequence and each second pinyin sequence by insertion or deletion sign flag
Part with the insertion or deletion sign flag is deleted, described after the sequence that will be aligned, namely alignment
Each second pinyin sequence after first pinyin sequence and alignment.
As first example in step S130 with insertion or is deleted described in symbol "-" label according to corresponding editing distance
First pinyin sequence needs the partial results be inserted into or deleted are as follows:
you jia li fu chao shi
The partial results for marking second pinyin sequence to need to be inserted into or delete with being inserted into or deleting symbol "-" are as follows:
__jia le fu chao shi
Will through insertion or delete sign flag first pinyin sequence (youJia li fu chao shi) and
The most left and most right portion with the insertion and deletion sign flag of second pinyin sequence (_ _ jia le fu chao shi)
Divide and deleted, as a result are as follows:
Jia li fu chao shi and jia le fu chao shi, thus the first pinyin sequence after being aligned
Jia li fu chao shi and the second pinyin sequence jia le fu chao shi after alignment.
In addition, it is necessary to which explanation needs the part replaced not mark first pinyin sequence in the present embodiment
Note processing.
Each described the after step S150, corresponding first text of the first pinyin sequence after determining the alignment and alignment
Corresponding each second text of two pinyin sequences.
Step S160 is greater than first threshold with the identical number of words of each second text in response to first text, calculates institute
The similarity of each second pinyin sequence after first pinyin sequence and alignment after stating alignment.
Specifically, the identical number of words of first text and each second text, namely the speech recognition after being aligned
As a result the number of words of the candidate text sequence after hit alignment, first threshold refers to preset hit number of words, because there are user's languages
Sound is very short and the case where just hitting the word in candidate text sequence, and the number of words that need to meet hit is greater than first threshold, Cai Nengjin
The calculating of row similarity.
For example, corresponding first pinyin sequence of speech recognition result is shi, the second pinyin sequence is jia le fu chao
Shi, if without whether being greater than the judgement of first threshold, and directly carry out the calculating of similarity, then similarity is a hundred percent,
This is not consistent with practical.
In embodiments of the present invention, need to carry out the identical of first text and each second text before calculating similarity
Whether number of words is greater than the judgement of first threshold, in the case where meeting identical number of words greater than first threshold, then carries out similarity
It calculates, avoids speech recognition result error correction mistake, improve the accuracy rate of speech recognition result error correction.
Optionally, calculating for the similarity can be by the number of operations of the replacement of calculating editing distance in step S130
Divided by each second pinyin sequence after alignment or the letter sum of the first pinyin sequence after alignment.
For above-mentioned example, similarity is to calculate the replacement operation number 1 of editing distance divided by the second pinyin sequence
Alphabetical sum 14.
Optionally, the calculating of the similarity can also add 1 inverse to calculate by editing distance.
In embodiments of the present invention, the similarity is each for first pinyin sequence after the alignment and after being aligned
The editing distance of second pinyin sequence adds 1 inverse.Here editing distance refers to minimum edit operation times.
Optionally, the calculating of the similarity can also be calculated by COS distance.
In embodiments of the present invention, first pinyin sequence after the alignment and each described second after alignment is spelled
Sound sequence vector, first pinyin sequence after obtaining the alignment are corresponding with each second pinyin sequence after alignment
Vector, pass through each second pinyin sequence after calculating separately alignment with COS distance and described the after corresponding be aligned
Two vectorial angle cosine values of one pinyin sequence, the cosine value are the similarity.Here, to will be after the alignment
The method of each second pinyin sequence vectorization does not illustrate after first pinyin sequence and alignment.
Step S170 is greater than second threshold in response to the maximum similarity in the similarity, by the speech recognition knot
Fruit replaces with the text of corresponding second pinyin sequence of the maximum similarity.
Due to candidate text be it is multiple, it is corresponding just to have multiple second pinyin sequences, multiple editing distances, after multiple alignment
The first pinyin sequence, each second pinyin sequence after multiple alignment, multiple similarities.
In step S160, second threshold is used to characterize the preset similarity degree of the similarity.If maximum similar
Degree is greater than second threshold, then it is assumed that the text and speech recognition result of corresponding second pinyin sequence of the maximum similarity are enough
It is similar, so that can be determined that speech recognition result is the text of corresponding second pinyin sequence of the maximum similarity, therefore,
Institute's speech recognition result is replaced with to the text of corresponding second pinyin sequence of the maximum similarity.
Furthermore it is also possible to include step S170, that is, being not more than second in response to the maximum similarity in the similarity
Threshold value keeps institute's speech recognition result constant.
That is, not larger than determining that institute's speech recognition result is the maximum similarity in the maximum similarity
When the first threshold of the text of corresponding second pinyin sequence, institute's speech recognition result is not processed, is avoided not true
The mistake of speech recognition result is handled in the case where fixed.
Fig. 3 is the data flowchart of the method for correcting error of voice identification result of the embodiment of the present invention.As shown in figure 3, combining figure
2, the data flow of the present embodiment is as follows:
Step S310, to speech recognition result 31 carry out phonetic notation to determine corresponding first phonetic of speech recognition result
Sequence 32.
In an optional implementation manner, step S310 includes the following steps:
Step S311 determines the corresponding pinyin combinations of each word in speech recognition result.
Step S312 is inserted into scheduled separator between the pinyin combinations of adjacent word to obtain the first phonetic sequence
Column.
In another optional implementation, step S310 includes the following steps:
Step S313 determines the corresponding pinyin combinations of each word in speech recognition result.
Step S314 is inserted into pre- between the pinyin combinations of adjacent word and between the initial consonant and simple or compound vowel of a Chinese syllable of each pinyin combinations
Separator is determined to obtain first pinyin sequence.
Step S320 determines the second pinyin sequence 33 of multiple candidates.
Specifically, step S320 includes the following steps:
Step S321 obtains candidate text sequence.
Optionally, in embodiments of the present invention, the text sequence of the candidate includes the identification card number of user, the birthday, lives
Supermarket's text sequence near location, address nearby hospitals and address.
Step S322 determines the second pinyin sequence of corresponding multiple candidates according to the text sequence of the candidate.
Specifically, step S322 is similar with step S310, and different places is, step S322 further include:
In response to there are polyphones in candidate text sequence, the corresponding multiple pinyin combinations of the polyphone are determined.
Multiple corresponding second pinyin sequences are determined respectively according to multiple pinyin combinations of the polyphone.
Step S330 calculates the editing distance 34 of first pinyin sequence 32 and each second pinyin sequence 33.
In step S330, a kind of new editing distance is defined, editing distance 34 refers to minimum edit operation here.
The edit operation of license includes that a character is substituted for another character, is inserted into a character, deletes a character.
Step S340, it is each described after determining the first pinyin sequence 36 after being aligned according to corresponding editing distance 35 and be aligned
Second pinyin sequence 37.
Specifically, step S340 includes the following steps:
Step S341 marks first pinyin sequence 32 and each second pinyin sequence according to corresponding editing distance 35
33 parts for needing to be inserted into or delete.
Step S342, by first pinyin sequence 32 and each second phonetic by being inserted into or deleting sign flag
It is deleted to determine the after alignment the most left and most right part with the insertion and deletion sign flag of sequence 33
Each second pinyin sequence 37 after one pinyin sequence 36 and alignment.
Wherein, in step S341, the part replaced is needed not mark processing first pinyin sequence.
Step S350, it is each described after corresponding first text 38 of the first pinyin sequence after determining the alignment and alignment
Corresponding each second text 39 of second pinyin sequence.
Step S360, judges whether first text and the identical number of words of each second text are greater than first threshold
40, when the identical number of words is greater than first threshold 40, step S370, S390 is executed, it is no to then follow the steps S380.
Step S370, the first pinyin sequence 36 after calculating the alignment and each second pinyin sequence 37 after alignment
Similarity 41.
Optionally, calculating for the similarity 41 can be secondary by calculating the operation of the replacement of editing distance in step S330
Number is total divided by the letter of each second pinyin sequence.
Optionally, the calculating of the similarity 41 can also add 1 inverse to calculate by editing distance.
Optionally, the calculating of the similarity 41 can also be calculated by COS distance.
Step S380 keeps institute's speech recognition result 31 constant.
Step S390, judges whether the maximum similarity 42 in the similarity 41 is greater than second threshold 43, in the phase
When being greater than second threshold 43 like degree, step S400 is executed, it is no to then follow the steps S410.
Institute's speech recognition result 31 is replaced with corresponding second pinyin sequence of the maximum similarity 39 by step S400
Text 44.
Step S410 keeps institute's speech recognition result 31 constant.
In embodiments of the present invention, second threshold is used to characterize the preset similarity degree of the similarity.If maximum
Similarity is greater than second threshold, then it is assumed that the text and speech recognition result of corresponding second pinyin sequence of the maximum similarity
It is similar enough, so that can be determined that speech recognition result is the text of corresponding second pinyin sequence of the maximum similarity,
Therefore, institute's speech recognition result is replaced with to the text of corresponding second pinyin sequence of the maximum similarity.Conversely, to institute
Speech recognition result is not processed, and is avoided and is handled in case of doubt the mistake of speech recognition result, improves language
The accuracy rate of sound identification.
The text that the embodiment of the present invention passes through calculating speech recognition result corresponding first pinyin sequence and multiple candidates as a result,
It is each described after the editing distance of corresponding each second pinyin sequence of this sequence, the first pinyin sequence after being aligned and alignment
Second pinyin sequence, in response to corresponding first text of the first pinyin sequence after the alignment and each described second after alignment
The identical number of words of corresponding each second text of pinyin sequence is greater than first threshold, the first pinyin sequence after calculating the alignment and
The similarity of each second pinyin sequence after alignment, when the maximum similarity in the similarity is greater than second threshold,
The text that institute's speech recognition result is replaced with to corresponding second pinyin sequence of the maximum similarity, realizes speech recognition
As a result automatic error-correcting improves the recognition accuracy of speech recognition.
Fig. 4 is the schematic diagram of the speech recognition result error correction device of the embodiment of the present invention.As shown in figure 4, the present embodiment
Device includes phonetic notation unit 41, the first determination unit 42, the first computing unit 43, the second determination unit 44, third determination unit
45, the second computing unit 46 and processing unit 47.
Wherein, phonetic notation unit 41 be configured as to speech recognition result carry out phonetic notation to determine speech recognition result pair
The first pinyin sequence answered.First determination unit 42 is configured to determine that the second pinyin sequence of multiple candidates.First calculates list
Member 43 is configured as calculating the editing distance of first pinyin sequence and each second pinyin sequence.Second determination unit 44
It is configured as determining the first pinyin sequence after alignment and each second pinyin sequence after alignment according to corresponding editing distance.
Third determination unit 45 be configured to determine that corresponding first text of the first pinyin sequence after the alignment and alignment after it is each
Corresponding each second text of second pinyin sequence.Second computing unit 46 is configured to respond to first text and institute
The identical number of words for stating each second text is greater than first threshold, each institute after the first pinyin sequence and alignment after calculating the alignment
State the similarity of the second pinyin sequence.Processing unit 47 is configured to respond to the maximum similarity in the similarity and is greater than the
Institute's speech recognition result is replaced with the text of corresponding second pinyin sequence of the maximum similarity by two threshold values.
The embodiment of the present invention proposes a kind of speech recognition result error correction device, by carrying out phonetic notation to speech recognition result
Corresponding first pinyin sequence of speech recognition result to determine, determines the second pinyin sequence of multiple candidates, described in calculating
The editing distance of first pinyin sequence and each second pinyin sequence, first after alignment is determined according to corresponding editing distance spells
Each second pinyin sequence after sound sequence and alignment, corresponding first text of the first pinyin sequence after determining the alignment
Each second text corresponding with each second pinyin sequence after alignment, in response to first text and each second text
This identical number of words is greater than first threshold, each second phonetic after the first pinyin sequence and alignment after calculating the alignment
The similarity of sequence is greater than second threshold in response to the maximum similarity in the similarity, institute's speech recognition result is replaced
It is changed to the text of corresponding second pinyin sequence of the maximum similarity, speech recognition result automatic error-correcting is realized, improves
The accuracy rate of speech recognition.
Fig. 5 is the schematic diagram of the electronic equipment of the embodiment of the present invention.Electronic equipment shown in fig. 5 is general data processing dress
It sets comprising general computer hardware structure includes at least processor 51 and memory 52.Processor 51 and memory 52
It is connected by bus 53.Memory 52 is suitable for the instruction or program that storage processor 51 can be performed.Processor 51 can be independence
Microprocessor, be also possible to one or more microprocessor set.Processor 51 is deposited by executing memory 52 as a result,
The order of storage is realized thereby executing the method flow of embodiment present invention as described above for the processing of data and for other
The control of device.Bus 53 links together above-mentioned multiple components, while said modules are connected to 54 He of display controller
Display device and input/output (I/O) device 55.Input/output (I/O) device 55 can be mouse, keyboard, modulation /demodulation
Device, network interface, touch-control input device, body-sensing input unit, printer and other devices well known in the art.Typically,
Input/output (I/O) device 55 is connected by input/output (I/O) controller 56 with system.
Wherein, memory 52 can store component software, such as operating system, communication module, interactive module and application
Program.Above-described each module and application program are both corresponded to complete one or more functions and be retouched in inventive embodiments
One group of executable program instructions of the method stated.
It is above-mentioned according to the method for the embodiment of the present invention, the flow chart and/or frame of equipment (system) and computer program product
Figure describes various aspects of the invention.It should be understood that each of flowchart and or block diagram piece and flow chart legend and/or frame
The combination of block in figure can be realized by computer program instructions.These computer program instructions can be provided to general meter
The processor of calculation machine, special purpose computer or other programmable data processing devices, to generate machine so that (via computer or
What the processors of other programmable data processing devices executed) instruction creates for realizing in flowchart and or block diagram block or block
The device of specified function action.
Meanwhile as skilled in the art will be aware of, the various aspects of the embodiment of the present invention may be implemented as be
System, method or computer program product.Therefore, the various aspects of the embodiment of the present invention can take following form: complete hardware
Embodiment, complete software embodiment (including firmware, resident software, microcode etc.) usually can all claim herein
For the embodiment for combining software aspects with hardware aspect of circuit, " module " or " system ".In addition, side of the invention
Face can take following form: the computer program product realized in one or more computer-readable medium, computer can
Reading medium has the computer readable program code realized on it.
It can use any combination of one or more computer-readable mediums.Computer-readable medium can be computer
Readable signal medium or computer readable storage medium.Computer readable storage medium can be such as (but not limited to) electronics,
Magnetic, optical, electromagnetism, infrared or semiconductor system, device or any suitable combination above-mentioned.Meter
The more specific example (exhaustive to enumerate) of calculation machine readable storage medium storing program for executing will include the following terms: with one or more electric wire
Electrical connection, hard disk, random access memory (RAM), read-only memory (ROM), erasable is compiled portable computer diskette
Journey read-only memory (EPROM or flash memory), optical fiber, portable optic disk read-only storage (CD-ROM), light storage device,
Magnetic memory apparatus or any suitable combination above-mentioned.In the context of the embodiment of the present invention, computer readable storage medium
It can be that can include or store the program used by instruction execution system, device or combine instruction execution system, set
Any tangible medium for the program that standby or device uses.
Computer-readable signal media may include the data-signal propagated, and the data-signal of the propagation has wherein
The computer readable program code realized such as a part in a base band or as carrier wave.The signal of such propagation can use
Any form in diversified forms, including but not limited to: electromagnetism, optical or its any combination appropriate.It is computer-readable
Signal media can be following any computer-readable medium: not be computer readable storage medium, and can be to by instructing
Program that is that execution system, device use or combining instruction execution system, device to use is communicated, is propagated
Or transmission.
Computer program code for executing the operation for being directed to various aspects of the present invention can be with one or more programming languages
Any combination of speech is write, the programming language include: programming language such as Java, Smalltalk of object-oriented, C++,
PHP, Python etc.;And conventional process programming language such as " C " programming language or similar programming language.Program code can be made
It fully on the user computer, is partly executed on the user computer for independent software package;Partly in subscriber computer
Above and partly execute on the remote computer;Or it fully executes on a remote computer or server.In latter feelings
It, can be by remote computer by including that any type of network connection of local area network (LAN) or wide area network (WAN) are extremely used under condition
Family computer, or (such as internet by using ISP) can be attached with outer computer.
The above description is only a preferred embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art
For, the invention can have various changes and changes.All any modifications made within the spirit and principles of the present invention are equal
Replacement, improvement etc., should all be included in the protection scope of the present invention.