CN105469801A

CN105469801A - Input speech restoring method and device

Info

Publication number: CN105469801A
Application number: CN201410462543.5A
Authority: CN
Inventors: 陈紫微
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba China Network Technology Co Ltd
Priority date: 2014-09-11
Filing date: 2014-09-11
Publication date: 2016-04-06
Anticipated expiration: 2034-09-11
Also published as: CN105469801B

Abstract

The invention provides an input speech restoring method and device. The method comprises that speech fields in received input speech are recognized according to a preset speech recognition base, and whether speech fields to be restored exist in the input speech are determined; if the speech fields to be restored exist in the input speech, corrective speech fields matching the speech fields to be restored are obtained from the preset speech recognition base; and the corrective speech fields replace the speech fields to be restored in the input speech to obtain the corrected input speech. The method and device restore the input speech and ensure the completeness of the input speech.

Description

A kind of method and device thereof repairing input voice

Technical field

The application relates to phonetic entry technical field, particularly relates to a kind of method and the device thereof of repairing input voice.

Background technology

Along with the development of Internet technology, convenient as one, the direct communication modes of voice technology is widely used.Such as, user can carry out instant messaging by voice or issue voice messaging (such as voice microblog).Voice instant messaging is that user inputs voice by terminal device, and this speech data is transmitted by internet, to realize instant messaging.

Therefore, for voice instant messaging, user by terminal device input voice integrality drastically influence the effect of voice instant messaging.Common voice instant messaging technology, as micro-letter, dealing, credulity etc., when user carries out phonetic entry by terminal device, easily can there is disappearance in one or more fields in these voice ending stage.Further, due to the impact of other environmental noises, also easily there is the disappearance of other fields by the voice that terminal device inputs in user.The disappearance of this field will cause the information integrity exporting voice impaired, causes the hint expression of whole sentence unclear, affects the effect of voice instant messaging.User is required to be the input that this re-starts voice usually, and this repetitive operation can affect the experience of user, and takies time cost more.

Therefore, how input voice are repaired, ensure that the integrality of input voice becomes technical matters urgently to be resolved hurrily.

Summary of the invention

In view of this, the application provides a kind of method and the device thereof of repairing input voice, and it is repaired input voice, ensure that the integrality of input voice.

The application provides a kind of method of repairing input voice, comprising:

According to the speech recognition library preset, the voice fields in the input voice received is identified, determine whether there is voice fields to be repaired in described input voice;

As described in there is voice fields to be repaired in input voice, then from described default speech recognition library, obtain the correction voice fields matched with described voice fields to be repaired; And

Described correction voice fields is replaced the voice fields to be repaired in described input voice, obtains the input voice after repairing.

The application also provides a kind of device repairing input voice, comprising:

Retrieval module, for identifying the voice fields in the input voice received according to the speech recognition library preset, determines whether there is voice fields to be repaired in described input voice;

Repairing module, for when there is voice fields to be repaired in described input voice, then from described default speech recognition library, obtaining the correction voice fields matched with described voice fields to be repaired; And

Replacement module, for described correction voice fields being replaced the voice fields to be repaired in described input voice, obtains the input voice after repairing.

From above technical scheme, the application identifies the voice fields in the input voice received, to determine whether there is voice fields to be repaired in input voice.The application, according to speech recognition library, obtains the correction voice fields matched with voice fields to be repaired.Correction voice fields is replaced the voice fields to be repaired in described input voice, obtain the input voice after repairing.Therefore, the application has carried out identifying and modifying to input voice, ensure that the integrality of input voice, improves Consumer's Experience.

Accompanying drawing explanation

Fig. 1 is the application server of the application and the communication schematic diagram of terminal device;

Fig. 2 is the method flow diagram that the application repairs input voice;

Fig. 3 is the structure drawing of device that the application repairs input voice;

Fig. 4 is the structural drawing of the application one embodiment;

Fig. 5 is the structural drawing that the application repairs replacement module in the device of input voice;

Fig. 6 is the user interface schematic diagram of the application's terminal device.

Embodiment

The application identifies the voice fields in the input voice received, to determine whether there is voice fields to be repaired in input voice.The application, according to speech recognition library, obtains the correction voice fields matched with voice fields to be repaired.Correction voice fields is replaced the voice fields to be repaired in described input voice, obtain the input voice after repairing.Therefore, the application has carried out identifying and modifying to input voice, ensure that the integrality of input voice, improves Consumer's Experience.

The application's specific implementation is further illustrated below in conjunction with illustrations.

Referring to Fig. 1, the application provides a kind of method of repairing input voice, and it is applied to the server 11 carrying out audio frequency dissection process.User carries out phonetic entry by terminal device 12, and described terminal device 12 is connected with described server 11 by network (can be wired, wireless or the two combination).Described terminal device 12 is generally mobile phone, panel computer, Intelligent worn device or PC etc.The input voice of user are sent to described server 11 by network by described terminal device 12, and described server 11 performs the method for the reparation input voice that the application provides, and carries out repair process to the input voice of described user.Input voice after reparation are sent to terminal device 12 by described server 11, and user selects the original input voice of transmission or the input voice after repairing to carry out communication by terminal device 12.In other embodiments, the input voice after reparation also can directly be sent to other-end equipment by described server 11.

Particularly, described terminal device 12 end can adopt the mode of APP (Application, application) software to realize the input voice of user to send to described server 12, and the input voice after the reparation of reception server 12 transmission.User selects the original input voice of transmission or the input voice after repairing to carry out communication by the interface that described APP software provides.

See Fig. 2, described in the application, method 2 comprises:

The speech recognition library that S1, basis are preset identifies the voice fields in the input voice received, and determines whether there is voice fields to be repaired in described input voice.

In the application one specific implementation, described server 11 receives the input voice that user is sent by terminal device 12.Before identifying the voice fields in the input voice received, described server 11 first carries out front-end processing to original input voice, the impact that part stress release treatment and different speaker bring, and makes the signal after process more can reflect the essential characteristic of voice.The most frequently used front-end processing has end-point detection and speech enhan-cement.End-point detection refers to and voice and non-speech audio period to be made a distinction in voice signal, determines the starting point of voice signal exactly.After end-point detection, subsequent treatment just can only be carried out voice signal.Speech enhan-cement is exactly eliminate neighbourhood noise to the impact of voice.

After described input voice carry out front-end processing, the speech recognition library according to presetting identifies each voice fields in the input voice after front-end processing, determines whether there is voice fields to be repaired in described input voice.Because the ending stage inputting voice easily exists the disappearance of one or more field, the application also can only identify the voice fields of ending in input voice, to determine whether there is voice fields to be repaired in described input voice.

In another specific implementation of the application, after described input voice carry out front-end processing, be split at least in short according to default fractionation rule.Particularly, described fractionation rule comprises at least one in word speed, interval, crucial voice fields.Such as, but when one section input voice in occur then and etc. crucial voice fields time, it is split.Or, when occurring that interval is greater than interval threshold in one section of input voice, then it is split.Or when there is obvious different word speed in one section of input voice, then it is split.Certainly, also can split according to above-mentioned any two kinds or whole three kinds the fractionation that rule carries out inputting voice simultaneously.

Such as, user input voice is " I first go shopping today // then go park // finally go eaten your * # of meal // very tired // get along well ", wherein // representing interval 2ms, interval threshold is 1ms.The fractionation rule preset is interval and crucial voice fields, 5 words are obtained after it is split, be respectively " I first goes shopping today ", " then going to park ", " finally going to have eaten meal ", " very tired ", " get along well you * # ", wherein *, # are fuzzy pronunciation.

After input voice split, carry out field cutting to every words after fractionation respectively, every words are cut into multiple voice fields, and particularly, described voice fields is word, word or morpheme.Such as, " get along well you * # " cutting become " no ", " with ", " you ", " * ", " # ".

The application splits input voice, carries out voice recognition processing respectively, greatly reduce the calculated amount of speech recognition algorithm, make server 11 occupy less internal memory and cpu resource the input voice after fractionation.

In addition, particularly, the speech recognition library prestored in described server 11 can obtain correction voice fields according to voice fields to be repaired by the repairing model of recognizer.The voice fields to be repaired of preservation as search index, if the voice fields of inquiry hits the voice fields preserved in this search index, is then shown that the voice fields of this inquiry is voice fields to be repaired by described speech recognition library.Described speech recognition library employing artificially collects or the mode of machine collection is set up, pronunciation characteristic according to different language is set up, such as MITMedialabSpeechDataset (MIT Media Lab speech data collection), PitchandVoicingEstimatesforAurora2 (pitch period of Aurora2 sound bank and tone are estimated), Congressionalspeechdata (Congress's speech data), MandarinSpeechFrameData (mandarin pronunciation frame data) etc.

It should be noted that, voice fields to be repaired in the application's speech recognition library and correct voice fields not relation one to one, different voice fields to be repaired can due to the corresponding identical correction voice fields of the difference of recognizer, and same voice fields to be repaired also can due to the corresponding multiple different correction voice fields of the difference of recognizer.Usually adopt different recognizers jointly to carry out the accuracy identifying to improve identification in speech recognition, general speech recognition algorithm is as HMM speech recognition modeling and algorithm, BMM speech recognition modeling and algorithm etc.

Such as, voice fields " * " to be repaired is preserved in described speech recognition library, the correction voice fields that can obtain its correspondence by recognizer " is said ", preserve voice fields " # " to be repaired in described speech recognition library, the correction voice fields " " of its correspondence can be obtained by recognizer.Certainly, described voice fields to be repaired " * ", also can by other recognizers obtain its correspondence other correct voice fields " institute ", described voice fields " # " to be repaired, also can by other recognizers obtain its correspondence other correct voice fields " ".Described speech recognition library using voice fields " * " to be repaired and " # " as search index.

" get along well you * # " cutting become " no ", " with ", " you ", " * ", after " # ", identify according to speech recognition library, the voice fields " no " of inquiry, " with ", " you ", " * ", " * " in " # ", " # " hit voice fields " * ", " # " in this search index, then show the voice fields " * " of this inquiry, " # " be voice fields to be repaired.

S2, as described in there is voice fields to be repaired in input voice, then from described default speech recognition library, obtain the correction voice fields matched with described voice fields to be repaired.

In the application one specific implementation, if by the search index of the described speech recognition library of inquiry, find there is the voice fields hitting described search index in the voice fields of input voice, then the voice fields of the described search index of this hit is voice fields to be repaired.Use different recognizers to identify described voice fields to be repaired from described speech recognition library, obtain the correction voice fields that this voice fields to be repaired is corresponding.

Such as, the voice fields " no " of inquiry in input voice " get along well you * # ", " with ", there is voice fields " * ", " # " to be repaired in " you ", " * ", " # ", then from described speech recognition library, identifies that the correction voice fields corresponding respectively with voice fields to be repaired " * ", " # " " is said ", " ".

As described in adopt various speech recognition algorithm all cannot obtain correction voice fields for this voice fields to be repaired in default speech recognition library.Like this, although input voice exist voice fields to be repaired, when speech recognition algorithm cannot be utilized to obtain according to speech recognition library the correction voice fields matched with described voice fields to be repaired, can abandon repairing described input voice.

In another specific implementation of the application, as described in adopt various speech recognition algorithm all cannot obtain correction voice fields for this voice fields to be repaired in default speech recognition library.Like this, although there is voice fields to be repaired in input voice, when but speech recognition algorithm cannot be utilized to obtain according to speech recognition library the correction voice fields matched with described voice fields to be repaired, then select excellent word filling algorithms selection voice fields as correction voice fields according to the fuzzy phoneme preset.Described fuzzy phoneme selects excellent word filling algorithm to adopt existing fuzzy control table to inquire about or Fuzzy Calculation formula, and its principle is for finding out the correction voice fields corresponding to voice fields close with the voice fields to be repaired of this inquiry in speech recognition library as correction voice fields corresponding to the voice fields to be repaired of this inquiry.

Such as, various speech recognition algorithm is adopted all cannot to obtain correction voice fields for this voice fields to be repaired " * ", " # " in described default speech recognition library.Like this, although input voice exist voice fields " * ", " # " to be repaired, speech recognition algorithm cannot be utilized according to speech recognition library to obtain the correction voice fields matched with described voice fields to be repaired.Need to adopt fuzzy phoneme to select excellent word filling algorithm, find out correction voice fields corresponding to voice fields " * ＇ " close with the voice fields " * " to be repaired of this inquiry, " # " in speech recognition library, " # ＇ " as correction voice fields corresponding to the voice fields to be repaired of this inquiry.Preserve voice fields " * ＇ " in described speech recognition library, correction voice fields corresponding to " # ＇ " be " saying ", " ", then select voice fields " to say ", " " as the correction voice fields of described voice fields to be repaired " * ", " # ".

In the application again a specific implementation, various speech recognition algorithm is adopted to obtain at least two correction voice fields corresponding to this voice fields to be repaired for this voice fields to be repaired from described default speech recognition library.

Such as, there is voice fields " * ", " # " to be repaired in input voice " get along well you * # ", then from default speech recognition library, adopt different speech recognition algorithms to obtain the correction voice fields corresponding with voice fields " * " to be repaired, " # " " is said ", " " and the correction voice fields " institute " corresponding with voice fields " * " to be repaired, " # ", " ".

S3, described correction voice fields is replaced the voice fields to be repaired in described input voice, obtain the input voice after repairing.

In the application one specific implementation, if adopt various speech recognition algorithm to obtain a correction voice fields of its correspondence for this voice fields to be repaired from described default speech recognition library, this correction voice fields is replaced the voice fields to be repaired in described input voice, thus obtains the input voice after repairing.

Such as, in input voice " get along well you * # ", correction voice fields corresponding to voice fields " * " to be repaired, " # " " is said ", " ", correction voice fields voice fields " * " to be repaired, " # " being replaced to correspondence " says ", " ".Input voice " get along well you * # " reparation becomes " getting along well, you have said ".

In another specific implementation of the application, if adopt various speech recognition algorithm to obtain at least two correction voice fields of its correspondence for this voice fields to be repaired from described default speech recognition library, respectively each correction voice fields is replaced the voice fields to be repaired in described input voice, obtain the input voice after many reparations.Input voice after repairing each carry out the assessment of statement smoothness, and the result according to the assessment of described statement smoothness determines the final input voice repaired.

The assessment of described statement smoothness is the rule that the language feature used according to phonetic entry is preset, as every words terminate word feature, adversative feature, conjunction feature etc.

Such as, in input voice " get along well you * # ", correction voice fields corresponding to voice fields " * " to be repaired, " # " " is said ", " " or " institute ", " ", voice fields " * " to be repaired, " # " replaced to corresponding correction voice fields " is said ", " " or " institute ", " ".Input voice " get along well you * # " reparation becomes " getting along well, you have said " and " get along well you place ".The assessment of statement smoothness is carried out to " getting along well, you have said " and " get along well you place ", obtains " getting along well, you have said " input voice as final reparation.

The input voice finally repaired are sent to terminal device 12 by described server 11, and terminal device 12 selects the original input voice of transmission or the input voice after repairing to carry out communication.Certain described server 11 also can from the input voice of all reparations case statement smoothness assessment rank forward many (as, article three, the input voice) repaired send to terminal device 12, and user selects to send original input voice or the arbitrary input voice after repairing carry out communication.Particularly, the voice fields of reparation is also sent to user by described server 11, makes user select to carry out reference when sending original input voice or the input voice after repairing.

The application identifies the voice fields in the input voice received, and to determine whether there is voice fields to be repaired in input voice, obtains the correction voice fields matched with voice fields to be repaired.And the voice fields to be repaired correction voice fields replaced in described input voice, obtain the input voice after repairing.Therefore, the application has carried out identifying and modifying to input voice, ensure that the integrality of input voice, improves Consumer's Experience.

Corresponding to the application's device, the application also provides a kind of device repairing input voice, and it is applied to the server 11 carrying out audio frequency dissection process.Described server 11 generally includes CPU, input/output module, storer and other hardware modules.Referring to Fig. 3, the application's device 3 logically comprises:

Retrieval module 31, for identifying the voice fields in the input voice received according to the speech recognition library preset, determines whether there is voice fields to be repaired in described input voice;

Repairing module 32, for when there is voice fields to be repaired in described input voice, then from described default speech recognition library, obtaining the correction voice fields matched with described voice fields to be repaired; And

Replacement module 33, for described correction voice fields being replaced the voice fields to be repaired in described input voice, obtains the input voice after repairing.

The application's retrieval module 31 determines whether input voice exist voice fields to be repaired, described reparation module 32 obtains the correction voice fields matched with described voice fields to be repaired, described correction voice fields is replaced the voice fields to be repaired in described input voice by described replacement module 33, obtains the input voice after repairing.Therefore, the application has carried out identifying and modifying to input voice, ensure that the integrality of input voice, improves Consumer's Experience.

In order to obtain better speech recognition effect, referring to Fig. 4, the application also comprises processing module 34, before the voice fields in the input voice received is identified, first front-end processing is carried out to original input voice, the impact that part stress release treatment and different speaker bring, makes the signal after process more can reflect the essential characteristic of voice.The most frequently used front-end processing has end-point detection and speech enhan-cement.End-point detection refers to and voice and non-speech audio period to be made a distinction in voice signal, determines the starting point of voice signal exactly.After end-point detection, subsequent treatment just can only be carried out voice signal.Speech enhan-cement is exactly eliminate neighbourhood noise to the impact of voice.

After described input voice carry out front-end processing, described retrieval module 31 identifies each voice fields in the input voice after front-end processing according to the speech recognition library preset, and determines whether there is voice fields to be repaired in described input voice.Because the ending stage inputting voice easily exists the disappearance of one or more field, described retrieval module 31 also can only identify the voice fields of ending in input voice, to determine whether there is voice fields to be repaired in described input voice.

Described retrieval module 31 by described carry out front-end processing after input voice split rule split at least in short according to presetting.Particularly, described fractionation rule comprises at least one in word speed, interval, crucial voice fields.Such as, but when one section input voice in occur then and etc. crucial voice fields time, it is split.Or, when occurring that interval is greater than interval threshold in one section of input voice, then it is split.Or when there is obvious different word speed in one section of input voice, then it is split.Certainly, also can split according to above-mentioned any two kinds or whole three kinds the fractionation that rule carries out inputting voice simultaneously.

After input voice split by described retrieval module 31, carry out field cutting to every words after fractionation respectively, every words are cut into multiple voice fields, and particularly, described voice fields is word, word or morpheme.

Such as, user input voice is " I first go shopping today // then go park // finally go eaten your * # of meal // very tired // get along well ", wherein // representing interval 2ms, interval threshold is 1ms.The fractionation rule preset is interval and crucial voice fields, 5 words are obtained after described retrieval module 31 splits it, be respectively " I first goes shopping today ", " then going to park ", " finally going to have eaten meal ", " very tired ", " get along well you * # ", wherein *, # are fuzzy pronunciation.Described retrieval module 31 carries out field cutting to above-mentioned 5 word respectively, such as, will " get along well you * # " cutting become " no ", " with ", " you ", " * ", " # ".

Retrieval module 31 described in the application splits input voice, carries out voice recognition processing respectively, greatly reduce the calculated amount of speech recognition algorithm, make server 11 occupy less internal memory and cpu resource the input voice after fractionation.

In the application one specific implementation, if described retrieval module 31 determines to there is voice fields to be repaired, described reparation module 32 uses different recognizers to identify described voice fields to be repaired from described speech recognition library, obtains the correction voice fields that this voice fields to be repaired is corresponding.This correction voice fields is replaced the voice fields to be repaired in described input voice by described replacement module 33, thus obtains the input voice after repairing.

Such as, the voice fields " no " of inquiry in input voice " get along well you * # ", " with ", there is voice fields " * ", " # " to be repaired in " you ", " * ", " # ", then from described speech recognition library, identifies that the correction voice fields corresponding respectively with voice fields to be repaired " * ", " # " " is said ", " ".Voice fields " * " to be repaired, " # " replaced to corresponding correction voice fields " is said ", " ".Input voice " get along well you * # " reparation becomes " getting along well, you have said ".

As described in reparation module 32 from preset speech recognition library adopt various speech recognition algorithm all cannot obtain correction voice fields for this voice fields to be repaired.Like this, although input voice exist voice fields to be repaired, when speech recognition algorithm cannot be utilized to obtain according to speech recognition library the correction voice fields matched with described voice fields to be repaired, can abandon repairing described input voice.

In another specific implementation of the application, if retrieval module 31 determines to there is voice fields to be repaired, but repair module 32 adopts various speech recognition algorithm all cannot obtain correction voice fields for this voice fields to be repaired from described default speech recognition library.Like this, although there is voice fields to be repaired in input voice, when but speech recognition algorithm cannot be utilized to obtain according to speech recognition library the correction voice fields matched with described voice fields to be repaired, then select excellent word filling algorithms selection voice fields as correction voice fields according to the fuzzy phoneme preset.Described fuzzy phoneme selects excellent word filling algorithm to adopt existing fuzzy control table to inquire about or Fuzzy Calculation formula, and its principle is for finding out the correction voice fields corresponding to voice fields close with the voice fields to be repaired of this inquiry in speech recognition library as correction voice fields corresponding to the voice fields to be repaired of this inquiry.

In the application again a specific implementation, if retrieval module 31 determines to there is voice fields to be repaired, described reparation module 32 adopts various speech recognition algorithm to obtain at least two correction voice fields corresponding to this voice fields to be repaired for this voice fields to be repaired from described default speech recognition library.

Referring to Fig. 5, described replacement module 33 comprises repairs replacement unit 331 and clear and coherent assessment unit 332.Each correction voice fields is replaced the voice fields to be repaired in described input voice by described reparation replacement unit 331 respectively, obtains the input voice after many reparations.Described clear and coherent assessment unit 332 carries out the assessment of statement smoothness to the input voice after each reparation, determines the input voice finally repaired according to the result of described statement smoothness assessment.

Such as, there is voice fields " * ", " # " to be repaired in input voice " get along well you * # ", then from default speech recognition library, adopt different speech recognition algorithms to obtain the correction voice fields corresponding with voice fields " * " to be repaired, " # " " is said ", " " and the correction voice fields " institute " corresponding with voice fields " * " to be repaired, " # ", " ".Voice fields " * " to be repaired, " # " replaced to corresponding correction voice fields " is said ", " " or " institute ", " ".Input voice " get along well you * # " reparation becomes " getting along well, you have said " and " get along well you place ".The assessment of statement smoothness is carried out to " getting along well, you have said " and " get along well you place ", obtains " getting along well, you have said " input voice as final reparation.

The application's principle is further illustrated below with a specific implementation.

User inputs voice " I first go shopping // get along well you * # " by terminal device 12, first carries out end-point detection and speech enhan-cement to input voice " I first go shopping // get along well you * # ".Obtain 2 words after input voice after process " I first go shopping // get along well you * # " split, be respectively " I first goes shopping ", " get along well you * # ", wherein *, # are fuzzy pronunciation.Respectively field cutting is carried out to above-mentioned 2 word, " I first goes shopping " cutting is become " I ", " elder generation ", " going ", " doing shopping "; Will " get along well you * # " cutting become " no ", " with ", " you ", " * ", " # ".Respectively for above-mentioned voice fields " I ", " elder generation ", " going ", " doing shopping " and " no ", " with ", identify in the speech recognition library preset in the application of " you ", " * ", " # ", wherein " * ", " # " hit the search index of described speech recognition library, are confirmed as voice fields to be repaired.Determine to there is voice fields " * ", " # " to be repaired, then from default speech recognition library, adopt different speech recognition algorithm to obtain the correction voice fields corresponding with voice fields " * " to be repaired, " # " " is said ", " " and the correction voice fields " institute " corresponding with voice fields " * " to be repaired, " # ", " ".Voice fields " * " to be repaired, " # " replaced to corresponding correction voice fields " is said ", " " or " institute ", " ".Input voice " get along well you * # " reparation becomes " getting along well, you have said " and " get along well you place ".The assessment of statement smoothness is carried out to " getting along well, you have said " and " get along well you place ", obtains " getting along well, you have said " input voice as final reparation.The input voice finally repaired are sent to terminal device 12 by described server 11.See Fig. 6, described terminal device 12 receives the voice fields of the input voice after repairing and reparation, and user sends original input voice with reference to the voice fields selection of repairing or the input voice after repairing carry out communication.

The application has carried out identifying and modifying to input voice, ensure that the integrality of input voice, improves Consumer's Experience.

The foregoing is only the preferred embodiment of the application, not in order to limit the application, within all spirit in the application and principle, any amendment made, equivalent replacements, improvement etc., all should be included within scope that the application protects.

Claims

1. repair a method for input voice, it is characterized in that, comprising:

2. method according to claim 1, is characterized in that, the speech recognition library that described basis is preset is carried out identification to the voice fields in the input voice received and comprised:

According to default fractionation rule, the input voice of reception are split at least in short;

Respectively the voice fields in every words is identified according to the speech recognition library preset.

3. method according to claim 2, is characterized in that, described fractionation rule comprises at least one in word speed, interval, crucial voice fields.

4. method according to claim 3, is characterized in that, described as there is voice fields to be repaired in input voice, then from described default speech recognition library, obtain the correction voice fields matched with described voice fields to be repaired also comprise:

As described in fail in default speech recognition library to obtain with as described in the correction voice fields that matches of voice fields to be repaired, then select excellent word filling algorithms selection voice fields as correction voice fields according to the fuzzy phoneme preset.

5. method according to claim 4, is characterized in that, the correction voice fields that voice fields described with to be repaired matches is at least two correction voice fields;

The described voice fields to be repaired replaced by correction voice fields in described input voice, obtains the input voice after repairing and comprises:

Respectively each correction voice fields is replaced the voice fields to be repaired in described input voice, obtain the input voice after many reparations;

Input voice after repairing each carry out the assessment of statement smoothness, and the result according to the assessment of described statement smoothness determines the final input voice repaired.

6. repair a device for input voice, it is characterized in that, comprising:

7. device according to claim 6, is characterized in that, carries out identification comprise in described retrieval module according to the speech recognition library preset to the voice fields in the input voice received:

8. device according to claim 7, is characterized in that, described fractionation rule comprises at least one in word speed, interval, crucial voice fields.

9. device according to claim 8, it is characterized in that, described reparation module also for when failing in described default speech recognition library to obtain the correction voice fields matched with described voice fields to be repaired, then selects excellent word filling algorithms selection voice fields as correction voice fields according to the fuzzy phoneme preset.

10. device according to claim 9, is characterized in that, the correction voice fields that voice fields described with to be repaired matches is at least two correction voice fields;

Described replacement module comprises:

Repairing replacement unit, for respectively each correction voice fields being replaced the voice fields to be repaired in described input voice, obtaining the input voice after many reparations;

Clear and coherent assessment unit, carry out the assessment of statement smoothness for the input voice after repairing each, the result according to the assessment of described statement smoothness determines the final input voice repaired.