CN103106061A - Voice input method and device - Google Patents

Voice input method and device Download PDF

Info

Publication number
CN103106061A
CN103106061A CN2013100699755A CN201310069975A CN103106061A CN 103106061 A CN103106061 A CN 103106061A CN 2013100699755 A CN2013100699755 A CN 2013100699755A CN 201310069975 A CN201310069975 A CN 201310069975A CN 103106061 A CN103106061 A CN 103106061A
Authority
CN
China
Prior art keywords
recognition result
voice messaging
speech
carried out
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013100699755A
Other languages
Chinese (zh)
Inventor
张然
邵颖
王力劭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING VCYBER TECHNOLOGY Co Ltd
Original Assignee
BEIJING VCYBER TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING VCYBER TECHNOLOGY Co Ltd filed Critical BEIJING VCYBER TECHNOLOGY Co Ltd
Priority to CN2013100699755A priority Critical patent/CN103106061A/en
Publication of CN103106061A publication Critical patent/CN103106061A/en
Pending legal-status Critical Current

Links

Images

Abstract

The embodiment of the invention provides a voice input method and device, and relates to the field of voice signal processing. The technical scheme provided by the embodiment of the invention comprises the following steps of: carrying out voice identification on original voice information input by a user, and displaying an obtained primary identification result; receiving secondary voice information input by the user after primary voice information is received; judging whether the secondary voice information indicates to be modified or not; and if so displaying the modified primary identification result according to the secondary voice information. The scheme can be applied to user terminals, such as a computer and a mobile phone.

Description

Pronunciation inputting method and device
Technical field
The present invention relates to field of voice signal, relate in particular to a kind of pronunciation inputting method and device.
Background technology
In recent years, along with the development of speech recognition technology, the user can realize controlling of mobile device by phonetic order, also can realize that the editor of word inputs by voice etc.Wherein, system can carry out speech recognition by the voice signal to user's input, and the Identification display result realizes editor's input of word.
Yet when having phonetically similar word in the user input voice signal or have noise etc. to disturb, all or part of of recognition result may be made mistakes; This moment, the user need to re-enter after the part of manual deletion error, complicated operation.
Summary of the invention
Embodiments of the invention provide a kind of pronunciation inputting method and device, can simplify user's operation.
On the one hand, provide a kind of pronunciation inputting method, comprising: the initial speech information to user's input is carried out speech recognition, obtains showing after recognition result first; Receive the secondary voice messaging that the user inputs after described initial speech information; Judge whether described secondary voice messaging indicates modification; If indication, according to described secondary voice messaging to the rear demonstration of modifying of described recognition result first.
On the other hand, provide a kind of speech input device, comprising:
The first display unit is used for the initial speech information of user's input is carried out speech recognition, obtains showing after recognition result first;
Voice receiving unit is used for receiving the secondary voice messaging that the user inputs after described initial speech information;
The indication confirmation unit is used for judging whether described secondary voice messaging indicates modification;
Revise display unit, if be used for indication, according to described secondary voice messaging to the rear demonstration of modifying of described recognition result first.
The pronunciation inputting method that the embodiment of the present invention provides and device, when the secondary voice messaging indication of inputting after initial speech information as the user is revised, can be directly according to the secondary voice messaging of user's input to the rear demonstration of modifying of recognition result first, thereby realize phonetic entry.The technical scheme that the embodiment of the present invention provides has solved and re-enter after the part of the manual deletion error of user's needs in the prior art, and the problem of complicated operation can improve the efficient of phonetic entry.
Description of drawings
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, the below will do to introduce simply to the accompanying drawing of required use in embodiment or description of the Prior Art, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain according to these accompanying drawings other accompanying drawing.
The process flow diagram of the pronunciation inputting method that Fig. 1 provides for the embodiment of the present invention one;
The process flow diagram of the pronunciation inputting method that Fig. 2 provides for the embodiment of the present invention two;
The schematic diagram one of the pronunciation inputting method that Fig. 3 provides for the embodiment of the present invention two;
The schematic diagram two of the pronunciation inputting method that Fig. 4 provides for the embodiment of the present invention two;
The structural representation one of the speech input device that Fig. 5 provides for the embodiment of the present invention three;
Fig. 6 is the structural representation one of speech input device indicating confirmation unit shown in Figure 5;
Fig. 7 is the structural representation two of speech input device indicating confirmation unit shown in Figure 5;
The structural representation two of the speech input device that Fig. 8 provides for the embodiment of the present invention three.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Based on the embodiment in the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that obtains under the creative work prerequisite.
The embodiment of the present invention provides a kind of pronunciation inputting method and device, can solve the problem of prior art phonetic entry complexity.
Embodiment one:
As shown in Figure 1, the pronunciation inputting method that the embodiment of the present invention provides comprises:
Step 101 is carried out speech recognition to the initial speech information of user's input, obtains showing after recognition result first.
In the present embodiment, when the user need to be by the phonetic entry word, can press the start button on speech input device, make speech input device can receive by microphone the voice messaging of user's input.When receiving the initial speech information of user's input first, can carry out speech recognition to this initial speech information, obtain recognition result first.For the pronunciation inputting method scope of application that the embodiment of the present invention is provided wider, can identify the user speech information of different field, different accents, in the present embodiment, step 101 can adopt the unspecified person speech recognition technology that the initial speech information of user's input is identified, resolved, and obtains recognition result first.
In the present embodiment, step 101 can show recognition result first with conventional state; Use for the ease of the user, also can show recognition result first with state to be confirmed, be not restricted at this.Wherein, show that take state to be confirmed recognition result can as showing in the mode that covers floating layer, also can show for the mode with flicker first.Wherein, show in the mode that covers floating layer, can be similar with the mode that highlights, give unnecessary details no longer one by one at this.
In the present embodiment, when showing first recognition result with state to be confirmed, the user can treat the word of acknowledgement state and modify.During for fear of needs inputs homonym, the mistake of speech input device is revised, and when can there is no new phonetic entry in Preset Time after voice messaging first, is acknowledgement state with the word marking of state to be confirmed, as removes coating, cancels flicker etc.
Step 102 receives the secondary voice messaging that the user inputs after initial speech information.
In the present embodiment, after speech input device shows first recognition result by step 101, if the user need to in recognition result first partly or entirely word modify or need to continue other words of input, can again press the start button on speech input device, make speech input device can receive by microphone the secondary voice messaging of user's input.
Step 103 judges whether this secondary voice messaging indicates modification.
In the present embodiment, after speech input device receives the secondary voice messaging of user input by step 102, need at first by this secondary voice messaging of step 103 judgement be the user need to recognition result first modify input or the user need to continue to input other words and inputs.
Concrete, judge by step 103 whether this secondary voice messaging indicates the process of modification to comprise: this secondary voice messaging and initial speech information are carried out audio frequency compare, obtain the similarity value; Judge according to the relation of similarity value and default threshold value whether this secondary voice messaging indicates modification.Wherein, this secondary voice messaging and initial speech information are carried out audio frequency to be compared and can realize that audio frequency compare for extracting the audio frequency characteristics parameter, the process of this extraction audio frequency characteristics parameter can comprise: at first utilize wavelet transformation respectively initial speech information and secondary voice messaging to be compressed, obtain initial compression voice and second-compressed voice, this small wave converting method is preferably haar wavelet transform, also can be additive method, not be restricted at this; Then the method for employing " audio frame " is extracted respectively the audio frequency characteristics parameter of initial compression voice and second-compressed voice, obtains initial audio parameter and secondary audio frequency parameter, and this audio frequency characteristics parameter is preferably barycenter, root mean square, Mel cepstrum parameter etc.; At last initial audio parameter and secondary audio frequency parameter are carried out respectively Euclidean distance calculating, after obtaining similarity distance, determine the similarity value according to similarity distance.Also the audio frequency of voice messaging and secondary voice messaging first can be converted to identical time shaft model simultaneously, the recycling pattern recognition technique is realized the audio frequency comparison; Can also realize by other means that the audio frequency of secondary voice messaging and initial speech information compares, give unnecessary details no longer one by one at this.
Judge by step 103 whether this secondary voice messaging indicates the process of modification also can comprise: at first the secondary voice messaging is carried out semantic analysis, obtain analysis result; Then judge according to analysis result whether this secondary voice messaging indicates modification.Wherein, the secondary voice are carried out semantic analysis, can whether comprise for judging in the secondary voice messaging " will replace with ", " adding in the position " etc.; Also can carry out semantic analysis to the secondary voice by other means, give unnecessary details no longer one by one at this.
In the present embodiment, by audio frequency compare, semantic analysis judges whether this voice messaging indicates modification, speech input device both can be selected a kind of in said method according to user's needs, also can with the said method combination, can be user-friendly to; When the user need to be modified to the word of having inputted, both can realize revising by the voice that repeat to need to revise part, also can contain by input and revise semantic voice (as x is revised as y, or add y etc. in the x back) realize revising, need not the user and carry out the manually operation such as deletion, be user-friendly to, and can improve the efficient of phonetic entry.
Step 104, if indication, according to this secondary voice messaging to the rear demonstration of modifying of recognition result first.
In the present embodiment, if whether step 103 indicates modification by the audio frequency contrast judgement, step 104 is modified and can be comprised recognition result first according to this secondary voice messaging: at first the secondary voice messaging is carried out speech recognition, obtain at least one secondary recognition result; Then obtain the target recognition result from least one secondary recognition result; At last according to the target recognition result to the rear demonstration of modifying of recognition result first.If step 103 judges whether the indication modification by semantic analysis, step 104 is modified and can be comprised recognition result first according to this secondary voice messaging: at first obtain location revision and target voice information according to analysis result, as can be used as target voice information by the part after " replacing with " in the secondary voice messaging etc.; Target voice information is carried out speech recognition, obtain at least one secondary recognition result; Obtain the target recognition result from least one secondary recognition result; According to this target recognition result and location revision to the rear demonstration of modifying of recognition result first.Wherein, obtain the target recognition result from least one secondary recognition result, both can be for having obtained the target recognition result according to the frequency of utilization of at least one secondary recognition result, also can be for obtaining the target recognition result according at least one secondary recognition result and the degree of association of recognition result first.
In the present embodiment, in step 104, recognition result is first modified and can be comprised: at first recognition result is first modified, obtain amended recognition result; Then the automatic recognition result first that shown of deletion; At last showing the amended recognition result of position display of recognition result first.Wherein, to recognition result first modify can for: at first determine the location revision of recognition result first; Then root is modified to recognition result first.
Preferably, when indication during to all or part of replacement of recognition result first, by step 104 pair first recognition result modify, also can comprise: at first automatically delete first part to be replaced in recognition result; Then add in the position of partial response to be replaced and replace the rear demonstration of part.When indication is added in recognition result first, by step 104 pair first recognition result modify, can show after corresponding result is added in recognition result first.
The pronunciation inputting method that the embodiment of the present invention provides, when the secondary voice messaging indication of inputting after initial speech information as the user is revised, can be directly according to the secondary voice messaging of user's input to the rear demonstration of modifying of recognition result first, thereby realize phonetic entry.The technical scheme that the embodiment of the present invention provides has solved and re-enter after the part of the manual deletion error of user's needs in the prior art, and the problem of complicated operation can improve the efficient of phonetic entry.
Embodiment two:
As shown in Figure 2, the pronunciation inputting method that the embodiment of the present invention provides, the method is to shown in Figure 1 similar, and difference is, if determine that by step 103 the secondary voice messaging do not indicate modification, the method that the present embodiment provides also comprises:
Step 105 is carried out speech recognition to the secondary voice messaging, obtains the secondary recognition result.
In the present embodiment, if do not indicate modification by the definite secondary voice messaging of step 103, explanation need to continue input after input initial speech information, therefore can directly carry out speech recognition to the secondary voice messaging, obtains the secondary recognition result.For the pronunciation inputting method scope of application that the embodiment of the present invention is provided wider, can identify the user speech information of different field, different accents, in the present embodiment, step 105 can adopt the unspecified person speech recognition technology that the initial speech information of user's input is identified, resolved, and obtains recognition result first.
Step 106 is showing the secondary recognition result after recognition result first.
In the present embodiment, after obtaining the secondary recognition result by step 105, can be and then first recognition result show this secondary recognition result.
In order to make those skilled in the art can understand the technical scheme that the embodiment of the present invention provides, need to describe as example by phonetic entry " sigh wind and cloud changeable " take the user, suppose that recognition result is " sigh wind and cloud different transform " first, be shown as the sigh wind and cloud different transform that covers floating layer "; Due to " conversion " two character errors, and recognition result is state to be confirmed first, therefore the user can be in Preset Time, input audio frequency " bian huan ", make speech input device that " bian huan " and " tan xi feng yun duo bian huan " audio frequency are carried out the audio frequency comparison successively, audio frequency " bian huan " indication of determining input is modified to recognition result first; At first speech input device can carry out speech recognition to " bian huan " afterwards, obtains at least one secondary recognition result---conversion, change irregularly, slow down, trouble on the frontier and just changing; The degree of association by " sigh wind and cloud " in above-mentioned secondary recognition result and recognition result first can determine that " change " is the target recognition result; Thereafter " conversion " deletion automatically in the recognition result first that speech input device is corresponding with audio frequency " bian huan " is shown as " sigh wind and cloud is many " that cover floating layer; Then speech input device can add target recognition result " change " to " conversion " relevant position in original recognition result first, be shown as " the sigh wind and cloud changeable " that cover floating layer, and recognition result is labeled as acknowledgement state first, as shown in Figure 3.Especially, if the user does not input audio frequency in Preset Time, the recognition result first that shows in the mode that covers floating layer can be labeled as acknowledgement state, as shown in Figure 4, when making the user again input voice after Preset Time, can continue recognition result input first, when needing avoiding the input homonym, the problem that the mistake of speech input device is revised; If the voice that the user inputs are again indicated, recognition result is not first modified, can continue recognition result input first yet.
The pronunciation inputting method that the embodiment of the present invention provides, when the secondary voice messaging indication of inputting after initial speech information as the user is revised, can be directly according to the secondary voice messaging of user's input to the rear demonstration of modifying of recognition result first, thereby realize phonetic entry.The technical scheme that the embodiment of the present invention provides has solved and re-enter after the part of the manual deletion error of user's needs in the prior art, and the problem of complicated operation can improve the efficient of phonetic entry.
Embodiment three:
As shown in Figure 5, the speech input device that the embodiment of the present invention provides comprises:
The first display unit 501 is used for the initial speech information of user's input is carried out speech recognition, obtains showing after recognition result first;
Voice receiving unit 502 is used for receiving the secondary voice messaging that the user inputs after initial speech information;
Indication confirmation unit 503 is used for judging whether the secondary voice messaging indicates modification;
Revise display unit 504, if be used for indication, according to the secondary voice messaging to the rear demonstration of modifying of recognition result first.
In the present embodiment, realize the process of phonetic entry by the first display unit 501, voice receiving unit 502, indication confirmation unit 503 and modification display unit 504, with the similar process that the embodiment of the present invention one provides, give unnecessary details no longer one by one at this.
Further, as shown in Figure 6, the present embodiment indicating confirmation unit 503 comprises:
Audio frequency comparing module 5031 is used for that secondary voice messaging and initial speech information are carried out audio frequency and compares, and obtains the similarity value;
First confirms module 5032, is used for judging according to the relation of similarity value and default threshold value whether the secondary voice messaging indicates modification.
As shown in Figure 7, this indication confirmation unit 503 also can comprise:
Semantic module 5033 is used for the secondary voice messaging is carried out semantic analysis, obtains analysis result;
Second confirms module 5034, is used for judging according to analysis result whether the secondary voice messaging indicates modification.
In the present embodiment, indication confirmation unit 503 can include only audio frequency comparing module 5031 and first and confirm module 5032, as shown in Figure 6; Also can include only semantic module 5033 and second and confirm module 5034, as shown in Figure 7; Can also both comprise audio frequency comparing module 5031 and the first confirmation module 5032, comprise again semantic module 5033 and the second confirmation module 5034, give unnecessary details no longer one by one at this.
In the present embodiment, indication confirmation unit 503 comprises when audio frequency comparing module 5031 and first is confirmed module 5032, audio frequency comparing module 5031, can comprise: the audio compression submodule, be used for respectively initial speech information and secondary voice messaging being compressed, obtain initial compression voice and second-compressed voice; The parameter extraction submodule is used for extracting the audio frequency characteristics parameter of initial compression voice and second-compressed voice respectively, obtains initial audio parameter and secondary audio frequency parameter; The distance operation submodule is used for initial audio parameter and secondary audio frequency parameter are carried out respectively the Euclidean distance computing, obtains similarity distance; Similarity is obtained submodule, is used for determining the similarity value according to similarity distance.At this moment, revise display unit, can comprise: the first identification module, be used for the secondary voice messaging is carried out speech recognition, obtain at least one secondary recognition result; The first acquisition module as a result is used for obtaining the target recognition result from least one secondary recognition result; The first modified module is used for according to the target recognition result the rear demonstration of modifying of recognition result first.
In the present embodiment, indication confirmation unit 503 comprises when semantic module 5033 and second is confirmed module 5034, revises display unit, can comprise: the position acquisition module is used for obtaining location revision and target voice information according to analysis result; The second identification module is used for target voice information is carried out speech recognition, obtains at least one secondary recognition result; The second acquisition module as a result is used for obtaining the target recognition result from least one secondary recognition result; The second modified module is used for according to target recognition result and location revision the rear demonstration of modifying of recognition result first.
In the present embodiment, the first/the second acquisition module as a result, can comprise: frequency acquisition submodule or the degree of association are obtained submodule.Wherein, the frequency acquisition submodule is used for obtaining the target recognition result degree of association according to the frequency of utilization of at least one secondary recognition result and obtains submodule, is used for obtaining the target recognition result according at least one secondary recognition result and the degree of association of recognition result first.
Further, if the indication confirmation unit is not indicated modification, as shown in Figure 8, the speech input device that the present embodiment provides can also comprise:
Recognition unit 505 is used for the secondary voice messaging is carried out speech recognition, obtains the secondary recognition result;
The second display unit 506 is used for showing the secondary recognition result after recognition result first.
In the present embodiment, realize the process of phonetic entry by recognition unit 505 and the second display unit 506, to the embodiment of the present invention two provide similar, give unnecessary details no longer one by one at this.
The speech input device that the embodiment of the present invention provides, when the secondary voice messaging indication of inputting after initial speech information as the user is revised, can be directly according to the secondary voice messaging of user's input to the rear demonstration of modifying of recognition result first, thereby realize phonetic entry.The technical scheme that the embodiment of the present invention provides has solved and re-enter after the part of the manual deletion error of user's needs in the prior art, and the problem of complicated operation can improve the efficient of phonetic entry.
The pronunciation inputting method that the embodiment of the present invention provides and device can be used on the user terminals such as computer, mobile phone.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited to this, anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; can expect easily changing or replacing, within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion by described protection domain with claim.

Claims (18)

1. a pronunciation inputting method, is characterized in that, comprising:
Initial speech information to user's input is carried out speech recognition, obtains showing after recognition result first;
Receive the secondary voice messaging that the user inputs after described initial speech information;
Judge whether described secondary voice messaging indicates modification;
If indication, according to described secondary voice messaging to the rear demonstration of modifying of described recognition result first.
2. pronunciation inputting method according to claim 1, is characterized in that, describedly judges that whether described secondary voice messaging indicates modification, comprising:
Described secondary voice messaging and described initial speech information are carried out audio frequency compare, obtain the similarity value;
Judge according to the relation of described similarity value and default threshold value whether described secondary voice messaging indicates modification.
3. pronunciation inputting method according to claim 2, is characterized in that, describedly described secondary voice messaging and described initial speech information are carried out audio frequency compares, and obtains the step of similarity value, comprising:
Respectively described initial speech information and described secondary voice messaging are compressed, obtained initial compression voice and second-compressed voice;
Extract respectively the audio frequency characteristics parameter of described initial compression voice and described second-compressed voice, obtain initial audio parameter and secondary audio frequency parameter;
Described initial audio parameter and described secondary audio frequency parameter are carried out respectively the Euclidean distance computing, obtain similarity distance;
Determine the similarity value according to described similarity distance.
4. pronunciation inputting method according to claim 2, is characterized in that, described according to described secondary voice messaging to the rear demonstration of modifying of described recognition result first, comprising:
Described secondary voice messaging is carried out speech recognition, obtain at least one secondary recognition result;
Obtain the target recognition result from described at least one secondary recognition result;
According to described target recognition result to the rear demonstration of modifying of described recognition result first.
5. pronunciation inputting method according to claim 1, is characterized in that, describedly judges that whether described secondary voice messaging indicates modification, comprising:
Described secondary voice messaging is carried out semantic analysis, obtain analysis result;
Judge according to described analysis result whether described secondary voice messaging indicates modification.
6. pronunciation inputting method according to claim 5, is characterized in that, described according to described secondary voice messaging to the rear demonstration of modifying of described recognition result first, comprising:
Obtain location revision and target voice information according to described analysis result;
Described target voice information is carried out speech recognition, obtain at least one secondary recognition result;
Obtain the target recognition result from described at least one secondary recognition result;
According to described target recognition result and described location revision to the rear demonstration of modifying of described recognition result first.
7. according to claim 4 or 6 described pronunciation inputting methods, is characterized in that, describedly obtains the target recognition result from described at least one secondary recognition result, comprising:
Obtain the target recognition result according to the frequency of utilization of described at least one secondary recognition result; Perhaps,
Obtain the target recognition result according to the degree of association of described at least one secondary recognition result and described recognition result first.
8. pronunciation inputting method according to claim 1, is characterized in that, described obtaining shows after recognition result first, comprising:
Obtain showing in the mode that covers floating layer after recognition result first; Perhaps
Obtain showing in the mode of glimmering after recognition result first.
9. pronunciation inputting method according to claim 1, is characterized in that, if not indication also comprises:
Described secondary voice messaging is carried out speech recognition, obtain the secondary recognition result;
Show described secondary recognition result after described recognition result first.
10. a speech input device, is characterized in that, comprising:
The first display unit is used for the initial speech information of user's input is carried out speech recognition, obtains showing after recognition result first;
Voice receiving unit is used for receiving the secondary voice messaging that the user inputs after described initial speech information;
The indication confirmation unit is used for judging whether described secondary voice messaging indicates modification;
Revise display unit, if be used for indication, according to described secondary voice messaging to the rear demonstration of modifying of described recognition result first.
11. speech input device according to claim 10 is characterized in that, described indication confirmation unit comprises:
The audio frequency comparing module is used for that described secondary voice messaging and described initial speech information are carried out audio frequency and compares, and obtains the similarity value;
First confirms module, is used for judging according to the relation of described similarity value and default threshold value whether described secondary voice messaging indicates modification.
12. speech input device according to claim 11 is characterized in that, described audio frequency comparing module comprises:
The audio compression submodule is used for respectively described initial speech information and described secondary voice messaging being compressed, and obtains initial compression voice and second-compressed voice;
The parameter extraction submodule is used for extracting the audio frequency characteristics parameter of described initial compression voice and described second-compressed voice respectively, obtains initial audio parameter and secondary audio frequency parameter;
The distance operation submodule is used for described initial audio parameter and described secondary audio frequency parameter are carried out respectively the Euclidean distance computing, obtains similarity distance;
Similarity is obtained submodule, is used for determining the similarity value according to described similarity distance.
13. speech input device according to claim 11 is characterized in that, revises display unit, comprising:
The first identification module is used for described secondary voice messaging is carried out speech recognition, obtains at least one secondary recognition result;
The first acquisition module as a result is used for obtaining the target recognition result from described at least one secondary recognition result;
The first modified module is used for according to described target recognition result the rear demonstration of modifying of described recognition result first.
14. speech input device according to claim 10 is characterized in that, described indication confirmation unit comprises:
Semantic module is used for described secondary voice messaging is carried out semantic analysis, obtains analysis result;
Second confirms module, is used for judging according to described analysis result whether described secondary voice messaging indicates modification.
15. speech input device according to claim 14 is characterized in that, described modification display unit comprises:
The position acquisition module is used for obtaining location revision and target voice information according to described analysis result;
The second identification module is used for described target voice information is carried out speech recognition, obtains at least one secondary recognition result;
The second acquisition module as a result is used for obtaining the target recognition result from described at least one secondary recognition result;
The second modified module is used for according to described target recognition result and described location revision the rear demonstration of modifying of described recognition result first.
16. according to claim 13 or 15 described speech input devices is characterized in that, the described the first/the second acquisition module as a result, and comprising: frequency acquisition submodule or the degree of association are obtained submodule;
Described frequency acquisition submodule is used for obtaining the target recognition result according to the frequency of utilization of described at least one secondary recognition result;
The described degree of association is obtained submodule, is used for obtaining the target recognition result according to the degree of association of described at least one secondary recognition result and described recognition result first.
17. speech input device according to claim 10 is characterized in that, described obtaining shows after recognition result first, comprising:
Obtain showing in the mode that covers floating layer after recognition result first; Perhaps
Obtain showing in the mode of glimmering after recognition result first.
18. speech input device according to claim 10 is characterized in that, if not indication, described device also comprises:
Recognition unit is used for described secondary voice messaging is carried out speech recognition, obtains the secondary recognition result;
The second display unit is used for showing described secondary recognition result after described recognition result first.
CN2013100699755A 2013-03-05 2013-03-05 Voice input method and device Pending CN103106061A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013100699755A CN103106061A (en) 2013-03-05 2013-03-05 Voice input method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013100699755A CN103106061A (en) 2013-03-05 2013-03-05 Voice input method and device

Publications (1)

Publication Number Publication Date
CN103106061A true CN103106061A (en) 2013-05-15

Family

ID=48313953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013100699755A Pending CN103106061A (en) 2013-03-05 2013-03-05 Voice input method and device

Country Status (1)

Country Link
CN (1) CN103106061A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105070297A (en) * 2015-07-16 2015-11-18 宁波大学 MP3 audio compression history detection method
CN105810188A (en) * 2014-12-30 2016-07-27 联想(北京)有限公司 Information processing method and electronic equipment
CN105869632A (en) * 2015-01-22 2016-08-17 北京三星通信技术研究有限公司 Speech recognition-based text revision method and device
CN106406807A (en) * 2016-09-19 2017-02-15 北京云知声信息技术有限公司 A method and a device for voice correction of characters
CN106601254A (en) * 2016-12-08 2017-04-26 广州神马移动信息科技有限公司 Information inputting method, information inputting device and calculation equipment
CN106648531A (en) * 2016-12-21 2017-05-10 惠州Tcl移动通信有限公司 Method and system for automatically matching different audio parameters based on mobile terminal
CN107068144A (en) * 2016-01-08 2017-08-18 王道平 It is easy to the method for manual amendment's word in a kind of speech recognition
CN107480118A (en) * 2017-08-16 2017-12-15 科大讯飞股份有限公司 Method for editing text and device
CN109994105A (en) * 2017-12-29 2019-07-09 宝马股份公司 Data inputting method, device, system, vehicle and readable storage medium storing program for executing
CN110603901A (en) * 2017-05-08 2019-12-20 昕诺飞控股有限公司 Voice control
CN112331194A (en) * 2019-07-31 2021-02-05 北京搜狗科技发展有限公司 Input method and device and electronic equipment
CN113177114A (en) * 2021-05-28 2021-07-27 重庆电子工程职业学院 Natural language semantic understanding method based on deep learning
CN113611284A (en) * 2021-08-06 2021-11-05 工银科技有限公司 Voice library construction method, recognition method, construction system and recognition system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1494299A (en) * 2002-10-30 2004-05-05 英华达(上海)电子有限公司 Device and method for converting speech sound input into characters on handset
CN1629934A (en) * 2004-02-06 2005-06-22 刘新斌 Building and using method of virtual speech keyboard for interactive control
CN101807399A (en) * 2010-02-02 2010-08-18 华为终端有限公司 Voice recognition method and device
JP2010257065A (en) * 2009-04-22 2010-11-11 Sanyo Electric Co Ltd Input device
CN102682763A (en) * 2011-03-10 2012-09-19 北京三星通信技术研究有限公司 Method, device and terminal for correcting named entity vocabularies in voice input text

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1494299A (en) * 2002-10-30 2004-05-05 英华达(上海)电子有限公司 Device and method for converting speech sound input into characters on handset
CN1629934A (en) * 2004-02-06 2005-06-22 刘新斌 Building and using method of virtual speech keyboard for interactive control
JP2010257065A (en) * 2009-04-22 2010-11-11 Sanyo Electric Co Ltd Input device
CN101807399A (en) * 2010-02-02 2010-08-18 华为终端有限公司 Voice recognition method and device
CN102682763A (en) * 2011-03-10 2012-09-19 北京三星通信技术研究有限公司 Method, device and terminal for correcting named entity vocabularies in voice input text

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105810188A (en) * 2014-12-30 2016-07-27 联想(北京)有限公司 Information processing method and electronic equipment
CN105810188B (en) * 2014-12-30 2020-02-21 联想(北京)有限公司 Information processing method and electronic equipment
CN105869632A (en) * 2015-01-22 2016-08-17 北京三星通信技术研究有限公司 Speech recognition-based text revision method and device
CN105070297B (en) * 2015-07-16 2018-10-23 宁波大学 A kind of MP3 audio compressions history detection method
CN105070297A (en) * 2015-07-16 2015-11-18 宁波大学 MP3 audio compression history detection method
CN107068144A (en) * 2016-01-08 2017-08-18 王道平 It is easy to the method for manual amendment's word in a kind of speech recognition
CN106406807A (en) * 2016-09-19 2017-02-15 北京云知声信息技术有限公司 A method and a device for voice correction of characters
CN106601254A (en) * 2016-12-08 2017-04-26 广州神马移动信息科技有限公司 Information inputting method, information inputting device and calculation equipment
US10796699B2 (en) 2016-12-08 2020-10-06 Guangzhou Shenma Mobile Information Technology Co., Ltd. Method, apparatus, and computing device for revision of speech recognition results
CN106648531A (en) * 2016-12-21 2017-05-10 惠州Tcl移动通信有限公司 Method and system for automatically matching different audio parameters based on mobile terminal
CN110603901A (en) * 2017-05-08 2019-12-20 昕诺飞控股有限公司 Voice control
CN110603901B (en) * 2017-05-08 2022-01-25 昕诺飞控股有限公司 Method and control system for controlling utility using speech recognition
CN107480118A (en) * 2017-08-16 2017-12-15 科大讯飞股份有限公司 Method for editing text and device
CN109994105A (en) * 2017-12-29 2019-07-09 宝马股份公司 Data inputting method, device, system, vehicle and readable storage medium storing program for executing
CN112331194A (en) * 2019-07-31 2021-02-05 北京搜狗科技发展有限公司 Input method and device and electronic equipment
CN113177114A (en) * 2021-05-28 2021-07-27 重庆电子工程职业学院 Natural language semantic understanding method based on deep learning
CN113611284A (en) * 2021-08-06 2021-11-05 工银科技有限公司 Voice library construction method, recognition method, construction system and recognition system

Similar Documents

Publication Publication Date Title
CN103106061A (en) Voice input method and device
CN111261144B (en) Voice recognition method, device, terminal and storage medium
US9177545B2 (en) Recognition dictionary creating device, voice recognition device, and voice synthesizer
US20120290298A1 (en) System and method for optimizing speech recognition and natural language parameters with user feedback
KR20180025121A (en) Method and apparatus for inputting information
CN110910903B (en) Speech emotion recognition method, device, equipment and computer readable storage medium
EP3422344B1 (en) Electronic device for performing operation corresponding to voice input
US9552810B2 (en) Customizable and individualized speech recognition settings interface for users with language accents
US11948567B2 (en) Electronic device and control method therefor
CN103426429A (en) Voice control method and voice control device
KR20060014369A (en) Speaker-dependent voice recognition method and voice recognition system
EP1899955B1 (en) Speech dialog method and system
CN112580335B (en) Method and device for disambiguating polyphone
CN105718781A (en) Method for operating terminal equipment based on voiceprint recognition and terminal equipment
CN104679733A (en) Voice conversation translation method, device and system
CN115104151A (en) Offline voice recognition method and device, electronic equipment and readable storage medium
CN113012683A (en) Speech recognition method and device, equipment and computer readable storage medium
CN112614482A (en) Mobile terminal foreign language translation method, system and storage medium
CN111916085A (en) Human-computer conversation matching method, device and medium based on pronunciation similarity
KR20140077773A (en) Apparatus and method for recognizing speech using user location information
CN110728137B (en) Method and device for word segmentation
Ranzenberger et al. Integration of a Kaldi speech recognizer into a speech dialog system for automotive infotainment applications
CN114783424A (en) Text corpus screening method, device, equipment and storage medium
KR20140126485A (en) Method of Emotion Reactive Type Mobile Private Secretary Service
CN110010131B (en) Voice information processing method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130515