CN1864204A - Methods, systems and programming for performing speech recognition - Google Patents

Methods, systems and programming for performing speech recognition Download PDF

Info

Publication number
CN1864204A
CN1864204A CNA028298519A CN02829851A CN1864204A CN 1864204 A CN1864204 A CN 1864204A CN A028298519 A CNA028298519 A CN A028298519A CN 02829851 A CN02829851 A CN 02829851A CN 1864204 A CN1864204 A CN 1864204A
Authority
CN
China
Prior art keywords
word
identification
user
option
speech recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA028298519A
Other languages
Chinese (zh)
Inventor
丹尼尔·L·罗思
乔丹·R·科亨
戴维·F·约翰逊
曼弗雷德·G·格雷伯赫尔
保罗·A·弗兰佐萨
爱德华·W·波特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Voice Signal Technologies Inc
Original Assignee
Voice Signal Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Voice Signal Technologies Inc filed Critical Voice Signal Technologies Inc
Priority claimed from PCT/US2002/028590 external-priority patent/WO2004023455A2/en
Publication of CN1864204A publication Critical patent/CN1864204A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems

Abstract

The present invention relates to: speech recognition using selectable recognition modes; using choice lists in large-vocabulary speech recognition; enabling users to select word transformations; speech recognition that automatically turns recognition off in one or more specified ways; phone key control of large-vocabulary speech recognition; speech recognition using phone key alphabetic filtering and spelling: speech recognition that enables a user to perform re-utterance recognition; the combination of speech recognition and text-to-speech (tts) generation; the combination of speech recognition with handwriting and/or character recognition; and the combination of large-vocabulary speech recognition with audio recording and playback.

Description

Be used for finishing method, the system and program of speech recognition
Technical field of the present invention
The present invention relates to be used for finishing method, the system and program of speech recognition.
Prior art of the present invention
Discontinuous big vocabulary predicative sound recognition system had had about history of using in 10 years on the desktop PC before writing this part patented claim.Continuous big vocabulary predicative sound recognition system had had about history of using in 5 years on such computing machine during this time.Such speech recognition system has been proved to be sizable value.In fact, most of text of present patent application is to prepare before using continuous big vocabulary predicative sound recognition system.
As what use at this part instructions and claims, when we mention big vocabulary predicative sound recognition system, we have the ability identification as the given sounding of any one word among at least two thousand different vocabulary words, depend on which word among those words has the most closely the corresponding phonetics model that is complementary with given spoken words.
Point out that as Fig. 1 big vocabulary predicative sound identification is normally worked facing to microphone 102 in a minute by user 100, this microphone is the microphone of mobile phone 104 in the example of Fig. 1.Microphone converts the waveform of the correspondence of representing with electronic signal 106 to along with the variation that passes the air pressure that the sounding of word is caused of time.In many speech recognition systems, this waveform signal is converted to time domain by the digital signal processing of finishing with computer processor or special-purpose digital signal processor 108 and expresses.Time domain is expressed and often to be comprised that numerous argument structures 112, each argument structure all represent the characteristic of the sound that each cycle (for example, each centisecond) lining among numerous continuous time cycles represents with waveform 106.
Point out that as Fig. 2 then, the time domain of the sounding that will discern (or structure) is expressed with the numerous possible sequence corresponding to the phonetics model 200 of words different in the big vocabulary and matched.In most of big vocabulary predicative sound recognition systems, each all is to use the phonetics spelling 204 similarly corresponding with find the phonetics spelling in most of dictionaries to represent for an individual words 202.Each phoneme in the phonetics spelling has the one or more phonetics models 200 that are associated with it.In many systems, model 200 is the phoneme context relation models (phoneme-in-context models) that imitate the sound of the phoneme that is associated with them when it in the phonetics spelling of given word in the context relation appearance by the front and back phoneme.The phonetics model generally is made up of the sequence of one or more probability models, and the time domain that the representative of each probability model is used in the sounding that will discern is expressed the probability of the different parameter value of each parameter in 110 the structure.
In recent years, one of main trend of personal computer was increase to use less and the computer installation that carries of being more convenient for often.
It is to finish on the universal desk-top computer of representing with Fig. 3 that originally most of personal computers are handled.Then, increase to some extent aspect the smaller and more exquisite personal computer using, because kneetop computer has computing power and user interface with the desk-top computer same type substantially, so showed in the accompanying drawings with the form of kneetop computer.Up-to-date big vocabulary predicative sound recognition system designs for using in such system.
Increased the use of novel computer recently, for example, the graphic tablet computing machine showed of Fig. 4, personal digital assistant that Fig. 5 shows, the mobile phone that strengthens computing power that Fig. 6 shows, wrist formula phone computing machine that Fig. 7 shows and Fig. 8 show uses by wearing the wearable computer that fluorescent screen that overhead device provides and visual tracking (eyetracking) and/or voice output provide user interface.
Because nearest increase aspect computing power, such new device can have the computing power of first desktop computer that equals to provide discontinuous big vocabulary recognition system, and its computing power and the same height that provides on the desk-top computer that moves continuous big vocabulary predicative sound identification first in some cases.The computing power of personal computer less and/or that be more convenient for carrying so only can increase gradually along with going by.
One of challenge that relates to the outbalance that effectively big vocabulary predicative sound identification is provided on the easier computing machine that carries provide makes on such device create, editor and use speech recognition to become easier and user interface faster.
General introduction of the present invention
One aspect of the present invention relates to the speech recognition of using selectable recognition mode.This comprises some innovations, for example: allow the user to select between recognition mode under the situation of language context relation having and do not have; Allow the user between continuous and discontinuous big vocabulary predicative sound recognition mode, to select; Allow the user between the speech recognition mode that at least two kinds of different alphabet sequences enter, to select; And allow the user when creating text, among recognition mode following more than four, to select: big vocabulary table schema, alphabetical recognition mode, digit recognition pattern and punctuate recognition mode.
Another aspect of the present invention relates to use option option list in big vocabulary predicative sound identification.This comprises some innovations, for example: the option option list that the character ordering is provided; But provide the option option list of vertical scrolling; But provide the option option list of horizontal rolling; Be used for limiting the option option list of discerning candidate item with in alphabetical filtrator, providing about character.
Another aspect of the present invention relates to makes the user can select the word conversion.This comprises some innovations, for example: and make the user will the conversion that recognized word is finished, select a kind of conversion, for example, become plural number, give to certain gerundial form of word from odd number from numerous so that change it by way of expectations, or the like.It also comprises such as making the user can be chosen between the alphabetical and non-alphabetical form innovation the selected word of conversion.It also comprises such as the corresponding option option list of word through conversion of handle and recognized word and offers the user and allow the user to select through one of word of conversion as the innovation the output.
Another aspect of the present invention relates to the speech recognition of automatically closing identification in the mode of more than one appointments.This comprises the innovation such as the big vocabulary predicative sound recognition instruction of such identification till receiving another instruction of opening identification again automatically closed in unlatching identification then.It also comprises the innovation of speech recognition, and wherein pressing button causes in identification in the duration of determining with such compressing time length and clicks same button and causes and discerning in irrelevant time with such click length.
Another aspect of the present invention relates to the telephone key-press control of big vocabulary predicative sound identification.This comprises and uses telephone key-press to select word from the option option list, uses their to select to provide about the help pattern of the explanation of the button pushed subsequently and use them to select the innovation of the tabulation of the current function that is associated with telephone key-press.It also includes the innovation of the speech recognition of text navigation mode, a plurality of in this pattern have the telephone key-press of numeral that the multiple different button mapping that is associated with them is arranged simultaneously, and push such button and cause with the function that has digital telephone key-press to be associated and become the mapping that is associated with the button that is pressed.
Another aspect of the present invention relates to the speech recognition of using the telephone key-press alphabet sequence to filter and spell.So-called alphabet sequence filters, and we refer to and support to comprise the speech recognition of importing the word of the corresponding alphabetical sequence of pointed alphabetical sequence (normally initial alphabetical sequence) with the user.This aspect of the present invention comprises using pushes telephone key-press as the innovation of filtering input, wherein each pressing keys can be done multiple explanation, because it shows that the corresponding characters position is corresponding with one of numerous letters that identify with that telephone key-press in the word of expection.This aspect of the present invention also comprises uses the sequence push telephone key-press as the innovation of filtering input, wherein zero degree or the repeatedly numeral of the given button of the repeated presses clearly indication that provides a plurality of letters of being associated with that button to tend to use in filtrator.This aspect of the present invention also comprises such telephone key-press input multiple explanation and clear and definite is used to spell the text that can use except the text that produces by speech recognition innovation of doing.
Another aspect of the present invention relates to the speech recognition that makes the user can finish secondary sounding identification, and wherein speech recognition is in order to help speech recognition to select one or more the bests that are fit to these sounding text sequence of keeping the score to say with saying on both in early days of same sequence at the secondary of the sequence of one or more words and finish better.
Another aspect of the present invention relates to speech recognition and text-to the combination of-voice (TTS).This includes the speech recognition such as etic spelling and letter change into the rule of sound and the innovation of TTS software shared resource.It also includes at least a pattern and uses TTS to say text that is identified and the sound that uses TTS or record are said the big vocabulary predicative sound recognition system of the list of instructions that is identified after identification innovation automatically after identification.This aspect of the present invention comprises that also the text of the TTS that uses each sounding back repeats the innovation of the large vocabulary system of the text that is identified automatically.This aspect also comprises makes the user can be each such moving in the text that is being identified after being said by TTS at one or more words of current location backward or the innovation of previous mobile large vocabulary system.This aspect also comprises those selected innovation of large vocabulary system of TTS output of using speech recognition to produce the option option list and one or more those tabulations being provided.
Another aspect of the present invention relates to the combination of speech recognition and hand-written and/or character recognition.This comprises that selecting one or more the bests to keep the score discerns the innovation of candidate item as the function of the identification of the hand-written and oral expression of the sequence of the one or more words that will discern.It also comprises the innovation that the character that uses one or more letters or handwriting recognition alphabet sequence filter the speech recognition of one or more words.It also comprises speech recognition alphabet sequence filtration handwriting recognition that uses one or more letter sign words and the innovation of using the handwriting recognition of the one or more words of speech recognition correction.
Another aspect of the present invention relates to the combination with playback of recording of the identification of big vocabulary predicative sound and sound.It includes the innovation of the handheld apparatus that sound that the identification of big vocabulary predicative sound and user can switch encodes once more between at least two kinds of patterns among the pattern of following recording voice input: do not have the pattern of corresponding speech recognition output with regard to recording voice; Pattern with the speech recognition output record sound of correspondence; With the pattern that does not have corresponding sound with regard to the speech recognition output of recording voice.This aspect of the present invention also includes the identification of big vocabulary predicative sound and sound code capacity and make the user can select sound that a part before recorded and the innovation that the handheld apparatus of the speech recognition of finishing is arranged on it once more.It also comprise make the user can use big vocabulary predicative sound be identified as a part do not have corresponding speech recognition output provide with regard to the sound that is recorded text mark big vocabulary predicative sound recognition system innovation and make the innovation of the system of the text mark that text search that the user can comprise those words by sound, identification sounding and the search of sending the mark word and unrecognized recording part correlation join.This aspect of the present invention comprises also and allows sound that the user before recorded in playback with single input and the innovation of finishing the large vocabulary system that switches between the speech recognition that wherein the playback slightly formerly of acoustic playback in succession finishes to begin automatically before.This aspect of the present invention also includes the innovation of the mobile phone of big vocabulary predicative sound identification and sound recording and playback capability.
The accompanying drawing summary
By the detailed description in conjunction with the accompanying drawings embodiment preferred, these and other aspects of the present invention will become very clear:
How the sound of accompanying drawing 1 diagram spoken language is transformed into the synoptic diagram of the parameter frame of the sense of hearing that is used for speech recognition software.
Accompanying drawing 2 is synoptic diagram, the pronunciation spelling is used in diagram, how speech recognition can come identified word with the sequence of as shown in Figure 1 parameter frame, and how the arrangement of time between the pronunciation model of word is used to the arrangement of time of those relative words of original acoustic signal that the initial parameter frame therefrom produced.
Accompanying drawing 3 to 8 has been showed the dissimilar development of computing platform, and based on it, the aspect of many current inventions can be by usefulness, and illustrate Xiang Gengxiao and [or] trend of more portable calculation element development.
Accompanying drawing 9 illustrates people's digit aid one by one, or PDA, this device has a touch-screen of having included the software for display input panel of the many aspects of the present invention, or SIP, allows to advance to operate in application program on such device by the speech recognition typing of text.
Accompanying drawing 10 is diagrams of the height signal of many hardware and software ingredients that can find in the PDA of accompanying drawing 9 shown types.
Accompanying drawing 11 is amplifications of the screen image in the accompanying drawing 9, is used for many concrete elements of the speech recognition SIP that points out at accompanying drawing 9.
Accompanying drawing 11 is similar with accompanying drawing 12.Except it also illustrates the data item of the correction window that produced by speech recognition SIP and most of its graphic user interface.
Accompanying drawing 13 to 17 provides speech recognition SIP to make, particularly the code description of simplifying to heavens of the response of the various input of receiving from its graphic user interface.
Accompanying drawing 18 is or at as shown in Figure 9 speech recognition SIP, and perhaps the identification in the mobile phone embodiment shown in the accompanying drawing 59 is used to determine to respond the code description that the height of duration logic of length of the time of opening of the speech recognition of pushing of one or more user interface button is simplified.
Accompanying drawing 19 is to use the family can see by touching the code description of simplifying to heavens of the help pattern of the function of each data item of the speech recognition shown in 9 in conjunction with the accompanying drawings.
Accompanying drawing 20 and 21 is fluorescent screen images of the help mode producing that is described in accompanying drawing 19.
Accompanying drawing 22 is descriptions of the code of simplifying to heavens of displayChoiceList program, and it is multi-form to be used to by showing the correction window as the speech recognition SIP of accompanying drawing 9 with as the mobile phone embodiment of accompanying drawing 59.
Accompanying drawing 23 is descriptions of the code of simplifying to heavens of getChoice program, and it is multi-form to be used to speech recognition SIP and mobile phone embodiment and to be used for showing the Show Options tabulation of accompanying drawing 22 to produce one or more option lists.
Accompanying drawing 24 and 25 illustrates the data structure of the list of utterances of the getChoice program that is used for shown in the accompanying drawing 23.
Accompanying drawing 26 is descriptions of the code of simplifying to heavens of filterMatch program, and this program is used by the getChoices program, revises the filtration input of window options coupling by user's typing with restriction.
Accompanying drawing 27 is descriptions of the code of simplifying to heavens of wordFormList program, and it is various multi-form by speech recognition SIP and the use of mobile phone embodiment, thereby produces a word forms correction tabulation that is shown to the alternative form of order speech or selection.
Accompanying drawing 28 and 29 provides the description of the code of simplifying to heavens of filterEdit program, its various multi-form speech recognition SIP and mobile phone embodiments of being used to, thus editor is used for the filtrator character string of filterMatch program of accompanying drawing 26 to respond the alphabetical filtering information from the user.
Accompanying drawing 30 provides the description of the code of simplifying to heavens of filterCharacterChoice program, and its various multi-form speech recognition SIP and mobile phone embodiments of being used to are with the option list of the single character of display filter character string.
Accompanying drawing 31 to 35 illustrates the reciprocation between user and speech recognition SIP, and the user uses once-come at discrete audio recognition method of-one-time the identification of typing and correction word.
Accompanying drawing 36 shows how the user of SIP revises the wrong identification at accompanying drawing 35 ends, revises option list that window provides by rolling up to finding the expection word, use then the capitalization button before this word typing text with its capitalization.
Accompanying drawing 37 is showed the user of the SIP identification that how to correct mistakes, and revises the part candidate option in the window and uses its as filtrator of selecting the expection speech recognition to export by being chosen in.
Accompanying drawing 38 shows how the user of SIP is chosen in the back alternate items of two continuous alphabet sequence orderings revising in the window, and the character that causes the output of speech recognition device to become to be positioned at a sequence in the middle of two alphabetical options is the limited output of beginning.
How the user that accompanying drawing 39 illustrates SIP uses the speech recognition of alphabetical name to come output filtering character and filtrator character option menu how to be used to correct mistakes in the identification of such filtration character.
How the user that accompanying drawing 40 illustrates SIP uses the one or more filtrator string characters of international communication alphabet typing, and word outside this alphabet is showed to the user in the SIP interface how.
Accompanying drawing 41 shows how the user selects the initiation sequence from the character of candidate option in revising window, thereby uses the international communication alphabet that character is added the spelling that this sequence is finished expection output then.
Accompanying drawing 42 to 43 illustrates the sequence of customer interaction, and user's typing also uses the continuous speech recognition editor to enter the text of SIP.
Accompanying drawing 45 illustrates the user and how to export by using the identification of contiguous alphabet name to spell all or part of expection as fuzzy (or many-valued) filtrator, and how the user uses the tabulation of filtrator character option to revise the mistake that produces fast in such contiguous alphabet name identification.
How accompanying drawing 46 makes the user come input character by extracting character recognition if illustrating speech recognition SIP.
Accompanying drawing 47 is code descriptions that the height of character recognition mode is simplified, when implementing to be used for SIP when type is extracted character recognition as shown in Figure 46.
Accompanying drawing 48 illustrates speech recognition SIP as when the user is by using the handwriting recognition input text.
Accompanying drawing 49 is descriptions of the code simplified of the height of handwriting recognition pattern, is used for SIP when implementing as shown in Figure 48 handwriting recognition.
How accompanying drawing 50 makes the user import this paper with a software keyboard if illustrating speech recognition system.
Accompanying drawing 51 illustrates a filtrator typing pattern, and this pattern is selected to comprise speech recognition to select the distinct methods of typing filtering information, character recognition, handwriting recognition and software keyboard input.
Accompanying drawing 52 to 54 illustrates character recognition, handwriting recognition, or how the software keyboard input is used for filtering the filtrator speech recognition option that produces in SIP correction window.
Accompanying drawing 55 to 56 illustrates SIP and how to allow the speech recognition of word or filtration character to be used to revise the handwriting recognition input.
The description of the code that the height of the alternate embodiment of the Show Options list procedure during accompanying drawing 58 in the accompanying drawing 22 is simplified, wherein, the option list of generation is only kept the score to option sorting by identification, rather than being sorted by lexicographic order in the accompanying drawing 22.
Accompanying drawing 59 illustrates a mobile phone that embodies many aspects of the present invention.
The structural drawing that accompanying drawing 60 provides the height of the chief component of the typical mobile phone shown in accompanying drawing 59 to simplify.
Accompanying drawing 61 is included in the structural drawing of the height simplification of distinct program design in the mass storage devices in the mobile phone shown in the one or more accompanying drawing 59 and data structure.
Accompanying drawing 62 illustrates mobile phone shown in the accompanying drawing 59 and allows traditional dialing, through the telephone key-press of pushing numeral.
Accompanying drawing 63 is code descriptions that the height of the order structure of mobile phone shown in the accompanying drawing 59 is simplified, when this is in its climax-horizontal telephony mode, shown in accompanying drawing 62 shown in the top, fluorescent screen.
How the user that accompanying drawing 64 illustrates mobile phone shown in the accompanying drawing 59 visits and by pushing read fast the instruction in the master menu of menu key on the mobile phone.
The code description that accompanying drawing 65 to 66 provides the height of the operation of master menu shown in the accompanying drawing 64 to simplify.
Accompanying drawing 67 to 74 illustrate with mobile phone shown in the accompanying drawing 59 on relevant each the different important model of the speech recognition text editor that moves and the improvement mapping of the mobile phone digital keys in the menu.
How the user that accompanying drawing 75 illustrates the text edit software of mobile phone can see function fast, function combines with one or more buttons under non--menu mode, by pushing menu button and rolling instruction list, the use-pattern of menu essence shown in accompanying drawing 64.
The code description that accompanying drawing 66 to 68 provides the height of response of the speech recognition software of mobile phone to simplify is when in its text window, editing machine pattern.
The code description that accompanying drawing 79 and 80 provides the height of typing pattern menu to simplify can enter this pattern from different speech recognition mode, selects diverse ways typing text.
The code description that accompanying drawing 81 to 83 provides the height of correctionWindow program to simplify, this program are used for mobile phone and show the correction window, and respond user's input when window shows when revising.When so revising window and show, up to 83 provide simplification to heavens by mobile phone with showing that one revises window and responds the code description of the correction window writing routine of user's input.
Accompanying drawing 84 is code descriptions that the height of editing navigation menu is simplified, and when the edit pattern text window shows, allows the user to select different air navigation aids with the navigation keys of mobile phone.
And select diverse ways to make and revise the selection that window is responded the alternative selection in the correction window.
Accompanying drawing 85 is code descriptions of revising the height simplification of window navigation menu, when being in the correction window, allow the user to select the diverse ways navigation, and select diverse ways to make the correction window respond the selection of revising the alternative selection in the window equally with the navigation keys of mobile phone.
The code description that accompanying drawing 86 to 88 provides the height of three kinds of slightly different embodiments of button Alpha pattern to simplify, make the user can be by saying with the initial word typing letter of this letter, and, respond pushing of telephone key-press by limiting the identification of an initial word in three or four letters relevant in fact with the button that is pressed.
Accompanying drawing 89 to 90 provides the identification of the code that the height of some the available options under the editing options menu simplifies, this menu of many mode access that can relate to from the mobile phone speech recognition program.
Accompanying drawing 91 and 92 provides the description of the code that the height of word types menu simplifies, and the word types menu can be used to slurry identification option and be limited in the certain words type, for example the specific syntax type of word.
The code description that accompanying drawing 93 provides the height of typing preference menu to simplify, this menu can be used for being provided with the acquiescence identification setting of different phonetic recognition function, or the setting of identification duration is set.
The code identification that accompanying drawing 94 provides the height of the Text To Speech playback operation that uses on the phone in action to simplify.
Accompanying drawing 95 provides the mobile phone Text To Speech to produce how service routine designs the code description of simplifying with the height of the data structure that is used for the mobile phone speech recognition equally.
Accompanying drawing 96 is code descriptions that the height of mobile phone transcriptional profile is simplified, and makes the speech recognition capabilities of the easier operative installations of user transcribe the audio frequency that is recorded on the mobile phone.
Accompanying drawing 97 is code descriptions that the height of program design is simplified, make mobile phone speech recognition editing machine be used to typing and editor and be illustrated in literal in the dialog box of mobile phone, and the state that changes control for example list box check box and the wireless communication button in such dialog box.
Accompanying drawing 98 is the code descriptions that can be used for the height simplification of the helper on the mobile phone, makes the user can find the description of the diverse location of mobile phone order structure fast.
Accompanying drawing 99 and 100 illustrates the example of the shown help menu type of the program design of accompanying drawing 98.
Accompanying drawing 101 and 102 illustrates the description that helper that how user to use accompanying drawing 98 designs the quick search function relevant with the mobile phone order structure of different piece with reception.
Accompanying drawing 103 and 104 illustrates the reciprocation between the editing machine user interface of the speech recognition of user and mobile phone, and in this interface, the user uses continuous speech recognition typing and review text.
Accompanying drawing 105 illustrates the user and can how flatly to roll in being presented at the correction window of mobile phone.
Accompanying drawing 107 illustrates the operation of the button Alpha pattern of showing in accompanying drawing 86.
How the speech recognition editing machine that accompanying drawing 108 and 109 illustrates mobile phone allows user's addressing and typing and the editor text in the electronic mail message that can be sent by the mobile phone wireless communication ability.
How the speech recognition that accompanying drawing 110 illustrates mobile phone produces expection output from having from the previous discontinuous speech identification of discerning the one or more word of keeping the score of these words continuously again with help.
Accompanying drawing 111 illustrates in order to use Internet website of wireless communication ability visit of mobile phone, and how the mobile phone speech recognition software can be used for entering a URL.
The data item that accompanying drawing 112 and 113 illustrates the speech recognition user interface of mobile phone can how to be used for navigating the Internet webpage and in the field of such webpage option and typing and editor this paper.
How the data item that accompanying drawing .114 illustrates the mobile phone speech recognition user interface can be used for making the user can read too big in the literal field that shows on the phone fluorescent screen in action more easily and the text-string that can not once all see, for example the literal field of webpage or dialog box.
Accompanying drawing 115 illustrates the dialog box that finds of mobile phone, how the user searches character string by the speech recognition typing and enters this dialog box, find function how to implement the search of the character string of typing, and how the text that finds is used for the audio frequency of identification record at phone.
How accompanying drawing 116 makes speech recognition be used for selecting from the possible assignment in conjunction with list box if illustrating Dialog Editor program design shown in the accompanying drawing 97.
How accompanying drawing 117 illustrates speech recognition can give someone through name dialing, and how in action the voice reproducing of mobile phone and registering capacity be used during the telephone relation.
Accompanying drawing 118 illustrate when mobile phone just at record audio, how speech recognition is opened and closed, thereby the comment of Text Flag or text is inserted the audio frequency of record.
Make the user can move speech recognition on the part the audio frequency how mobile phone that illustrates accompanying drawing 119 formerly writes down.
How accompanying drawing 120 makes the user can peel off the text of the identification of given fragment of sound from the audio recording of this sound if illustrating mobile phone.
Accompanying drawing 121 illustrates the recording how mobile phone makes the user arrive or to open or united sound outside the indication of selecting fragment allocation of which this paper.
Accompanying drawing 122 to 125 illustrates the mobile phone speech recognition software and how to allow the user to revise this identification by speech recognition typing telephone number and when number is mistake.
Accompanying drawing 126 illustrate mobile phone embodiment that what show in accompanying drawing 59 to 125 aspect can be used to vehicle environment, comprise the duration logic aspect of TTS and mobile phone embodiment.
Accompanying drawing 127 and 128 illustrate all mobile phone embodiments of in accompanying drawing 59 to 125, showing aspect that major part of aspect of the mobile phone embodiment that can be used to show in the accompanying drawings.59 finish 125 can by with or at wireless phone or at the phone of communication cable.
The code description that accompanying drawing 129 provides the height of the name dialing program design of the mobile phone embodiment that is illustrated in accompanying drawing 117 partly to simplify.
The code description that accompanying drawing 130 provides the height of digit dialling program design of the mobile phone of accompanying drawing 122 to 125 to simplify.
The detailed description of invention
Accompanying drawing 9 has been showed individual digit aid (or PDA) 900, and many aspects of the present invention can be applicable on it.The PDA that is showed is similar to the Compaq iPAQ that sells on the market now TMH3650 pocket personal computer, Casio Cassiopeia TM, and Hewlett-Packard TMJornado 525.
Personal digital assistant 900 comprises a relative high-resolution touch-screen 902, makes the user select this paper of software push buttons and part via touching touch-screen, for example uses stylus 904 or finger.The personal digital assistant also comprises the Navigation Control 908 of one group of load button 906 and two dimension.
In the application's description details and claim, a navigation input media is usually designed among the definition that is included in button, is chosen in the motion of the discontinuous unit on the one or more dimensions to allow the user; This resembles the interface of phone very much, and the button of employing phone or the button of phone make progress as guider, downwards, import left and to the right.
Accompanying drawing 10 provides a system schematic table of personal digital assistant 900.Touch-screen 902 and load button 906 (comprising navigation input 908) have been showed among the figure.The device of also having showed processing unit (for example microprocessor 1002) among the figure with central authorities.Microprocessor 1002 is connected (adopting the flash ROM (read-only memory) usually) with ROM (read-only memory) 1006 on one or more telecommunications buses 1004; Random access memory 1008; One or more I/O devices 1010; Video Controller 1012 is used for the demonstration control of touch-screen 902; And an audio devices 1014, audio devices 1014 receives from the input of microphone 1015 and audio frequency output is provided for a loudspeaker 1016.
The personal digital assistant also comprises a battery 1018, for the personal digital assistant provides compact power; The socket 1020 that input of headphone and headphone are exported, socket 1020 is connected to voicefrequency circuit 1014; A butt connector 1022 is used to provide a personal digital assistant and an other computing machine, the binding of for example desktop computer; And an additional connector 1024, so that the flash ROM (read-only memory) that the user adds device the personal digital assistant, for example adds, a modulator-demodular unit, a wireless set 1025, or a mass storage device.
Accompanying drawing 10 has been showed a mass storage device 1017.In fact, this memory storage may be the mass storage device of any kind, comprises all or part of flash ROM (read-only memory) 1006 or micro hard disk.In a jumbo memory storage like this, the personal digital assistant can store an operating system 1026 usually, with many basic functions of generator.Except operating system and the speech recognition system that will introduce below, generally include one or more application programs, for example word processor program, electronic data sheet program, web browser or people's letter breath management system one by one.
When personal digital assistant 900 is applied to the time of the present invention, it can comprise speech recognition program design 1030 usually.The program design of the words coupling that this program is showed in attached Fig. 1 and 2 more than being included as and implementing.The speech recognition program design also comprises one or more vocabulary tables or vocabulary marshalling 1032 usually, and vocabulary marshalling 1032 has the large vocabulary that comprises 2,000 words, speech at least.Permitted great vocabulary system 50,000 vocabulary to hundreds of thousands are arranged.Each word for vocabulary, usually there are spelling this paper 1034 and the one or more affiliated vocabularies of this word to organize into groups 1036 (for instance, this paper output ". " may be actually in a big vocabulary identification vocabulary, in the spelling vocabulary and in the punctuate vocabulary marshalling in some system).Word in each vocabulary also has the identifier 1038 of one or more phonological components usually, and according to identifier word is classified; And corresponding to the voice phonetic 1040 of each phonological component of word.
Speech recognition program design generally includes pronunciation conjecture device 1042, be used to guess new adding system thereby do not have a pronunciation of predefined word.The speech recognition program design generally includes one or more pronunciation dictionary trees 1044.A pronunciation dictionary tree is tree-like data structure, and tree form data structure flocks together from tree " root " (phonetic of all voice wherein is with identical phoneme) in a common path.Use this pronunciation dictionary tree to improve the performance of identification, because it makes all parts of various words share the identical raw tone phonetic that is divided into together.
Preferred speech recognition program design also comprises a golygram compositional language model 1045, be used to refer to the possibility occurrence of various words herein, comprise the possibility that occurs in the word in the text that has provided the word before or after one or more.
Usually, the speech recognition program design can store more new data 1046 of language model, and language model more new data packets is drawn together the letter breath that can be used for upgrading above-described golygram compositional language model 1045.Usually this language model more new data packets draw together or comprise statistics letter breath, this statistics letter breath comes from text that the user creates or through the similar text of text of expecting generation with him or she of user's indication.In accompanying drawing 10, the design of the speech recognition program of displaying has stored the communication data 1048 that comprise name, address, telephone number, email address and be used for some or all these class-letter breaths.These data are with helping speech recognition program design identification speech or this type of contact letter breath.In many embodiments; this type of communication data will be comprised in the program of an outside; the annex of one of application program 1028 or operating system 1026 for example; but; even in situation so, the speech recognition program design can need to visit these names, address usually; telephone number, the expression of email address and their voice.
The speech recognition program design also generally includes the voice acoustic model 1050 that is similar to the speech model 200 that is illustrated among Fig. 2.Usually more new data 1052 of acoustic model is also stored in the speech recognition program design, and this acoustic model more new data also comprises the letter breath of the acoustics letter number that comes from system's identification in the past.Usually, such acoustic model more new data is the form of parameter frame, as the parameter frame shown in attached Fig. 1 and 2, perhaps acoustic model more new data be the form of the statistics extracted from such frame.
Detail display by the user interface that is provided by touch-screen 902 that shows in the accompanying drawing 9 is provided accompanying drawing 11.Many aspects of the present invention are included in the software input panel (or SIP) 1100 that the personal digital assistant uses particularly.
Accompanying drawing 12 is similar with accompanying drawing 11, except accompanying drawing 12 has also been showed interface when the speech recognition software input panel is showing a correction window 1200.
Accompanying drawing 13 to 17 showed the continuous page or leaf of the detailed description of code continuously, showed the speech recognition software input panel and how to have responded various input on its graphic user interface.For the purpose of simplifying, these codes are to show as a major cycle 1300 of the software input panel program of importing the response user.
In accompanying drawing 13 to 17, this event loop is described to have two main switch statements: one is the switch statement 1301 in the accompanying drawing 13, whether no matter revise window 1200 and show, switch statement 1301 all will be produced for the response of the input on the user interface; And the switch statement 1542,1542 in accompanying drawing 15 has only the response that just can produce user's input when window 1200 shows when revising.
If the user is by " Talk " button 1102 shown in Figure 11, the functional block 1302 among Figure 13 causes functional block 1304 to 1308 to be moved.The text whether functional module 1304 detections are showed in having as the window in the accompanying drawing 11 1104 with understanding in software input panel buffering.In the accompanying drawings in the embodiment of the software input panel of Zhan Shiing, software input panel buffering is designed to have the text of relatively small number amount row, the software of software input plane can keep the track of acoustics input and the optimal selection relevant with the identification of each word and by the context relation of the language of such text creation.Adopt such text buffer to be because the speech recognition software input panel does not often have the affirmation about this paper in the remote application of showing in the window in the accompanying drawing 11 1106, within in the window 1106 in accompanying drawing 11, the software input panel is at the position output text of current cursor 1108 in application.In other embodiment of the present invention, can adopt bigger software input panel buffering.In other embodiment, many aspects of the present invention can be used as one not to be needed to use for input text uses the independently speech recognition text of software input panel to create to use.Use speech recognition device as the software input panel, can provide input for any personal digital assistant's of running on application program almost.
Get back to accompanying drawing 13, because Talk button 1102 is offered the user as same method, indication software input panel, he gives an oral account this paper at new literal section, so functional block 1304 is removed any this paper from software input panel buffering 1104.Therefore, if the cursor 1108 of the application window 1106 of the mobile accompanying drawing 11 of the user of software input panel, he should begin next section oral account by clicking Talk button 1102.
Whether speech recognition system is current at correct state with clear and definite by test, and the functional module 1306 in accompanying drawing 13 is in response to the click of Talk button.If it withdraws from this state, remove any correction window 1200 that may illustrate as showing in the accompanying drawing 12.
When revise window be shown, but do not have selectedly when receiving input from the input of the button at the interface of most of major software input panel, the software input panel of Zhan Shiing is not under modification model in the accompanying drawings; Be shown when revising window, and selected when receiving the input of the button that comes from most of main software input panels, and the software input panel of showing in the accompanying drawing is to be under the modification model.This difference has the consideration of design, because the software input panel that is shown especially can be selected to operate under " once a kind of pattern ", under this pattern, word is read out and by careful identification, and under this pattern, when each word is identified, all show and revise window, from but the user can see fast and select the correction input that inventory or software provided.Under " once a kind of pattern ", most of form, and indefinitely be relevant to the user input of making correction and be used to carry out additional function, additional function confirms to be presented at that first in the current selective listing select is conceivable word.When system is not under " once a kind of pattern ", revises window and only when the user claims input before revising, just be shown usually.In this case, the demonstration of revising window is at modification model, and the user selects to make correction because its is supposed, the input of most of form all should be imported into the correction window.
Should be realized that, the value of the identification of " once a kind of pattern " in the native system, otherwise also need to add the switch application program that is used to cut or jump out update the system.
Get back to functional module 1306, functional module 1036 is removed any current correction window, has just indicated a requirement that begins new oral account because push Talk button 1302, rather than the oral account before revising.
Functional module 1308 in the accompanying drawing 13 is pushed in response to the Talk button, this response beginning by cause the identification of software input panel buffering according to previous selection, current identification continuous-mode.This identification just begins to take place without any need for the previous language model context relation that is used for initial word.Preferably, the language model context relation derives from the word as the response identification of pushing the Talk button, and the language model context relation is used for being provided for discerning second word and the context relation of the language of word thereafter in such identifying.
Accompanying drawing 18 has schematically represented identification persister design 1800, as to pushing or click the response of any button on the software input panel interface that can be used to start software identification, identification persister design 1800 makes the user can select the voice activated identification of different patterns, in the embodiment illustrated, many buttons are arranged, comprise the Talk button that is used to start speech recognition.This makes the user can select the given pattern of discerning, and in this pattern, by clicking a pushbutton enable identification.
Functional module 1802 helps which functional module in the decision accompanying drawing 18 to be moved, and relies on current identification continuous-mode.The identification continuous-mode can be set under multiple diverse ways, comprise by default setting set and the functional module menu in Figure 46, showed in typing performance option in option.
To continue type chosen if click unique identification, and functional module 1804 can cause functional module 1806 and 1808 to be identified in to push the speech utterance during the talk button.It is promptly complicated simply again that this identification continues type, because it makes the user can control identification length, simple regular by one: be identified in when pushing talk button and take place and only take place when pushing talk button.The end that preferred sounding and/or sounding detect is used at any recognition mode, to reduce the possibility that ground unrest is identified as sounding.
If it is to push and click to the sounding end type that current identification continues type, functional module 1810 will cause functional module 1812 and 1814 to respond pushing of a talk button by the voice of identification during clicking in this time.In this case, " pushing " of talk button pushes away so button for being longer than the time that is defined in than given longer duration during given, for instance, and than 1/4th or 1/3rd of 1 second long times.If the user pushes away the relatively short period of a talk button, pushing away will be by as " clicks ", rather than " pushing ", and functional module 1816 and 1818 startups that will begin to discern, the next one end that detects up to sounding from this click the time.
Push and click to the lasting type of sounding end identification is had benefit, use quick and easy selecting of button between two kinds of patterns, the permission user selects the recognized patterns of different delayed time length and only discerns the pattern of a simple sounding.
If it is to push continuously, click discontinuous to the sounding end type that current identification continues type, functional module 1820 functional modules 1822 to 1828 are moved.If talk button is clicked, just define as above, functional module 1822 and the discontinuous identification of 1824 operations finish up to the next one of sounding.If on the other hand, talk button is pressed, just defined as above, as long as talk button keeps being pressed, functional module 1826 and 1828 is carried out continuous identification.
This identification continues the type benefit and is, it allows the user only change between continuous and discontinuous identification soon, by use on a given talk button push dissimilar.In the software input panel embodiment of showing, other identification continues type and does not change between continuous and discontinuous identification.
If it is to click to overtime type that current identification continues type, functional module 1830 causes functional module 1832 to 1840 to be moved.If talk button is pressed, functional module 1833 to 1836 is switched identification from being closed to open usually.Functional module 1834 is responded and is clicked, and whether speech recognition is to open now with clear and definite by test.If so, and if be that it responds click by closing speech recognition except changing of vocabulary just in clicked talk button.On the other hand, if when talk button is clicked, speech recognition is closed, and functional module 1836 is opened speech recognition, up to through one overtime the duration.The length of this overtime duration can be by being provided with in the typing performance option of user in function menu shown in Figure 46 4602.Surpass a given duration if talk button is pressed, as previously mentioned, functional module 1838 and 1840 can cause to be identified in and open during pushing, but closes when it finishes.
This identification continues type provides fast selecting to the user with easy method, and only switches between speech recognition is opened and closed with a button, and causes speech recognition only to be opened during the pushing of talk button.
Get back in the accompanying drawing 13 functional module 1308, can see, select different identification to continue type and can allow the user to select how to begin to discern with Talk button and other talk button.
If the user is chosen in the Clear button of showing in the accompanying drawing 11 1112, functional module 1309 to 1314 is removed any correction window, revises window and can be shown and remove the context relation in the software input buffering and not send any deletion to the input of operating system text.As previously mentioned, at the voice software input panel of showing, this text window of software input panel of showing in accompanying drawing 11 is designed to there is relatively little this paper.Because by typing or editor, character is supplied to personal digital assistant's operating system to this paper, causes the variation of this paper of the corresponding application window of showing in accompanying drawing 11 1006 in software input panel buffering.The Clear button makes the user can remove this paper that cushions from the software input panel, prevents its overload, and does not cause the deletion of this paper of corresponding application window.
In accompanying drawing 11, showed Continue button 1114, when the user wants to give an oral account one section continuous text, or when inserting text, used Continue button 1114 as the current location in the identification of the software among Figure 11 panel buffering 1104.When 1114 buttons of this Continue button were pressed, functional module 1316 caused that functional module 1318 to 1330 is moved.Functional module 1318 is removed any correction window, indicates the user not to using the interest of revising window because push the Continue button.Next, 1132 tests of functional module, if the current cursor in software input panel buffer window has previous language context relation, this language context relation can be used in the possibility of first word of prediction or as any possibility that is identified the literal of sounding of pushing the result of Continue button.If it causes that the language context relation is used so.If not, if there is not this paper of software input panel buffering now, functional module 1326 is used and had before been sent into last one or more word in the software input panel buffering, the language context relation when Continue pushbutton enable identification beginning.Next, functional module 1330 starts the identification of software input panels buffering, that is to say, uses current identification continuous-mode, and the identification of text is output the cursor place that cushions to the software input panel.
If the user selects the 11 Backspace buttons of showing 1116 in the accompanying drawing, functional module 1132 to functional module 1336 is moved.Functional module 1134 detects whether the software input panel is in modification model now.If so, it will be stepped back and send within the filtrator editing machine of revising window.In accompanying drawing 12, showed correction window 1200.Revise window and comprise that first selects window 1202.As what below will describe in detail, revise window interface permission user's selection and editor one or more character at the first selection window, as the partial character string of filtrator, the partial character string of this filtrator identifies the original character of the identified word of wanting belonging to of a sequence.If the software input panel is in modification model, push retreat will from the filtrator character string and current select the character that window selects first deleted, if not having character is to select like this, will delete the character on the left side of filtrator cursor 1204.
If the software input panel is not in modification model now, functional module 1136 will be responded pushing of Backspace button, send within the software input panel buffering and and export operating system to by stepping back character, so that can make identical variation the corresponding text in the application window in the accompanying drawing 11 1106 with identical character.
If the user is chosen in the New Paragraph button of showing in the accompanying drawing 11 1118, functional module 1338 to 1342 among Figure 13 will withdraw from modification model, if the software input panel is at modification model now, and they will send into a new paragraph character and advance within the software input panel buffering, and the corresponding operating system of exporting to is provided.
Indicated as functional module 1344 to 1338, the user that the software input panel is responded Space button 1120 selects, respond the same in fact mode of taking that retreats with it, that is to say, by it being sent within the filtrator editing machine, if the software input panel is in modification model, otherwise export it to software input panel buffering and operating system.
If the user selects Vocabulary Selection button 1122 to 1132, as shown in accompanying drawing 11.Functional module 1350 to 1370 in the accompanying drawing 13, functional module 1402 to 1416 in the accompanying drawing 14, according to current identification continuous-mode with to other settings of recognition mode, corresponding to selected button and the speech recognition that starts under this pattern, suitable recognition mode vocabulary can be set to vocabulary.
If the user selects name recognition button 1122, be provided with the duration of according to current identification and other suitable voice setting, the identification vocabulary of functional module 1350 and 1356 present modes is set to the name vocabulary and starts identification.By comprising name and all vocabulary buttons of making the vocabulary button, depend on whether software identification panel is in modification model, these functional modules are considered as filtrator or the identification of software identification panel buffering with current recognition mode.This is because these relevant with the vocabulary of the character string that is fit to definition filtrator character string as input of other vocabulary button or within direct typing typing software input panel cushions.Yet big vocabulary and name vocabulary are considered to be not suitable for the filtrator string editing, and, therefore, in the embodiment that is disclosed, current recognition mode is considered to repetition sounding or the identification of software input panel buffering, and whether the depended software input panel is in modification model.In other embodiment, name and big vocabulary identification can be used to edit many speech filtrator.
Except the standard relevant with pushing of a vocabulary button responded, if AlphaBravo vocabulary button is pressed, functional module 1404 to 1406 cause a series of all be demonstrated by the word that international communication alphabet (or ICA) uses, as what in the numeral 4002 of accompanying drawing 40, show.
If the user is chosen in the continuous/discontinuous recognition button 1134 of showing in the accompanying drawing 11, the functional module 1418 to 1422 in the accompanying drawing 14 is moved.These switchings between continuous recognition mode are used the continuous speech acoustic model and are allowed multiword identification candidate item and given single sounding is complementary; And a discontinuous recognition mode uses discontinuous identification acoustic model and only allows for single sounding and discerns single speech identification candidate item.This functional module is also used discontinuous or continuous identification opening voice identification, as just selected by pushing continuous/discontinuous button.
If the user is by pushing its selection function key 1110, functional module 1424 and 1426 is invoked at the function menu of showing in the accompanying drawing 46 4602.This function menu allows the user to select from comprise other options that directly can get from those buttons of showing accompanying drawing 11 and 12.
If the user is chosen in the help button of showing in the accompanying drawing 11 1136.Functional module 1432 in the accompanying drawing 14 and 1434 invoke help states.
As shown in Figure 19, when the help pattern by typing when responding the pushing of help button, a functional module 1902 shows that help windows 2000 provide about using the letter breath of help pattern, give an example as accompanying drawing 20.In the operation afterwards of the pattern of help, if the user touches the software input panel interface of part, functional module 1904 and 1906 shows the help window that has about the letter breath at the interface partly that is touched, and this help window continues to be shown as long as the user continues to touch.This graphic extension in accompanying drawing 21, the user pushes the filter button 1218 of revising window with stylus 904.As response, help window 2100 is shown to explain the function of filter button.If the display screen of user's double-click on part during the help pattern, functional module 1908 and 1910 shows help window, and help window can continue to show the interface of pushing another part up to the user.This makes the user can use in the accompanying drawing 21 spool label in the help window of showing, thereby rolls and read too big and can not once be presented at letter breath in the help window 2102.
Though in accompanying drawing 19, do not show, help window also has button 2100, for hold button 2100, the user can pull from the interested software input panel user interface of the part of pushing of beginning, select to keep help window, up to the software input panel user interface that touches another part to same.
After initial typing help pattern, when the user touches the help button of showing 1136 again in accompanying drawing 11,20 and 21, functional module 1912 and 1914 is removed any help windows and is also withdrawed from the help pattern, closes highlighting of help button.
If the user clicks a word on software input panel buffering, functional module 1438 to 1436 in the accompanying drawing 14 makes selected speech become current option and calls and is illustrated in the selective listing program that shows in the accompanying drawing 22, and this list procedure has as the word that is clicked of current option and has the acoustic data relevant with the word that is clicked.
As shown in Figure 22, show that the selective listing program is called with following parameter: select parameter; The filtrator string argument; The filtrator range parameter; Word types parameter and non-selective listing mark.Select parameter to point out this paper in software input panel buffering, the selective listing program is that the text is called.The filtrator character string indicates the one or more characters of a sequence, and this character indicates data item, and definition of data item one group of one or more possible spellings, identification output of expection begin with this spelling.Two character strings of filtrator scope character definition, the alphabet of this sequence bound fraction, expection output drops in this this alphabet.The identification output that the word types parameter indicates expection is certain type, for example Yu Qi type of grammar.The one or more word of a tabulation of non-selective listing mark indication, user's mode is indicated and unexpected word.
As shown in Figure 23, the functional module 2202 of demonstration selective listing program is called and is obtained option program.By filtrator character string, filtrator range parameter, and with the list of utterances of selecting parameter correlation, show that the selective listing program is called by the filtrator range parameter.
Shown in accompanying drawing 24 and 25, the sound that the one or more quilt of list of utterances 2404 storage is read out characterizes, as the one or more words relevant with current option of the sequence of the expection of part.As previously mentioned, when the functional module in the accompanying drawing 22 2202 is called when obtaining option program, as shown in Figure 24, it places the sign 2400 of a sound part 2402, and the word of current option is identified from sound part 2402.Indicated as accompanying drawing 2, the process of speech recognition with respect to the expression of alphabetical number of audio frequency to the acoustic model arrangement of time.Recognition system is preferably stored these arrangements of time, consequently works as selected text correction or playback and expects that it can find corresponding audio representation from such arrangement of time.
In accompanying drawing 24, first typing 2004 of list of utterances be the part of continuous sounding 2402.The present invention makes the user say language tabulation to what the additional sounding of the another one of an expected sequence or more speech brought Selection In, and identification can implement together at all these sounding, with the chance of the output that adds correct identification expection.As shown in Figure 24, so additional sounding can comprise discontinuous sounding, typing 2400A for example, and together with continuous sounding, typing 2400B for example.Whether each extra sounding comprises the letter breath, and is pointed as numeral 2406 and 2408, be continuous or discrete sounding promptly, and the vocabulary pattern of oral account.
In accompanying drawing 24 and 25, the acoustics of the sounding in list of utterances characterizes and is shown as waveform.Should be realized that in many embodiments, other forms of acoustics sign can be used, comprise that parameter frame characterizes the sign 110 of showing among for example attached Fig. 1 and 2.
Accompanying drawing 25 is similar to accompanying drawing 24, except the tabulation of the original spoken in the accompanying drawing 25 typing is discontinuous sounding sequence.Accompanying drawing 25 has showed that additional sounding typing can comprise discontinuous or continuous sounding equally respectively, 2500A and 2500B, and additional sounding typing is used to help to revise the one or more discontinuous sounding of initial sequence.
As shown in Figure 23, obtain option program 2300 and comprise module 2302, functional module 2302 detect with clear and definite whether serve as to select had previous this program that is identified as to be called to move (being filtrator character string and filtrator scope assignment) with current list of utterances and filtrator assignment.If so, it causes the selection of functional module 2304 with the identification before returning, because play the just not change aspect identification parameter when identification is before made.
If the test of functional module 2302 does not meet, whether the filtrator range parameter is invalid with clear and definite in functional module 2306 test.If it is not invalid, whether the filtrator scope is more detailed than current filtrator character string with clear and definite in functional module 2308 test, and, if so, it changes the filtrator character string into the public graphical symbol of filtrator scope.If not, functional module 2312 invalid filtrator scopes are because the filtrator character string comprises the more detailed letter breath that it is done.
As what will be explained below, when the user when two options are selected in selective listing, the filtrator scope is selected, as same indication, i.e. the identification of expection output drops in the middle of them in alphabet.When the user selected two to share initial graphical symbol, the graphical symbol that functional module 2310 causes filtrator character string and those to be shared was consistent.This is done, so that when the option tabulation was shown, the graphical symbol that is shared will be instructed to the user, as the proof of the initial parameter that meets expection output.
Should figure out, when new filtrator scope or filtrator character string that user's operating instruction is selected, if the assignment contradiction of new these two parameters of selecting, one the old assignment in these two parameters will be disabled.
If any candidate item from the previous identification of current list of utterances is arranged, functional module 2316 causes functional module 2318 and 2320 to be moved.For the previous identification with candidate item of each so previous identification is kept the score and the candidate item functional module 2318 of current filter definition is called a filtrator matcher of showing in accompanying drawing 26.And functional module 2320 deletion those be taken as such result who calls, be lower than candidate item certain limit value of keeping the score, that be returned.
Pointed as accompanying drawing 26, filtrator matcher 2600 moves filtration on word candidates.In the invention embodiment of showing, this filtrator extreme flexibly because its allows filtrator character string, filtrator scope or word types definition filtrator.Why be flexibly because the combination that it allows a word types and filtrator character string or filtrator scope to describe in detail, so and since its allow fuzzy overanxious it also be have flexible and or the filtrator character string, otherwise filtrator is arranged specification, and because it allows fuzzy filtration, comprise blur filter, the data item of a filtrator character string not only about their assignment of parameter of combination be blur and also blur at the character tree of its relevant character string.
The part of saying filtrator character string or filtrator character string when us is when bluring, and we want to show that many possible character strings can be considered to and its coupling.Fuzzy filtration is valuable, when being particularly useful for the input of filtrator character string, though discerned reliably, do not have single character of unique definition, for example Mo Hu telephone key filtered the situation of the type of narrating below, about many aspects of the present invention embodiment of phone in action.
Fuzzy filtration is not for being valuable equally by the input of the filtrator character string of highly sure identification, and for example identification of the name of letter is if particularly identification is constantly moved.Under situation so, not only have the best option of identification of the sequence of great possibility character can comprise one or more mistakes, and have reasonably may be in the identification option that the best is kept the score, discern the quantity of character differ with the quantity of reading out.But piece together all or the beginning character of the output of expection is the very fast and intuitive approach that the letter breath is filtered in input, although often be incorrect, when particularly when giving an oral account, being in hostile environment from the best option of such identification.
Filtrator coupling journey is called by each other word candidates.Program is kept the score with the identification before of the candidate item of word and is called.If any, or other to keep the score be 1, program is returned an identification and is kept the score, this identification keep the score with since the candidate item coupling equate when keeping the score of repeatedly being called of the possibility of front filter assignment.
Whether the word types parameter is defined with clear and definite in functional module 2602 to 2606 test of filtrator matcher, and, if like this and word candidates be not the word types of definition, it returns from filtrator matching feature module and keeps the score 0, and this shows that word candidates is obvious and it is inconsistent to work as the front filter assignment.
Whether current assignment is the filtrator scope is defined with clear and definite in functional module 26082614 test.If so, and if current word candidates be in beginning alphabetically and the word of the filtrator scope between stopping, they return with the constant assignment of keeping the score, otherwise they return with the assignment 0 of keeping the score.
Whether functional module 2616 decisions have the filtrator character string that is defined.If it causes that functional module 2618 to 2653 is moved so.It is to cause filtrator to mate first character in the invoked word candidates that functional module 2618 is provided with current candidate item character, and one will be used to following round-robin variable.Then, circulation 2620 is moved, up to the end of filtrator character string by it repeat reach.This circulation is passed through and is comprised functional module 2622 to 2651.
First function of this each repetition of round-robin is the character of the test of the 2622nd step with next data item of decision filtrator character string.In the embodiment of showing, three types filtrator string data item is allowed to: the character of non-fuzzy, and fuzzy character and a kind of fuzzy data item that shows a series of fuzzy character strings, this may have different length.
The character of a non-fuzzy clearly identifies alphabetic(al) letter or other character, for example space.It can be produced by the identification of the non-fuzzy of any type of alphabetical input, but it is usually in conjunction with the telephone key input in the phone use of letter or the identification of ICA word, keyboard input or non-fuzzy.Only by receiving by the spelling output of keeping the score as single the best of the identification of the character string of non-fuzzy, the identification of any letter input can be by as non-fuzzy.
A fuzzy character can have a plurality of alphabetical assignment, but a clear and definite character length is arranged.As mentioned above, this target can be based on fuzzy being pushed of the button of the embodiment of a phone and is realized, or realizes by the character recognition to voice or literal.This target also can realize by the identification of continuous literal name, and the character string that all the bests in the literal name are kept the score has identical character length.
A kind of data item of fuzzy length is generally in conjunction with the identification of continuous literal name or the output of handwriting recognition.Its performance is for a plurality of the best of hand-written or spoken input word sequence of keeping the score, and some such sequences can have different length.
If next data item in the filtrator character string is the character of non-fuzzy, functional module 2644 causes functional module 2626 to 2606 to be moved.Whether current functional module 2626 test with the character of the current non-fuzzy of clear and definite candidate item character match.If do not match, calling of filtrator coupling returned for current word candidates keep the score 0.If coupling, the position of the candidate item character that functional module 2630 increments are current.
If next data item in the filtrator character string is the character that blurs, functional module 2632 causes functional module 2634 to 2636 to be moved.Functional module 2634 detects with clear and definite whether current character can't mate in the assignment of identification of fuzzy character one.If coupling, functional module 2636 is returned and is kept the score 0 from calling filtrator coupling.Otherwise, as the function of the possibility of the assignment of the fuzzy current candidate item character of character match, the keeping the score of the word candidates that functional module 2638 to 2642 changes as coupling is current, the current candidate item character position of increment then.
If next data item in the filtrator character string is the data item of fuzzy length, functional module 2644 is each character string operation circulation 2646 with the data item performance of fuzzy length.This circulation comprises functional module 2648 to 2652.Whether functional module 2648 detects with clear and definite has the matching sequence that the current character of the current character sequence of coupling circulation 2646 begins at current candidate item character position.If have, function as the possibility of the matching sequence of the identification that shows as fuzzy length data item, functional module 2649 changes keeping the score of word candidates, and, by the character quantity in the fuzzy length data item sequence of coupling, functional module 2650 adds the current candidate item character of current location.Originate in current word candidates character position, if any character string that does not have character string coupling to combine with the data item of bluring length, functional module 2651 and 2652 is called to return and is kept the score 0 from what filtrator mated.
2620 are done if circulate, and current word candidates will be mated whole filtrator character string.In this case, functional module 2653 is returned keeping the score by the 2620 current words that produce that circulate from the filtrator coupling.
If the filtrator character string that the 2616th step test discovery is not defined, the 2654th step is only returned unaltered current word candidates from the filtrator coupling and is kept the score.
Get back to the functional module 2318 in the accompanying drawing 23 now, can see, can return one to candidate item for calling of the filtrator of each word candidates coupling and keep the score.These are kept the score is used for determining in functional module 2320 which word candidates of deletion.
In case implement deletion, functional module 2322 tests are clearly to be the quantity that previous identification candidate item remaining after deletion is arranged, if having, functional module 2320 is lower than the candidate item of anticipated number.Usually, this anticipated number can show the option that is used for option list of anticipated number.Be lower than such anticipated number if before discerned the quantity of candidate item, functional module 2324 to 2336 is moved.Functional module 2324 is based on each the enforcement speech recognition in the one or more typing of list of utterances 2400, shown in accompanying drawing 24 and 25.Indicated as functional module 2326 and 2328, this identifying comprises that a test is to determine whether to have the continuous and discontinuous typing in the list of utterances, and, if so, be corresponding to quantity at the detected independent sounding of one or more discontinuous typings with the restricted number of the possible word candidates in the identification of continuous typing.As the continuous or discrete identification indication 2406 that is illustrated in accompanying drawing 24 and 25, separately pattern is that effectively the identification of functional module 2324 is also included within and distinguishes each typing in the list of utterances with continuous or discontinuous identification when depending on each and being received.As 2332 indicated, the identification of each list of utterances typing also comprises to be used aforesaid filtrator matcher and uses to discerning each such sounding, language model in the tabulation of selecting best receivable candidate item of keeping the score.In the filtrator matcher, be illustrated in vocabulary indicator 2408 in accompanying drawing 24 and 25, that be the sounding in the list of utterances of most recent and be used as the word types filtrator, with any indication of reflection user, the word sequence of expection is limited in the one or more word from specific vocabulary.Employed language model is the PolyGram language model, and for example bigram or tigram language model, language model are used for the language text that any previous being used to helps to select the candidate item that the best keeps the score.
After the identification of the typing in the one or more list of utterances is moved, if have more than a typing in list of utterances, based on the combination of keeping the score of difference identification, functional module 2334 and 2336 is identification candidate list that the best is kept the score of list of utterances selection.Should figure out, in some embodiments of the present invention, the combination of keeping the score can be used to the identification of different sounding, so that improve the validity of the identification of using a plurality of sounding.
If the number that is produced the identification candidate item that is produced by functional module 2314 to 2336 is less than the number of expection, if and non-invalid filtrator character string or filtrator scope definition arranged, from vocabulary in conjunction with the typing of the most recent the list of utterances, if or do not have typing in the list of utterances then from current identification vocabulary, functional module 2338 and 2340 uses filtrators to mate the additional option of selecting anticipated number.
If when obtaining the functional module 2342 of option program run in accompanying drawing 23, in identification or current vocabulary, there is not candidate item, functional module 2344 is used the character string of keeping the score when the best of front filter character string as the option coupling, until expected numbers purpose option.When the filtrator character string only comprises the character of non-fuzzy, have only the single character string of the character of those non-fuzzies of coupling will be selected as possible option.Yet, the data item of fuzzy character and fuzzy length is arranged in the filtrator character string, will have many such character string options.And, there, there is the fuzzy character of the data item of fuzzy length to have different possibility with the one or more character combination of different correlated serieses, will correspondingly be kept the score by the option that functional module 2344 produces, as the functional module 2616 to 2606 of showing among Figure 26 by the mechanism of keeping the score.
When calling when obtaining option and returning, by identification, by being returned usually according to the selection of the vocabulary of filtrator or the option of the tabulation that possible Selection of Filters produces.
Get back to accompanying drawing 22 now, when the functional module 2202 that is invoked at that obtains option is returned the Show Options list procedure, whether functional module 2204 tests have any filtrator to be defined for current selection with clear and definite, whether there is any sounding to add the list of utterances of current selection, and, whether causing the Show Options invoked selection of tabulating is not in no option list, and what this no option list comprised a tabulation is not the one or more word of the identification candidate item of expection through user's indication.If these condition couplings, functional module 2206 makes one's options for being presented at first option of revising window, and this program is created and selected.Next, functional module 2210 remove any other from calling the candidate item the no option list of being included in that obtains that option program produces.Then, if first option is not selected by functional module 2206, functional module 2212 is made by obtaining option and is called the candidate item that the best of returning is kept the score, and is first option that ensuing correction window shows.If there is not single the best identification candidate item of keeping the score, alphabetical order can be used for selecting the candidate item as first option to be.Next, functional module 2218 is selected the character corresponding to first option of filtrator character string, if having any, for special demonstration.As following described, in preferential embodiment, the character of first option that meets the filtrator of non-fuzzy is pointed out in a method, and the character that meets first option of a fuzzy filtrator is pointed out partly meet which filter data item type so which filtrator character string the user can understand in a different manner.Next, functional module 2220 is placed a filtrator cursor before first character of first option that does not meet the filtrator character string.When not having the filtrator character string to be defined, this cursor will be placed before first character of first option.
Next, 2224 steps that functional block 2222 causes were implemented to 2228 steps, and the option program has been returned except when any candidate item outside preceding first option if obtain.In this case, functional module 2224 is kept the score from the best of a series of such candidate item and is createed first-character-order option list, and once all illustrates in revising window.If more identification candidate item is arranged, from last the best was kept the score candidate item, functional module 2226 and 2228 was created for all such options and is equaled to preset screening quantity second-character-order option list.
When all these are finished, functional module 2230 shows revises windows, revises window and shows the first current option, allows the indication of any or the character in filtrator, the indication of current filtrator cursor position, and, the tabulation of first option.In accompanying drawing 12, be defined without any filtrator owing to current, first option one 206 is illustrated in first option window 1202, and filtrator cursor 1204 is illustrated in before first character or first option.
Should figure out, the Show Options list procedure can be called with the invalid value that is used for current selection and is used to not have the text selecting of relevant sounding.In this case, respond alphabetic input by the word completion of implementing with functional module 2338 and 2340 that is operating as the basis.Adopt the sounding of filtration or repetition sounding for identification and allow to select option, adopt filtration and/or repeat sounding to help to revise previous identification, enforcement is based on the word completion of the input of lexicographic order, and, if expection, by importing the sounding of a sequence, help so alphabetical completion process, to spell out one is not word at the current vocabulary with alphabetical input, to mix and to mate the input of multi-form lexicographic order, comprise the form of non-fuzzy, about the fuzzy form of character, and about the fuzzy form of length.
Get back to accompanying drawing 14 now, we had explained that how functional module 1436 and 1438 clicks a word by calling Show Options tabulation response at the software input panel, in order, caused that revising window is shown, for example the correction window 1200 in the accompanying drawing 12.By only on a word, clicking, show that the ability of revising the window option list relevant with it provides a method fast and easily for the user revises the individual words mistake.
If the user double-clicks a selection that a little is selected on the software input panel buffering, functional module 1440 to 1444 is jumped out from the current correction window that should be shown, and beginning software input panel is according to current identification continuous-mode and use the current language context relation of current selection.Identification continues logical response in the time of pushing relevant with such double-click, determining whether respond, as for pushing or clicking as purpose of description in the accompanying drawing 18.Any such identification output will replace current selection.Though do not show in the accompanying drawings, click if the user double-clicks on a word of software input panel buffering, it is regarded the current selection for the purpose of functional module 1444.
If why not the user in office comprises in the part of software input panel buffering of this paper if being click, for example after this paper between word or before the word or in buffering, functional module 1446 causes that functional module 1448 to 1452 is moved.In the position that clicks, 1448 of functional modules are implanted a cursor.If click be located in the software input panel buffering after this paper any point, cursor will be placed after last word of that buffering.If click is to double-click to click, according to current identification continuous-mode and other setting, in new cursor position, functional module 1,450 1452 beginning software input panel buffering identifications, the touching time-delay second time of adopting double-click to click determines whether should be as pushing or responding as clicking.
Accompanying drawing 15 is continuities of the code relevant with above-mentioned attached Figure 13 and 14.
If the part of the user towing one or more words in software input panel buffering, functional module 1502 and 1504 are invoked at the demonstration selective listing program relevant with accompanying drawing 22 that is described above.What all words were all or part of is pulled as current selection, and these word acoustic datas relevant with identification, if any, as first typing of list of utterances.
If the user pulls a beginning part through the individual words in software input panel buffering, functional module 1506 and 1508 call and show the selective listing function, this word as select, the word of initial part that this word is added into non-selective listing, this towing is as first typing as list of utterances of filtrator character string and sounding data relevant with this word.A fact has been explained in this program design, i.e. user's towing is indicated whole word and unexpected option, and put into practice partly just as same indication through the only initial of a word, and this word is added into non-selective listing.
If user's towing is through the word in the software input panel buffering, functional module 1510 and 1512 is called the Show Options list procedure, this word is as selection, to select to add no option list, the initial part of this word is as the filtrator character string, and the acoustic data relevant with selected word is as first typing in the list of utterances.
If software input panel buffering has the indication above this paper of specified quantitative to be received, function 1514 and 1516 shows that buffering to the user is near a warning that is full of.In revealed embodiment, this warning informs that user buffering will automatically be eliminated, be added into buffering if surpass the character of an additives amount, and to require the user to investigate current text in buffering be correct, and push the talk key or continue to remove buffering.
Received that 1502 to 1528 steps that the indication that this paper imports, functional module 1518 cause are moved if receive software input panel buffering.Functional module 1520 test is with the end of clear and definite whether current cursor in software input panel buffering.If not 1522 pairs of operating systems outputs of, functional module many retreat the current cursor position of its distance in equaling from the last letter of software input panel buffering to buffering.Next, functional module 1526 causes that this paper imports, and text input may be made up of one or more characters, it current cursor position be output cushion to the software input panel within.The 1527th this paper sequence identical with 1528 steps output and the text thereafter in any software input panel buffering are to this paper input of operating system.
Before the text of identification was fed to operating system, functional module 1522 was presented to operating system and is retreated.Equally, functional module 1528 is the text after operating system is presented the text of any reception also, the change to the text in any software input panel buffering that causes also is made into the text in application window in response to the previous text that is provided for application window that receives.
When the indication of new software input panel buffering this paper input is received, if software input panel program be once-at-one-temporal mode, whether functional module 1536 tests are produced in response to the text input of speech recognition with clear and definite.If so, functional module 1537 is called the Show Options list procedure for this paper of identification, and functional module 1538 is turned off modification model.Usually, show that calling of selective listing program switches to modification model with system, but when once--one-when temporal mode is used, stop the sounding of these situations with functional module 1538.As the above, this be because once-at-one-temporal mode, revising window shows when each time the speech recognition of utterances of words being moved automatically, therefore and a kind of bigger possibility arranged, the user wishes at most to provide the purpose of input to the non-correction window aspect at the software input panel interface that is used, rather than provides input in revising window.On the other hand,, revise window and be shown the hope that one or more words will be revised in the surface as the result of special user input, enter modification model after, the input in some non-correction window will be imported into the correction window.
Whether a following set condition mate with clear and definite in functional module 1539 test: the software input panel be once-at-one-temporal mode, the correction window is shown, but system is not in modification model.This be once-under-one-temporal mode, the state that usually after the sounding of each word, presents.If described situation exists, for cause this option be used as this paper output be introduced into software input panel buffering and operating system purpose, for order one or more word thereafter upgrade current language context relation purpose, for the purpose that is provided at the data of using in the renewal language model and, in order to provide the purpose of data to upgrading acoustic model, by determining the identification of first option in revising window, any input that functional module 1540 is responded in accompanying drawing 13,14 and 15.This make the user can determine before once-identification of word in-one-temporal mode, promote the identifying input by any one or a large amount of also can being used to.
Should figure out, if the user be once-in-one-temporal mode, and produced the input that is presented at the word in the selective listing that indicative of desired corrects, the software input panel will be set to modification model, and input afterwards that pattern the duration will not cause the operation of functional module 1540.
In accompanying drawing 15, functional module 1542 has indicated the main response round-robin of the software input panel program relevant with input when revising window and be shown to begin part.The whole of accompanying drawing 15 and accompanying drawing 16 and 17 are shown in this part expansion.
If the button 1210 of jumping out of the correction window of showing in accompanying drawing 12 is pressed, functional module 1544 and 1546 causes that software input panel program withdraws from the correction window and do not change current selection.
If by the delete button 1212 that is pressed in the correction window of showing in the accompanying drawing 12.Functional module 1548 and 1550 deletions current in software input panel buffering selection and send output to operating system, this variation that causes any text in application window is corresponding to the variation in the software input panel buffering.
If by being pressed in the newly-built button of showing in the accompanying drawing 12 1214, functional module 1552 causes that functional module 1553 to 1556 is moved.Functional module 1553 deletion is corresponding to the current selection of the software input panel buffering of revising window, with the variation of the correspondence that causes this paper in application window.The group recognition mode of 1554 pairs of new sounding acquiescences of functional module, normally big vocabulary recognition mode, and can be set at continuous or discontinuous recognition mode by the user.Functional module 1556 uses the setting of current identification continuous-mode and other identification to begin the identification of software input panel buffering.According to the operation of above-mentioned functional module 1518 to 1538, the identification of software input panel buffering provides input can for software input panel buffering.
The explanation of the input that the major cycle response of accompanying drawing 16 continuation software input panel programs is received during the demonstration of revising window.
If that in the accompanying drawing 12 again-sounding button 1216 is pressed.Functional module 1602 causes functional module 1603 to 1610 to be moved.If group software input panel program is not at modification model, functional module 1603 is organized software input panel program and is set to modification model.Above situation will sounding, if revise window as once-result of word identification in-one-temporal mode shows, and the user is again-the sounding button to show a kind of for revising the intention of purpose use correction window in this situation by pushing a button at the correction window.Next, functional module 1604 current recognition modes be set to again-recognition mode that sounding identification combines.Then, according to current again-other the identification setting of sounding identification continuous-mode river, comprise vocabulary, functional module 1606 receives one or more sounding.Next, select in order to revise window, functional module 1608 adds list of utterances with the one or more sounding of being received by functional module 1606, is accompanied by an indication, shows the vocabulary pattern when those sounding and is that continuous or discontinuous being identified in worked.This has caused the list of utterances 2004 of showing in the accompanying drawing 24 and 25 to have additional sounding.
Functional module 1610 is called the Show Options list procedure in the accompanying drawing 22 then, as described above.In order, this will transfer and obtain the selection function module in the aforesaid accompanying drawing 23, and will cause that functional module 2306 to 2336 operations use the identification of sounding again of new list of utterances typing.
If by being pressed in the filtering button 1218 of showing in the accompanying drawing 12.The functional module 1612 of accompanying drawing 16 causes that functional module 1613 to 1620 is moved, functional module 1613 enters modification model, and, if the software input panel is current is not to be in modification model, relevant with functional module 1603 as mentioned above, and functional module 1614 detects with clear and definite whether current typing pattern is a speech recognition mode, and, if current typing pattern is a speech recognition mode,, cause that functional module 1616 begins to filter identification according to current filtration identification continuous-mode and setting.This input that causes any such identification to produce is directed to the cursor place when the front filter character string.On the other hand, be non-speech recognition typing window scheme if work as front filter typing pattern, functional module 1618 and 1620 is called suitable typing window.As what below will be described, in the embodiment of the present invention of being showed, these non-voice typing window schemes are corresponding to character recognition typing pattern, handwriting recognition typing pattern and keyboard typing pattern.
If the user pushes the word forms button of showing in the accompanying drawing 12 1220; functional module 1622 to 1624 causes that software input panel program enters modification model; if current software input panel program is not at modification model; and; cause that the word in the list procedure of accompanying drawing 27 is called for the current first option word; provide input to cause the demonstration that repeats of correction window for the correction window up to the user, the first current option can be a selection usually, revises window and is called for this selection.This means, select one or more words, and by pushing the word forms button at the correction window, the user can select the candidate of the form of a tabulation fast for any such selection by cushioning at the software input panel.
Accompanying drawing 25 is for example understood the function of word forms list procedure.Be shown when it is called if revise window, as selection, select to show for this by the word forms tabulation with current best option for functional module 2702 and 2704.If current selection is a word, functional module 2706 causes functional module 2708 to 2714 operations.If current selection has any homophone, functional module 2708 is placed on them the beginning of word forms selective listing.Then, the 2710th step is found the radical form of selected word, and functional module 2712 produces the form of the syntax of a series of replacement for word.Functional module 2714 sorts all these grammar forms in option list alphabetically after any homophone then, and homophone can be added in the tabulation by functional module 2708.
On the other hand, if select to be made up of a plurality of words, functional module 2716 causes that functional module 2718 to 2728 is moved.Functional module 2718 tests have any space to see with the clear and definite word centre of tool of whether selecting.If have, functional module 2720 adds the option list that does not have the space in the middle of its word to the copies of selecting, and functional module 2222 adds the copy of the selection that spaces are replaced by hyphen.Though do not show in accompanying drawing 27, additional functional module can be used the space by operation or omit the space and replace hyphen.If select to have a plurality of data item of the mapping function that submits to identical spelling/non-spelling, 2726 selecting and the copy of all previous selection conversion adds option list.For instance, this will be transformed into a series of digital name the coordinator of a numeral, or the word that repeats " period " be transformed into corresponding punctuate sign.Next, functional module 2728 alphabetical sort options tabulations.
In case the option list that is the selection of a word or a plurality of words is produced, functional module 2730 shows revises window, revise the selection of window displaying as first option, the filtrator cursor is in the beginning of first option, and rolling option list and the scroll list.In some embodiments, selection is a single word, its filtrator has the character with its all grammatical forms appearance of unique sequence, and the filtrator cursor can be placed in after this common sequence, and this common sequence is designated as the filtrator character string of non-fuzzy.
In some embodiments of the present invention, the word forms tabulation provides the selectable word forms tabulation of a single alphabetical rank order.In other embodiment, option can be sorted according to the frequency of using, or have alphabetical first and second selective listings, first option list comprises one group of selecteed usually optional form that can once be put into the correction window, and second tabulation comprises the word forms of less use.
As what will demonstrate below, the word forms tabulation provides a very fast method to revise the very speech recognition errors of general type, that is, the homophone of expection word or first option are candidate's grammatical forms of expection word during first option.
If the user is by being pressed in the capitalization button of showing in the accompanying drawing 12 1222, functional module 1626 to 1628 can enter modification model, if present mode is not a modification model, and can call capitalization circulatory function module for current first option of revising window.The initial capitalization that one or more word that circulation can cause that a sequence not all has an initial capitalization possesses each word is revised in capitalization, can cause that the one or more word with initial capitalization of a sequence becomes the form of all Caps, and, can cause that the word of the one or more form that all has capitalization of a sequence becomes the form of low lattice.By repeatedly pushing the capitalization button, the user can select between these forms apace.
If the user is chosen in the broadcast button of showing in the accompanying drawing 12 1224, functional module 1630 and 1632 causes the voice reproducing in first typing of the list of utterances of the relevant option of combination correction window, if any such typing exists.This makes the user hear the pronunciation that is relevant to by the one or more word of the sequence of wrong identification in definite ground.Though do not illustrate, when the correction window at first showed, preferential embodiment made the user can select a setting that causes that automatically such sound automatically is played.
If show adding word button 1226 in the accompanying drawing 12 by being pressed in, when it did not appear dimmed state, functional module 1634 and 1636 was called a dialog box, and this dialog box allows user to key in the current first option word and enters vocabulary existing or reserve.In the embodiment of the recognizer of this special software input panel, system with its subclass of total vocabulary as the existing vocabulary when using the common identification of big vocabulary pattern.Functional module 1636 allows the user to make a word in the reserve vocabulary part of existing vocabulary usually.It also allows the user to add a word, and this word is not in any vocabulary, but by the letter of using that is added into existing or reserve vocabulary is imported, this word is risked in first option window.Should figure out, in having other the embodiment of the present invention of big hardware resource, just not need existing and the difference reserve vocabulary.
When first selection word is current be not in existing vocabulary in, add the state of 1226 meetings of word button in grey.This offers indication of user, and he or she can select to add vocabulary existing or reserve to first.
If the user is chosen in the review button of showing in the accompanying drawing 12 1228, functional module 1638 to 1648 is removed current correction window and is exported its first option to the software input panel, and present the necessary button of a sequence, thereby make corresponding modify at application window to operating system.
If the user puts one of option one 230 of showing in the correction window that is selected in the accompanying drawing 12, function mould 1650 to 1653 is removed current correction window, and present the button of necessity of a sequence to software input panel buffering output intent option and to operating system, thereby make corresponding modification at application window.
If user's point is selected in one of them Edit button 1232 of showing in the accompanying drawing 12, functional module 1654 causes functional module 1656 to 1658 to be moved.If system is current is not at modification model, and functional module 1656 makes system change modification model over to.Functional module 1656 makes the button relevant with the option Edit button that clicks become first option, and becomes when the front filter character string, and functional module 1658 is called the Show Options tabulation with new filtrator character string then.To describe as following, this makes the user can select word option or word sequence as when the front filter character string, and edits this filtrator character string, normally by the rear portion of deletion with the inconsistent character string of expection word.
If the user pulls any option, the one or more original character that comprises first option, if current system is not at modification model, functional module 1664 to 1666 transformation systems are to modification model, and call the Show Options tabulation with the towed option that is added into option list and with the option of towed initial part as the filtrator character string.These functions allow the user to point out, current selection is not first option of expection, but the option of towed initial part should be as filtrator to help the option of expection.
The last continuation of the feature list that the input that accompanying drawing 17 provides software input panel recognizer to respond the correction window is made.
If the user pulls an ending that comprises the option of first option, if it is not at modification model that functional module 1702 and 1704 makes system enter the modification model system current, and, option and the option that is used as the not towed initial part of filtrator character string with the towing that is added into non-option list partly call the Show Options tabulation.
If the user is at two options of option list towing, if it is current not at modification model that functional module 1706 to 1708 makes system enter the modification model system, and, be used to be added into two options of non-option list and, call the Show Options tabulation as the beginning and end word of current filter definition.
If the user clicks between the character of first option, be not at modification model if functional module 1710 to 1712 makes the software input panel enter modification model software input panel, and moving filter device cursor is to the position that clicks.In this time, without any calling that Show Options is tabulated, because the user does not make any variation to filtrator.
At modification model, if the user retreats by pushing one of Backspace button 1116 typing, about as described in the functional module in the accompanying drawing 13 1334, functional module 1714 causes that functional module 1718 to 1720 is moved as preceding.When retreating when being transfused to, functional module 1718 is called the filtrator edit routine in accompanying drawing 28 and 29.
As will describe for example about accompanying drawing 28, filtrator edit routine 2800 is designed to the user in the dirigibility of editor in conjunction with the filter data item of non-fuzzy, fuzzy and/or fuzzy length.
This program comprises functional module 2802, is used for detecting with clear and definite whether any character is arranged in the option that can be called in the current location of filtrator cursor.If so, it causes the filtrator character string of functional module 2804 define programs by the old filtrator character string of its invoked conduct, and functional module 2806 is made this character in program by it in the invoked option before the filtrator cursor position, new filtrator cursor, and the character in all character strings that is defined by non-fuzzy.This makes the user can define first option of any part, because the position of editor's task is confirmed as correct filtrator character automatically.
Next, whether the filtrator edit segment is to retreat by its invoked input with clear and definite in functional module 2807 test.If it causes that functional module 2808 to 2812 is moved so.If the filtrator cursor is a non-selection cursor, last character of the filtrator character string that functional module 2808 and 2810 deletions are new.If the filtrator cursor meets the selection of the current first one or more characters of selecting, these characters are not included among the new filtrator by the operation of aforesaid functional module 2806.Functional module 2812 is removed old filtrator character string then because when filtrator editor's input be to step back, can suppose, without any the filtration of the previous authority to going-back position of part be planned into the future of filtrator included.This deleted any may before as the data item of the fuzzy and non-fuzzy of the filtrator character string of the right of filtrator cursor position.
If the filtrator edit routine is the character of one or more non-fuzzy by its invoked input, functional module 2814 and 2816 is the character that the end of new filtrator character string adds one or more non-fuzzies.
If the input of filtrator edit routine is the sequence of the fuzzy character of one or more regular lengths, functional module 2818 and functional module 2820 placed the data item of each a fuzzy character in the sequence that shows new filtrator ending.
If the input of filtrator edit routine is a kind of data item of fuzzy length, functional module 2822 causes functional module 2824 to 2832 to be moved.The alphabetical sequence that functional module 2824 selections are kept the score in conjunction with the best of fuzzy input, if this sequence is added into the filtrator of previous non-fuzzy part, this sequence can meet a beginning part of whole or vocabulary words.Should remember, when this functional module is moved, part before all of new filtrator character string will be by the operation acknowledgement of aforesaid functional module 2806, next, whether functional module 2826 tests have any any sequence of being selected by functional module 2824 to keep the score above a certain minimum with clear and definite.If so, it will cause functional module 2828 to select the best alphabetical sequence that is independent of vocabulary of keeping the score.Do like this is because if the condition that functional module 2826 is tested coupling shows that blur filter is used to clearly illustrate a vocabulary word.Next, functional module 2830 and 2832 is in conjunction with the character string that adopts new fuzzy filter data item to select to functional module 2824 by functional module 2824, and their blur filter data item that is new add the ending of new filtrator character string.
Then, be each the filter data item operation circulation 2834 in the old filtrator character string.This circulation is included in functional module 2836 to 2850 of showing in the residue of accompanying drawing 28 and the functional module 2900 to 2922 of showing in accompanying drawing 29.
2834 current old filtrator string data item is a data item that blur, regular length if circulate, this data item extends beyond the data item that is added the new fixed length of new filtrator character string by functional module 2814 to 2820, if the end of the new filtrator character string of the data item adding that functional module 2836 and 2838 will be old it extend beyond those new data item.Do like this is because of editor's filtrator character string rather than by using the Backspace button not delete ceasing to the filtrator on the right of new editor is alphabetical corresponding to the previous filtrator of part of previous typing.
2834 current old data item is blured if circulate, fixed-length data, this data item extends beyond the data item of new fuzzy length that some operations by functional module 2822 to 2832 are added into the ending of new filtrator character string, and functional module 2840 causes that functional module 2842 to 2850 is moved.For each is added into the character string of the data item of the new fuzzy length of showing as of filtrator character string, functional module 2842 is implemented circulation.For the circulation of such character string operation of the data item of each new fuzzy length comprises a circulation 2844, circulation 2844 is to move for each character string consistent with the current old fuzzy regular length of circulation 2834.Circulation 2844 comprises a functional module 2846 in this, and whether old functional module 2846 detect with clear and definite data item coupling, and old data item extends beyond current sequence in new data item.If so, functional module 2848 will add the character string tabulation of the data item that show as new fuzzy length corresponding to the new character string of current sequence, and the data item that current sequence is made a fresh start adds the sequence from the old data item of the current sequence that extends beyond the data item of making a fresh start on top.
If current legacy data item is a kind of fuzzy length data item, comprise and extend beyond a kind of any character string that has been added into the new fixing length data item of new filtrator, the functional module 2900 in the accompanying drawing 29 causes functional module 2902 to 2910 to be moved.
Functional module 2902 is the circulations for being moved by each sequence of old fuzzy length data item performance.It 2904 is made up of test, detect with clear and definite whether with mate from the current sequence of old data item and new fixed-length data item, and extend beyond new fixed-length data item.If so, functional module 2906 produces new character string, corresponding to the prolongation from the legacy data item that extends beyond new data item.After this circulation has been done, whether any new sequence is produced by functional module 2,906 2908 tests of functional module with clear and definite, and, if so, they cause the end of the filtrator of the new fuzzy length data Xiang Zhixin of functional module 2910 addings, after new data item.This new fuzzy length data item shows the sequence possibility that each is produced by function 2906.Preferably, possibility is kept the score relevant with each so new sequence, and new sequence is recycled the 2902 relative possibilities of having found to mate the character string of current new fixing length data item with each and keeps the score and be the basis.
If current legacy data item is a kind of data item of fuzzy length, some character strings that extend beyond new fuzzy length data item are arranged, functional module 2912 causes functional module 2914 to 2920 to be moved.Functional module 2914 is that a quilt is the circulation of each character string operation of new fuzzy length data item.It is that the inner loop 2916 of each character string operation of old fuzzy length data item is formed by a quilt.Whether the circulation of this inside is made up of functional module 2918 and 2920, detect with clear and definite from the character string coupling of old data item and extend beyond the current character string of the data item of making a fresh start.If so, they are in conjunction with new fuzzy length data item, and new character string is corresponding to the current sequence that adds from the data item of making a fresh start of the extension of current legacy data item character string.
Being done of 2834 all functions in case circulate, functional module 2924 returns, follows the new filtrator character string of being called generation by that from the filtrator editor who calls.
Should figure out, in the embodiment of many various different aspects of the present invention, different and often more simple filtering device-edit scenario can be used.But, also should figure out, the main advantage of the filtrator edit scenario of showing in 28 and 29 in the accompanying drawing is, it makes the people can enter blur filter soon, the contiguous alphabet that passes through is for example discerned, and, then through more reliable alphabetic typing pattern editor it, or even through subsequently continuous letter identification editor.For instance, this scheme can allow all or part of quilt of the filtrator of the continuous letter identification typing of quilt from discontinuous letter identification, ICA word identification or even the input of handwriting recognition replace.Under this scheme, edit the filtrator character string of part early as the user, the letter breath that is comprised in the filtrator character string of aft section can be not destroyed, unless the user points out so intention, this is through using backspace character in the embodiment of showing.
Get back to accompanying drawing 17 now, when the functional module 1718 of calling the filtrator editor is returned, functional module 1724 is called the Show Options tabulation in order to select, and follows by the filtrator editor and calls the new filtrator character string of returning.
No matter when, be received when filtering input, perhaps filter the result of the identification that buttons are pressed by the 1612 relevant responses of the functional module in the above-mentioned accompanying drawing 16, perhaps by any other mode, functional module 1722 to 1738 is implemented.
Functional module 1724 test with clear and definite whether system be once-at-one-time recognition mode, and whether the filtrator input is produced by speech recognition.If it causes that functional module 1726 to 1730 is moved so.Whether window selected in the filtrator character is current demonstration with clear and definite in functional module 1726 test, for example the window of showing in accompanying drawing 39 3906.Functional module 1728 is closed that filter option window, and functional module 1730 is called the filtrator editor, follows the first option filtrator character as input if so.These all previous characters that cause the filtrator character string are by the filter sequence as the non-fuzzy definition.No matter the result of functional module 1726 test, functional module 1732 is called the filtrator editor for new filtrator input, and the filtrator input causes functional module 1722 and the operation of the functional module that is listed below it.Then, functional module 1734 is called the demonstration selective listing for current selection and new filtrator character string.Then, if system be once-at-one-temporal mode, functional module 1736 and 1738 is called filtrator character option program, follows the filtrator character string of being returned by the filtrator editor, and, be accompanied by filtrator input as the new identification of the filtrator character of selecting.
Accompanying drawing 30 illustrates the operation of filtrator character chooser program 3000.It comprises functional module 3002, functional module 3002 test with clear and definite whether program since itself and the filtrator character of invoked selection accord with and have relative fuzzy character or the non-fuzzy character in the front filter character string of a plurality of best character option.If this situation, functional module 3004 is set the filtrator character selective listing that equates with all characters relevant with that character.If the quantity of character surpasses the quantity that once is fit in the tabulation of filtrator character option, option list can scroll button makes the user can see in addition character.Preferably, option shows that in alphabetical order it is become is easier to the character scanning that the user arrives to be needed than being more quickly.In the accompanying drawing 30 filtrator character option program also comprise functional module 3006, functional module 3006 test meets filtrator string data item in the fuzzy length in the front filter character string with the clear and definite filtrator character of whether selecting.If it causes that functional module 3008 to 3114 is moved so.Functional module 3008 test is first character of the length data item that blurs with the clear and definite filtrator character of whether selecting.If so, the tabulation of functional module 3010 filtrator character option is set to be equal to all first characters of the relevant character string of any fuzzy data item.If the filtrator character of selecting does not meet first character according to item of fuzzy length number, functional module 3012 to 3014 is set at all characters that are equal to in any character string of fuzzy data Xiang Wei representative with the selective listing of filtrator character, and the filtrator character of selecting in fuzzy data item and current first option begins with identical characters.In case functional module 3002 and 3004 or functional module 3006 to 3014 produced the tabulation of filtrator character option, functional module 3016 tabulate at a window Show Options, for example the window 3906 of displaying in accompanying drawing 39.
If software input panel program receive the user the filtrator character of filtrator character option window select a selection, functional module 1740 causes functional module 1742 to 1746 to be moved.Functional module 1742 is closed the filtrator selection window that such selection has been made.Functional module 1744 is called the filtrator editting function, follows the current filtrator character string of the character of the new input of in the filter option window selected conduct.Functional module 1746 is called the Show Options list procedure then, follows new filtrator character string to be filtered the device editor and returns.
If upwards towing is from a character of filtrator character string, the type of in the correction window 4526 and 4538 of accompanying drawing 45, showing, functional module 1747 causes functional module 1748 to 1750 to be moved.Be towed character, functional module 1748 is called filtrator character option program, produces the character option that filtrator character option window has any other relevant with that character if this causes.If the towing on the filter option character of this window is released, the selection that functional module 1749 produces d/d filtrator character option.Like this, it causes the operation of the above functional module of just having described 1740 to 1746.If d/d towing is on the option of filtrator character option window, functional module 1750 is closed the filter option window.
If, again-sounding is received, rather than again-the sounding button is pressed, as above description about functional module 1602 and 1610, for example by push big vocabulary table button or name vocabulary button at modification model, as above-mentioned functions module 1350,1356 and 1414, and the functional module in attached Figure 13 and 14 1416, divide other, the functional module 1752 in the accompanying drawing 17 causes functional module 1754 and 1756 to be moved.Functional module 1754 adds the selection list of utterances of revising window to any so new sounding, and functional module 1756 for select and implement again-discern the new sounding intonation option list program of use.
Forward accompanying drawing 31 to 41 now to, we will provide a user interface that just has been described can how to be used for giving an oral account the sequence illustration of this paper.In this special sequence, the interface be shown be in once-at-one-temporal mode, this pattern is a discontinuous recognition mode, cause discontinuous each time sounding to be identified the correction window that formula has option list and all be shown.
In accompanying drawing 31, numeral 3100 is pointed to the screenshot (reflection of fluorescent screen) of personal digital assistant's screens, thus this screen shows the user click Talk button 1102 and begin the contextual oral account of new linguistics.The big vocabulary table button 1132 that highlight shows shows that software input panel recognizer is in big vocabulary table schema.Continuously/the sequence surface recognizer of the point that separates on the discontinuous button 1134 is in a discontinuous recognition mode.It is assumed to be that the software input panel is the pushing and clicking to finishing sounding identification continuous-mode about numeral 1810 to 1816 in accompanying drawing 18.As a result of, the click of Talk button causes discerning the end of generation up to next sounding.Numeral 3102 performance users are to the sounding of word " this ".By this paper 3106 of identification is placed on this text window of software input panel 1104, to application window 1106 these this paper of output, and, by showing that one comprises the word of the identification in first option window 1202 and the correction window 1200 of first option list 1208, an image of numeral 3104 sensing personal digital assistants' fluorescent screen is to respond this sounding.
In the example of accompanying drawing 31, the user clicks as 3108 of numeral capitalization buttons 1222 pointed.This causes the personal digital assistant fluorescent screen to have 3110 appearances of pointing out, therein, exports at current first option of software input panel buffering and application window and this paper and to be changed to having the beginning of capitalization.
In example, it is directed digital 3102 that the user clicks Continue button 1104, reads word " be " point to numeral 3114.In example, suppose that this sounding is by the identification of mistake, work as word " it " word causes the personal digital assistant fluorescent screen to have the demonstration of being pointed to by numeral 3116, in this shows, have as the word of the wrong identification of its first option 3118 and the new option list that is used for this identification 1208 and be shown at new correction window 1200.
The continuation of these examples of accompanying drawing 32 performance, wherein the user to click the option word in the images that pointed to by numeral 3202 " be " 3200.This causes the personal digital assistant fluorescent screen to have the appearance of indicating through numeral 3204, wherein revises window and is eliminated.And this paper that revises appears in software input panel buffer window and the application window.
In the screenshot that point to by numeral 3206, show that the user clicks alphabetical name vocabulary button 1130, when button 1130 during by the demonstration of highlight, this changes current recognition mode into the alphabetical name vocabulary.As top point out about function 1410 and 1412, this button click the speech recognition of beginning according to current identification continuous-mode.This causes the order identification of system's sounding alphabetical name " e ", and is pointed as numeral 3208.
For the ability of the very fast correction identification error that will emphasize current interface, this letter of the understanding of example supposing the system mistake as letter " p " 3211, as be presented at once-in-one-pattern in the response of revising the indicated sounding 3208 of window.As in sight in the 3210 correction windows that point to, correct letter is " e ", yet, be one of option of in revising window, showing.In the visual field of the correction windows that pointed to by numeral 3214, the user clicks on option 3212, and this causes the personal digital assistant fluorescent screen to show the appearances of being pointed to by numeral 3216, wherein, correct letter at software input panel buffering and application window all by typing.
Accompanying drawing 33 illustrates the continuation of this example, and the user clicks on the punctuate vocabulary table button 11,024 pointed as the screenshot that is pointed to by button 11,024.This begin to cause pointed to by 3300 of numeral " period " identification of sounding of that word, this will discern vocabulary and change the punctuate vocabulary into, as highlight numeral 3302 pointed causing by the corrections of 3304 sensings at the punctuate sign "." following punctuate mark name in first option window and be illustrated, the user is easy to identification.
Because in example, this is correct identification, the user determines it and brings into use the identification of the new sounding of alphabetical name vocabulary by pressing button 1130, shown in screenshot numeral 3306, and the sounding 3308 " l " of saying letter.Regular being repeated of this typing process has the appearance that numeral 3312 shows up to the personal digital assistant fluorescent screen.At this moment, suppose the horizontal towing this paper of user " e.l。v。i。s。", shown in screenshot3314, as it is selected to draw the text, and, caused that the correction window 1200 of the screenshot3400 the corner, left side on accompanying drawing near is shown.Because the text-string of supposing to select is not in current vocabulary, so there is not the candidate option to be presented in this option list.In the visual field of 3402 correction windows pointed, the user clicks word forms button 1220, will call the above-mentioned word forms list procedure about accompanying drawing 27.Since the text-string of selecting comprises the space, its by as more than one-selection of word, cause that the subprogram functional module 2716 to 2728 of showing is for example moved in accompanying drawing 27.This comprises option list, for example comprises the option 3406 that points to by 3404, and wherein, the space is removed from the selection of revising window.In example, the user clicks the next Edit button 1232 of the most close option 3406.Show that as the correction window that is pointed to by numeral 3410 this causes that option 3406 is selected as first option, is showed as 3412 correction windows pointed.The user clicks the capitalization button, all becomes capitalization up to first option, at this moment, revises window and has the appearance of pointing at screenshot3414.At this moment, the user clicks the punctuate vocabulary table button 1124 that points to by 3416, and says the sounding that " comma " 3418 points to.In example, suppose that this sounding language is correctly identified, will cause the first previous option " e that is showed as text output by the correction window 1200 of numeral 3420 sensings.l。v。i。s。″。
Accompanying drawing 35 is continuation of this example.Wherein, suppose user clicks is said sounding " the " 3502 then by the big vocabulary button of numeral 3500 indications.This causes that revising window 3504 is shown.The user passes through once more calamity big vocabulary table button shown in 3506, and to 3508 " embedded " sounding that point to, responds this identification of affirmation.In this example, this causes that revising window 3510 is shown, and sounding wherein is erroneously identified as " imbeded ", and the word of expection is not displayed in first option list.Begin this moment, and as what pointed out by suggestion 3512, many different correction options will be illustrated.
Accompanying drawing 36 illustrates the correction option that first and second option lists relevant with wrong identification roll.In the visual field of the 3604 correction windows that point to, show that the user clicks the nextpage scroll button 3600 in the scroll bar 3602 of revising window, to cause first option list 3603, shown as revising window 3606 by first replacement all over the screen of second option list.As what show in this visual field, the slider bar 3608 of revising window is moved downward to below the horizontal bar 3609, and this is definition position in the scroll bar that the end with first option list combines.In example, the expection word is not in the part of second option that sorts alphabetically, as the tabulation of in the visual field 3606, showing, and the user pushes following one page button of scroll bar shown in 3610, this causes revising window the appearance of showing in the visual field 3612, one of them new screen completely is presented in the option that alphabet sequence lists.As being pointed out that by 3616 in example, the word of expection " embedded " is presented in this option list.In example, the user at one as selecting to click on this relevant OptionButton 3619 with this expection as shown in the visual field of the 3618 correction windows that point to.This causes revising window and has 3620 visuals field of pointing to, and wherein this is chosen in first option window and shows.In example, the user points to as numeral 3622 and clicks the capitalizationization button, causes this first option to have as the capitalization as shown in the screenshot3624 and begins.
Can see, software input panel interface among a relatively a large amount of identification option, provide one rapidly method allow the user to select.In the embodiment of showing, first option list is made up of maximum six options, and second option list can comprise maximum three extra fluorescent screens of 18 extra at most options.Since selection is arranged alphabetically and four all fluorescent screens can be viewed in less than 1 second, this is from can selecting the user very fast among maximum 24 selections.
Accompanying drawing 37 illustrates the method for filtering selection by a beginning part of horizontal towing option, as being described in the accompanying drawing 16 in the above about function 1664 to 1666.In the example of this accompanying drawing, suppose that first option list is included in the option of showing in the visual field of the 3700 correction windows that point to 3702, this comprises the sixth day of lunar month character " embodded " of expection word.Revise in the window 3704 as being illustrated in, these six letters that begin of the horizontal towing of user, and system responds and shows that the identification candidate item that a new correction window limit begins with the non-fuzzy filtrator is six characters, as being illustrated in screenshot3706.In this screenshot, the expection word is first selection, and first letter of selecting that initial 6 non-fuzzies are confirmed is shown as the frame 3708 that highlight shows, and filtrator cursor 3710 is also showed for example.
Accompanying drawing 38 illustrates the method for filtering option by the option in 2 option lists of horizontal towing, as describing about functional module 1706 to 1708 in the accompanying drawing 17.In this example, when it takes place between the numeral 3802 and 3804 two displayings alphabetically, revise 3800 of windows and show expection option " embodded ".Shown in the visual field 3806, the user points out that the word of expecting falls into the letter arrangement of this scope, by this two option of horizontal towing.This causes that new correction window is shown, and wherein possible option is limited to the word of generation in selecting the alphabet of scope, and is pointed as screenshot3808.In this example, the word quilt of supposing expection is as first option with by selecting in the caused filter result of 3806 selections of showing.In this screenshot, form that the part of first option of the beginning part of two options selects in the visual field 3806 in the part that is pointed out as the filtrator character string 3810 that non-fuzzy confirms, and filtrator cursor 3812 is placed on and has confirmed after the filter part branch.
Accompanying drawing 39 illustrates a method, and in the method, alphabetic filtration is used to once-help to select expection word option at-one-temporal mode.In this example, the user pushes as revising the filter button shown in window 3900 visuals field.Suppose that acquiescence filtrator vocabulary is alphabetical name vocabulary.Push filter button and begin, and the user reads letter " e " to next sounding begins speech recognition, indicated as 3902.This correction window 3904 that causes demonstration is shown, and in this window, supposes that the filtrator character is by be identified as " p " of mistake.In the embodiment of showing, once--one by one-temporal mode, alphabetic input also has the identification rendering preferences tabulation for it.In this situation, it is the window 3906 of the filtrator character selective listing of the filtrator character option subroutine in a type such as the above-mentioned accompanying drawing 30.In example, the user selects the filtration character of expecting, letter " e " shown in 3908, causes that new correction window 3900 is shown.In example, the user determines extra filtrator letter of typing, by pushing filter button once more shown in 3912, and reads sounding " m " 3914 then.This causes that revising window 3916 is shown this window display filter character option window 3918.Revise window at this, the filtrator character string is by correct identification, and the user can be by reading extra filtrator character or selecting correct letter shown in window 3916.The affirmation of this a pair of expection filtrator character has caused that new correction window is shown, and follows filtrator character " em " to be regarded as the filtrator character string of the affirmation of non-fuzzy.In the example that screenshot3920 shows, this causes the expection word to be identified.
Accompanying drawing 40 illustrates a method of filtering according to lexicographical order with AlphaBravo, ICA word or alphabetic spelling.In screenshot4000, the user clicks on AlphaBravo button 1128.This changes letter into the ICA word letter, as the functional module of describing in the accompanying drawing 14 1402 to 1408.In this example, suppose that the Display_Alpha_On_Double_Click parameter is not set.Like this, when pushing AlphaBravo button 1128, the functional module 1406 in the accompanying drawing 14 can be presented at the ICA word 4002 of tabulation among the screenshot4004.In example, user's typing ICA word " echo ", push the second time outside the AlphaBravo button that letter " e " is being followed shown in 4008, and read second ICA word " Mike " of representing letter " m ".In example, the filtrator character string of non-fuzzy of the identification of the word " embodded " of expecting is made of and is produced in the life of the input of these two alphabetic filtration characters success real estate the letter " em " of expection.
Accompanying drawing 41 illustrates a method, and in this method, the user selects the option of part as a filtrator, and uses AlphaBravo to spell then to finish the not word select in the vocabulary of system and select, and forms word " embodded " in this case.
In this example, the user presents with the correction window 4100 that comprises option 4100, and comprises that the sixth day of lunar month of expection word is alphabetical.As revise shown in the window 4104, the user pulls this sixth day of lunar month letter and causes those letters by the character of the current filtrator character string of the affirmation of non-fuzzy.This causes one to revise window 4106.Screenshot4108? show the displaying contents of revising window, therein, the user is from filter button 1218 towings, and discharge discontinuous/continuous button 1134, give an oral account pattern from discontinuous filtrator oral account mode switch to continuous filtrator, shown in the continuous lines on the button of showing at screenshot4108.In screenshot4110, the user pushes the alpha button again, and reads a sounding and comprise following ICA word " echo is the mountain range that dentation rises and falls, tango for echo, delta ".This causes current filtrator character string to meet the spelling of the word of expection.Because there is not this filtrator character string of word matched in the vocabulary, as what show in revising window 4114, the filtrator character string itself becomes first and selects.With the window of being showed 4116, thereby the user clicks the selection of pointing out first option on review button, causes that the personal digital assistant has the demonstration shown in 4108.
Oral account, identification and the correction of accompanying drawing 42 to 44 demonstration continuous speechs.In screenshot4200, the user clicks to close and knows button 1112, as in the relevant functional module 1310 to 1314 as shown in the accompanying drawing 13.This causes that the text in the software input panel buffering 1104 is eliminated, and does not cause the variation of any related text in application window 1106, and is indicated as screenshot4204.In screenshot4204, the user clicks continuously/discontinuous button 1134, and this causes a transformation, the button from screenshot4002 the point of a sequence become the transformation of the indicated discontinuous identification of continuous lines.This has started the speech recognition according to current recognition mode, and indicated as numeral 4206, and the user reads the continuous sounding " largevocabulary interface system from voice signal technologies period " of following word.System responds by discerning this sounding, and places the text of identification in software input panel buffering 1104, and by operating system to application window 1106, shown in screenshot4208.Because the text of identification is slightly more than software input panel window capacity once, shown in numeral 4210, user's software input panel window that rolls, and click word " vocabularies " 4214, causing that the functional module 1436 to 1438 in the accompanying drawing 14 selects these words, and produce correction window for it.Revising window 4216 in response shows.In example, expect word " vocabulary " the 4218th, this revise on the option list of window and, in the visual field of revising window 4220, the user clicks and causes it selected, the word " vocabularies " during this word replacement software input panel in application window that will use that selection cushions.
Continue accompanying drawing 43 now, this correction is shown by screenshot4300.In example, the user selects this four words by horizontal towing as four pointed words of making a mistake of 4302 the visual field " enter faces men rum ".This causes functional module 1502 and 1504 to show the option window have as the towed word of selecting, as by shown in the visual field 4304.
How the correction window that accompanying drawing 44 illustrates the bottom that is illustrated in accompanying drawing 43 leads to the horizontal line of revising window is corrected with vertical rolling and the combination of rendering preferences therein.Numeral in 43 4400 is pointed to the uniform windows of showing 4304 in the accompanying drawings.In its inside be not only vertical scroll bar 4602 be demonstrated and also scroll bar 4402 that a level arranged outside this visual field.Following one page button 3006 that user's point is selected in vertical scroll bar causes that the option list of part is shown, thereby first option list that one page of showing from 4400 sorts alphabetically forwards second first option list that sorts alphabetically to.The character string that neither one identification candidate item is exported with the identification of coupling expection in second option list in this section in example begins, and this is that " interface system from " so user such as digital 4408 indications click down one page spool button 3600 again.This causes revising window and is presented at 4410 appearances of showing, wherein two option that is shown 4412 beginnings are exported with the identification of character string coupling expection.For the output that will look at whether the similar word expection of user's rolling on the HScrollBar shown in 4,414 4402 mated in these terminations of discerning candidate item.This sees the user, the output of option 4418 coupling expections.Similarly be illustrated in 4420, the user clicks this option and causes it to be inserted within this paper that is given an oral account, and as showing at screenshot4422, is inserted in the software input panel window 1104 and in application window 1106.
Accompanying drawing 45 illustrates the use of the blur filter that the identification of being read out alphabetical name continuously produces, and is filtered the device character and selects the window editor can be used for error correcting apace oral account.In this example, the user similarly pushes and is presented at 4500 Talk button 1102, reads shown in 4502 then " trouble " sounding of that word.In example, suppose that this sounding was wrong being identified as shown in 4504 " treble ".In example, the user shown in 3506 " treble " and on click, cause that revising window is demonstrated 4508.Because when the word of expection is not to be shown as any option, promptly the user clicks filter button 1218 shown in 4510, and make comprise the expection word " trouble " and the continuous sounding 4512 of each letter.In this example, it is assumed to be that the filtrator recognition mode is set and comprises the identification of continuous alphabetical name.
In example, system responds the identification of sounding 4512 by the tabulation 4518 of Show Options.In this example, the recognition result of generally supposing this sounding language is to cause the filtrator that produces the data item that comprises that a kind of fuzzy length.In being described in the above about functional module 2644 to 2652, the filter data item of fuzzy length allows any identification candidate item of comprising in by one of character string of the sort of fuzzy data item performance at the counterpart of its beginning character sequence.In revising window 4518, word 4519 parts that meet a kind of first option of fuzzy filter data item are pointed out by fuzzy filter indicator 4520.Because filtrator uses a kind of fuzzy data item, option list shows that comprise this candidate item of identification candidate item that the best keeps the score begins with the kinds of characters sequence, comprise length be less than part first option meet the character string that shows as fuzzy data item.
In example, the user is first character of first option of towing upwards, causes the operation of the functional module 1747 to 1750 in the accompanying drawing 17.This causes filter option window 4526 to be shown.As revise shown in the window 4524, the user pulls expection character letter and discharges towing in this position, causes that functional module 1749 and 1740 to 1746 is moved.This closes filtrator and selects window, call add as the character of the selection of non-fuzzy correction to the filtrator of previous fuzzy correction data item, and cause that new correction window follows the new filtrator shown in 4528 to be shown.Revise as shown in window as this, first option 4530 be shown for the non-fuzzy filter indicator of its first letter " t " and for the blur filter indicator 4534 of its remaining character.Next, showed that as the same correction window shown in 4536 user upwards pulls the 5th letter " p " of the first new option, cause that new correction window 4538 is shown.When being released in this towing on the character " p ", the user causes this that character and all in new correction window 4540, being pointed out in the filtrator current at the character before first option by being defined in of non-fuzzy, select as a result of to show that the selection 4542 of first in this window is the word of expection, and the non-fuzzy of filtrator part is pointed out by the filter indicator 4544 and the remainder of the non-fuzzy of fuzzy filter data item, operation by function 2900 to 2910, this rests in the filtrator character string, and as shown in drawings 29.
Accompanying drawing 46 illustrates software input panel recognizer and allows user also input text and filtering information, is similar to the Character recognizer that adopts Windows CE operating system by using Character recognizer.
Shown in the screenshot4600 in this accompanying drawing, if the user upwards pulls the functions of functional keys module 1428 and 1430 in the accompanying drawing 14.It will show pushes and menu 4608, and if the user in the character recognition typing 4604 of menu, the character recognition mode that is described in accompanying drawing 47 will be opened.
As shown in Figure 47, this causes functional module 4702 character display identification window 4608, as showing in the accompanying drawing 46, the input circulation 4704 that is repeated of one of typing then selects to withdraw from window up to the user by the other input option that is chosen on the function menu 4602.When in this circulation, if the user touches the character recognition window, functional module 4906 records " ink " during touching, this write down motion, if anyly touch surface through partly demonstration, touch the fluorescent screen corresponding to the character recognition window.If the user discharges touching of this window, functional module 4708 to 4714 is moved.In current window, functional module 4710 pairs " ink " is implemented character recognition.Functional module 4712 clear character identification window, pointed as the numeral 4610 of accompanying drawing 46.And the 4708 pairs of software input panels of functional module buffering and operating system provide the character of corresponding identification.
Accompanying drawing 48 illustrates in show, if the user selects to be illustrated in the handwriting recognition option of the function menu of screenshot4600, handwriting recognition typing window 4008 will with software input panel together exhibit at screenshot4802.
The operation of hand-written pattern is provided in the accompanying drawing 49.When this pattern enters functional module 4902, show the handwriting recognition window, enter a circulation 4903 then and select to use other input option up to the user.In this circulation, if the user touches the handwriting recognition window then in any place, remove the delete button 4804 of in accompanying drawing 48, showing, by functional module 4904, if any action record " ink " is arranged when touching.If the user touches down the right button scope 4806 that Figure 48 shows, functional module 4905 causes functional module 4906 to be moved up to 4910.The functional module 4906 previous handwriting recognitions that move in any typing in the handwriting recognition window " ink ".4908 pairs of software input panel bufferings of functional module and operating system provide the output of identification, and functional module 4910 is removed identification window.If the user pushes that delete button 4804 of showing in accompanying drawing 48.Functional module 4912 and the 4914 any " ink that remove in the identification window.″
Should figure out, the use of recognition button 4806 allows the user to instruct the writing of the new word of understanding systematicly " ink " of previous handwriting recognition and beginning to recognize.
Accompanying drawing 50 expression key zones 5000, this button area also can be selected from function menu.
Has character recognition, handwriting recognition and keyboard and input method are effective fast, because the speech recognition software input panel of part is often extremely favourable, because it makes the user switch to and fro between these different patterns in not as good as one second, this is easily in the current time to rely on this method.And it allows the output of all these patterns to be used to this paper that the software for editing input panel cushions.
Shown in accompanying drawing 51, in an embodiment of software input panel buffering, if the user pulls filter button 1218, window 5100 is shown, to provide selectable filtrator typing the mode option to the user.These comprise the speech recognition of use letter-name, the AlphaBravo speech recognition, and character recognition, the option of handwriting recognition and keyboard window substitutes typing as other possibility method and filters spelling.It also makes the user can select whether any one speech recognition mode is discontinuous or continuous, and the alphabetical name identification character is discerned and whether the handwriting recognition typing will be blured as that in the filtrator character string.This user interface makes the user can select that to be fit to current time and local filtrator typing pattern soon.For instance, a quiet position, a people not will consider the place that people are caused to offend in a minute, and the contiguous alphabet name is discerned often of great use.Yet in a position, the many noises of where is it can not be to the offending of neighbours but the user feels voice, and AlphaBravo identification may be more suitable.A position, for example in library, the there voice may be to the offending of the people of other Shen Mo, the filtrator input method, and for example character recognition, handwriting recognition or keyboard input may be more suitable.
Accompanying drawing 52 provides a character recognition can how to be chosen to the example that filters identification soon.5200 have showed the part of revising window, and the user has pushed in filter button and the towing there, cause that the filtrator typing pattern menu of showing 5100 is shown in accompanying drawing 51, and select the character recognition option then.When showing at screenshot5202 in this, this causes that allow the user see at one shows character recognition typing window 4608 in the whole correction position of window.At screenshot5202, the user write character " e " and when he from a little letters " e " when removing stylus, letter can be caused that revising window 5204 is displayed in the example by typing filtrator character string.User's other character of typing similarly " m " then advances the character recognition window of 5206 indications, and when he discharged writing of his stylus, the identification of character " m " caused the filtrator character string to comprise " e ", as showing 5208.
Accompanying drawing 53 begins with part screenshot5300, and there, the user has clicked and moved away from filtrator button 1218, causes the demonstration of filtrator typing pattern menu, and has selected hand-written option.This shows one such as 5302 fluorescent screen, has a hand-written typing window 4800 and is presented at one and does not block the position, the visual field of revising window." embed " of the hand-written continuous rapid style of writing handwritten form letter of user presses the REC button then in screenshot5302, causes the identification of those characters.In case he has clicked this button, be presented at fuzzy character string of fuzzy filter indicator 5304 indications of first option window, the character that is relevant to identification is presented to be revised in the window 5306.Accompanying drawing 54 shows how the user uses the alphabetic filtering data of a key zone window 5000 typings.
Accompanying drawing 55 illustrates speech recognition and how to be used for collecting handwriting recognition.Screenshot5500 shows that a hand-written typing window 4800 enters within the software input panel buffer window 1104 for typing this paper.The user has just write a word in this screenshot.Numeral 5502 to 5510 is pointed out five other handwritten word.For the word identification of writing out before will causing, " Rec " button begins each word by touching down in the view.Numeral 5512 is pointed to the handwriting recognition window, and there, what the user was last clicks " Rec " button to cause the identification of last handwritten word " speech ".In the example of accompanying drawing 55, after the sequence of handwriting input has been recognized, the appearance that in screenshot5514, shows of 5516 indications before the software input panel buffer window 1104 of application window 1106 has.The user pulls the word " snackshower " of wrong identification, and this causes that revising window 5518 is shown.In example, the user clicks-sounding button 1216 again, and discrete word " much ... slower " that repeats to read expection." get " option operation through the correction a little that is relevant to accompanying drawing 23, cause that the identification that comes self-identifying sounding 5520 keeps the score in conjunction with from the recognition result of hand-written " Rec " in the input that combines digital 5504 indications, and 5506 select best identification candidate item of keeping the score, and are as the numeral 5522 expection words of being showed in the present example.
Also should figure out, the user can push new button and replace the Re-add button in revising window 5518, and in this case, sounding 5520 should use the output of speech recognition to replace the 5516 selecteed hand-written outputs of being showed.
Pointed as accompanying drawing 56, if the user pushes the button of sounding again that filter button 1218 replaces revising window 5518, the user may be with the speech recognition of known word, the sounding of in accompanying drawing 56, showing 5600 for example, thus alphabetical filtration accompanying drawing 55 5516 in the handwriting recognition selected.
Accompanying drawing 57 is showed the alternate embodiment 5700 of software input panel speech recognition interface for example, and the button 5702 and 5704 of two top, horizontal of separating is wherein arranged, and selects between each comfortable discontinuous and continuous speech recognition.Should figure out, it is the problem of a design alternative, and button is provided at the top, horizontal of voice understanding user interface.Yet, compare with the ability of between rapider and more natural continuous speech recognition, switching apace, more reliable, though more pauses and the discontinuous speech recognition of waiting a moment is possible be very desirable, and in some embodiments, proved the selection of the button of the top, horizontal of separating of settling for discontinuous and continuous identification.
Accompanying drawing 58 has shown alternate embodiment of program that the Show Options tabulation is showed in accompanying drawing 22.Except it creates the option list of the ordering of keeping the score of single rolling, rather than the option list of two alphabet sequences orderings of the program of accompanying drawing 22 generation.Its unique part that is different from the language that is comprised in accompanying drawing 22 is identified by underscoring, the fact Example of deleting in the program version that functional module 2226 and 2228 has also been showed in accompanying drawing 58.
Accompanying drawing 67 to 74 has shown the various mapping to a basic telephone number key zone of the menu function that is used to various pattern or revealed mobile phone speech recognition editing machine.The editing machine pattern in the accompanying drawing 76 of being mapped in the has been showed main volume telephone key-press of number.Accompanying drawing 68 has been showed the telephone key-press part of typing pattern menu, in the editing machine pattern, if the user by a button, typing pattern menu is selected.Typing pattern menu is used for selecting among various this paper and alphabetic typing pattern in system.The function that accompanying drawing 69 shows is used for the telephone key-press of numeral, can be by causing showing the correction window by " 2 " button from the editing machine pattern.Accompanying drawing 70 display digit telephone key-presss instructions selects edit menu to obtain main instruction in the accompanying drawing 67 by " 3 " button.This menu is used for changing the navigation feature of being moved by the navigation keys by the telephone key-press district.Accompanying drawing 71 illustrates the similar slightly correction navigation menu of revising the navigation options that window obtains by being presented at by " 3 " button.Except the pattern of change navigation, when being in the correction window, when option was selecteed, it also allowed the user to change the function of being moved.
Accompanying drawing 72 illustrates the digital keys mapping in button Alpha pattern, one of them telephone key-press with letter is pressed, associated, can cause that prompting is presented on the display screen of mobile telephone, the inquiry user says the ICA word in conjunction with one group of letter of expecting in conjunction with the button of pushing.When the typing pattern menu that is in shown in the accompanying drawing 68, this pattern is selected by double-clicking " 3 " telephone key-press.
Accompanying drawing 73 is showed a basic button menu, from at one group of punctuate the most common be used to allow the user to select apace among the function key of text-editing, or by see a menu that allows more generally not use the selection marker of punctuate by " 1 " button.Basic button menu is chosen in the editing machine pattern of giving an example in the accompanying drawing 67 by pressing " 9 ".Accompanying drawing 74 illustrational editing options menus are selected by " 0 " in by the editing machine of showing in accompanying drawing 67.This comprises menu and allows user's operation to use relevant basic task with disabled editing machine in other patterns or menu.
The digital keys mapping on each tool phone top that shows in accompanying drawing 67 to 74 is a title block of the display screen top displaying of phone in action, and menu or instruction list are presented at this display screen.Show that as accompanying drawing 67,69 and 72 letter in the title block " Cmds " shows that rendering preferences is the instruction list of part, the title block in the accompanying drawing 68,70,71,73 and 74 starts with " MENU ".This is used to refer in accompanying drawing 67,69 and 72 shows the difference between the instruction list and is illustrated in menu in other accompanying drawings.In the pattern that instruction list does not show, the instruction that instruction list shows is available.When in the editing machine pattern relevant with the instruction list of accompanying drawing 67, or the button Alpha pattern relevant with accompanying drawing 72, even telephone key-press has the function mapping of showing in those accompanying drawings, text-editing device window will normally be showed.Usually, in the relevant correction window scheme of the and instruction tabulation of in accompanying drawing 69, showing, revise window and be displayed on display screen of mobile telephone.In these all patterns, the user can entry instruction tabulation, thereby see when the current telephone key-press mapping of in accompanying drawing 75, being given an example, only by by menu key, as numeral 7500 indications of this accompanying drawing.In the example of showing in accompanying drawing 75, one shows that fluorescent screen 7502 is illustrated in the window that menu button is pushed preceding editing machine pattern.When the user presses menu button, first page of quilt of editing machine instruction list is as the displaying of 7504 indications, on the user has then in instruction list or the option that rolls down, thereby clearly be not only instruction and numeric phone keys mapping, " Talk " and " End " button similarly are presented at fluorescent screen 7506, and navigation keys " OK " and " Meny " are also shown in screen 7508 and 7510, if have with the other option of current pattern associating when instruction list the time by typing, they also can be selected from instruction list via rolling highlight 7512 and use " OK " button.In the example of showing in accompanying drawing 75, indicators 7514 has the shape of plain old telephone receiver, and indication points out to the user that on the left side of each title block mobile phone is now in a phone.In this situation, it is noiseless that extra function allows the user to select microphone soon, and only record is from the sound on the user limit of telephone talk and the telephone talk on a replaying user limit.
Accompanying drawing 76 to 78 provides the code description of the function of more detailed editing machine pattern, and contrast is by an instruction repertory of only showing in accompanying drawing 67 and 75.This code is expressed as input circulation 7602, and editing machine is wherein responded various user's input.
If user input by numeral 7603 point out one of navigation instruction, push one of them navigation keys or say the navigation instruction of a correspondence, that invents under it is moved as the function in the accompanying drawing 76.
These comprise whether editing machine is now in the pattern of word/line navigation with clear and definite in a functional module 7604 test.This is the prevailing navigation mode in the editing machine, it can by in editing machine pressing keys " 3 " thus twice selected fast.First pushes the navigation mode menu that is chosen in displaying in the accompanying drawing 70, and second pushes the word/line navigation mode of selection from menu.If editing machine is in the pattern of word-line, functional module 7606 to 7624 is moved.
If the input of navigation is the instruction on a word-left side or word-the right, functional module 7606 causes function 7608 to 7617 to be moved.Functional module 7608 and 7610 tests are out with clear and definite selection of whether extending, and if so, their moving cursor to one the word left side or the right respectively, and extends to that word to previous selection.If the selection of extending is not open that functional module 7612 causes functional module 7614 to 7617 to be moved.Functional module 7614 and 7615 tests are the instructions of the different directions on a word left side/right side with clear and definite input before whether, rather than present instruction or deny that the present instruction meeting is not placed cursor before or after the text end.If wherein any these situations are real, cursor is placed to outside the previous word of selecting in the left side or the right, and the previous word of selecting is optional.If the situation in the test of functional module 7614 is unmatched, function 7617 will word of moving cursor to the left side or the right leave its present position, and make this word move to current selection.
The navigation that functional module 7612 to 7617 is permitted the word left sides and word the right allows the user not only to move the cursor of a word but also if necessary, selects current word each moving.It also makes the user can be apace change between insertion point before or after the word of the corresponding selection of cursor or the present previous word of selecting of cursor table.
If user's input is on the line or a line gives an order, functional module 7620 moving cursors to line or under the line from the nearest word of current cursor position, if the selection of extending is out that functional module 7624 is extended current selection to new current word.
7626 pointed as numeral, editing machine also comprises the program design of response navigation input, and when editing machine is in other navigation modes the time, navigation mode can be showed in the editing navigation menu from accompanying drawing 70 and selects.
If the user selects " OK " or by pressing button or use sound instruction, functional module 7630 test is advanced within the other program design with the clear and definite whether editing machine typing this paper that has been called, for example a field or a dialog box of typing this paper enter the internet file, if so, the current context of functional module 7632 typing editing machines enters within other program design of this program design and current this paper typing position of returning.Do not match if test 7630, functional module 7634 withdraws from editing machine, for current text and state are preserved in later possible use.
When in editing machine, if the user pushes the Menu button, functional module 7638 is an editing machine instruction calls display menu program, causes the instruction list of showing for as the editing machine of showing in the accompanying drawing 75.As mentioned above, this allows the user to roll through all present instruction mappings for the editing machine pattern in one second or two seconds.When in editing machine, if user's double-click on Menu button, it is that editing machine is showed instruction list that functional module 7642 to 7646 is called display menu, and identification vocabulary is set at the instruction vocabulary of editing machine, and the double-click on that instructs the speech recognition use to push at last determines the duration of identification.
Push the Menu button if the user continues, functional module 7650 is that the editing machine typing helps pattern.This will provide a kind of explanation fast of function of editing machine pattern, and allow the user to probe into the order structure of the stratum of editing machine, by pushing its button, and has brief explanation, the result who is pressed as each such button for the order structure of the stratum of part.
If when in editing machine, the user presses the Talk button, functional module 7654 is opened the identification of setting according to current identification, comprises vocabulary and identification continuous-mode.The main button that the Talk button will often be taken as the speech recognition in the mobile phone embodiment to start with uses.
If the user selects conclusion button, functional module 7658 forwards telephony mode to, for example makes a call soon or connects one.The present status of its storage editing machine, in the time of with telephone finished like this of box lunch, the user can return it.
As showing in the accompanying drawing 77, if the user is chosen in the typing pattern menu of describing for example in the accompanying drawing 68, functional module 7702 causes this menu to show.Similarly will be than being described below of details, this menu allows user selecting between dictation mode soon, and some resembles the button 1122 to 1134 in the accompanying drawing 11 in personal digital assistant's embodiment.In the embodiment of showing, typing pattern menu is in conjunction with " 1 " button, because " 1 " button and talk button are approaching.This permission user changes dictation mode soon and continues to use the oral account of Talk button then.
Revise the pattern of the pattern of window navigation for the navigation of page or leaf/project if the user selects " choice list " function 7706 and 7708 to be set, this is to rolling and selecting the identification candidate item to select best.They can call the correction window writing routine for current selection then, and this causes revising the similar slightly correction window 1200 that the fluorescent screen of the mobile phone of displaying shows in accompanying drawing 12 of window.If there is not cursor now, revises window and will call with the selection of a sky.If this situation, it can be used for selecting one or more words, uses alphabetic input, and word is finished, and [or] additional more sounding.Revise being described that window writing routine will be below than details.
If the user selects " filter choises ", for example by double-clicking " 2 " button, functional module 7712 to 7716 will be revised the word/character pattern of the mode initialization of window navigation for conduct navigation in one first option or filtrator character string.They call current selection and revise window writing routine and to second pushing of double-clicking, if first push by typing, as are all the voice button that identification continues purpose.
In most mobile phone, " 2 " button directly is positioned at below the navigation keys usually.This makes the user can navigate to the expection word in editing machine maybe needs the word revised, near and single subsequently-" 2 " button pushing, thereby see correction window, or double-click " 2 " button and begin the typing filtering information immediately and select correct the selection to help recognizer with sounding option for selecting.
If the user is chosen in the navigation mode menu of showing in the accompanying drawing 70, functional module 7720 causes it to be demonstrated.As what will be described in more detail below, this function makes the user can change the navigation of finishing by by the left side and the right and up-and-down the navigation button.In order to make such switching easier, the button of navigation is placed within the highest row's button of phone.
If the user selects discontinuous identification input, functional module 7724 is opened the discontinuous identification according to current vocabulary, uses and pushes and put percussive sound, and send continuous-mode as current identification continuous-mode setting.This button is provided the user can be converted to soon to discontinuous speech recognition, whenever the expection " 1 " button press the time.As former the statement, discontinuous identification is tended to more more accurate than continuous identification in fact, though its more pause.The position of this instruction button has been selected as near Talk button and typing pattern menu button.Because discontinuous identification button is effective, the recognition mode that normally is mapped to the Talk button will be continuous.So setting for one allows the user by changing between continuous and discontinuous identification pushing change between Talk button and " 4 " button.
If the user selects beginning or select to stop by switching " 5 " button, the selection that functional module 7728 switchings are extended is opened and is closed, and relies on whether that pattern is to open or close now.Whether functional module 7730 tests only are switched off with the selection of clearly extending then, if so, functional module 7732 is selected any previous selection except at current cursor.In the embodiment that is described, " 5 " button is the selection instruction of extending by control and " 2 " button near navigation because of it, so as the button that proposes to revise window.
If the user selects all instructions, for example double-click button " 5 ", functional module 7736 is selected current all this paper of file.
If system is not playing sound now, if the user select " 6 " button or any one now existing unite instruction, can comprise and play beginning, play and stop or write down stopping that it is in audio plays not that functional module 7740 is tested current with clear and definite whether system.If so, functional module 7742 is switched between the pattern that audio-frequency play mode and voice playing are closed.If mobile phone is current make a call one and play in accompanying drawing 75, show to have only my 7513 options are the patterns that are set to close, functional module 7746 is sent to other ends of telephone conversation from the audio frequency of playing at telephone wire, with up to loudspeaker or earphone or mobile phone self.
If, when " 6 " when button is pressed, the sound that writing down of system on the other hand, functional module 7750 is closed record.
If recording instruction of user's double-click on " 6 " button or typing, functional module 775 are opened audio sound-recording.Whether system is current makes a call one with clear and definite in functional module 7756 test then, and whether writes down unique I and set 7511 and shown in the accompanying drawings.If so, functional module 7758 is from from the another side of telephone wire and the input recording voice of telephone microphone or microphone jack.
If the menu instructions that user's double-click " 7 " button or other selection are capitalized, functional module 7762 shows the menu that quilt capitalizes, the option that this menu provides the user to select between all lowercases that all caused typing this paper afterwards, all beginning capitalization or the pattern that all capitalizes.It also allows the user to select to change the current selection of one or more words, arrives all lowercases, all initial caps, or the form that all capitalizes.
If the user double-clicks " 7 " button or selects capitalization circulation button, the capitalization loop program can be called once or more times to change current selection, change all initial caps into, all are all capitalized or all forms of small letter all.
If the user is by " 8 " button or the tabulation of other selection word forms, functional module 7770 is called as top accompanying drawing 27 described word forms tabulation journeys.
If user's double-click on " 8 " button or the instruction of selection word types, functional module word 7774 shows the word forms menus.Word is keyed in menu and is allowed the user to select the filtrator matcher on the word of a selection of the related words of describing as top accompanying drawing 26 of word types restriction.In the embodiment of showing, this menu is that the menu of a stratum shows its general form in accompanying drawing 91, allow user's specified word end type,, comprise that word begins type if having, the word Temporal Types, word part sound-type and other word types, so as the possessive case or non-all forms, nominative form odd number or plural number, verb form odd number or plural number, spelling or non-orthographic form and homophone.
Shown in accompanying drawing 78, if the user double-clicks " 9 " button or selects basic button menu instructions, functional module 7802 is presented in the accompanying drawing 73 the basic menu key of showing, allow typing that the user selects one of punctuate for sign or input can from as the selecteed character of that menu imported of this paper.
If the user double-clicks " 9 " button or selects New Paragraph instruction, new paragraph character of functional module 7806 typings enters within this paper of editing machine.
If user's selection " * " is for button or jump out instruction, functional module 7810 to 7824 is moved.Functional module 7810 test is with the be called input or at another program editing text, in this situation, functional module 7812 is returned from calling editing machine, follows to inserting the text that this program is edited of clear and definite whether editing machine.If editing machine is not required so purpose, functional module 7820 usefulness leave editing machine selection, store it content and [or] cancellation jump out the prompting user.If the user selects to jump out, functional module 7822 and 7824 is jumped out to the telephony mode of top description about the climax level of accompanying drawing 63.If the user double-clicks " * " button or selects the task list function, functional module 7828 forwards task list to, and similarly, double-click on works in most mobile phone, in operator scheme and the menu.
If the user is by " 0 " button or select the editing options menu instructions, functional module 7832 edited by Short Description ground about the top options menu of accompanying drawing 74.If the user double-clicks " 0 " button or selects cancellation instruction instruction, if having, functional module 7836 those instructions in editing machine at last of cancellation.
If the user is by " # " button or select the instruction of that backspace, whether functional module 7840 tests have a current selection with clear and definite.If so, functional module 7842 is deleted it.If if the unit that does not have current selection and current minimum to navigate is a character, word or outline project, functional module 7846 and 7848 is deleted backward by the current navigation unit of that minimum.
The option that provides by as the typing pattern menu in the accompanying drawing of discussing in the above 68 is provided accompanying drawing 79 and 80.
When in this menu, if the user pushes " 1 " button or the big vocabulary identification of other selection, function 7906 to 7914 is moved.These will be discerned vocabulary and be set at big vocabulary.They continue the pushing as a voice button of pushing that purpose is treated " 1 " for identification.If revise window and show for one, they also test and whether revise window and be shown.If so, they are set at discrete identification with recognition mode, so that revise in the window one, the user needs discrete more accurately identification based on supposition.They add any new sounding or sounding or the sounding received the list of utterances of above-described type in this pattern, and they call the list procedure of the Show Options of accompanying drawing 22, thereby show a new correction window for any sounding of receiving again.
Because it is the most common identification vocabulary, and so the user can be easily by selecting it twice from editing machine by " 1 " button, so in the mobile phone embodiment of showing, " 1 " button is that the big vocabulary of typing pattern menu is selected.First clicks selection typing pattern menu, and second clicks the big vocabulary identification of selection.
If when in the typing pattern, the user is by " 2 " button, and the type letter that system will be configured to be described in the above-name is discerned.If the user double-clicks this button, when the user is when one revises in the window, typing pattern menu is each when showing, functional module 7926 will discern vocabulary and be set at the vocabulary of letter-name and point out that the output of that identification will be regarded a fuzzy filtrator.In preferred embodiments, the user has the ability preferentially to select to point out under the option in the typing relevant with the button " 9 " of menu, and so whether filtrator is by the filtrator as fuzzy length.Default value is set the identification filtrator that allows such identification be taken as the continuous letter-name of fuzzy length, and the fuzzy filtrator of fixing length is responded the identification of discontinuous letter-name.
When the user presses " 3 " button, discern the AlphaBravo pattern that is configured to.If the user double-clicks " 3 " button, identification is as button " Alpha " pattern of describing in the above that is configured to that is relevant to accompanying drawing 72.This pattern and AlphaBravo mode class are seemingly, except push digital keys " 2 " extremely " 9 " will cause the user to be prompted one of ICA word relevant with the letter on the button of pushing, and identification will support from the identification of one group of limited ICA word, and is so consistent under extreme relatively noise situation even very reliable alphabetic typing is provided.
Its user pushes " 4 " button, and vocabulary is changed to the digit vocabulary table.If user's double-click on " 4 " button, system will advance within editing machine this paper by the numeral of typing correspondence, pushes numeric phone keys thereby respond.
If the user is by " 5 " button, identification vocabulary is limited to a punctuate vocabulary.
If the user by " 6 " button, discerns the contact name word vocabulary that vocabulary is limited in being described above to communication.
Accompanying drawing 86 illustrates the button Alpha pattern that has been described about accompanying drawing 72 in the above on some degree.Pointed as accompanying drawing 86, when this record by typing the time, navigation mode is configured to normally relevant with alphabetic typing word/character navigation mode.Then functional module 8604 topped below it by with each button of listing of the function pointed out of button so.In this pattern, push the identification that the talk button is opened the AlphaBravo vocabulary, set and respond pushing of button according to current identification and continue to set according to current identification." 1 " button continues to operate as the typing edit pattern so that the user can withdraw from button Alpha pattern by it.Push digital telephone tone keys " 2 " extremely " 9 " cause functional module 8618 to be moved during so pushing at one up to 8624, show a prompting of ICA word corresponding to the letter of telephone key-press.This causes discerning the identification of supporting one of those three or four ICA words in fact, it be push during open identification, if and its output is corresponding to the letter of the ICA word of identification or enter within this paper of editing machine in the editing machine pattern, if or enter the filtrator character string in the filtrator edit pattern.
If the user pushes zero button, functional module 8628 enters a button punctuate pattern of pushing of responding any telephone key-press, by the scrollable list that shows all punctuate signs, this is tabulated with a beginning in one group of letter relevant with button, and supports the identification of one of them punctuate word.
The embodiment of substituting of the main A Erfa pattern that accompanying drawing 87 performance and accompanying drawing 86 are identical.Except in accompanying drawing 87 by the code section of underscoring.In this pattern, if the user presses the button at top, big vocabulary identification will be opened, and will be output but have only the beginning letter of the word of each identification, as functional module 8608A indication.Point out as functional module 8618A and 8620A, when the user has one group of three or four letter by a telephone key-press of uniting with it, the user is prompted an alphabetic word with expection, and the identification vocabulary is limited in fact from the word of one of the relevant letter of button beginning, and functional module 8624A output is corresponding to the initial letter of recognized word.
In some embodiments of the present invention, the 3rd a button Alpha pattern can be used to one group of limited word that combines with each Alpha letter, and during the pushing of button, identification be limited in fact to the identification of one of one group of word of the relevant alphabetical combination of button.At some so in the embodiment, one group five or word still less can be so alphabetical relevant with each.
Accompanying drawing 89 and some optional options in the options menu of being concerned about of 90 performances, the option mat is being entered with revising pressing button 0 in the window scheme by editing machine.In this menu, if the user by 1 button, he obtains the menu as a pointed file option of functional module 8902.If the user is by 2 buttons, he obtains the menu of an editing options, for example common in most edit routine design as pointed those of functional module 8904.If the user is by 3 buttons, functional module 8906 shows the identical preferential choice menus of typing, by top description about in typing pattern menu, pushing 9 and enter this menu in accompanying drawing 68 and 79.
If the user is by " 4 " button in the editing options menu, text-will show to-voice or TTS menu.In this menu, " 4 " button switches opening or closing of TTS broadcast.If open a current selection if this button switches TTS, functional module 8916 and 8918 causes TTS to read selection, preferably it is through TTS or the reading of word " selection " of writing down word in advance, if when TTS is switched, do not select, TTS begins to read out in the current this paper on the current cursor position, up to the current end of file or user's TTS except file the inside motion cursor, providing input.When will be explained below about accompanying drawing 99, when the TTS pattern is out, just can be used for the substantial portion that will make system functionality can not need to see the mobile phone fluorescent screen, the TTS playback of audio prompt and text is provided for the user.
The TTS submenu also comprises option, allow the user to play current selection, whenever he or she want so to do as functional module 8924 and 8926 and functional module 8928 and 8930 pointed, no matter machine is to be in the pattern that TTS opens or closes, and allows the user to switching opening or closing of broadcast continuously.8932 pointed, double-click on " 4 " button switches TTS and opens or closes as the user by " 4 " button as the option of climax-level in the editing options menu, wait then the TTS menu displaying and and then once by " 4 " button.
" 5 " button in the editing options menu is selected the outline menu, comprises many functions, allow the user expansion with contact title and outline pattern in navigate.If the user double-clicks " 5 " button, system is in expansion fully and contact fully between the current outline data item at editing machine cursor place and switch.
If the user selects " 6 " button, and voice menus is showed as a submenu, and some options are displayed on indentation in the audio menu project 8938 in the combination of accompanying drawing 89 and 90.This audio menu comprises the project of being selected by " 1 ", and this project finally controls for the user for the audio frequency navigation speed, and this control is provided by the button in the present menu of the editor of " 6 " button in accompanying drawing 84 and 70.If the user selects " 2 " button, he or she will see the sub-dish that a playback that allows the user to call the monophone sound is set, for example volume and acceleration and the relevant sound of word whether discerned will be played, and [or] do not need the word of relevant identification with regard to recording voice.
Accompanying drawing 90 is with above-mentioned button " 3 " under audio menu, " 4 ", " 5 ", the project that " 6 " and " 7 " are selected begins, in accompanying drawing 89 with numeral 8938 beginnings.If user's pressing keys " 3 ", the audio option dialog box 9000 of identification will be shown, shown in numeral 9002 to 9014, give user option to select to implement any speech recognition that is included in the current option in the editing machine, discern the audio frequency in all current files, whether before the decision audio frequency of identification will be read identification, and parameter is set to determine this identification quality, required time.Point out that as functional module 9012 this dialog box provides the estimation of being familiar with current selection with the state of current quality settings and work at present, if the task of a understanding selection is carried out now.This dialog box allows user in a large amount of relatively audio frequency operation identification, and for example background task or when a tool phone not have for other purpose use comprises when being inserted into standby power supply.
If selecting " 4 " button, user to be provided one in audio menu, the user allow him to select to delete submenu from the specific data of current selection.This comprises that the permission user selects to delete all sound of not uniting with the word of identification, and all sound that deletion is selected with the word of identification are deleted all sound, or deletion is from the selection this paper that expects.Deletion comes the audio frequency of identification of this paper of self-identifying to reduce the internal memory relevant with the storage of these this paper greatly and is a frequent useful thing, in case the sound that the user has determined him not need this paper to be correlated with is helped his/her its implication of having a mind to of decision.From medium deletions this paper of part but it often is useful not deleting sound, this paper is produced by the speech recognition from sound, so but very inaccurate almost useless.
In the menu of sound, " 5 " button allows the user to choose whether that this paper of associating sound recognition is by marking, in some embodiments, for example the user is known if so this paper playback can maybe will have from the performance of the acoustics of substitute identification option and can be produced with helping to understand it by underscoring.Button " 6 " allows the user to select, and whether discerning audio frequency is to preserve in order to discern text.In many embodiments, even the recording of identification audio frequency is closed, this audio frequency will be for some that word initial caps of discerning most recently, so that it can be used to revise the purpose of playback.
In audio menu, " 7 " button is selected one and is copied modal dialog.This causes dialog box to be showed, allow the user to select to set to be used to one be described below 94 copy pattern about accompanying drawing.This is that a design makes the user copy the pattern of the audio frequency of recording in advance easily by speech recognition.
If the user is by " 8 " button, functional module 9036 will be moved, call one with current selection and search dialog box, if having any, as the search character string.As what will be described for example below, if desired, speech recognition text-editing device can be used for the different search character string of typing.If the user double-clicks " 8 " button, this will be explained as a searching instruction again that this instruction will be that the search character string of previous typing is searched.
If the user selects editing options menu " 9 " button, the vocabulary menu displaying allows the user to determine which word in current vocabulary, selects between different vocabularies and word is added a given vocabulary.When in the editing options menu, if the user pushes or double-click " 0 " button, a cancellation command function module will be moved.The click of twice enters from the cancellation command function module within the editing options menu, and similarly, double-click on " 0 " enters the cancellation command function module from editing machine or correction window.In the editing options menu, the operation of key tap is as the button of reforming continuously.
Accompanying drawing 94 illustrates the TTS play rules.These are the rules of ruling the TTS operation, when TTS by being described the 8908 selected of functional module in the accompanying drawing 89 in the above to 8932.
When in the TTS menu, open TTS button pattern as merit by the operation of " 1 " button, as above surface function module 1909 is pointed, and functional module 9404 causes functional module 9406 to 9414 to be moved.These functional modules make the user can not need to see that they just can select telephone key-press safely, for example when the user is driving an automobile or otherwise hurrying in.Preferably, to being used for any pattern phone operation, this pattern to the operation of speech recognition editing machine without limits.When any telephone key-press was pressed, whether identical button was pressed in the TTS button time with clear and definite in functional module 9408 test, and this is short time interval, for example 1/4th or 1/3rd seconds.Be the purpose of this test, from this button to be released to the time that is again depressed for the last time measured.If identical button is not pressed in short time interval, functional module 9410 and 9412 sounding that will cause a this paper, or the playback of record audio are in some embodiments said numeral and its present instruction name name of button.As long as the user continues pressing keys, the sound of response will continue.If button has the instruction relevant with its double-click on, it also will be read out, if the user continues the button of long enough.If the time that functional module 9408 test is found because identical button be discharged at last by pressure ratio mobile phone software respond the functional module 9414 of pushing TTS the button timing still less, comprise any double-click on, just do not open as TTS button pattern.
Therefore, can see that TTS button pattern allows the user by touching the button that finds mobile phone, push it and determine whether that it is the button of expection, if so, soon again by once or more times is up to the expectation function that finds button.Since pressing keys by can't causing of responding of functional module 9410 and 9412 except it the correlation function module any response, this pattern allows user not need to cause that any undesired result just seeks the button of expection.
In some mobile phone embodiments, the mobile phone button be designed to they only touched and do not promote sound feedback as they be exactly button and now functional module will be passed through, similarly provide by functional module 9412.This can be provided, for instance, the material of the telephone key-press of doing by conductive material, or produce voltages by other parts that the phone that separates with those buttons is arranged, if the health through the user passes to a button, just can be found by the circuit relevant with button.Such system can provide one in addition more fast method to the user by touching the button of seeking an expection, because because its user can receive feedback, for example which is the button that he is touching, by near the finger of scanning on the key zone expection button.It also can allow user's instruction name scanning for needing apace, by similarly scan his finger on continuous button, is found up to the instruction of expecting.
When TTS was out, if system identification or receive instruction input, functional module 9416 and 9418 caused that the playback of sound of TTS or record is to read the instruction name of identification.Preferably, the affirmation sound of such instruction has a relevant sound quality, the different form of for example different tones or different related sound, difference from this paper that says identification the narration of instruction word.
When TTS was out, when this paper sounding language was recognized, functional module 9420 to 9424 can be told the end of sounding and finishing of its identification, uses TTS to say word then, and this word has been identified first option of sounding.
Up to 9430 pointed, TTS is in response to the identification of filtering sounding under similar fashion as functional module 9426.
When in the TTS pattern, if when user's moving cursor is selected a new word or character, functional module 9432 to 9438 uses TTS to say the word or the character of new selection.If this motion of cursor is to extend a selection that has begun to a new word or character position, after saying new cursor position, functional module 9436 and 9438 is to point out that it is not that the part mode of text of identification will be said " selection " that word, say the word of current selection then.If user's moving cursor is a non-selection cursor, the functional module 7614 and 7615 that for example is described in the above about accompanying drawing 76, the functional module 9440 in the accompanying drawing 94 and 9442 uses TTS to say two words on cursor position next door.
When in the TTS pattern, if a new correction window is showed, functional module module 9444 and 9446 uses TTS to say first selection of correction window, eliminate current filtrator, if filtrator is arranged, which part of pointing out it is which part with it of non-fuzzy is blured, and uses TTS to say each candidate item of part of present rendering preferences tabulation then.Because the purpose of speed, it preferably uses not same tone or sound to indicate which part filtrator is absolute or fuzzy.
Revise a project of window if the user rolls, respond each such rolling, functional module module 9448 and 9450 usefulness TTS say the selection of present highlight and its selection numeral.The page or leaf of a correction window if the user rolls, functional module module 9452 and 9454 usefulness TTS are said the new selection that shows and are pointed out the option of present highlight.
When in modification model, if menu of user's typing, thereby functional module module 9456 and 9458 use TTS or freely record audio say the name of the current set of menu and in menu all option and their dependency number, point out current chosen position.Preferably, finishing the said word of used audio frequency indication user is menu option.
The project of a menu if the user rolls up or down, functional module module 9460 and 9462 use TTS or the sound that writes down are in advance said by the option of highlight, then, after brief termination, say any selection on the menu page that shows now.
Accompanying drawing 95 illustrates some aspects of program design that are used to TTS.If the word that will produce of TTS is in the word vocabulary table of the spelling phonetically of speech recognition program design, functional module 9502 causes functional module 9504 to 9512 operations.Whether word has with the relevant various language spelling of the different piece of voice with clear and definite in functional module 9504 test, and the word that whether uses TTS to be set has its context relation when the language of forward part voice of indication.If both a little situations are run into, 9506 of functional modules use the phonological components of speech recognition program design to point out that code selects Chinese phonetic spelling, this Chinese phonetic spelling in conjunction with the indication code in TTS produces of current word as the part voice that may in the part voice, find of Chinese phonetic spelling.If on the other hand, have only a Chinese phonetic spelling to combine word or do not have sufficient voice context to distinguish that most probable part for word, functional module 9510 are that monolingual spelling is selected in word or its most common language spelling.In case the spelling of a language has been selected by the word of functional module 9506 or functional module 9510 generations, functional module 9512 is used the Chinese phonetic spelling of selecting as word, as being used to the Chinese phonetic spelling that TTS produces.If, as being pointed out 9514, the word neither one Chinese phonetic spelling that is produced by TTS, TTS generation for word, functional module 9514 and 9516 is used pronunciation conjecture softwares, and this software is used by speech recognition device, is used for Chinese phonetic spelling is distributed to the word of name name and new typing.
Accompanying drawing 96 is described the operation of the pattern of copying, this pattern of copying can be selected by copying modal dialog, this dialog box activates in the audio menu option of editing options menu, under the audio menu in conjunction with the editing options menu showed in accompanying drawing 89 and 90 of numeral " 7 " in the above-mentioned accompanying drawing 90.
When entering when copying pattern, functional module 9602 normally changes navigation mode into the audio recording of a navigation keys input of responding the left side and the right and responded the navigation input up or down and five seconds the navigation mode forward or backward that navigates forward and backward in 1 second.These are the default values that can change in copying modal dialog.In this pattern, if the user by the broadcast button that is button " 6 " in editing machine, functional module 9606 is moved up to 9614.Functional module 9607 and 9608 is switched the Kai Heguan that plays.If switch broadcast is opened, functional module 9610 causes functional module 9612 to be moved.If so, if there has not been the navigation of sound, the time of one section setting before last sound is played, and the last broadcast of functional module 9614 beginning playback ends up.This is moved, so that if the user is implementing to copy, before a last end slightly, each continuous playback will begin, so the user will be familiar with have only the word of partly being said in playback before, and by looking into previous language context relation so that better make the user that speech sound is translated as word.If the user is by playing the period of button above appointment, for example 1/3rd seconds, functional module 9616 caused functional module 9618 to be moved up to 9622.Whether these functional module tests play out with clear and definite, and if therefore they turn off it.They also open the identification of big vocabulary during pushing, or continuous or discontinuous pattern, according to current setting.They insert the identification text then and enter editing machine, the position that the last part of being transcribed at audio frequency takes place.If user's double-click on broadcast button, functional module 9624 and 9626 prompting user recordings are disabled in the pattern of copying, and the transcriptional profile of the audio menu under the options menu that increases is switched off.
Should see, its pattern of copying permit the user alternately play the audio frequency of part precedence record and transcribe then its use by speech recognition by only alternately click and continue push the broadcast button, this button is digital " 6 " telephone key-press.Thereby the user uses another function of editing machine to correct any mistake of having been made freely during transducer aspect identification, then by press again " 6 " thus button play the sound that next fragment transcribed and get back to it.Certainly, should understand the user will often not want to implement letter and copy outside audio frequency.For instance, may the reset part of phone and only copy summary of more noteworthy part of user.
Accompanying drawing 97 illustrates the operations of dialog box that editor uses the feature program design of many editing machine patterns of describing in the above, make the user can typing this paper and other data advance within the dialog box of showing in the phone fluorescent screen in action.
When a dialog box at first by typing the time, functional module 9702 shows the first of editor window performance dialog box.Can't show on a fluorescent screen once that it will be showed if dialog box is too big in a rotatable window.Pointed as functional module 9704, except being pointed out by functional module 9704 to 9726, dialog box is similarly responded all above-described input editing device patterns about accompanying drawing 76 to 78 and is existed.As 9707 and 9708 pointed, if the user navigates in the input in a dialog box, the response of cursor movement is a pattern, in this pattern, should be in editing machine, except it can move to a control usually, wherein the user can provide input.Therefore, if the user moves the left side or the right of a word, cursor can be left or the right move to next dialog box control, if move delegation up or down essential look for so control.If the user moves delegation up or down, cursor can move on to nearest lastrow of current cursor position or next line.In order the user can be read can not comprise the extension of this paper of any control, though normally cursor will not be moved beyond page or leaf at that apart from not control of the inside.
Pointed as functional module 9700 to 9716, if cursor has been moved to a field and user and provides any one can import this paper to advance type input within the editing machine, functional module 9712 shows an independent editor window for this field, if any, this shows current this paper in that field.If the field has the vocabulary restriction any with its associating, functional module 9714 and 9716 limits being identified within that vocabulary of editing machines.For instance, if the field is limited name, is identified in and has so restriction in that field.As long as field-editor's window shows that functional module 9718 will indicate all editing machine instructions its inside operation editor.The user can be by selecting OK to withdraw from this field-editor, and this will cause that the current text in window is by the field of typing dialog box window at that time.
If the cursor in dialog box is moved to the tabulation and the user of option and selects this paper input instruction, functional module 9722 shows revises window, in the tabulation dialog box, show a currency, as first selection, and other the option that is provided in the list box is illustrated in the rolling option list as other available options.At this special option list, rotatable option still can not enter by the numeral of selecting an associating, and can enter through the speech recognition that use is limited to a vocabulary of those options.
If cursor in a check box or radio button and the user select any editing machine this paper input instruction, whether functional module 9724 and 9726 changes the state of check box or radio button, select by switching check box or radio button.
Accompanying drawing 98 illustrates a helper 9800, this program be the help pattern described by accompanying drawing 19 in the above of mobile phone embodiment similarly.When mobile phone be in the pattern of a given state or operation in, when this help pattern is invoked, functional module 9802 shows a rotatable help menu for state, and this state comprises the description of state and optionally help options tabulation and the instruction of all state.Accompanying drawing 99 shows a such help menu for the editing machine pattern that is described about accompanying drawing 67 and 76 to 78 in the above.Accompanying drawing 100 illustrates a like this help menu for the typing pattern menu about accompanying drawing 68 and accompanying drawing 79 to 80 that is described in the above, as what in accompanying drawing 99 to 100, show, each these help menu comprises a help options selection, can be selected via the operation of rotatable highlight and elp button, will allow the user to jump to help menu and another the relevant functional module of various different pieces helps soon.Each help menu also comprises a brief statement, 9904, and about the state of current phone in action.Each help menu also comprises enterable roll and selectable menu 9906 is listed all options through telephone key-press.It comprises that also allows other the functional module 9908 of help function module of user access, comprises how describe this uses the help function module and in the help of some situations about the functional module of the available fluorescent screen different piece in current pattern.
Shown in accompanying drawing 101, if the user in the editing machine pattern be made on the menu key continue push, as 10100 indicated,, cause mobile phone to show fluorescent screen 10102 for the editing machine pattern will enter the help pattern.This shows selectable help options, option 9902, and the beginning of the operation Short Description of another pattern 9900 of demonstration shown in accompanying drawing 99.If the user is by the right arrow key of mobile phone, this button moves as the page or leaf right button, because, in the help pattern, the pattern of navigation is a page or leaf/line navigation mode as displaying "<P^L " character indication in fluorescent screen 1102, and the demonstration one page that will roll down is pointed as fluorescent screen 10104.If the user is again by a page or leaf right button, the fluorescent screen one page that will can roll down again causes the fluorescent screen demonstration shown in 10106.In this example, only by the right button of double-click on page or leaf, the user can read the summary of the functional module of the editing machine pattern 9904 that the numeral 99 in the accompanying drawing 99 shows.
Shown in screen shot10108, if the user causes fluorescent screen rolling one page down by the right button of page or leaf again, the beginning of the instruction list relevant with the editing machine pattern can be in sight.If needs are arranged, the user can use the roll help menu of whole length of navigation keys.In the example of showing, when the user found the numeral of the button relevant with typing pattern menu, shown in 10110, he pushed this button to cause demonstration of help pattern and the relevant help menu of typing pattern menu, shown in fluorescent screen 10112.
Should figure out, whenever the user is in a help menu, he can be chosen in the instruction that is listed under " selected by key " row 9910 of showing in the accompanying drawing 99 at once.Therefore, in order to push the button relevant and to see its function, do not need user's instruction of rolling to be listed in the help menu of part wherein downwards with the instruction of listing.In fact, a user thinks that he understands and button function associated module, can only make the button of keying in expection then of pushing that menu key kept, thus see it a kind of short explanation of functional module and a series of available instruction under it.
List instruction under " select by OK " row 9912 of showing in accompanying drawing 99 to 100, these instructions must be by being rolled to highlight the dos command line DOS in the menu and passing through to use the OK Instruction Selection.Because the instruction of listing below 9912 row is relevant with the operation push-button that is used to help menu itself.The instruction that this is similar in the instruction list of the editing machine pattern in the fluorescent screen 7506 that is listed in the accompanying drawing 75 also is unique can selecting by the selection of the instruction of the OK in the instruction list.
In the example of accompanying drawing 101, suppose that the user knows that the preferential choice menus of typing can be by by selecting in typing pattern menu pressing keys " 9 ", and he enters the help of typing pattern menu when pushing this button, shown in 10114.Show for example that as 10116 this causes that for the preferential choice menus of typing help menu is shown.
In example, user's pressing keys " 1 " is and then pushed.Button " 1 " is oral account default option invoke help menu tout court, and the escape button menu that returns the preferential choice menus of typing and combine with the preferential choice menus of typing in this position, shown in fluorescent screen 10118.So and then select the escape button, only the numeral of following the instruction of an escape by the instruction and the tabulation of pressing part allows the user to navigate to the part instruction list of the expection of help menu apace.
In example, thereby the user pushes following one page of the right button rolling instruction list of page or leaf shown in fluorescent screen 1122 shown in 10120.In example, it is assumed to be that the user selects the option with " 5 " relevant button, by pushing this button, shown in 10124, thereby obtains to push the discontinuous description to the sounding option of continuous click.This causes the demonstration of a help menu for this option, shown in fluorescent screen 10126.In example, the user reads the functional module Short Description of this option at other two fluorescent screens of rolling, and presses the escape button then, shown in 10128, for as the fluorescent screen the 10130 preferential choice menus of typing that show return help menu.
Shown in accompanying drawing 102, in example, when the user is returned as the help of the preferential choice menus of typing, he or she selects " 4 " button, shown in numeral 10200, caused to During Press and the sounding of Click and the help menu of option, shown in fluorescent screen 10202.Thereby the user rolls then other two fluorescent screens for this pattern read enough illustrate understand it function then, shown in 10204, jump out for the preferred choice menus of typing and to return to help, shown in fluorescent screen 10206.User and then once push escape and get back to help menu, therefrom, the preferential choice menus of typing is called, and the help of typing pattern menu is as shown in the fluorescent screen 10210.User and then once push escape and get back to help menu, therefrom, the help of the preferential choice menus of typing is cancelled, and the help of editing machine pattern is as shown in the fluorescent screen 10214.
In example, thereby suppose the part of user by six bottoms of rolling of the right button of page or leaf, 9908, in accompanying drawing 99, show help menu for the editing machine pattern.If the user needs him can use one to place this part the option that instruction enters help menu faster.In case among " the other help " of help menu part, the user pushes to the next line button as showing, shown in 10220, thereby select editing machine fluorescent screen option one 0224, shown in fluorescent screen 10222.At this moment, it is that the editing machine fluorescent screen itself causes that help shows that the user selects the OK button, shown in fluorescent screen 10228.In the pattern that this fluorescent screen is showed, main telephone key-press member indicator 10230 is used for the editing machine fluorescent screen of identification division.If the user pushes one of them these relevant telephone number, the description of the counterpart of fluorescent screen will be showed.In the example of accompanying drawing 102, the user causes that by " 4 " button an editing machine helps fluorescent screen 10234 to be shown, and this function "<W^L " of describing the navigation mode indicator is showed in the help fluorescent screen 10228 of the fluorescent screen, top of editing machine.
In example, the user presses the escape button three times, shown in 10236.First these escape 10234 get back to fluorescent screen 10228 from the fluorescent screen, select the explanation of digital option of other described fluorescent screen to user option.In example, the user is to these other option and lose interest in, and therefore the pushing and then in addition two and push fast of first escape, these two first escape in pushing fast get back to help menu for the editing machine pattern, and second escape wherein gets back to editing machine pattern itself.
Can see that in accompanying drawing 101 and 102 stratum of help menu manipulates the family can probe into order structure on the phone in action apace.Can be used to seek an instruction that moves the functional module of expection, or only learn the order structure of linear precedence.
Accompanying drawing 103 to 104 is described an example, and the user gives an oral account some voice continuously in the editing machine pattern, uses this paper output of the interface modification generation of editing machine then.
Sequence in accompanying drawing 103 is made with the user and is continued to push dialog buttons and begin, and shown in 10300, during this, he says sounding 10302.This causes the identification of this sounding, in example, causes to be illustrated in this paper that shows in the fluorescent screen 10304 in this text window 10305 of editing machine.The cursor position that numeral 10306 is pointed to is positioned at the end of this paper of this identification, is a non-selection cursor at continuous oral account end.
Suppose that system has been set at a pattern, this pattern will cause uses the sounding of continuous big vocabulary predicative sound identification to be identified.This is pointed out by the character " _ LV " 10306 in the title block of the editor window of showing in fluorescent screen 10304.
In example, user's pressing keys " 3 " enters additional navigation menu, shown in giving an example in accompanying drawing 70 and 84.Be chosen in the sounding option of showing among those figure by " 1 " button then.This make cursor corresponding to by nearest sounding first word of identification text, as 10308 as shown in the fluorescent screen 10310.The next one, the user double-clicks the periodic function module that the selection of " 7 " button is capitalized, shown in accompanying drawing 77.This causes that selected word is shown with capitalization, shown in 10312.
Then, the user pushes right button, and it is pointed out by navigation mode indicator 10314 in current word/row navigation mode, as a word right button operation.This causes cursor to move on to next word to the right, 10316.The user pushes " 5 " button editing machine is set at the preference pattern of an extension then, shown in the functional module in the accompanying drawing 77 7728 to 7732.The user pushes the word right side again then, causes cursor to move on to the selection 10320 of word 10318 and extension to comprise this paper " got it ".
Then, the user presses the option list instruction that " 2 " button is selected accompanying drawing 77, causes that one will be that first option list of selecting demonstration correction window 10322, the first to sort alphabetically is presented at 10324 for selecting 10320.In this option list, each selection can be used for selecting its related telephone key number to show with one.
In example, generally the selection of supposition expection does not show in first selective listing, therefore, the user pushes roll down for three times the 3rd fluorescent screen of the option list of coming second letter sequence of right button alphabetically, show that 10328 therein, the word of expection " product " is positioned.
Functional module 7706 as accompanying drawing 77 is pointed, when pushing by option list single, the user enters when revising window, the navigation of revising window is set to the pattern of the navigation of page or leaf/project, points out as the navigation mode indicator of showing in fluorescent screen 10,332 10326.
In example, the user is by the option of " 6 " button selection expection, and this causes it to be inserted within this text window of editing machine in the position that cursor is selected, and causes the demonstration of this text window of editing machine appearance shown in 10330.
Then, the user is placed on position 10332 to cursor by the right button of word three times.In this situation, the word of identification is " results ", and the word of an expection is that this word " result " of singulative is because this reason, the user is by the word forms list button, this causes that word forms tabulation revises window 10334 and be demonstrated, and this has that to need as its alternative form of one of demonstration selection.User data causes this text window of editing machine to show shown in 10336 by the option of selecting expection by its related telephone button.
Shown in accompanying drawing 104, the user pushes then and goes to the knob down moving cursor down to position 1400.The word that the user begins the selection of an extension and comes the moving cursor right side by the word button by " 5 " button then is to the position 10402, causes that current selection 10404 extends a word to the right.
Next, the user double-clicks " 2 " button and selects filtrator to select option, is described in the above as the functional module in the accompanying drawing 77 7712 to 7716.It is the click of extending that second of button " 2 " is clicked, as downward arrow 10406.In the pushing of this extension, the continuous sounding alphabetic character of user string, the initial letter of " p, a, i, n, s, t, " this is the word of expection " painstaking ".
In example, suppose that revising window is name recognition mode in continuous letter, pointed in the title block of revising 10412 as " _ the abc " of character.
In example, cause revising window 10412 as the identification of the sounding 10408 of filtrator input and show a series of options, these options have and are filtered the length of device to blur filter, corresponding to the recognition result from the character string of reading the continuously identification of alphabetical name name.Revising window has first to select, and 10414, start from one of character string relevant with fuzzy filter data item.First option part in conjunction with the character of blur filter corresponding to a sequence is pointed out by blur filter indicator 10416.The filtrator cursor, 10418, be located at after the end of first this part that select.
At this moment, the user presses word right button, because the functional module 8124 of accompanying drawing 81 and 8126 operation cause a filtrator cursor to be moved to and to select first character of current word, 10420.The functional module 8151 and 8162 of accompanying drawing 81 causes that filtrator character option window 10422 is shown.Since the character of expection is " p ", user's pressing keys " 7 " is selected it, causes that character to be used as the non-fuzzy character of filtrator character string, and causes the demonstration that a new correction window 10424 is showed as the result who changes filtrator.
Then, the user presses character to knob down four times, because the operation of functional module 8150 in the accompanying drawing 81, causes first four characters that moved right in selecting that are chosen in of filtrator cursor, and this is letter " f " in example, and 10426.Because this is first part of selecting, still corresponding to as the pointed filtrator Strength Fuzzy part of fuzzy filtrator marker 10428, the calling of accompanying drawing 81 filtered character option 8152 and will be caused an other character to select window to be shown.
In that example, the character of that expection, the letter " s; " be joint key in that option list " 5 ", and the user push this button and cause correct character, 10430, be inserted into current filtrator intensity and that all characters, before it clearly is identified, pointed as numeral 10432.
Specifically, correct option shows in conjunction with button " 6 ", and the user pushes this telephone key-press and be inserted within this text window of editing machine with the word that causes expection, shown in 10434.
Then, in example, the user push row downwards and single right button come moving cursor to select to next line and to the right, thereby selection this paper " period ", shown in 10436.The user pushes " 8 " then, or word forms tabulation button, causes that word forms tabulation correction window 10438 is shown.The expection output, period marker, in conjunction with telephone key-press " 4 ".The user pushes this button and causes that the output of expection is inserted within this paper of editor window, shown in 10440.
Accompanying drawing 105 illustrates the user can be how flatly through the functional module 8132 of the accompanying drawing 81 that is described in the above and 8135 operation rolling option list.。
Accompanying drawing 106 illustrates button Alpha recognition mode and can how to be used for the alphabetic input of typing and to advance within this text window of editing machine.Fluorescent screen 10600 shows this text window of editing machine, and wherein cursor 10602 is demonstrated.In this example, the user pushes " 1 " button and opens typing pattern menu, shown in accompanying drawing 79 and 68, causes fluorescent screen 10604.In this pattern, in case double-clicking " 3 " button, the user is chosen in the button Alpha recognition mode that is described above, shown in the functional module 7938 of accompanying drawing 79.This causes system to be set to button Alpha pattern, and shown in accompanying drawing 86, and editor window is presented at the indication of showing in the accompanying drawing 106 10606.
In example, the user extends pushes telephone key-press, shown in 10608, causes prompt window, and 10610, show and each alphabetical relevant ICA word on the telephone key-press that has been pressed.In response, user's sounding " charley ", 10612.This cause corresponding letter " c " in the front position of cursor by this text window of typing, and cause this text window that the appearance of showing in fluorescent screen 10614 is arranged.
In example, suppose that the user pushes simultaneously continuous two the ICA words of sounding of dialogue button, " alpha " and " bravo " is shown in 10616.This cause the letter " a " relevant and " b " with these two ICA words as the pointed cursor in fluorescent screen 10618 by this text window of typing.In the example, next, the user is instructed to three ICA words relevant with this button by 8 buttons, and this word of sounding " uniform " causes letter " u " to be inserted within this text window of editing machine, shown in 10620.
Accompanying drawing 7 furnishes an example, and same button Alpha recognition mode is used for the alphabetic filtration of typing input.Show that button Alpha pattern can be by typing, follow two point " 3 " button similarly it can be from the text-editing device, shown in accompanying drawing 106 when revising in the window by pushing " 1 " button.
Figure 106 and 109 shows to speak in the phone embodiment in action in interface that how users to use the voice recognition text-editing device that is described in the above and typing and correct this paper and send e-mails in the phone embodiment in action.
In accompanying drawing 108, if he selects E-mail option by double-clicking " 4 " button when master menu, the E-mail option fluorescent screen is showed in fluorescent screen 10800, as accompanying drawing 66 description of giving an example.
In the example of showing, suppose that the user wants to produce a new electronic mail message and so selects " 1 " option.This causes a new electronic mail message window, 10802, show with the cursor of first the editable position that is positioned at this window.This is first character of part of the electronic mail message relevant with the addressee of message.In example, the user is with the Talk button press and read name " Dan Roth ", shown in 10804.
In example, this causes incorrect slightly name, and " Stan Roth " is inserted within the addressee of letter breath, shown in 10806.The user responds for selecting by the tabulation of selecting an option by " 2 " button, shown in 10806.In example, tabulation and the user that the name of expection is presented at option pushes " 5 " button and selects it, causes that the name of expection is inserted into addressee People's Bank of China, shown in 10808.
Then, thus the user pushes to twice of descending button moving cursor down to the target line starting position, shown in fluorescent screen 10810.The user presses Talk button, 10812 then when sounding " mobile phone speech interfaces " time.In example, this slightly false identification go out as " selling the call voice interface ", and this this paper is inserted in the score cursor position and causes that the electronic mail editing window shows shown in 10814.In response, the user pushes capable left button and the word left button comes positioning cursor to be chosen in position 10816.The user pushes " 8 " button then and causes that word forms tabulation correction window shows 10818.In example, the output of expection is relevant with " 4 " button, and the user selects this button and causes that the output of expection is placed within the cursor position, shown in fluorescent screen 10820.
Then, the user partly places cursor by twice on the button under the alignment from initial place as the theme of the electronic mail message as shown in the fluorescent screen 10822.Done once this, in the time of sounding constantly, the user presses Talk button " new Elvis interface is just in very good task ".This causes wrong slightly identification row, and " he knows that elfish interface is just in very good ground task " is inserted into 10824 in the pointed position of cursor such as fluorescent screen.
In response, the user push to up button once and twice of word left button cursor is placed on the position that is shown by the fluorescent screen 10900 of accompanying drawing 199.The user begins the selection of an extension by " 5 " button then and by twice of word left button cursor is placed on position 10902 and causes the selection that will prolong, shown in 10904.At this moment, the user double-clicks " 2 " button entering the correction window, and 10906, for current selection and during this is pushed, constantly say character " t, h, e, space, n ".This causes a new correction window, 10908, and with naming character string corresponding to constantly being shown by the filtrator 10910 of the non-fuzzy of typing letter.
Next, the right button of word is pushed in typing, and moving filter device cursor is to first character right side of next word, shown in 10912.The user enters typing pattern menu and selects AlphaBravo by " 3 " button by " 1 " button then, or the ICA word, the input vocabulary.In when duration of pushing of " 3 " button, the sounding that the user is continuous, " echo, lima, victor, india, sierra " 10914.This is identified as detector sequence " ELVIS ", begins with before filtrator cursor position, is inserted into to revise first of window and select window, 10916.This example shows, it is that supposition AlphaBravo identification is the fiduciary level because of it as non-fuzzy, cause the character of typing and be counted as at all characters before it of first option window that non-fuzzy confirms, confirm indication by non-fuzzy, shown in 10916 in the fluorescent screen 10916.
Because it is the output of expection, so in example, the user selects the first current selection by " OK " button.
Accompanying drawing 110 illustrates again-and the sounding language can be with the identification output of helping obtain to expect.As the state that fluorescent screen 10906 and accompanying drawing 109 are pointed out, revise window and begin with identical state.But in the example of accompanying drawing 110, the user responds the fluorescent screen twice by pushing " 1 " button, once enters the typing menu mode and selects big vocabulary identification for the second time.Point out in accompanying drawing 79 as functional module 7908 to 7914, if when one revises the window displaying, big vocabulary is identified in the typing pattern menu selected, system translates the indication that the user wants to move again and again the sounding language, that is to say, the new sounding that is the output of expection is added the output of the tabulation help selection expection of sounding use.When using discontinuous voice, in example, the user continues second of " 1 " button and pushes three words " the ", and " new ", " Elvis " is corresponding to the output of expection.In the above example, suppose to cause system correctly to be familiar with two preceding in three words by the other discontinuous sounding information that new list of utterances typing provides.In the example of general supposition the 3rd of three words not in current vocabulary, this will need the user to risk the 3rd word with filtering to import, and is finished as the sounding 10914 of accompanying drawing 109.
Accompanying drawing 110 illustrates the editing machine functional module and how to be used for to entering network address text-string of webpage purpose typing of an expection on the web browser of the software section of a mobile phone.
The browser options fluorescent screen, 11100, select in the master menu and the relevant the Web Browser option of button " 7 " to show the user if show the fluorescent screen, shown in accompanying drawing 66.In example, it is assumed to be that the user needs the website of typing expection and selects relevant network address window options by pressing keys " 1 ".This causes fluorescent screen 11102 to show a brief prompting religion user.The Talk button continue push during, the user responds by the name of using continuous alphabetical name to risk the network address of expection.In the embodiment of showing, the network address editing machine always is shown so that the identification of sounding 11103 causes a correction window 11104 at modification model.The user uses the filtrator string editing that has been described then in the above, thereby corrects the network address type to the original wrong identification of spelling of the expection shown in fluorescent screen 11106, and this time he selects first to select, and causes system to enter the website of expection.
Accompanying drawing 112 to 114 illustrates how editor interface is used to navigate and the field of typing this paper network access page or leaf.
When it entered a new website at first, fluorescent screen 11200 illustrated the appearance of the web browser of mobile phone.A network address field, 11201, before the top of webpage, be demonstrated, 11204, help the current webpage of User Recognition.If this position can be wanted to see the webpage network address of present demonstration at any time backward by the user that rolls.When webpage was at first entered, they were in the pattern of a file/page or leaf navigation, therein, the mobile left side and button the right as the page or leaf on the most web browser before and the control of page or leaf back.In this situation, " file " that word is by " page or leaf " replacement because " page or leaf " and that word be used to other navigation mode to consult a mobile phone fluorescent screen that is full of medium.If the user is by upwards or downward button, the demonstration of webpage will be that being full of of rolling shows page or leaf (or fluorescent screen).
Accompanying drawing 116 illustrates in the time of dialog box of the type of editor shown in accompanying drawing 115, and the mobile phone embodiment shows that the correction window that how to allow a special form is taken as a list box and uses.
The example of accompanying drawing 116 starts from finding that dialog box is in shown in the fluorescent screen 11504 of accompanying drawing 15 in the state.From this state, thereby the user is by being placed on cursor " In: " list box to twice of descending button, and this has defined in the mobile phone data of which part, and search is directed the dialog box that finds that will take place to respond.When the user pushes cursor at the Talk of this window button, list box correction window 11512 is shown, this shows as the current selection in the list window of current first option, and other list box options of a scroll list are provided, each other option and related telephone key number are shown.Can roll this tabulation and by telephone key or use the selection of highlight to select to expect option of user.In example, the user continues to push the talk button and says the list box assignment of expection, 11514 with sounding.In list box correction window, existing vocabulary is limited to list numerical value in fact.It is very possible to have the correct identification of conditional vocabulary like this, and as what point out in example, it is first selection that the tabulation of expection is worth.The user responds by pressing the OK button, and this causes that the tabulation assignment of expection is placed in the dialog box in the list box, shown in 11518.
Accompanying drawing 117 illustrates a series of reciprocation between user and mobile phone interface, shows some functional modules when making a phone call, and the interface allows the user to move those functions.
Identical climax-horizontal telephony mode fluorescent screen during fluorescent screen 6400 in accompanying drawing 117 is shown in accompanying drawing 64.If when last the navigation button of its displaying user selection, mapped is the name dialing instruction, system will enter the name dialing pattern, and its basic function is shown in the code of showing in the accompanying drawing 119.As we can see from the figure, this pattern allows the user by adding the name that their selection is tabulated from a communication, if vicious identification is revised by the filtration of lexicographic order, by selecting from the potential rotatable selective listing that is similar in the aforesaid correction window.
When mobile phone enters the name dialing pattern, an initial indication fluorescent screen, 11700, show pointed as accompanying drawing 117.In example, the user reads a name, and 11702, during the pushing of talk button.In name dialing, so sounding identifies with the vocabulary that automatically is limited to the name vocabulary, and the identification that produces causes that one revises window demonstration, 11704.In example, first to select be correct, so the user selects " OK " button, causes phone to begin calling the telephone number relevant with name the opposing party in user's the contacts list.
When phone is connected, a fluorescent screen, 11706, being demonstrated to have has the same lasting calling indicator, and 7414, shown in accompanying drawing 75.Bottom in the fluorescent screen, 11708 pointed as numeral, indication during continuing conversation with each navigation keys function associated.In example, the user selects downward button, and is relevant with the identical recordings function that is described in the above, shown in accompanying drawing 64.In response, an editor window, 11710, for the record outline 11712 with a header entry that automatically produces is demonstrated, in record, produced, the conversation that diagrammatic illustration is current, it is done and is begun and its final concluding time to whom to identify this opposing party.Cursor 11714 is placed on a new project under the conversation title.
In example, the user says continuous sounding, 11714, during the pushing of Talk button because the text of identification is inserted into the sounding of record header corresponding to cursor position, shown in glimmering 11716.User's double-click on " 6 " button opening entry then, this cause sound of audio curve figure performance be placed in within the record of editor window in the present position of cursor.Shown in 11718, the figure of the audio frequency of the part of speaking from this action phone operation person by underscoring make it become the easier user of allowing grasp with who by words, conversation how long, and, if desired, can search the what someone said to the opposing party's conversation of or other in the audio frequency that is recorded of part preferably.
In the example of accompanying drawing 117, the next double-click on of user is selected task list at the star button.This shows a fluorescent screen, 11720, and the task of phone in action of tabulating when front opening.In example, the user selects and telephone key-press " 4 " related task, and a record editor window in addition is just one of the displayed record outline different position.In response, telephone key-press shows fluorescent screen 11722 of expression, and the record of diagrammatic illustration is that part of.
In example, the user 11724 presses " 6 " button then and begins to play and the relevant sound of current cursor audio frequency figure to the position by three moving cursors of button upwards, and is indicated by the motion of the cursor between fluorescent screen 11726 and 11728.
Unless play to have only my option 7513 is opened, as above described about accompanying drawing 75, the acoustic playback of fluorescent screen 11728 will be played to both limits of current talking, makes the user of mobile phone be shared in during the mobile phone conversation audio recording with the opposing party.
Accompanying drawing 118 illustrates when an editor just at recording voice the time, as what show near the fluorescent screen the bottom center of accompanying drawing 117 11717, the user causes that the audio frequency that goes on record during that part of also moves speech recognition opening speech recognition during the recording of an audio frequency.In the example of showing during the recording of showing in fluorescent screen 11717, the user pushes the Talk button and says sounding 11800.This causes this paper relevant with that sounding language, 11802, be inserted into editor window 11806.The figure of a record audio after the time of identification.Normally this can be used to method, and the user attempts clearly to say during the sounding that is identified as sounding 11800 therein, can more more loosen freely then during the talk of a record audio of part or oral account.Usually, audio frequency is recorded with the identification of voice, so that the user afterwards can playback, listen and revise any oral account, as give an oral account 11802, this oral account during writing down by the identification of mistake.
How accompanying drawing 119 makes the user can select the audio frequency of part if illustrating system, shown in 11900, by in conjunction with selection key and the broadcast or the navigation of extending, select the audio session frame of identification then, shown in the functional module in the accompanying drawing 90 9000 to 9014.This paper that identifies selection is 11902.In the example of accompanying drawing 119, the user has selected the option of identified sound, and 9026, in accompanying drawing 90, show, cause this paper of identification, 11902, by underscoring, point out that it has the audio frequency of playing with its associating.
Accompanying drawing 120 illustrates the user and how to select the part 12000 of the text of the identification of the audio frequency of united recording, then by selecting option 9024 to select to have the text of peeling off from the audio frequency of its relevant identification, as demonstration in the accompanying drawing 90, in the submenu below the editing machine options menu.The graphical representation that this just stays audio frequency 12002 and its corresponding audio frequency remains in the part of media at the previous place of text of identification.
Accompanying drawing 121 illustrates functional module 9020 as how about accompanying drawing 90 is peeled off the identification audio frequency from allow the user under the audio menu of editing options menu, and the audio frequency of this identification is relevant with the part 12100 of the text of the identification of the 12102 indicated texts that come accompanying drawing 21 freely.
Accompanying drawing 122 is up to the 125 operation illustrations that the digit dialling pattern that is described in the code of accompanying drawing 126 is provided.If user-selected number word dialing mode.For example in master menu by pressing telephone key-press 2, shown in the functional module 6552 of accompanying drawing 65 is given an example, or by select left the navigation button when system be that system will enter the pattern that digit dialling is showed when being illustrated in the telephony mode of climax level of fluorescent screen 6400 in the accompanying drawing 64 in accompanying drawing 126.And will show an indication fluorescent screen 12202, the indication user says a telephone number.When the user said the sounding of telephone number, shown in 12204, this sounding language will be identified.If system is quite self-confident, the identification of telephone number is correct, and the telephone number that it will dial automatically and discern is shown in 12206.If system is not so self-confident to the identification of telephone number, it will show that one revises window 12208.If revise the numeral of window just like the expection of first option, shown in 12210, the user can only select it by pressing the OK button, and this causes system when this word of dialing, shown in 12212.If correct option is in first selective listing, shown in 12214, the user can be only by the telephone key-press numeral relevant with this option because system dials this numeral, shown in 12216.
If correct numeral is neither that first option neither first option list, pointed as fluorescent screen 12300, top at accompanying drawing 123 shows, the user can check and see that with clear and definite numeral of whether expecting be on one of them fluorescent screen of two option lists, these two option lists or the repeated presses nextpage button by shown in 12302, or by project button under the repeated presses shown in 12304.If by the numeral of seeing expection these method rolling option list users, the user selects it by the related telephone button of pushing it or push the OK button then to it by the highlight of mobile option and also can select it.This will cause the numeral of system's dialing shown in fluorescent screen 12308.Should figure out, because the telephone number of option list is digitally sorted, the user can look for the numeral of expection apace by the scroll list.The embodiment of in these accompanying drawings, showing, digital change indicators 12310 is provided to indicate the sum of Most Significant Digit, and by this numeral, any option is different with the option before this option is gone up in tabulation.This makes it become the easier telephone number that allows eye scanning expect.
Accompanying drawing 124 illustrates the digit dialling pattern and how to allow the user to navigate to first digit position of selecting and correct any mistake that exists in its inside.In accompanying drawing 124, this quilt is by the numeral of saying expection, but the user is by the numeral that also is allowed to revise expection by suitable telephone key-press.
Give an example as accompanying drawing 125, the user also can be by inserting wrong numeral and replacing the telephone number of the digital editing wrong identification of wrong identification.
Invention described above has the many aspects that also have, and can be used for typing and revises speech recognition and the identification of other form on dissimilar computing platforms, comprises all those that show in accompanying drawing 3 to 8.Many inventive features that are described about accompanying drawing 94 can be used to the user need typing and [or] editor do not have to give those tasks this paper situation of close vision attention.When entering into a park, for instance, this can allow the user to listen Email and oral account to respond, and there is no need closely to see in his mobile phone or other oral account device.In special environment, the feedback of audio frequency is useful to speech recognition and other control function, and for example making call and phone control is in the competition field of an automobile, shown in giving an example in the accompanying drawing 126.
In embodiments, being had a computing machine 12600 to be connected to a cellular radio communication system 12602 by the automobile that shows in the accompanying drawing 126 enters within the automobile audio system 12604.In many embodiments, the electronic system of automobile will have short scope radio wireless electricity transceiver, for example bluetooth or other short scope wireless set 12606.These can be used for communication to a wireless earphone 2608, or user's mobile phone 12610, so when using his automobile, the user can easily enter the information on the common mobile phone that is stored in him.
Preferably, mobile phone/radio wireless electricity transceiver 12602 can be used to not only to send and receive the conversation of mobile phone but also transmission and receive Email, digital document, this paper file and the audio frequency webpage that can listen and edit by above function for example.
The input media of controlling many functional modules that are described in the above can be by telephone key-press district 12212 accesses about the mobile phone embodiment of showing, preferably, being positioned at a position for example on automobile steering wheel, will be its button of user capture and do not need excessive in driving the notice that involves him.In fact, by key zone similar in accompanying drawing 126, show a position arranged, when with identical hand thumb selection key district button the time, the forefinger that the user can see a hand nearby centers on the limit of bearing circle.In an embodiment like this, preferred systems has the TTS keypress function, shown in the functional module in the accompanying drawing 94 9404 to 9414.Make the user can determine he by button which is, and the function of pressing keys and must not see key zone.If can be more even than being easier to and rapider use, in other embodiment, the tactile sensing key zone be responded and is only touched its telephone key-press and be accompanied by that such information also can be provided so that lighter and use efficiently.
Accompanying drawing 127 and 128 illustrates most of above-described ability can be by the phone of other types by usefulness about the mobile phone embodiment, for example the wireless telephone of showing in accompanying drawing 127 or find on communication cable as point out at accompanying drawing 128.
It should be understood, and the description of front and accompanying drawing only provide in order to explain and to illustrate, and the present invention is confined to except the so conditional scope in the claim explanation that adds when quilt.Those of ordinary skill in the art can disclose the present invention and can modify and change and do not exceed scope of the present invention.
The present invention of current application as requiring widely, is not limited to use operating system, any one type of computer hardware or computer network, and therefore, other embodiment of the present invention can be used inconsistent software and hardware system.
In addition, the program behavior that is described in the claim below be should understand,, can the tissue and the sequence that are different in essence be used by many different program designs and data structure operation just as actual all program behaviors.Because program design is a kind of flexible technology that extremely has, so in case the given idea of any complexity being understood by those those of ordinary skill in the art can show with an in fact unlimited digital form.Therefore, claim do not mean to accurate functional module and [or] restriction of the functional module sequence that is described in the accompanying drawings.This is real in particular, because the code that is described in herein the top is allowed its more efficient communication by simplification to heavens, does not need to bear unnecessary details, he or she, those of ordinary skills just know how to realize the present invention.In like this simplification purpose, the structure of described code real code when realizing inventing is very different, is different from the structure that those of ordinary skills implement the true code that can use when of the present invention.In addition, manyly be illustrated book exhibition show that the program design behavior in the software of being implemented in may be moved in other the hardware of embodiment.
More than in many working of an invention schemes, come into question, the of the present invention various different aspects of displaying take place together, take place in the embodiment aspect other those of the present invention that also can separate.
Should understand, the present invention extends to method, the program design of the form record that apparatus system and available computer are handled, submitted all features of the present invention of describing in the application and aspect comprise its instructions, accompanying drawing and its initial claim.

Claims (248)

1. the method for a speech recognition, comprising:
Provide and allow the user producing the user interface of selecting between first and second users input;
By in the pattern that changes with previous language context relation of first word of the identification of discerning the language model context relation that this word that depends on previous identification partially creates at least, finishing generation about the identification of vocabulary greatly response first user input of one or more sounding; And
By finishing generation in the irrelevant pattern of the language context relation first word and previous of the irrelevant identification of the language model context relation created at the word of discerning the sort of and any previous identification at least about big vocabulary identification response second user input of one or more sounding.
2. according to the process of claim 1 wherein:
User interface comprises first by the button and second button;
First user input produces by pushing first button; And
Second user input produces by pushing second button.
3. even selecting to have also seldom the language model context relation of creating by first and any word of discerning in succession of sounding using in second of sounding identification and in succession the word according to the process of claim 1 wherein with the irrelevant pattern of previous language context relation.
4. according to the method for claim 1, comprise that further input offers another program as text the word output of using the pattern-recognition relevant and irrelevant with previous language context relation.
5. according to the method for claim 4, wherein said method is to finish with the software input panel among the MicrosoftWindows CE.
6. the method for a speech recognition, comprising:
Provide and allow the user producing the user interface of selecting between first and second users input;
Respond the generation of first user input as the one or more words in the given vocabulary by one or more sounding of identification in continuous speech recognition mode; And
Respond the generation of second user input as the one or more words in the same given vocabulary by one or more sounding of identification in discontinuous speech recognition mode.
7. according to the method for claim 6, wherein given vocabulary is big vocabulary.
8. according to the method for claim 6, wherein given vocabulary is the vocabulary according to the lexicographical order input.
9. according to the method for claim 6, wherein:
The selection that described user interface allows the user to leave first and second inputs selects to produce third and fourth input independently; And
Described method further comprises by selecting first vocabulary or second vocabulary to respond described third and fourth input respectively as described given vocabulary.
10. according to the method for claim 9, wherein said first and second vocabularies are the big vocabulary of word and the vocabulary of alphabet sequence input.
11. according to the method for claim 9, wherein said first and second vocabularies are vocabularies of two different alphabet sequence inputs.
12. according to the method for claim 6, wherein:
The user interface that is provided comprises first button and second button;
First user input produces by pushing first button; And
Second user input produces by pushing second button.
13. according to the method for claim 12, wherein:
Pushing first and second buttons makes their recognition modes separately in fact according to the time identification of pushing such button before detecting the next end of sounding;
Wherein discontinuous identification is confined to the identification to one or more candidate item of the single word that is complementary with described sounding in fact, and recognition mode does not have such restriction continuously.
14. according to the method for claim 6, the acoustic model that wherein is used for expressing word in discontinuous recognition mode is different from the acoustic model that is used for expressing identical word in recognition mode continuously.
15. the method for a speech recognition, comprising:
Provide and allow the user to select to produce the user interface of first and second users input;
Respond the generation of first user input as one or more words by one or more sounding of identification in the vocabulary that enters at first alphabet sequence; And
Respond the generation of first user input as one or more words by one or more sounding of identification in the vocabulary that enters at second alphabet sequence.
16. according to the method for claim 15, wherein:
The vocabulary that first alphabet sequence enters comprises each alphabetical name in the alphabet, and the vocabulary that second alphabet sequence enters does not comprise; And
The vocabulary that second alphabet sequence enters comprises with alphabetic(al) each letter being one or more words of starting point, and the vocabulary that first alphabet sequence enters does not comprise.
17. according to the method for claim 15, wherein said user interface provides the button that separates that is used for producing described first and second inputs.
18., wherein touch the identification of each described button unlatching by the alphabet sequence typing pattern that is associated with button according to the method for claim 17.
19. according to the method for claim 15, wherein:
Described user interface is permitted:
The filtered model of the word that sequence that-user selects with regard to the identification of the given word selection of word to be confined to its spelling and one or more characters of user's input is complementary;
-user imports described one or more filtration character with the pattern that described first or second alphabet sequence enters by voice recognition; And
Described first and second inputs select such filtration character recognition to be to use described first alphabet sequence typing pattern also to be to use described second alphabet sequence typing pattern to finish actually.
20. the method for a speech recognition, comprising:
The user interface that allows the user to select to produce first, second and the 3rd user input is provided;
By discerning one or more sounding respond first user input as the one or more words in first general big vocabulary generation;
By discerning one or more sounding respond second user input as the one or more words in the vocabulary that second alphabet sequence enters generation;
By discerning one or more sounding respond the 3rd user input as the one or more words in the 3rd vocabulary of expressing non-spelling text input generation; And
Receive the identification of arbitrary vocabulary is received among three vocabularies output continuously and that output is put into public text.
21. according to the method for claim 20, wherein the 3rd vocabulary is the digit vocabulary table.
22. according to the method for claim 20, wherein the 3rd vocabulary is the punctuate vocabulary.
23. according to the method for claim 20, wherein user interface provides different buttons to be used to select first, second and the 3rd input.
24., wherein push the button that is associated with one of described three vocabularies and open the identification of using that vocabulary according to the method for claim 23.
25. a method of finishing word identification, comprising:
Reception comprises the word input signal of the non-text user input of the sequence of expressing one or more words;
Finish word identification producing the keep the score option option list of identification candidate item of the best on input signal, each is all found by forming than higher probability and the corresponding one or more words of input signal and/or digital sequence by being identified device;
Produce the perceptible output of user of option option list represent the best identification candidate item of keeping the score, wherein candidate item is that character according to the one or more word corresponding characters sequences that are associated with each candidate item in the tabulation sorts and sorts in described option option list;
Provide user interface to make the user can select one of identification candidate item by the character ordering from the option option list;
By a selected candidate item conduct and the corresponding one or more words of word input signal and/or digital user's selection of carrying out processing response from one of option option list identification candidate item.
26. according to the method for claim 25, wherein:
Word identification is selected the best identification candidate item of keeping the score; And
In the perceptible output of described user the best candidate item of keeping the score be placed in corresponding in the character sorted lists, dropping on the position that where has nothing to do according to described character ordering with the keep the score character string of one or more words that candidate item is associated of the best.
27. according to the method for claim 25, wherein:
The word input signal is the expression of spoken words sounding; And
Word identification is speech recognition.
28. according to the method for claim 25, wherein the perceptible output of user is included on the visual display and shows the keep the score character sorted lists of identification candidate item of described the best.
29. according to the method for claim 28, wherein:
Described option option list comprises the identification candidate item that manys of installing simultaneously than on display screen; And
The option option list is rotatable, so the user can select with respect to the display screen move options, goes up the identification candidate item that manys of installing simultaneously than on display screen so that see tabulation.
30. according to the method for claim 28, wherein:
The character sorted lists is the tabulation of alphabet sequence ordering; And
The demonstration of individual recognition candidate item comprises the sequence of the word of one or more alphabet sequence spellings in the tabulation.
31. according to the method for claim 30, wherein:
Described option option list comprises the identification candidate item that manys of installing simultaneously than on display screen; And
The option option list is rotatable, so the user can select with respect to the display screen move options, goes up the identification candidate item that manys of installing simultaneously than on display screen so that see tabulation.
32. according to the method for claim 31, wherein:
Son must the be tabulated alphabet sequence ordering of described option option list;
The first son tabulation comprises the highest option option candidate item of keeping the score that is installed in simultaneously on the display screen; And
The second son tabulation comprises other the best option option candidate item of keeping the score.
33. according to the method for claim 32, wherein the second son tabulation has than the candidate item that a display screen is Duoed is installed simultaneously.
34., further comprise according to claim 30 method:
The user interface that allows the user to select the filtration sequence of one or more letter indications after the character sorted lists of the best identification candidate item of keeping the score of described demonstration is provided; And
By producing on described display screen and showing that the new identification candidate item option option list that arranges in alphabetical order responds the selection of described filtration sequence, the sequence that wherein new option option list is confined to its one or more characters is the candidate item of starting point with described filtration sequence; And
The user interface that makes the user can select one of identification candidate item of arranging in alphabetical order from new option option list is provided;
By selected candidate item conduct and corresponding one or more words of word input signal and/or numeral are carried out the selection of processing response user to one of identification candidate item in the new option option list.
35., wherein said by producing and showing that the selection that the new option option list that arranges in alphabetical order responds described filtration sequence comprises according to the method for claim 34:
Whether detect the number of identification candidate item is expecting below the number;
Show the number of discerning candidate item when testing result when the expection numeral is following, selecting to be included in from vocabulary list among the described option option list that arranges in alphabetical order newly with the filtration sequence is one or more additional candidate items of starting point.
36. according to the method for claim 35, wherein:
The described new option option list that arranges in alphabetical order comprises the identification candidate item that manys of installing simultaneously than on display screen; And
The option option list is rotatable, so the user can select with respect to the display screen move options, goes up the identification candidate item that manys of installing simultaneously than on display screen so that see tabulation.
37. according to the method for claim 34, wherein:
This method is to finish on the phone that the telephone key-press district is arranged;
The user interface that allows the user to import described letter indication input allows the user to begin such input by one or more buttons of pushing in the described telephone key-press district, wherein pushes button given in the telephone key-press dish and points out that letter corresponding in the sequence of one or more characters that the identification candidate item with expection is associated is one of one group of various letter that is associated with given button; And
The sequence that new candidate list is confined to its one or more words is the candidate item of starting point with the corresponding initial alphabetical sequence of sequence with letter indication input, and each letter of wherein initial alphabetical sequence is all pointed out letter one of with corresponding letter indication input corresponding to that group in the sequence of described letter indication input.
38. according to the method for claim 37, wherein:
Described new option option list comprises the identification candidate item that manys of installing simultaneously than on display screen; And
The option option list is rotatable, so the user can select with respect to the display screen move options, goes up the identification candidate item that manys of installing simultaneously than on display screen so that see tabulation.
39. according to the method for claim 34, wherein:
Allow the user to select the user interface of the sequences of one or more letter indications to allow the user to begin to select expected numbers purpose character from the starting point that is included in a string alphabetical character within the selected identification candidate item that is presented among the option option list; And
Described user interface is by using the whole or a part of response such selection of selected one or more characters as described one or more alphabetical indicator sequences.
40. the method according to claim 30 further comprises:
Provide and allow the user to point out to go up between the candidate item of listing or the user interface of the selection of the position between the beginning or end of candidate item of listing and tabulation at the option option (tabulation) that arranges in alphabetical order that shows; And
Be confined to respectively between two candidate item or the new such selection of the option option list that arranges in alphabetical order response of the identification candidate item of the spelling between candidate item and alphabetic(al) beginning or end by showing again.
41. according to the method for claim 28, wherein:
Input signal is represented the sounding of one or more continuous numbers; And
The option option list is the tabulation by numeric sorting as the identification candidate item of numeral demonstration.
42. according to the method for claim 30, wherein:
Described input signal is represented the sounding of telephone number;
Described word identification is speech recognition; And
The user of described response identification candidate item selects to cause that the telephone number that shows at selected identification candidate item will be dialled automatically.
43. according to the method for claim 28, wherein:
Input signal is represented the sounding from one or more names of contact details; And
The option option list is represented from the best of the numerous alphabet sequences ordering of contact details name of keeping the score.
44. according to the method for claim 43, wherein:
Described option option list comprises the identification candidate item that manys of installing simultaneously than on display screen; And
The option option list is rotatable, so the user can select with respect to the display screen move options, goes up the identification candidate item that manys of installing simultaneously than on display screen so that see tabulation.
45. a method of finishing word identification, comprising:
Reception comprises the word signal input of the non-text user input of the sequence of representing one or more words;
Finish word identification producing the keep the score option option list of identification candidate item of the best on input signal, each is all found by forming than higher probability and the corresponding one or more words of input signal and/or digital sequence by being identified device;
Show the option option list in the rotatable display screen of user, wherein the option option list has the identification candidate item that manys of installing simultaneously than on display screen, so have only the subdivision of option option list to be shown simultaneously;
By move up and down the option option list response user input of selecting to roll up or down of option option list respectively with respect to display screen, so that change rendering preferences option list part on display screen.
46. according to the method for claim 45, wherein the word input signal is the expression of the sounding of spoken words, and word identification is speech recognition.
47. according to the method for claim 45, wherein:
Described user imports the selection option option list that rolls up or down and comprises the rolling input of multiple candidate item; And
Described response user input comprises by move up and down the option option list with respect to display screen near multiple identification candidate item and responds the input of rolling of each multiple candidate item.
48. according to the method for claim 45, wherein:
Described method is finished on mobile phone; And
Described display screen is the display screen of mobile phone.
49. according to the method for claim 48, wherein:
The related different numeral of each identification candidate item in that part of option option list on the demonstration of option option list on display screen of mobile telephone comprises demonstration and be simultaneously displayed on display screen;
Provide and make the user can be by from the option option list, selecting a user interface of discerning candidate item by being pressed on the described mobile phone with the corresponding telephone key-press of numeral of the identification candidate item of expection; And
By selected candidate item is carried out the processing response user from the selection of option option list to one of identification candidate item as one or more words corresponding with the word input signal and/or numeral.
50. according to the method for claim 45, wherein:
Each identification candidate item all has a character string to be associated with it; And
Identification candidate item in rotatable option option list is to sort by the character sequence that their character strings separately occur.
51. according to the method for claim 45, wherein the identification candidate item in rotatable option option list is by them the identification of word signal to be kept the score to sort.
52. method according to claim 45, further comprise by with respect to display screen respectively move left and right option option list response select the user's input of option option list of rolling to the left or to the right so that change the indivedual option option parts that are presented in the option option list on the display screen.
53. a method of finishing word identification, comprising:
Reception comprises the word input signal of the non-text user input of the sequence of representing one or more words;
Receive the sequence of one or more filtrator input signals, each all comprises the non-text user input of the sequence of representing one or more characters;
By producing the one or more filtrator input signals of filter response of the one or more possible character strings of expression, each all be found might with the corresponding one or more characters of filtrator input signal;
Generation is the tabulation of the identification candidate item of starting point with one of character string of representing with filtrator, when the one or more such word identification candidate item that with one of character string of representing with filtrator is starting point has the one or more candidate item that comprise from the word identification of input signal in certain identification probability more than the floor level;
Produce the perceptible output of user of the following content of performance:
Described the best is kept the score and is discerned the tabulation of candidate item; With
Represent with described filtrator with the keep the score corresponding character string of initial character of one of tabulation of identification candidate item of the best;
Make the user select one of identification candidate item and/or from described filtrator, to select a character from described tabulation;
By selected candidate item as and the corresponding one or more words of word input signal carry out the selection of processing response from one of identification candidate item of option option list;
By being presented in the possible character string of representing with filtrator and of the selection of selected character to the option option list response filter character of corresponding other character in position of the perceptible filtrator of user;
Make the user can in the character option option list, select one of character;
Select by the character of following method response in the character option option list:
The possible character string that filtrator is represented only limits to have those of the character chosen in selected character position; With
The filtrator of the character limit that use is chosen repeats the generation of described identification candidate list.
54., comprise and make such character string those before only limiting to also seldom in the perceptible filtrator of user, appear at selected character even wherein limit possible character string that filtrator represents according to the method for claim 53.
55. according to the method for claim 53, wherein:
The single character string that the generation of described identification candidate list makes the identification candidate item only limit to only to represent with filtrator is those of starting point; And
The perceptible output of user of representing described candidate list comprises the single character string as the perceptible filtrator of user.
56. according to the method for claim 53, the generation of wherein said identification candidate list makes that any one is starting point those among identification candidate item numerous character strings of only limiting to represent with filtrator.
57. according to the method for claim 53, wherein:
The filtrator input signal is corresponding to the sequence of pushing one or more telephone key-presss, and wherein each telephone key-press that is pressed all has one group of letter that is associated; And
The response filter input signal produces the filtrator of the one or more character strings of expression, and wherein each such sequence is organized at each such character and that under one of the letter that is associated with corresponding key corresponding situation all has a character for each such button.
58. according to the method for claim 53, wherein:
The filtrator input signal is corresponding to the sequence of each one or more letter indication of sequence of one or more sounding; And
The response filter input signal is included in finishes speech recognition to produce the filtrator of expression and the corresponding one or more character strings of recognizing from described sounding of character on the sequence of one or more sounding.
59. a method of finishing word identification, comprising:
Reception comprises the word input signal of the non-text user input of the sequence of representing one or more words;
Finish word identification producing the keep the score option option list of identification candidate item of the best on input signal, each is all found by forming than higher probability and the corresponding one or more words of input signal and/or digital sequence by being identified device;
Show Options option list on the rotatable display screen of user;
By with respect to the display screen move left and right option option list option option list response user input of selecting to roll to the left or to the right respectively, so that change the indivedual option option parts that are presented at the option option list on the display screen.
60. according to the method for claim 59, wherein said method is put into practice on mobile phone, is button or the button of pushing on the mobile phone and select user's input of horizontal rolling.
61. a method of finishing word identification, comprising:
Receive the word input signal of the one or more words of expression;
On signal, finish word identification to produce and the corresponding one or more the bests of word input signal word of keeping the score;
Provide and make the user user interface of selecting among numerous word conversion instructions of the dissimilar conversion that is associated with it all be arranged from each;
The selection of one of word conversion of response user converts the word of current selected with alphabetical a to spell corresponding of the different sequences of z but different words by using the conversion that is associated with selected instruction.
62. according to the method for claim 61, wherein one of word instruction transformation converts the word of current selected to different grammatical form at least.
63. according to the method for claim 62, wherein one of word conversion instruction converts the word of current selected to different tense at least.
64. according to the method for claim 62, wherein one of word conversion instruction converts the word of current selected to plural number or singulative at least.
65. according to the method for claim 62, wherein one of word conversion instruction converts the word of current selected to the possessive case or non-genitive form at least.
66. according to the method for claim 61, wherein one of word conversion instruction converts the word of current selected to the homonym of selecting word at least.
67. according to the method for claim 61, wherein one of word conversion instruction is changed the word of current selected by its ending being changed over one of one group of general suffix at least.
68. according to the method for claim 61, wherein
Word identification produces the option option list of the best identification candidate item of keeping the score, and each is all found by forming than higher probability and the corresponding one or more words of word signal by being identified device; And
User interface is with the identification candidate item in the perceptible form output intent option of the user option list; And
User interface makes the user can be chosen on the option option list the final word through conversion of selecting, selecting a selected conversion instruction of finishing selected those selected and have the output as recognizer to produce from one of identification candidate item output.
69. according to the method for claim 61, wherein
Word identification is the speech recognition of finishing on phone; And
User interface makes the user select a selected conversion instruction by pushing telephone key-press.
70. a method of finishing word identification, comprising:
Receive the word input signal of the one or more words of expression;
On signal, finish word identification to produce and the corresponding one or more the bests of word input signal word of keeping the score;
The user interface that the user can be selected among numerous word conversion instructions is provided;
Word by conversion current selected between alphabetical expression and non-alphabetical expression responds the selection of user to one of word conversion.
71. according to the method for claim 71, wherein
Word identification produces the option option list of the best identification candidate item of keeping the score, and each is all found by forming than higher probability and the corresponding one or more words of signal by being identified device; And
User interface is with the identification candidate item in the perceptible form output intent option of the user option list; And
User interface makes the user can be when option is selected select word from one of identification candidate item output, selects to have the final word of finishing at that selected word through changing that is adapted at the conversion of changing between the alphabetical and non-alphabetical expression and has the output as recognizer to produce.
72. a method of finishing word identification, comprising:
Receive the word input signal of the one or more words of expression;
On signal, finish word identification to produce and the corresponding one or more the bests of word input signal word of keeping the score;
Provide user interface that the user can be chosen on the word that described identification produces and show the conversion tabulation;
Respond user's selection by the generation described option option list that passes through the word of conversion corresponding with recognized word;
User interface makes the user can select one of word of process conversion in the option option list; And
By the selected word response through conversion that produces as the output of the recognizer selection through the word of conversion is arranged.
73. according to the method for claim 72, wherein:
The option option list of the word of process conversion is presented on the rotatable display screen of user, and wherein the option option list has than the word of installing simultaneously on display screen through conversion that manys, so have only the subdivision of option option list to be shown simultaneously;
Select scroll-up/down option option list response user input by move up or down the option option list respectively with respect to display screen, so that change rendering preferences option list part on display screen.
74. according to the method for claim 72, wherein user interface:
By recognizer word output is put into text; And
Select to produce it the word of conversion options option list among one or more words of permission user from text.
75. according to the method for claim 72, wherein user interface:
Produce the option option list of the best word candidates of keeping the score from word identification; And
Allow the user among the best is kept the score one or more words the option option list, to select to produce it the word of conversion options option list.
76. according to the method for claim 72, though wherein comprise through the word in the word list of conversion to it produce the conversion options option list word also few one or more homonyms are arranged.
77. according to the method for claim 72, though wherein be included as through the word in the word list of conversion its produce conversion options option list word one or more also few different expression are arranged.
78. according to the method for claim 72, though wherein be included as through the word in the word list of conversion its produce conversion options option list word one or more also few different grammatical forms are arranged.
79. a method of finishing word identification, comprising:
By after receiving instruction, opening the identification of big vocabulary predicative sound and automatically turning off big vocabulary predicative sound identification before beginning to discern and do not re-use the instruction input that it responds from the user and begin to discern receiving another instruction input from the user subsequently.
80., wherein turn off speech recognition and automatically occur in after the given time cycle passage according to the method for claim 79.
81., wherein turn off speech recognition and detect after first terminal point of sounding after automatically occurring in opening voice identification according to the method for claim 79.
82. according to the method for claim 79, the instruction input that wherein causes opening voice identification is non-acoustics input.
83. 2 method according to Claim 8, the next terminal point that the sounding that the identification of wherein speech recognition voice responsive is carried out detects be switched off and do not re-use up to the non-acoustics user of the next one import start identification till.
84. 3 method according to Claim 8, wherein speech recognition is continuous speech recognition.
85. 3 method according to Claim 8, wherein speech recognition is discontinuous speech recognition.
86. 3 method according to Claim 8 further comprises:
The perceptible expression of user of one or more words that output was recognized as the optimal selection of sounding before finishing the sounding detection;
The identification that provides user interface to allow the user to respond sounding provides the correction input of the mistake that corrects in the optimal selection output;
After the optimal selection recognized with regard to sounding of output by confirming that optimal selection is correct and to being that the new sounding of starting point repeats the beginning recognition instruction that described method receiving that response is received before any correction input that is used for described optimal selection again and imports to receive the beginning recognition instruction.
87. 6 method further comprises this affirmation by response sounding that one or more words of recognizing of the some of the current context of the mode of keeping the score as the language model that is used for calculating the speech recognition that is used for back relation are included according to Claim 8.
88. 6 method according to Claim 8 further comprises by using this affirmation of one or more words of recognizing as the data response sounding that is used for changing language model.
89. 6 method according to Claim 8 further comprises by marking the comfortable this affirmation of upgrading the acoustic data response of the sounding that one or more acoustic models that are used for the identification of the given word of recognizing use corresponding to the sounding of the given word of recognizing.
90. 3 method according to Claim 8 further comprises first pattern that identification is switched off after the next terminal point that allows user's detected sounding after receiving non-acoustics input and discerns between second pattern that is not switched off after the next terminal point that sounding detects and select.
91. according to the method for claim 90, wherein, in described second pattern, the identification response ratio is turned off between twice sounding in talk normally long time lapse of time lapse automatically.
92. 3 method according to Claim 8, wherein:
Described method is to be used in the software that moves on the handheld computer device to finish; And
Non-acoustics input is a pressing button, comprises graphical user interface button.
93. according to the method for claim 92, wherein:
The handheld computer device is a mobile phone; And
Button is a mobile phone button.
94. being the software that moves on the computing machine that is used in as the part of automobile, 3 method according to Claim 8, wherein said method finish.
95. 2 method according to Claim 8, wherein:
The input of beginning recognition instruction is a button of pushing hardware or software; And
Identification is that less than is turned off in 1 second time automatically after stopping pressing button.
96. 2 method according to Claim 8, wherein:
Described method provides the user interface of numerous speech pattern selector buttons, and each all is used for selecting can be simultaneously selecting the different speech recognition mode that uses for the user;
Cause that the non-acoustics input that speech recognition is closed is to push one of described button;
Described method is discerned by opening voice in the pattern that is associated at it and is turned off described identification response subsequently automatically and push the speech pattern button.
97. according to the method for claim 96, wherein:
The speech recognition mode that is associated with one of described button is described big vocabulary identification;
The recognition mode that is associated with another described button is to finish recognized patterns with the vocabulary of alphabet sequence typing.
98. according to the method for claim 96, wherein:
The speech recognition mode that is associated with one of described button is continuous identification;
The recognition mode that is associated with another described button is discontinuous identification.
99. according to the method for claim 96, wherein
The handheld computer device is a mobile phone; And
Button is a mobile phone button.
100. the method for a speech recognition, comprising:
User interface is provided, and this user interface provides response ratio as short sustained touch of first duration of clicking with such as with pushing the button of the sustained touch of the second such longer duration;
Push by in duration, making speech recognition on sound, finish to respond along with the compressing time length variations; And
Respond click by in the duration that has nothing to do in length speech recognition being finished on sound with click.
101. according to the method for claim 100, wherein:
Described response click cause speech recognition substantially from click the time be carved into the sound that the next terminal point of detected sounding receives and finish; And
Described response is pushed on the sound that speech recognition is received during pushing and is finished.
102. according to the method for claim 101, wherein to click the identification finish be discontinuous identification in response, and response to push the identification of being finished be continuous identification.
103. method according to claim 102, wherein user interface allows the user to select between two kinds of patterns, wherein a kind of pattern is that the identification both that the identification clicked of response and response are pushed is continuous or is discontinuous, and another kind of pattern to be response click the identification finished is discontinuous identification, and response to push the identification of being finished be continuous.
104. according to the method for claim 100, wherein:
Described response is clicked and is caused speech recognition substantially to be finished from clicking the sound of receiving in cycle of at least 1 minute constantly; And
Described response is pushed on the sound that causes speech recognition during pushing and after this to be received in 1 second time at the most and is finished.
105. according to the method for claim 100, wherein:
User interface has numerous speech pattern selector buttons, and each all is used for selecting different can be simultaneously selecting the speech recognition mode that uses for the user;
User interface respond each model selection button duration such as with click that short touch of the first such duration touches and such button duration such as the touch of pushing the second such longer duration together;
Described method is finished on sound by the pattern relevant with button in along with the duration of compressing time length variations by causing speech recognition; And
On sound, finish the click of response modes button by causing speech recognition in irrelevant duration by the pattern relevant with the length of clicking with button.
106. according to the method for claim 105, wherein:
The recognition mode that is associated with first described mode button is to finish recognized patterns with big vocabulary; And
The recognition mode that is associated with second described mode button is to finish recognized patterns with the vocabulary that alphabet sequence enters.
107. according to the method for claim 105, wherein
The speech recognition mode that is associated with one of described mode button is continuous identification; And
With another recognition mode that is associated in the described mode button is discontinuous identification.
108. according to the method for claim 105, wherein:
Described method is put into practice on mobile phone; And
There is the mobile phone button of numeral to serve as described mode button.
109. a computer installation that plays the phone effect, comprising:
The perceptible output unit of user;
At least the one group of telephone key-press that comprises 12 push-button phone key zones of standard;
One or more microprocessors;
The storer that microprocessor is readable;
Described phone can receive the microphone or the sound input of the electronics expression of sound from it there;
Be used for making the ammeter of the sound that in described phone, produces to reach the loudspeaker or the voice output of the sound of the correspondence that converts to;
Transmit and receive Circuits System;
Be recorded in the program in the storer, comprising:
Have to be used for finishing and comprise the telephone program of sending with the instruction of the telephony feature of receipt of call; And
Comprise that being used for expressing instruction and the response of finishing big vocabulary predicative sound identification according to the electronics of the sound of receiving from the input of microphone or microphone pushes the speech recognition program of the instruction of one or more telephone key-presss control speech recognition operation.
110. according to the computer installation of claim 109, wherein device is a mobile phone.
111. according to the computer installation of claim 109, wherein device is a wireless phone.
112. according to the computer installation of claim 109, wherein device is the communication cable phone.
113. according to the computer installation of claim 109, wherein speech recognition program comprises following instruction:
Be used for by finish speech recognition produce each all by one or more be identified device find by than higher probability and given sounding or partly the keep the score option option list of speech recognition candidate item of the best of forming of the corresponding word of sounding respond the instruction of given sounding;
Be used for producing the instruction of pointing out numerous option option list candidate item and making the perceptible output of user that telephone key-press separately is associated with each such candidate item; And
Be used for pushing the telephone key-press that is associated with option option list candidate item as the output response of given sounding by the candidate item of selecting to be associated.
114. according to the computer installation of claim 113, wherein speech recognition program comprises the numerous instructions that the telephone key-press of numeral is arranged as the described telephone key-press that is associated with option option list candidate item of use.
115. according to the computer installation of claim 114, wherein speech recognition program comprises and is used for the telephone key-press of numeral being arranged and having other digital telephone key-press to be used for the instruction of other speech identifying function when option option list candidate item is associated at some.
116. according to the computer installation of claim 113, wherein speech recognition program comprises following instruction:
Be used for pushing the instruction of operating in first pattern of each button in one group of telephone key-press in response by the option option list candidate item of selecting to be associated; And
Be used for the instruction operated among same group of telephone key-press pushed in response each button second pattern as letter identification input.
117. according to the computer installation of claim 116, wherein speech recognition program comprises the instruction that described letter identification is used for the alphabet sequence filtration of option option list.
118. according to the computer installation of claim 109, wherein speech recognition program comprises following instruction:
Be used for responding the identification generation of given sounding and the instruction of the corresponding identification output of the sequence of one or more recognized words;
Be used for identification output put into previous in text sequence current cursor position comprise zero or the instruction of the text sequence of the more sequence that is stored in the word in the storer; And
Be used for responding push buttons different among the telephone key-press in text sequence respectively forward and the instruction of moving cursor position backward.
119. according to the computer installation of claim 118, the instruction that wherein is used for moving current text position comprises being used for responding pushes one of two telephone key-presss related with the each mobile phase of word (move forward be associated and another is with word that mobile phase is related backward with word) current text position is distinguished forward and the instruction of a mobile whole-word backward at every turn.
120. according to the computer installation of claim 119, wherein be used for current text position each forward and the instruction of moving a whole-word backward comprise following instruction:
Be used under first kind of condition pushing respectively and the word instruction of the related button of mobile phase forward or backward by the whole-word response that is chosen in previous cursor position back or front; And
Be used under second kind of condition by after non-selection cursor is placed on previous cursor position respectively at once or before response push and the word instruction of the related button of mobile phase forward or backward;
Same whereby two are used to move a word and/or make cursor and whole word or corresponding in the non-selective light target selection of word front or back at every turn in text by bond energy.
121. according to the computer installation of claim 120, wherein said second kind of condition is included in to push to receive after one of described two each mobile buttons of word and pushes the each mobile button of another described word and import as the next one.
122. according to the computer installation of claim 118, wherein:
The perceptible output unit of user is a display screen;
Speech recognition program comprises the instruction that is used for showing across continuous number row all or part of text sequence on display screen; And
Being used for the instruction of mobile current text position comprises being used for responding and pushes buttons different among the telephone key-press move up respectively delegation or move down the instruction of delegation of current text position.
123., wherein be used for the instruction of mobile current text position and comprise being used for responding and push the instruction that different button among the telephone key-press moves to the current text position respectively the beginning or end of the word sequence that comprises all or part of word in the text sequence according to the computer installation of claim 118.
124. according to the computer installation of claim 118, wherein speech recognition program comprises following instruction:
Be used for by begin the instruction that a telephone key-press is pushed in open-ended selection response at current text position; And
Be used for by selecting respectively forward and expanding the quantity response that is associated with such button backward and push respectively and different button forward and among the telephone key-press that is associated of mobile current backward text position.
125. according to the computer installation of claim 118, wherein program comprise be used for response push that current location of one of telephone key-press be moved after at current text position by the text of one or more words-produce the instruction of voice output to-voice program.
126. according to the computer installation of claim 118, wherein:
The perceptible output unit of user is a display screen;
Speech recognition program comprise be used for response push that current location of one of telephone key-press be moved after the instruction that is presented at one or more words of current location on the display screen.
127. according to the computer installation of claim 109, wherein speech recognition program comprises and being used for by entering by providing with the perceptible form of user about the instruction to the selection of one of given telephone key-press of the explanation response help mode response of pushing telephone key-press afterwards that enters the function that is associated with the telephone key-press that is pressed later on before the help pattern.
128. according to the computer installation of claim 127, wherein:
Be used for responding the order structure with the instruction definition layering of the operation of control speech recognition of pushing of one or more telephone key-presss, wherein the user can be by the sequence navigation and the selection instruction of one or more telephone key-presss; And
If be used for entering the instruction of help pattern comprise be used for by provide with the perceptible form of user the explanation response that before entering the help pattern, has been transfused to so the function that in the order structure of layering pressing keys will have about that pressing keys sequence in similar pressing keys sequence entering described help pattern after the instruction of each pressing keys in the sequence of twice above pressing keys.
129. according to the computer installation of claim 109, wherein speech recognition program comprises and is used for pointing out the instruction that the perceptible list response of user of the function that each button is associated among current and numerous indivedual telephone key-presss is pushed first telephone key-press by output.
130. according to the computer installation of claim 129, wherein the perceptible output of user comprises and generates the voice signal say function indication tabulation.
131. according to the computer installation of claim 129, wherein:
Telephone key-press comprises described first button and one group of one or more navigation keys; And
Speech recognition program comprises the instruction that is used for operating in Text Mode, wherein:
Navigation keys allows to enter the perceptible navigation of user through the text of identification;
Other telephone key-press has one group to be used for controlling the function that enters and edit of described text one to one with them; And
Push first button and meet with a response, the perceptible navigation of user that the navigation keys permission enters the feature list that is associated with each button among numerous telephone key-presss in Text Mode in this pattern by the entry instruction list mode.
132. according to the computer installation of claim 131, wherein:
Getting in touch of numerous functions during the perceptible feature list of the user of instruction list pattern comprises the telephone key-press numeral and tabulates; And
Speech recognition program comprises that the function response that is used for by the returned text pattern and selects to be associated with it pushes the instruction that digital telephone key-press is arranged that is associated with a certain function in the described tabulation in the operating period of instruction list pattern.
133. according to the computer installation of claim 131, wherein:
Speech recognition program is included in the following instruction of using in the instruction list pattern:
Be used for by selecting to respond the instruction that one or many is pushed navigation keys with respect to the perceptible feature list locomotive function of user; With
The selection telephone key-press is pushed in the function response that is used for by the returned text pattern and selects to be associated with it.
134. according to the computer installation of claim 133, wherein instruction list comprises except and being chosen in the instruction list pattern in the Text Mode of selecting additional function by pushing other function selected those of telephone key-press by described navigation.
135. according to the computer installation of claim 133, wherein:
Instruction list is listed in the function that is associated with navigation keys in the Text Mode;
Described Text Mode navigation keys function is different from those that are associated with navigation keys in the instruction list pattern; And
Text Mode navigation keys function can and be chosen in the instruction list pattern selected by described navigation.
136. according to the computer installation of claim 131, wherein:
Described telephone key-press comprises menu key;
The described program that is recorded in the storer comprises in each pattern that is used for being different from described Text Mode among numerous patterns by showing that the available phone button selects but pushing the instruction that can not push menu key before the menu key with the feature list response that same telephone key-press is selected; And
Being used for described first button of selection instruction list mode in described Text Mode is menu key.
137. according to claim 109 computer installation, wherein speech recognition program is included in the instruction of operating in the following period in the Text Mode:
Navigation keys allows during the perceptible navigation of user of the text discerned;
Numerous telephone key-presss that numeral arranged are selected as each button simultaneously will be not during on the same group the function button mapping button that is mapped to the different button mapped mode on numerous telephone key-presss that numeral arranged plays a role;
The user can be by selecting the mapping of needed button apace from numerous such mappings by the telephone key-press that is pressed with numeral whereby, thereby can increase the user greatly from the speed from selection instruction among the many instruction of the number of Text Mode.
138. according to the computer installation of claim 137, wherein speech recognition program comprise be used for by enter navigation keys allow with button mapped mode that the mapping button that is pressed is associated in point out to push the instruction that described button shines upon one of button with the menu mode that the is associated response of the perceptible navigation of user of every numerous menus that the function that each button is associated among the numeric phone keys arranged.
139. a method of finishing big vocabulary predicative sound identification, comprising:
Receive each signal and point out that all the user has chosen the filtration sequence of one or more push button signallings of which button among numerous buttons, wherein each button is represented two or more letters;
The acoustics that receives sound is expressed;
On acoustics is expressed, finish as the acoustics of sound express and the acoustic model of word between the adaptation function speech recognition of keeping the score to word candidates;
Wherein:
The keeping the score of word candidates supports to comprise the word candidates with the sequence of the corresponding one or more alphabetical characters of push button signalling filtration sequence, if each follow-up character is all corresponding to one of letter of representing with the follow-up push button signalling of its correspondence in character string, word candidate is counted as comprising and the corresponding character string of filtration sequence so.
140. the method according to claim 139 further comprises:
By finish the additional sounding pattern that speech recognition response is associated with given push button signalling on the sounding that is associated in described filtration sequence; And
Push the identification that the letter of representing only limits to respond with the letter of the letter sign word sign that is identified the sounding that is associated with pressing keys as letter sign word by making that group with the button in the filtration sequence.
141. the method according to claim 140 further comprises:
By show one group of word word that to comprise one or more each letter of representing with the button that is pressed be starting point response pressing keys signal with the perceptible form of user; And
Be supported in and show the sounding identification of carrying out afterwards with the corresponding letter sign word that is associated with the button that is pressed of described demonstration word.
142., further comprise user interface is provided this user interface according to the method for claim 139:
With the perceptible form output of user numerous word candidates by described speech recognition generation in the option option list; And
Allow the user to select one of candidate item of exporting as needed word; And
By selecting its as the selection of recognized word response user with regard to identification to one of candidate item of output.
143. according to the method for claim 139, wherein said receiving filtration sequence and describedly finish reception that the speech recognition of supporting to comprise with the candidate item of the corresponding character of filter sequence can respond pressing keys signal continuous in the described filtration sequence and express at given acoustics and repeatedly finished.
144. according to the method for claim 139, wherein preferentially keeping the score of word candidates is by from before finishing by selecting those to comprise with the candidate item of the sequence of the corresponding one or more characters of filtration sequence the selected word candidates of recognizer.
145. according to the method for claim 139, wherein preferentially keeping the score of word candidates is to finish by finish speech recognition for the second time on acoustics is expressed during the word candidates of the sequence of the corresponding one or more characters of filtration sequence that comprise and receive is supported.
146. according to the method for claim 139, wherein the sequence of pressing keys signal be before finishing the initial identification that acoustics expresses, receive and also the alphabetical volume of word candidates regulate and during initial identification, finish.
147. according to the method for claim 139, wherein said method is to finish by the software that moves on phone, and those buttons are the buttons in the telephone key-press district.
148. according to the method for claim 139, wherein phone is a mobile phone.
149. method according to claim 139, wherein preferentially keeping the score of word candidates is to finish by finishing speech recognition on expressing at the acoustics of second sounding of the word of expection, and the word candidates that wherein comprises with the sequence of the corresponding one or more characters of receiving of filtration sequence is supported.
150. according to the method for claim 149, wherein preferentially keeping the score of word candidates is to finish by keeping the score to word candidates at the expection original spoken of word and secondary sounding.
151. according to the method for claim 139, wherein preferentially keeping the score not only of word candidates supports to comprise word candidates with the sequence of the corresponding one or more alphabetical characters of filtration sequence, and the supporting language model is kept the score.
152. according to the method for claim 151, wherein the language model that is used in combination with such filtration sequence in the keeping the score of word candidates is the language model that depends on context relation.
153. a method of finishing big vocabulary predicative sound identification, comprising:
Receive one or more pressing keys sequences of pushing the telephone key-press signal, wherein each signal points out all the user has selected which button among numerous buttons;
By use which letter among the selection of times of pushing given button in the given time each other, take place and a plurality of letters that given button is associated as the letter of needs with the pressing keys sequential decoding;
The sequence of one or more letters of the described pressing keys sequential decoding of storage foundation is as alphabetical filtration sequence;
The acoustics that receives sound is expressed;
On acoustics is expressed, finish as the acoustics of sound express and the acoustic model of word between the adaptation function speech recognition of keeping the score to word candidates;
Wherein:
The keeping the score of word candidates supports to comprise the word candidates with the sequence of alphabetical corresponding one or more alphabetical characters of described alphabetical filtration sequence.
154. finish the method for big vocabulary predicative sound identification for the sequence of importing one or more alphabetical characters for one kind, comprising:
Push the sequence of one or more selected telephone key-presss, each button is represented two or more letters;
Launch the corresponding sequence of one or more letter sign words;
On the sounding of each letter sign word, finish speech recognition, the wherein identification support of each the such sounding identification of the two or more alphabetical letter sign word that identifies of the telephone key-press representative that is associated with sounding; And
Handle as importing using series processing from user's letter with one or more letters of pushing the letter sign word sign that telephone key-press is associated at every turn.
155. according to the method for claim 154, wherein:
Described method combines use with big vocabulary recognition system; And
The most of word that begins from given letter in the vocabulary of big vocabulary recognition system can use as the letter sign word of given letter.
156. according to the method for claim 154, wherein:
The letter sign word that is associated with each letter among most of letter belongs to the one group of limited sign of letter below five word that begins from that given letter; And
The identification of one of limited letter sign word of that group of one of two or more letters of the telephone key-press representative that is associated with sounding of the identification support sign of the sounding of letter sign word.
157. the method according to claim 156 further comprises:
By show with the perceptible form of user comprise one or more from one group of letter sign word response pressing keys signal with the word of each letter beginning of the button representative that is pressed; And
Be supported in and show the sounding identification of carrying out afterwards with the corresponding letter sign word that is associated with the button that is pressed of described demonstration word.
158. according to the method for claim 156, wherein:
Described method is to finish having on the phone of display screen; And
The output of the subclass of letter sign word is to finish by the such word of demonstration on the display screen of phone.
159. one kind is having the method for finishing big vocabulary predicative sound identification on the device of telephone key-press, described method comprises:
On one or more sounding, finish the identification of big vocabulary predicative sound to produce the corresponding output text that comprises one or more words of having recognized with described speech recognition;
Receive and one or morely push the sequence of telephone key-press signal and described depression sequence is interpreted as corresponding with the sequence of one or more alphabetical characters; And
The sequence of described one or more alphabetical characters is outputed among the described output text.
160. according to the method for claim 159, wherein phone is a mobile phone.
161. according to the method for claim 159, wherein:
The sequence of one or more pressing keys signals at each pressing keys signal each all represent on the meaning of two or more letters by Automatic Program be processed into multiple explanation; And
Except such pressing keys from the information in other source be used to select with sequence among one or more letters of being associated of pressing keys which letter will be interpreted as with each such pressing keys corresponding.
162. according to the method for claim 161, wherein the information from other source comprises language model information except pressing keys.
163. according to the method for claim 162, wherein the information from other source comprises the language model information that depends on context relation except pressing keys.
164. method according to claim 159:
The sequence of wherein one or more pressing keys signals at each pressing keys signal each all represent on the meaning of two or more letters by Automatic Program be processed into multiple explanation; And
Further comprise:
Export numerous spellings in the option option list with perceptible form of user and the corresponding word candidates of pressing keys signal;
Allow the user to select one of candidate item of exporting as needed word; And
By selecting its as the selection of recognized word response user with regard to identification to one of candidate item of output.
165. according to the method for claim 159, wherein the explanation of pressing keys sequence comprise by use among the selection of times of pushing given button in the given time each other, take place and the various letter that given button is associated which letter as the letter of expecting with the pressing keys sequential decoding.
166. the method for a speech recognition, comprising:
Receive the original spoken of one or more words;
On original spoken, finish initial speech recognition;
Producing representative is chosen to be most possibly and the perceptible output of user of one or more sequences of the corresponding one or more words of this sounding by identification;
Provide the user interface that allows the user to select to discern so that on a part of original spoken corresponding, finish the secondary sounding with the whole of the perceptible output of user or selected part; And
Response user's selection is so that finish the identification of secondary sounding by following method on whole or a part of original spoken:
The secondary sounding of receiving that is associated with this selection is processed into the secondary sounding of the selected part of original spoken; With
On the secondary sounding, finish speech recognition so that the keeping the score of selected part of secondary sounding and original spoken selected one or more sequences most possible and one or more words that the secondary sounding mates based on one or more words.
167. according to the method for claim 166, wherein:
The initial identification of original spoken is by continuous speech recognition; And
The secondary sounding recognizes by discontinuous speech recognition.
168., wherein use the number of the detected sounding of recognizing by discontinuous identification of secondary sounding to be used to determine to allow the number of words in the sequence of one or more words of after the secondary sounding, with regard to original spoken, being recognized according to the method for claim 167.
169. according to the method for claim 166, wherein original spoken and secondary sounding both recognize by discontinuous speech recognition.
170. according to the method for claim 166, wherein original spoken and secondary sounding both recognize by continuous speech recognition.
171., it seems that wherein the selection of sequence of one or more words of selected part coupling of most possible and secondary sounding and original spoken is used to use the selected partial data renewal acoustic model from original spoken according to the method for claim 166.
172. according to the method for claim 166, wherein:
User interface allows the user to select one or more word filter inputs, and each points out that all the output of expecting has certain will combine the feature of use with the identification of secondary sounding; And
Selecting one or more sequences as program most possible and secondary sounding and original spoken coupling as also uses selected filtration input to support to select to have any identification candidate item of selected feature.
173. according to the method for claim 172, wherein user interface allows the user to select to point out that the output of expecting comprises the alphabetical filtration input of the word of the sequence that will comprise one or more appointment letters.
174. a computer installation that is used for finishing big vocabulary predicative sound identification, comprising: the storer that microprocessor is readable;
Be used to provide microphone or the sound input of representative with the electronic signal of the sounding that is identified;
Be used for making the ammeter of the sound that in described phone, produces to reach loudspeaker or the voice output that converts corresponding sound to;
Be recorded in the program that comprises speech recognition program in the storer, wherein speech recognition program comprises the instruction that is used for following program:
Be used for finishing by producing corresponding to being identified as and sounding
The speech recognition program of the big vocabulary predicative sound identification that the electronics of the sequence of one or more sounding that the text output response of corresponding one or more words is received from microphone or microphone input is expressed; And
Be used for TTS output is offered the loudspeaker of saying the one or more words in the described text that identifies with regard to sounding by speech recognition or the TTS program of voice output;
Be stored in the storer shared speech model data for the TTS program use of the corresponding sound of voice of the sequence of the described speech recognition program of identification and the corresponding word of spoken sounding and generation and one or more words.
175. according to the computer installation of claim 174, wherein said shared speech model data comprise alphabetical sounding rule.
176. according to claim 174 computer installation, wherein said shared speech model data comprise the mapping between the word and one or more Chinese phonetic spellings with regard to each word among several thousand vocabulary words at least.
177. according to the computer installation of claim 176, wherein said mapping comprises the indication that is fit to the different phonetics spelling of some word when they partly take place as different voice.
178. according to the computer installation of claim 177, wherein said shared speech model data comprise which voice of pointing out one or more words partly more may occur in the language model information among the given language context relation.
179. according to the computer installation of claim 174, wherein device is a handheld apparatus.
180. according to the computer installation of claim 179, wherein device is a mobile phone.
181. a computer installation that is used for finishing big vocabulary predicative sound identification, comprising:
The storer of microprocessor readability;
Be used to provide the microphone or the sound input of the electronic signal of the sounding of indicating to discern;
Be used for making the ammeter Danone of the sound that in described phone, produces enough to convert the loudspeaker or the voice output of corresponding sound to;
Be recorded in the program that comprises speech recognition program in the storer, wherein speech recognition program comprises following instruction:
Be used on the electronics of the sounding of receiving from the input of microphone or microphone is expressed, finishing the instruction that big vocabulary predicative sound identification produces text output;
Be used for TTS output is offered the loudspeaker of one or more words of saying described text output or the instruction of voice output;
Be used for the instruction that is identified as instruction as the sounding of voice commands;
Be used for TTS or the voice output noted are offered the instruction that the name name that is identified instruction is said in described loudspeaker or voice output.
182. according to the computer installation of claim 181, wherein device is a handheld apparatus.
183. according to the computer installation of claim 182, wherein device is a mobile phone.
184. a computer installation that is used for finishing big vocabulary predicative sound identification, comprising:
The storer of microprocessor readability;
Be used to provide the microphone or the sound input of the electronic signal of the sounding that representative will discern;
Be used for making the ammeter Danone of the sound that in described phone, produces enough to convert the loudspeaker or the voice output of corresponding sound to;
Be recorded in the program that comprises speech recognition program in the storer, wherein speech recognition program comprises the instruction of the big vocabulary predicative sound identification that the electronics that is used for finishing each sequence that responds one or more sounding of receiving from the input of microphone or microphone is as follows expressed:
Generation is exported corresponding to the text that is identified as with the corresponding one or more words of sounding; Then
TTS output is offered described loudspeaker or voice output say one or more words by the described text that the speech recognition of sounding is recognized.
185. according to the computer installation of claim 184, wherein said speech recognition is discontinuous speech recognition, and the textual words according to each sounding identification is said in described TTS output.
186. according to the computer installation of claim 184, wherein speech recognition is continuous speech recognition, and one or more textual words of discerning according to each sounding are said in described TTS output after sounding finishes.
187. according to the computer installation of claim 184, wherein device is a handheld apparatus.
188. according to the computer installation of claim 187, wherein device is a mobile phone.
189. a computer installation that is used for finishing big vocabulary predicative sound identification, comprising:
The storer that microprocessor is readable;
Be used to provide the microphone or the sound input of the electronic signal of the sounding that representative will discern;
Be used for making the ammeter Danone of the sound that in described phone, produces enough to convert the loudspeaker or the voice output of corresponding sound to;
Be recorded in the program that comprises speech recognition program in the storer, comprising the following instruction of speech recognition program:
Be used on the electronics of the sounding of receiving from the input of microphone or microphone is expressed, finishing the instruction that big vocabulary predicative sound identification produces text output;
Be used for by in one or more words of described text output backward and move forward the instruction of cursor response text navigation;
Be used for by TTS output being offered described loudspeaker or voice output to say position with cursor after described the moving be starting point or being that one or more words responses of terminal point are moved according to the each of described navigation instruction.
190. according to the computer installation of claim 189, wherein said program further comprises being used for responding as follows selects to expand the instruction of instructing:
Recording light cursor position when receiving the instruction of conduct selection starting point;
From selecting starting point to begin to select; And
Enter the selection augmented pattern, in this pattern, the response of one of described navigation instruction further comprised selection is extended to according to the cursor position after the described navigation instruction moving cursor from selecting starting point.
191. according to the computer installation of claim 190, wherein said program further comprises and is used for saying the instruction that selection instructions are play in current one or more words responses in selection by TTS output being offered described loudspeaker or voice output.
192., wherein saidly say one or more words and begin to say the word from the described text that current cursor position begins and continue to say them up to the terminal point that reaches the text unit bigger or up to receiving the user's input that finishes such playback than word according to the computer installation of claim 189.
193. according to the computer installation of claim 189, wherein device is a handheld apparatus.
194. according to the computer installation of claim 193, wherein device is a mobile phone.
195. a computer installation that is used for finishing big vocabulary predicative sound identification, comprising:
The storer that microprocessor is readable;
Be used to provide the microphone or the sound input of the electronic signal of the sounding that representative will discern;
Be used for making the ammeter Danone of the sound that in described phone, produces enough to convert the loudspeaker or the voice output of corresponding sound to;
Be recorded in the program that comprises speech recognition program in the storer, wherein speech recognition program comprises following instruction:
Being used for finishing on the electronics of the sound of receiving from the input of microphone or microphone that sends is expressed big vocabulary predicative sound identification produces each and all is chosen to be the keep the score instruction of option option list of the identification candidate item that the sequence of best word forms of the described sound that sends by identification by one or more; And
Be used for spoken language output is offered the instruction that one or more words of one of identification candidate item in the option option list are said in described loudspeaker or voice output.
196. according to the computer installation of claim 195, wherein said program comprises following instruction:
Be used for by moving the instruction of current that identification candidate item response option option navigation instruction of in the option option list, selecting; And
Be used for moving according to the each of one of described navigation instruction by the one or more word responses that provide spoken output to say among the identification candidate item of current selected.
197. according to the computer installation of claim 195, wherein:
Described spoken output is said in the described tabulation word in numerous identification candidate item and is comprised the spoken language indication of the option option input signal that is associated with every instruction in described numerous instructions; And
Described program further comprises and is used for responding as the output that is fit to the described sound that sends by the identification candidate item of selecting to be associated the instruction of the reception of one of described option option input signal.
198. according to the computer installation of claim 197, wherein:
Described device has the telephone key-press district; And
Described option option input signal comprises the telephone key-press numeral; And
The reception of one of the described option option of described response input signal comprise response as option option input signal by be pressed with the numeral telephone key-press.
199. according to the computer installation of claim 197, the identification candidate item that the best is kept the score is at first said in wherein said spoken output.
200. according to the computer installation of claim 195, wherein said program comprise be used for by produce each all by consistent with described filtration input and also by identification be chosen to be for the described sound that sends keep the score identification candidate item that the sequence of best one or more words forms through the option option list that filters and spoken language output offered described loudspeaker or voice output say instruction in the reception of importing through one or more words response filtrations of one of identification candidate item in the option option list that filters.
201. according to the computer installation of claim 200, wherein said program further comprises and is used to provide the instruction that the current numerical value of filtrator is said in spoken output.
202. according to the computer installation of claim 201, wherein filtering input is alphabetical sequence, and the letter in the filter sequence is said in spoken output.
203. according to the computer installation of claim 195, wherein spoken output comprises the spelling of one or more selections.
204. according to the computer installation of claim 195, wherein device is a handheld apparatus.
205. according to the computer installation of claim 204, wherein device is a mobile phone.
206. the method for a word identification, comprising:
Whole or a part of hand-written expression of the given sequence of one or more words that reception will be discerned;
Receive the oral expression of the described sequence of one or more words;
Finish on speech recognition on the oral expression and hand-written expression handwriting recognition and based on keep the score selection one or more each the bests of all by the sequence of one or more words forming the keep the score identification candidate item of identification candidate item to hand-written and spoken two kinds of expression.
207. the method for a word identification, comprising:
The oral expression of the given sequence of one or more words that reception will be discerned;
Reception is by filtration input hand-written or that the character picture input is formed;
Use hand-written or character recognition define and represent to be chosen to be the filtrator that corresponding one or more character strings are imported in most possible and described filtration respectively by described identification; And
The identification candidate item that one of one or more each character string that all whether are associated with one or more and described filtrator by them of the combination selection of the speech recognition of using described filtrator and finishing on described oral expression are complementary and all as them sequence of the selected one or more words of the function of the coupling degree of closeness of oral expression are formed.
208. according to the method for claim 207, wherein said filtration input is made up of hand-written.
209. according to the method for claim 208, wherein:
Described filtrator is represented numerous character strings; And
Described selection identification candidate item is selected numerous the bests identification candidate item of keeping the score, and wherein different the bests identification candidate item of keeping the score can be complementary with the different character string of representing with described filtrator.
210. according to the method for claim 209, filtrator of wherein said usefulness is represented and numerous character strings that be used among the described selection identification candidate item can have kinds of characters length.
211. according to the method for claim 208, wherein:
Described filtrator is only represented one of character string that is used to filter; And
Numerous the bests that described selection identification candidate item is selected all to mate with described character string identification candidate item of keeping the score.
212. according to the method for claim 207, wherein said filtration input is made up of one or more character pictures that separate.
213. according to the method for claim 212, wherein:
Described filtrator is represented numerous character strings; And
Described selection identification candidate item is selected numerous the bests identification candidate item of keeping the score, wherein different the bests keep the score the identification candidate item can with mate with the different character string of described filtrator representative.
214. according to the method for claim 212, wherein:
Described filtrator is only represented one of character string that is used to filter; And
Numerous the bests that described selection identification candidate item is selected all to mate with described character string identification candidate item of keeping the score.
215., further comprise according to 207 methods in the claim:
The oral expression of second sequence of one or more words that reception will be discerned;
Use speech recognition the corresponding sequence of one or more words to be outputed among the continuous main body of text;
Be used in the finger position device responds user input that touches the sequence of one or more words in the described body of text by the sequence of selecting to be touched as the sequence that will revise;
The oral expression of described second sequence of word part is handled as described given word sequence; Then
Receive described filtration input;
Use described hand-written or character recognition to define described filtrator; And the one or more identification candidate item of the combination selection of using described filtrator and speech recognition.
216. the method for a word identification, comprising:
The hand-written expression of the given sequence of one or more words that reception will be discerned;
Reception comprises the filtration input of one or more sounding of the sequence of representing one or more letter sign words;
Use the speech recognition define and represent to be chosen to be the filtrator of most possibly importing corresponding one or more character strings by described identification with described filtration; And
The identification candidate item that one of one or more each character string that all whether are associated with one or more and described filtrator by them of the combination selection of the handwriting recognition that uses described filtrator and finish on described hand-written expression coupling is all formed the sequence of the selected one or more words of the function of the coupling degree of closeness of hand-written expression as them.
217. according to the method for claim 216, wherein:
Filtering input is the letter sign word sequence of continuous oral expression; And speech recognition is continuous speech recognition.
218. according to the method for claim 216, wherein:
Filtering input is the letter sign word sequence of oral expression discontinuously; And speech recognition is discontinuous speech recognition.
219. according to the method for claim 216, wherein:
Described filtrator is represented numerous character strings; And
Described selection identification candidate item is selected numerous the bests identification candidate item of keeping the score, and wherein different the bests identification candidate item of keeping the score can be complementary with the different character string of representing with described filtrator.
220. according to the method for claim 219, the representative of filtrator of wherein said usefulness and numerous character strings that be used among the described selection identification candidate item can have different character lengths.
221. according to the method for claim 220, wherein:
Filtering input is the alphabetical name name sequence of continuous oral expression; And
Speech recognition is continuous speech recognition.
222. according to the method for claim 216, wherein:
Described filtrator is only represented one of character string that is used to filter; And
Numerous the bests that described selection identification candidate item is selected all to mate with described character string identification candidate item of keeping the score.
223. according to claim 216 method, actually or further comprising providing makes the user can select to filter the user interface of importing with discontinuous identification with continuous identification identification.
224., further comprise providing making the user can select the user interface whether the identification filtration is imported in the recognized patterns of the identification of supporting the alphabetical name name or non-alphabetical name names letters sign word according to the method for claim 216.
225. the method for a word identification, comprising:
The hand-written expression of the given sequence of one or more words that reception will be discerned;
On described hand-written expression, finish handwriting recognition produce each all comprise be chosen to be might with one or more the bests of the corresponding one or more words of one or more words of described hand-written expression identification candidate item of keeping the score;
Receive the oral expression of the given sequence of the one or more words that will discern then;
On described oral expression, finish speech recognition produce each all comprise be chosen to be might with one or more the bests of corresponding one or more words of one or more words of described oral expression identification candidate item of keeping the score;
Use the keep the score previous identification of the described hand-written expression of information correction in one of candidate item of described speech recognition the best.
226. according to the method for claim 225, wherein said use voice recognition information correction handwriting recognition comprises that the best that produces with speech recognition identification candidate item of keeping the score replaces with the best that handwriting recognition produces identification candidate item of keeping the score.
227. according to the method for claim 225, wherein said use voice recognition information correction handwriting recognition comprises and one of identification candidate item that produces with speech recognition is interpreted as instructing and keeps the score and carry out described instruction in the identification candidate item revising the best that produces with handwriting recognition.
228. a handheld computer device that is used for finishing big vocabulary predicative sound identification, comprising:
One or more treating apparatus;
The storer that treating apparatus is readable;
Be used to provide the microphone or the sound input of the electronic signal of representative voice input;
Be used for making the ammeter Danone of the sound that in described device, produces enough to convert the loudspeaker or the voice output of corresponding sound to;
Be recorded in the program in one or more storage arrangements, comprising:
Be used for finishing by producing the speech recognition program of the big vocabulary predicative sound identification of expressing with the electronics of the sound of one or more sequences that are identified as one or more sounding of receiving from microphone or microphone input with the text output response of the corresponding word of sounding; And
Be used for the expression of the electronically readable of described sound is recorded in SoundRec program in one or more described storage arrangements; And
Be used for the described acoustic expression of noting of playback and corresponding voice signal is offered the acoustic playback program of described loudspeaker or voice output;
Wherein Bian Cheng device has the instruction that is used for making the user selecting between two kinds of patterns among following three kinds of possible recording voice input patterns when receiving the sound input:
The speech recognition that responds the input of described sound is put text output the file that the user can navigate into and is not had first pattern of expression of the recording of described sound input in current cursor position;
The recording of described sound input is expressed in that described cursor place puts the file that described user can navigate into and second pattern of text that do not respond the speech recognition of described sound input; And
The three-mode of the file that the user can navigate is put text output in the speech recognition that responds the input of described sound in current cursor position, the various piece of the recording of the word expression sound input of its Chinese version output itself, each such word is identified according to it; And
Wherein the acoustic playback program comprises and is used for making the user to be placed on the instruction of file with the sound of recording of acoustic expression representative by having the cursor that is arranged in such expression to select to play by the second and the 3rd logging mode in can be in playback mode.
229. according to the device of claim 228, wherein Zhuan Zhi program further comprise be used for making the user can be chosen in second pattern and first or three-mode between the instruction of switching back and forth, wherein each the switching has 1 second delay of less than.
230. according to the device of claim 228, wherein Zhuan Zhi program further comprises the sound part that the speech recognition that is used for making the user can select not need corresponding identification to have finishing just is recorded so that produce instruction corresponding to the text output of selected sound on selected SoundRec part.
231. according to the device of claim 228, wherein Zhuan Zhi program further comprises the instruction that is used for making the user can be chosen in the subdivision that writes down the sound that is associated with its word of the text of exporting by speech recognition in the three-mode and the recording that is associated with deleted selected text is arranged.
232. according to the device of claim 228, wherein Zhuan Zhi program further comprises and is used for making the user can be chosen in the three-mode by the subdivision that writes down the sound that is associated with its word of the text of speech recognition output and deleted selected text is arranged and with the instruction of its position hereof of expression type replacement of the recording that produces by record in second pattern.
233. according to the device of claim 228, wherein the expression that is placed on the sound in the file by second kind of logging mode is that the audio graphics that changes duration of the various piece of the recording expressed along with their aspect length is expressed.
234. according to the computer installation of claim 228, wherein device is a handheld apparatus.
235. according to the computer installation of claim 234, wherein device is a mobile phone.
236. a handheld computer device that is used for finishing big vocabulary predicative sound identification, comprising:
One or more treating apparatus;
The storer that treating apparatus is readable;
Be used to provide the microphone or the sound input of the electronic signal of representative voice input;
Be used for making the ammeter Danone of the sound that in described device, produces enough to convert the loudspeaker or the voice output of corresponding sound to;
Be recorded in the program in one or more storage arrangements, comprising:
Be used for finishing by producing the speech recognition program of the big vocabulary predicative sound identification of expressing corresponding to the electronics of the sound of the sequence that is identified as one or more sounding of receiving from microphone or microphone input with the text output response of the corresponding one or more words of sounding; And
Be used for the expression of the electronically readable of described sound is recorded in SoundRec program in one or more described storage arrangements; And
Being used for the described recording of playback expresses and corresponding voice signal is offered the acoustic playback program of described loudspeaker or voice output;
Wherein Zhuan Zhi program further comprises and is used for making the user can select not have the sound part that corresponding identification just is recorded and the speech recognition of finishing is arranged so that produce instruction with the corresponding text output of selected sound on selected SoundRec part.
237. a handheld computer device that is used for finishing big vocabulary predicative sound identification, comprising:
One or more treating apparatus;
The storer that treating apparatus is readable;
Be used to provide the microphone or the sound input of the electronic signal of representative voice input;
Be used for making the ammeter Danone of the sound that in described device, produces enough to convert the loudspeaker or the voice output of corresponding sound to;
Be recorded in the program in one or more storage arrangements, comprising:
Be used for finishing the speech recognition program that is identified as the big vocabulary predicative sound identification that the electronics of sound of the sequence of one or more sounding of receiving from microphone or microphone input with the corresponding text output of the corresponding one or more words of sounding response expresses by generation; And
Be used for the expression of the electronically readable of described sound is recorded in SoundRec program in one or more described storage arrangements; And
Be used for acoustic expression that playback records and corresponding voice signal is offered the acoustic playback program of described loudspeaker or voice output;
The program of wherein said device further comprises following instruction:
Be used for making the user recording section of text output not to be expressed the instruction that partial association is got up with the recording of sound mark as yet with previous by described speech recognition;
Be used for making the user can select to make text output be used as the instruction that the text search character string is used by described speech recognition; And
Be used for finishing instruction to the search of the recording text output that is complementary with search string;
The user can select to seek the acoustic expression that a part is recorded by searching for the recording text that is associated with it whereby.
238. a computer installation that is used for finishing big vocabulary predicative sound identification, comprising:
One or more treating apparatus;
The storer that treating apparatus is readable;
Be used to provide the microphone or the sound input of the electronic signal of representative voice input;
Be used for making the ammeter Danone of the sound that in described device, produces enough to convert the loudspeaker or the voice output of corresponding sound to;
Be recorded in the program in one or more storage arrangements, comprising:
Be used for finishing by producing the speech recognition program of the big vocabulary predicative sound identification of expressing with the electronics of the sound that is identified as one or more sounding sequences of receiving from microphone or microphone input with the corresponding text output of the corresponding one or more words of sounding response; And
Be used for the expression of the electronically readable of described sound is recorded in SoundRec program in one or more described storage arrangements;
Being used for the described recorded voice of playback expresses and corresponding voice signal is offered the acoustic playback program of described loudspeaker or voice output; And
Be used for importing the instruction of between described acoustic playback and described speech recognition, switching back and forth, wherein begin before the playback slightly formerly of acoustic playback in succession with a user who causes each such switching.
239. according to the computer installation of claim 238, the wherein said user who is used between described acoustic playback and described speech recognition the same input media of commanded response that switches back and forth selects to finish two kinds of such switchings.
240. a computer installation that works as mobile phone, comprising:
The perceptible output unit of user;
At least the one group of telephone key-press that comprises 12 push-button phone key zones of standard;
One or more treating apparatus;
The storer that treating apparatus is readable;
Described phone can be used for receiving microphone or the sound input that the electronics of sound is expressed;
Be used for making the ammeter Danone of the sound that in described device, produces enough to convert the loudspeaker or the voice output of corresponding sound to;
Transmit and receive Circuits System;
Be recorded in the program in the storer, comprising:
Have to be used for finishing and comprise the telephone program of sending with the instruction of the telephony feature of receipt of call; And
Be used for finishing by producing the speech recognition program of the big vocabulary predicative sound identification of expressing with the electronics of the sound of the sequence that is identified as one or more sounding of receiving from microphone or microphone input with the corresponding text output response of the corresponding one or more words of sounding; And
Be used for the expression of the electronically readable of described sound is recorded in SoundRec program in described one or more storage arrangement;
Being used for the described recorded voice of playback expresses and corresponding voice signal is offered the acoustic playback program of described loudspeaker or voice output.
241. according to the computer installation of claim 240, wherein said playing program comprises following instruction:
Be used for making the user can select the instruction of the subdivision of described recorded voice expression; And
Be used for making the user can select the selected subdivision of described acoustic expression is played to the opposing party's of mobile calls instruction.
242. according to the computer installation of claim 240, wherein said logging program comprises and is used for making the user can select to write down the instruction of expression of side of mobile phone talk or both sides' electronically readable.
243. according to the computer installation of claim 240, wherein Zhuan Zhi program further comprises the instruction that is used for making the user to associate the recording section of text output and the previous various piece of not expressing with the recorded voice of sound mark as yet by described speech recognition.
244. according to the computer installation of claim 243, wherein Zhuan Zhi program further comprises following instruction:
Be used for making the user can select to make the text output of described speech recognition to be used as the instruction that the text search character string is used; And
Be used for finishing the instruction of the search that the recording text that meets described search string is exported;
The recording text that described whereby user can select to be associated with it by search is sought certain part that recorded voice is expressed.
245. according to the computer installation of claim 240, wherein Zhuan Zhi program further comprises and is used for making the user can select described recorded voice previous still unrecognized subdivision and the instruction of the described big vocabulary predicative sound identification of finishing is arranged in expressing on described selected subdivision.
246. according to the computer installation of claim 245, wherein:
Described speech recognition program comprises and is used for finishing with different quality level the instruction of speech recognition, and wherein the higher identification of quality will spend the more time and go to discern given sound length; And
The instruction of finishing speech recognition on the described selected subdivision that is used for making the user can be chosen in recorded voice comprises the instruction that is used for making selected recorded voice to be identified with described higher quality.
247. according to the computer installation of claim 245, wherein said speech recognition program comprises following instruction:
Be used for making the individual words in text output, recognized by speech recognition and with recorded voice part that each recognized word in the described text is associated between the instruction of time alignment; And
Be used for making the user can select the sequence of one or more words and instruction with those recorded voices that are associated by the word of playback is arranged.
248. according to the computer installation of claim 240, wherein Zhuan Zhi program further comprises and is used between acoustic playback and speech recognition the instruction of switching back and forth, wherein the playback slightly formerly of acoustic playback in succession begins before finishing.
CNA028298519A 2002-09-06 2002-09-06 Methods, systems and programming for performing speech recognition Pending CN1864204A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2002/028590 WO2004023455A2 (en) 2002-09-06 2002-09-06 Methods, systems, and programming for performing speech recognition

Publications (1)

Publication Number Publication Date
CN1864204A true CN1864204A (en) 2006-11-15

Family

ID=34271640

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA028298519A Pending CN1864204A (en) 2002-09-06 2002-09-06 Methods, systems and programming for performing speech recognition

Country Status (5)

Country Link
EP (1) EP1604350A4 (en)
JP (1) JP2006515073A (en)
KR (1) KR100996212B1 (en)
CN (1) CN1864204A (en)
AU (1) AU2002336458A1 (en)

Cited By (132)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541438A (en) * 2010-11-01 2012-07-04 微软公司 Integrated voice command modal user interface
CN103019535A (en) * 2011-10-10 2013-04-03 微软公司 Speech recognition for context switching
CN103035240A (en) * 2011-09-28 2013-04-10 苹果公司 Speech recognition repair using contextual information
CN103823547A (en) * 2012-11-16 2014-05-28 中国电信股份有限公司 Mobile terminal and cursor control method thereof
US8892446B2 (en) 2010-01-18 2014-11-18 Apple Inc. Service orchestration for intelligent automated assistant
CN104267922A (en) * 2014-09-16 2015-01-07 联想(北京)有限公司 Information processing method and electronic equipment
CN104756062A (en) * 2012-10-19 2015-07-01 谷歌公司 Decoding imprecise gestures for gesture-keyboards
CN104903846A (en) * 2013-01-08 2015-09-09 歌乐株式会社 Voice recognition device, voice recognition program, and voice recognition method
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US9300784B2 (en) 2013-06-13 2016-03-29 Apple Inc. System and method for emergency calls initiated by voice command
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
CN106126156A (en) * 2016-06-13 2016-11-16 北京云知声信息技术有限公司 Pronunciation inputting method based on hospital information system and device
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US9535906B2 (en) 2008-07-31 2017-01-03 Apple Inc. Mobile device having human language translation capability with positional feedback
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9606986B2 (en) 2014-09-29 2017-03-28 Apple Inc. Integrated word N-gram and class M-gram language models
US9620104B2 (en) 2013-06-07 2017-04-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9633674B2 (en) 2013-06-07 2017-04-25 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9697822B1 (en) 2013-03-15 2017-07-04 Apple Inc. System and method for updating an adaptive speech recognition model
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9798393B2 (en) 2011-08-29 2017-10-24 Apple Inc. Text correction processing
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9922642B2 (en) 2013-03-15 2018-03-20 Apple Inc. Training an at least partial voice command system
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
CN108028042A (en) * 2015-09-18 2018-05-11 微软技术许可有限责任公司 The transcription of verbal message
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
CN108231066A (en) * 2016-12-13 2018-06-29 财团法人工业技术研究院 Speech recognition system and method thereof and vocabulary establishing method
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
CN108899016A (en) * 2018-08-02 2018-11-27 科大讯飞股份有限公司 A kind of regular method, apparatus of speech text, equipment and readable storage medium storing program for executing
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10199051B2 (en) 2013-02-07 2019-02-05 Apple Inc. Voice trigger for a digital assistant
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
CN110211576A (en) * 2019-04-28 2019-09-06 北京蓦然认知科技有限公司 A kind of methods, devices and systems of speech recognition
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US10568032B2 (en) 2007-04-03 2020-02-18 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
CN110808035A (en) * 2019-11-06 2020-02-18 百度在线网络技术(北京)有限公司 Method and apparatus for training hybrid language recognition models
CN110880319A (en) * 2018-09-06 2020-03-13 丰田自动车株式会社 Voice interaction device, control method for voice interaction device, and non-transitory recording medium storing program
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US10607140B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
CN110955401A (en) * 2018-09-27 2020-04-03 富士通株式会社 Sound playback section control method, computer-readable storage medium, and information processing apparatus
CN111033509A (en) * 2017-07-18 2020-04-17 视语智能有限公司 Object re-identification
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10755703B2 (en) 2017-05-11 2020-08-25 Apple Inc. Offline personal assistant
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US10791216B2 (en) 2013-08-06 2020-09-29 Apple Inc. Auto-activating smart responses based on activities from remote devices
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
CN112259100A (en) * 2020-09-15 2021-01-22 科大讯飞华南人工智能研究院(广州)有限公司 Speech recognition method, training method of related model, related equipment and device
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US11217255B2 (en) 2017-05-16 2022-01-04 Apple Inc. Far-field extension for digital assistant services
CN114454164A (en) * 2022-01-14 2022-05-10 纳恩博(北京)科技有限公司 Robot control method and device
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7720682B2 (en) * 1998-12-04 2010-05-18 Tegic Communications, Inc. Method and apparatus utilizing voice input to resolve ambiguous manually entered text input
US7697827B2 (en) 2005-10-17 2010-04-13 Konicek Jeffrey C User-friendlier interfaces for a camera
JP4672686B2 (en) * 2007-02-16 2011-04-20 株式会社デンソー Voice recognition device and navigation device
US8457946B2 (en) * 2007-04-26 2013-06-04 Microsoft Corporation Recognition architecture for generating Asian characters
JP4862740B2 (en) * 2007-05-14 2012-01-25 ソニー株式会社 Imaging apparatus, information display apparatus, display data control method, and computer program
KR100998566B1 (en) 2008-08-11 2010-12-07 엘지전자 주식회사 Method And Apparatus Of Translating Language Using Voice Recognition
US8494852B2 (en) * 2010-01-05 2013-07-23 Google Inc. Word-level correction of speech input
KR101687614B1 (en) * 2010-08-04 2016-12-19 엘지전자 주식회사 Method for voice recognition and image display device thereof
KR101218332B1 (en) * 2011-05-23 2013-01-21 휴텍 주식회사 Method and apparatus for character input by hybrid-type speech recognition, and computer-readable recording medium with character input program based on hybrid-type speech recognition for the same
KR101330671B1 (en) * 2012-09-28 2013-11-15 삼성전자주식회사 Electronic device, server and control methods thereof
KR102009423B1 (en) * 2012-10-08 2019-08-09 삼성전자주식회사 Method and apparatus for action of preset performance mode using voice recognition
EP2933067B1 (en) * 2014-04-17 2019-09-18 Softbank Robotics Europe Method of performing multi-modal dialogue between a humanoid robot and user, computer program product and humanoid robot for implementing said method
US11455148B2 (en) 2020-07-13 2022-09-27 International Business Machines Corporation Software programming assistant
KR102494627B1 (en) * 2020-08-03 2023-02-01 한양대학교 산학협력단 Data label correction for speech recognition system and method thereof
US11880645B2 (en) 2022-06-15 2024-01-23 T-Mobile Usa, Inc. Generating encoded text based on spoken utterances using machine learning systems and methods

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19635754A1 (en) * 1996-09-03 1998-03-05 Siemens Ag Speech processing system and method for speech processing
US6122613A (en) * 1997-01-30 2000-09-19 Dragon Systems, Inc. Speech recognition using multiple recognizers (selectively) applied to the same input sample
US6526380B1 (en) * 1999-03-26 2003-02-25 Koninklijke Philips Electronics N.V. Speech recognition system having parallel large vocabulary recognition engines

Cited By (190)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US8942986B2 (en) 2006-09-08 2015-01-27 Apple Inc. Determining user intent based on ontologies of domains
US9117447B2 (en) 2006-09-08 2015-08-25 Apple Inc. Using event alert text as input to an automated assistant
US8930191B2 (en) 2006-09-08 2015-01-06 Apple Inc. Paraphrasing of user requests and results by automated digital assistant
US10568032B2 (en) 2007-04-03 2020-02-18 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US10381016B2 (en) 2008-01-03 2019-08-13 Apple Inc. Methods and apparatus for altering audio output signals
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9865248B2 (en) 2008-04-05 2018-01-09 Apple Inc. Intelligent text-to-speech conversion
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US9535906B2 (en) 2008-07-31 2017-01-03 Apple Inc. Mobile device having human language translation capability with positional feedback
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US10475446B2 (en) 2009-06-05 2019-11-12 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US10795541B2 (en) 2009-06-05 2020-10-06 Apple Inc. Intelligent organization of tasks items
US11080012B2 (en) 2009-06-05 2021-08-03 Apple Inc. Interface for a virtual digital assistant
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US9548050B2 (en) 2010-01-18 2017-01-17 Apple Inc. Intelligent automated assistant
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US8892446B2 (en) 2010-01-18 2014-11-18 Apple Inc. Service orchestration for intelligent automated assistant
US8903716B2 (en) 2010-01-18 2014-12-02 Apple Inc. Personalized vocabulary for digital assistant
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US10706841B2 (en) 2010-01-18 2020-07-07 Apple Inc. Task flow identification based on user intent
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10984326B2 (en) 2010-01-25 2021-04-20 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10607141B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10984327B2 (en) 2010-01-25 2021-04-20 New Valuexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US11410053B2 (en) 2010-01-25 2022-08-09 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10607140B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10049675B2 (en) 2010-02-25 2018-08-14 Apple Inc. User profiling for voice input processing
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
CN102541438A (en) * 2010-11-01 2012-07-04 微软公司 Integrated voice command modal user interface
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US10102359B2 (en) 2011-03-21 2018-10-16 Apple Inc. Device access using voice authentication
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US9798393B2 (en) 2011-08-29 2017-10-24 Apple Inc. Text correction processing
CN103035240A (en) * 2011-09-28 2013-04-10 苹果公司 Speech recognition repair using contextual information
CN103035240B (en) * 2011-09-28 2015-11-25 苹果公司 For the method and system using the speech recognition of contextual information to repair
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US9256396B2 (en) 2011-10-10 2016-02-09 Microsoft Technology Licensing, Llc Speech recognition for context switching
CN103019535B (en) * 2011-10-10 2016-12-21 微软技术许可有限责任公司 Speech recognition for context switching
CN103019535A (en) * 2011-10-10 2013-04-03 微软公司 Speech recognition for context switching
TWI601128B (en) * 2011-10-10 2017-10-01 微軟技術授權有限責任公司 Computer-implemented method and system of speech recognition for context switching
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
CN104756062B (en) * 2012-10-19 2018-01-12 谷歌公司 Decode the inaccurate gesture for graphic keyboard
CN104756062A (en) * 2012-10-19 2015-07-01 谷歌公司 Decoding imprecise gestures for gesture-keyboards
CN103823547B (en) * 2012-11-16 2017-05-17 中国电信股份有限公司 Mobile terminal and cursor control method thereof
CN103823547A (en) * 2012-11-16 2014-05-28 中国电信股份有限公司 Mobile terminal and cursor control method thereof
CN104903846A (en) * 2013-01-08 2015-09-09 歌乐株式会社 Voice recognition device, voice recognition program, and voice recognition method
CN104903846B (en) * 2013-01-08 2017-07-28 歌乐株式会社 Speech recognition equipment and audio recognition method
US10978090B2 (en) 2013-02-07 2021-04-13 Apple Inc. Voice trigger for a digital assistant
US10199051B2 (en) 2013-02-07 2019-02-05 Apple Inc. Voice trigger for a digital assistant
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9922642B2 (en) 2013-03-15 2018-03-20 Apple Inc. Training an at least partial voice command system
US9697822B1 (en) 2013-03-15 2017-07-04 Apple Inc. System and method for updating an adaptive speech recognition model
US9620104B2 (en) 2013-06-07 2017-04-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9966060B2 (en) 2013-06-07 2018-05-08 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9633674B2 (en) 2013-06-07 2017-04-25 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10657961B2 (en) 2013-06-08 2020-05-19 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
US9300784B2 (en) 2013-06-13 2016-03-29 Apple Inc. System and method for emergency calls initiated by voice command
US10791216B2 (en) 2013-08-06 2020-09-29 Apple Inc. Auto-activating smart responses based on activities from remote devices
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US10497365B2 (en) 2014-05-30 2019-12-03 Apple Inc. Multi-command single utterance input method
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US10169329B2 (en) 2014-05-30 2019-01-01 Apple Inc. Exemplar-based natural language processing
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
US9668024B2 (en) 2014-06-30 2017-05-30 Apple Inc. Intelligent automated assistant for TV user interactions
US10904611B2 (en) 2014-06-30 2021-01-26 Apple Inc. Intelligent automated assistant for TV user interactions
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10431204B2 (en) 2014-09-11 2019-10-01 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
CN104267922A (en) * 2014-09-16 2015-01-07 联想(北京)有限公司 Information processing method and electronic equipment
CN104267922B (en) * 2014-09-16 2019-05-31 联想(北京)有限公司 A kind of information processing method and electronic equipment
US9606986B2 (en) 2014-09-29 2017-03-28 Apple Inc. Integrated word N-gram and class M-gram language models
US9986419B2 (en) 2014-09-30 2018-05-29 Apple Inc. Social reminders
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US11556230B2 (en) 2014-12-02 2023-01-17 Apple Inc. Data detection
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
CN108028042A (en) * 2015-09-18 2018-05-11 微软技术许可有限责任公司 The transcription of verbal message
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US11069347B2 (en) 2016-06-08 2021-07-20 Apple Inc. Intelligent automated assistant for media exploration
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
CN106126156A (en) * 2016-06-13 2016-11-16 北京云知声信息技术有限公司 Pronunciation inputting method based on hospital information system and device
CN106126156B (en) * 2016-06-13 2019-04-05 北京云知声信息技术有限公司 Pronunciation inputting method and device based on hospital information system
US10553215B2 (en) 2016-09-23 2020-02-04 Apple Inc. Intelligent automated assistant
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
CN108231066A (en) * 2016-12-13 2018-06-29 财团法人工业技术研究院 Speech recognition system and method thereof and vocabulary establishing method
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10755703B2 (en) 2017-05-11 2020-08-25 Apple Inc. Offline personal assistant
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
US11217255B2 (en) 2017-05-16 2022-01-04 Apple Inc. Far-field extension for digital assistant services
CN111033509A (en) * 2017-07-18 2020-04-17 视语智能有限公司 Object re-identification
CN108899016A (en) * 2018-08-02 2018-11-27 科大讯飞股份有限公司 A kind of regular method, apparatus of speech text, equipment and readable storage medium storing program for executing
CN108899016B (en) * 2018-08-02 2020-09-11 科大讯飞股份有限公司 Voice text normalization method, device and equipment and readable storage medium
CN110880319A (en) * 2018-09-06 2020-03-13 丰田自动车株式会社 Voice interaction device, control method for voice interaction device, and non-transitory recording medium storing program
CN110955401A (en) * 2018-09-27 2020-04-03 富士通株式会社 Sound playback section control method, computer-readable storage medium, and information processing apparatus
CN110955401B (en) * 2018-09-27 2023-07-28 富士通株式会社 Sound playback interval control method, computer-readable storage medium, and information processing apparatus
CN110211576A (en) * 2019-04-28 2019-09-06 北京蓦然认知科技有限公司 A kind of methods, devices and systems of speech recognition
CN110211576B (en) * 2019-04-28 2021-07-30 北京蓦然认知科技有限公司 Voice recognition method, device and system
CN110808035B (en) * 2019-11-06 2021-11-26 百度在线网络技术(北京)有限公司 Method and apparatus for training hybrid language recognition models
CN110808035A (en) * 2019-11-06 2020-02-18 百度在线网络技术(北京)有限公司 Method and apparatus for training hybrid language recognition models
CN112259100A (en) * 2020-09-15 2021-01-22 科大讯飞华南人工智能研究院(广州)有限公司 Speech recognition method, training method of related model, related equipment and device
CN112259100B (en) * 2020-09-15 2024-04-09 科大讯飞华南人工智能研究院(广州)有限公司 Speech recognition method, training method of related model, related equipment and device
CN114454164A (en) * 2022-01-14 2022-05-10 纳恩博(北京)科技有限公司 Robot control method and device
CN114454164B (en) * 2022-01-14 2024-01-09 纳恩博(北京)科技有限公司 Robot control method and device

Also Published As

Publication number Publication date
JP2006515073A (en) 2006-05-18
KR100996212B1 (en) 2010-11-24
AU2002336458A1 (en) 2004-03-29
EP1604350A4 (en) 2007-11-21
AU2002336458A8 (en) 2004-03-29
KR20060037228A (en) 2006-05-03
EP1604350A2 (en) 2005-12-14

Similar Documents

Publication Publication Date Title
CN1864204A (en) Methods, systems and programming for performing speech recognition
CN1218233C (en) Touch-typable devices based on ambiguous codes and method to design such devices
CN1293450C (en) Touch-type key input apparatus
CN1759593A (en) Apparatus and method for inputting alphabet characters
CN1149503C (en) Apparatus for inputting words and method therefor
JP5166255B2 (en) Data entry system
CN1099629C (en) Screen display key input unit
CN1140871C (en) Method and system for realizing voice frequency signal replay of multisource document
CN1387639A (en) Language input user interface
CN101067780A (en) Character inputting system and method for intelligent equipment
CN1648828A (en) System and method for disambiguating phonetic input
CN1326308A (en) Portable terminal, data inputting method, dictionary picking up method and device and media
CN1794159A (en) Personalization of user accessibility options
CN1441371A (en) Character input device
CN1484798A (en) Information processor and information processing method and its program
CN1618173A (en) Explicit character filtering of ambiguous text entry
CN1886717A (en) Method and apparatus for inputting data with a four way input device
CN1280748C (en) Speed typing apparatus and method
CN1755663A (en) Information-processing apparatus, information-processing methods and programs
CN1855223A (en) Audio font output device, font database, and language input front end processor
CN1813285A (en) Device and method for speech synthesis and program
CN1348559A (en) Portable character input device
CN1287348C (en) Stylus computer
CN1241101C (en) Chinese syllable double reading scheme, Chinese keyboard and information input and processing method
CN1598744A (en) Input device on key and operation mode thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20061115