WO2005031995A1 - Method and apparatus for providing a text message - Google Patents

Method and apparatus for providing a text message Download PDF

Info

Publication number
WO2005031995A1
WO2005031995A1 PCT/US2004/030553 US2004030553W WO2005031995A1 WO 2005031995 A1 WO2005031995 A1 WO 2005031995A1 US 2004030553 W US2004030553 W US 2004030553W WO 2005031995 A1 WO2005031995 A1 WO 2005031995A1
Authority
WO
WIPO (PCT)
Prior art keywords
message
templates
utterance
text message
template
Prior art date
Application number
PCT/US2004/030553
Other languages
French (fr)
Inventor
Yaxin Zhang
Xin He
Xiao-Lin Ren
Fang Sun
Original Assignee
Motorola, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola, Inc. filed Critical Motorola, Inc.
Priority to EP04784421A priority Critical patent/EP1665561A4/en
Publication of WO2005031995A1 publication Critical patent/WO2005031995A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72436User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for text messaging, e.g. short messaging services [SMS] or e-mails
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/26Devices for calling a subscriber
    • H04M1/27Devices whereby a plurality of signals may be stored simultaneously
    • H04M1/271Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/74Details of telephonic subscriber devices with voice recognition means

Definitions

  • the invention relates to a method and apparatus for providing a text message using voice.
  • the invention is particularly useful for, but not necessarily limited to, providing a text message using voice inputs processed on a portable electronic device having limited memory and computational capacity.
  • SMS Short Messaging Service
  • Short text messaging often using the Short Messaging Service (SMS) format, is a very popular application in wireless communications. Billions of short text messages are sent each month, usually from one mobile phone to another. Such text messages are popular for a number of reasons. The messages are generally a fraction of the cost of a one-minute mobile telephone call and they do not require an engaged tone to send or to receive.
  • Text messages are generally created by typing characters into the keypad of a mobile telephone.
  • non-querty keypads to compose a message can be awkward and generally requires more time than would be needed using a full-size querty keyboard. But of course it is impractical to have a full size keyboard attached to a mobile phone. Thus there is a need for a more effective method of composing short text messages.
  • speech recognition systems are well known, most are not suitable for use in portable electronic devices such as mobile phones. That is because prior art speech recognition systems generally require more processing power and memory than are available in portable electronic devices.
  • Prior art closed vocabulary speech recognition systems and methods employ a pre-defined, fixed vocabulary list.
  • the fixed vocabulary list may be large but may not be exhaustive and therefore, for instance, a person's family name and the names of many locations would not be included.
  • open vocabulary speech recognition systems and methods have a variable vocabulary list to which new words and phrases may be added by a user or otherwise.
  • current open vocabulary speech recognition systems and methods require relatively high computational overheads that may not be acceptable for portable electronic devices such as Personal Digital Assistants, radio-telephones and other portable devices.
  • a method of providing a text message includes the steps of receiving an utterance at an input of an electronic device. Speech recognition is then performed on the utterance guided by user-defined message templates stored in a memory associated with the electronic device, wherein speech recognition is defined by matching the utterance with one of the templates to create a matching template. A text message is then provided from the matching template.
  • At least one of the message templates may include a fixed language component.
  • At least one of the message templates may include a variable language component.
  • At least one of the message templates may include both a fixed and a variable language component.
  • the text message may be an SMS message.
  • the above method may also include the step of editing the user-defined message template by receiving typed characters from a keypad of the electronic device.
  • a component of the text message may be a transcription of the utterance.
  • the entirety of the text message may be a transcription of the utterance.
  • an electronic device for providing a text message includes a microphone operative to receive an utterance; a non-volatile memory for storing message templates; and a processor operative to perform speech recognition of the utterance guided by the message templates, wherein processor is operative to match the utterance with one of the templates to create a matching template, and to provide a text message from the matching template.
  • the message templates may also include fixed or variable language components or both fixed and variable language components.
  • the text message may be an SMS message.
  • the electronic device may include a keypad operative for editing the message template.
  • the electronic device may be operative to match the utterance with a plurality of the templates and to calculate a likelihood score for each of the templates.
  • Fig. 1 is a schematic block diagram of a radio telephone in accordance with the present invention
  • Fig. 2 is a flow diagram illustrating a method for providing, editing and transmitting a text message in accordance with the present invention
  • Fig. 3 is a flow diagram that illustrates a method for providing a list of candidate message templates to a user in accordance with the present invention
  • Fig. 4 is a flow diagram illustrating a method for enabling a user to edit existing message templates and save new templates in a static programmable memory in accordance with the present invention.
  • a radio telephone 100 comprising a radio frequency communications unit 105 coupled to be in communication with a processor 110.
  • I/O Input/Output
  • the processor 110 includes an encoder/decoder 125 with an associated
  • the processor 110 also includes a micro-processor 135 coupled, by a common data and address bus 140, to the encoder/decoder 125 and an associated character Read Only Memory (ROM) 145, a Random Access Memory (RAM)
  • the static programmable memory 155 and SIM module 160 each can store, amongst other things, selected incoming text messages, a telephone book database, and, as described in more detail below, templates of outgoing text messages.
  • the microprocessor 135 has ports for coupling to the keypad 120, the display 115 and an alert module 165 that typically contains a speaker, vibrator motor and associated drivers.
  • the character Read Only Memory 145 stores code for decoding or encoding text messages that may be received by the communication unit 105, input at the keypad 120.
  • the radio frequency communications unit 105 is a combined receiver and transmitter having a common antenna 170.
  • the communications unit 105 has a transceiver 175 coupled to antenna 170 via a radio frequency amplifier 180.
  • the transceiver 175 is also coupled to a combined modulator/demodulator 185 that couples the communications unit 2 to the processor 110.
  • Fig. 2 there is a flow diagram illustrating one embodiment of the present invention including a method 200 for providing, editing and transmitting a text message using the radio telephone 100.
  • the method 200 is invoked at a start step 205.
  • an utterance is received at an input, such as the microphone 190, of the telephone 100.
  • the processor 110 then performs sampling and digitizing of the utterance waveform at step 215, then segmenting at a step 220 before processing to provide feature vectors representing the waveform at a step 225.
  • steps 215, 220, and 225 are well known in the art and therefore do not require a detailed explanation.
  • speech recognition is performed on the feature vectors resulting from step 225.
  • the speech recognition is guided by user- defined message templates stored in the static programmable memory 155 of the device 100.
  • the message templates are described in more detail later in this specification.
  • the method 200 then provides a text message to a user at step 235.
  • the message may be provided to the user using one of the I/O interfaces such as the display 115 or the speaker 195 of the device 100. After the message is provided to the user, the user is then able to decide whether to edit the message at step 240.
  • the message is transmitted at step 245 in a message format such as SMS. However if the user decides at step 240 to edit the message, the message is edited at step 250 before being transmitted at step 245.
  • the user may edit the message in several different ways including speaking edits into the speaker 195 or typing edits into the keypad 120.
  • the method 200 then ends at step 255.
  • the provide a text message step 235 may include providing a user of the telephone 100 with a list of candidate message templates from which the user may select the template that is most appropriate for the intended text message.
  • Fig. 3 is a flow diagram that illustrates a method 300 for providing such a list of candidate templates to a user. The method 300 is invoked at start step 305 when a user inputs a command into the keypad 120 or into the microphone
  • the method 300 first includes the processor 110 selecting at step 310 a message template from a list of available message templates. At step 315 the selected template is then compared with the feature vectors provided in step 225 of method 200. The processor 110 then calculates a likelihood score at step 320 that estimates the matching quality between aspects of the selected template and the feature vectors of the input utterance. The processor 110 then determines at step 325 whether the likelihood score is above a set threshold. The threshold may be automatically calculated by processor 110, or it may be pre-set by a user of the telephone 100. If the likelihood score of the selected template is below the set threshold, the template is rejected at step 330.
  • the method 300 determines whether all available templates have been evaluated. If all available templates have not been evaluated, at step 345 the method 300 selects the next message template and returns to step 315 where the next template is compared with the feature vectors of the input utterance. If all templates have been evaluated at step 340, the method 300 continues to step 350 and provides a list of all of the candidate templates to the user.
  • the candidate templates may be provided to the user using one of the I/O interfaces such as the display 115 or the speaker 195 of the device 100.
  • the method 300 then ends at step 355.
  • users of the telephone 100 are not limited to the use of templates supplied by a manufacturer of the device 100. Rather, users of the device 100 are able to edit existing templates stored in the static programmable memory 155 to create their own personalized message templates.
  • Fig. 4 there is illustrated a method 400 for enabling a user to edit existing templates and save new templates in static programmable memory 155. The method 400 is invoked at start step 405 when a user inputs a command into the keypad 120 or into the microphone 190.
  • a list of existing templates is provided to the user of the device 100 through an I/O interface such as the display 115 or the speaker 195.
  • the user selects a desired message template at step 415 using an I/O interface such as the microphone 190 or the keypad 120.
  • the user edits the template at step 410
  • the method 400 then ends at step 430.
  • Other methods of editing the message templates are also within the scope of the present invention, including connecting the telephone 100 to a host computer using a communication channel such as a USB cable and then downloading or flashing edited templates to the static programmable memory- 155.
  • the method of the present invention may further include message templates that comprise fixed and variable language components.
  • the fixed language components are not changed when a user selects a template and transmits a message.
  • the variable language components may be changed by the user from message to message.
  • the use of fixed and variable language components can greatly leverage the limited processing power and memory of the telephone 100.
  • a particular template of a short text message concerning a meeting request might include the following: "Meet me at $PLACE at $TIME".
  • the fixed language components are underlined and the variable language components are capitalized and begin with "$”.
  • variable language component $FESTINAL may be edited by the user to include:
  • $FESTINAL sp
  • the phone 100 is able to recognize the edited variable language components entered by a user. Because the variable language components consist of discrete sets of variables, the speech recognition processing overhead and memory requirements are minimized. The above method is thus particularly suited for devices having limited processing and memory resources such as mobile phones.
  • the use of templates including fixed and variable language components increases the efficiency of a speech recognition system for several reasons. First, the fixed language components of a particular template may generally be recognized quickly and efficiently because there are only a modest number of templates saved in the static programmable memory 155 compared with the almost unlimited number of sentence permutations associated with natural language sentence structures.
  • variable language components may also be recognized efficiently because the intra-sentence location of a variable language component in a message template automatically identifies a discrete set of possible responses. For example, referring to the "Happy $FESTINAL" message template given above, the fixed language component "Happy” may act as a signal such that the processor 110 knows that the subsequent voice input received at the microphone 190 will be the variable language component "$FESTINAL.”
  • PDAs Personal Digital Assistants
  • a text message may be provided through voice inputs rather than through typed characters entered into a small keypad.
  • the invention may include open vocabulary speech recognition to avoid the memory intensive requirements of prior art closed vocabulary speech recognition.
  • Open vocabulary speech recognition uses speaker-independent sub-word acoustic models designed to cover all of the acoustic occurrences, or phonemes, of a language.
  • a user is not limited to a predefined vocabulary but can edit the variable language components as described above to include words not found in a dictionary, such as names and locations.
  • the result is that the text messages provided by the present invention may be highly personalized.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A method and apparatus for providing a text message includes receiving an utterance (Step 210) at an input of an electronic device (100). Speech recognition is then performed on the utterance (Step 230) guided by user-defined message templates stored in a memory (155) associated with the electronic device (100). Speech recognition is defined by matching the utterance with one of the templates to create a matching template. A text message is then provided from the matching template (Step 235).

Description

METHOD AND APPARATUS FOR PROVIDING A TEXT MESSAGE
FIELD OF THE INVENTION The invention relates to a method and apparatus for providing a text message using voice. The invention is particularly useful for, but not necessarily limited to, providing a text message using voice inputs processed on a portable electronic device having limited memory and computational capacity. BACKGROUND OF THE INVENTION Short text messaging, often using the Short Messaging Service (SMS) format, is a very popular application in wireless communications. Billions of short text messages are sent each month, usually from one mobile phone to another. Such text messages are popular for a number of reasons. The messages are generally a fraction of the cost of a one-minute mobile telephone call and they do not require an engaged tone to send or to receive. Therefore the messages can be created and sent at a time that is convenient to the sender, and received and read at a time that is convenient to the recipient. Text messages are generally created by typing characters into the keypad of a mobile telephone. However using such small, non-querty keypads to compose a message can be awkward and generally requires more time than would be needed using a full-size querty keyboard. But of course it is impractical to have a full size keyboard attached to a mobile phone. Thus there is a need for a more effective method of composing short text messages. Further, although various types of speech recognition systems are well known, most are not suitable for use in portable electronic devices such as mobile phones. That is because prior art speech recognition systems generally require more processing power and memory than are available in portable electronic devices. Prior art closed vocabulary speech recognition systems and methods employ a pre-defined, fixed vocabulary list. In use, the fixed vocabulary list may be large but may not be exhaustive and therefore, for instance, a person's family name and the names of many locations would not be included. In contrast, open vocabulary speech recognition systems and methods have a variable vocabulary list to which new words and phrases may be added by a user or otherwise. However, current open vocabulary speech recognition systems and methods require relatively high computational overheads that may not be acceptable for portable electronic devices such as Personal Digital Assistants, radio-telephones and other portable devices. In this specification, including the claims, the terms 'comprises', 'comprising' or similar terms are intended to mean a non-exclusive inclusion, such that a method or apparatus that comprises a list of elements does not include those elements solely, but may well include other elements not listed.
SUMMARY OF THE INVENTION According to one aspect of the invention there is provided a method of providing a text message. The method includes the steps of receiving an utterance at an input of an electronic device. Speech recognition is then performed on the utterance guided by user-defined message templates stored in a memory associated with the electronic device, wherein speech recognition is defined by matching the utterance with one of the templates to create a matching template. A text message is then provided from the matching template. At least one of the message templates may include a fixed language component. At least one of the message templates may include a variable language component. At least one of the message templates may include both a fixed and a variable language component. The text message may be an SMS message. The above method may also include the step of editing the user-defined message template by receiving typed characters from a keypad of the electronic device. A component of the text message may be a transcription of the utterance. The entirety of the text message may be a transcription of the utterance. According to another aspect of the invention there is provided an electronic device for providing a text message. The device includes a microphone operative to receive an utterance; a non-volatile memory for storing message templates; and a processor operative to perform speech recognition of the utterance guided by the message templates, wherein processor is operative to match the utterance with one of the templates to create a matching template, and to provide a text message from the matching template. With respect to the electronic device, the message templates may also include fixed or variable language components or both fixed and variable language components. With respect to the electronic device, the text message may be an SMS message. The electronic device may include a keypad operative for editing the message template. The electronic device may be operative to match the utterance with a plurality of the templates and to calculate a likelihood score for each of the templates.
BRIEF DESCRIPTION OF THE DRAWINGS In order that the invention may be readily understood and put into practical effect, reference will now be made to preferred embodiments as illustrated with reference to the accompanying drawings in which: Fig. 1 is a schematic block diagram of a radio telephone in accordance with the present invention; Fig. 2 is a flow diagram illustrating a method for providing, editing and transmitting a text message in accordance with the present invention; Fig. 3 is a flow diagram that illustrates a method for providing a list of candidate message templates to a user in accordance with the present invention; and Fig. 4 is a flow diagram illustrating a method for enabling a user to edit existing message templates and save new templates in a static programmable memory in accordance with the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS OF THE INVENTION With reference to FIG. 1, there is illustrated a radio telephone 100 comprising a radio frequency communications unit 105 coupled to be in communication with a processor 110. Input/Output (I/O) interfaces in the form of a display 115, a keypad 120, a microphone 190, and a speaker 195 are also coupled to be in communication with the processor 110. The processor 110 includes an encoder/decoder 125 with an associated
Read Only Memory (ROM) 130 storing data for encoding and decoding voice or other signals that may be transmitted or received by the radio telephone 100. The processor 110 also includes a micro-processor 135 coupled, by a common data and address bus 140, to the encoder/decoder 125 and an associated character Read Only Memory (ROM) 145, a Random Access Memory (RAM)
150, static programmable memory 155 and a removable SIM module 160. The static programmable memory 155 and SIM module 160 each can store, amongst other things, selected incoming text messages, a telephone book database, and, as described in more detail below, templates of outgoing text messages. The microprocessor 135 has ports for coupling to the keypad 120, the display 115 and an alert module 165 that typically contains a speaker, vibrator motor and associated drivers. The character Read Only Memory 145 stores code for decoding or encoding text messages that may be received by the communication unit 105, input at the keypad 120. The radio frequency communications unit 105 is a combined receiver and transmitter having a common antenna 170. The communications unit 105 has a transceiver 175 coupled to antenna 170 via a radio frequency amplifier 180. The transceiver 175 is also coupled to a combined modulator/demodulator 185 that couples the communications unit 2 to the processor 110. Referring to Fig. 2 there is a flow diagram illustrating one embodiment of the present invention including a method 200 for providing, editing and transmitting a text message using the radio telephone 100. The method 200 is invoked at a start step 205. At step 210 an utterance is received at an input, such as the microphone 190, of the telephone 100. The processor 110 then performs sampling and digitizing of the utterance waveform at step 215, then segmenting at a step 220 before processing to provide feature vectors representing the waveform at a step 225. It should be noted that steps 215, 220, and 225 are well known in the art and therefore do not require a detailed explanation. Next, at step 230, speech recognition is performed on the feature vectors resulting from step 225. The speech recognition is guided by user- defined message templates stored in the static programmable memory 155 of the device 100. The message templates are described in more detail later in this specification. The method 200 then provides a text message to a user at step 235. The message may be provided to the user using one of the I/O interfaces such as the display 115 or the speaker 195 of the device 100. After the message is provided to the user, the user is then able to decide whether to edit the message at step 240. If the user decides not to edit the message, the message is transmitted at step 245 in a message format such as SMS. However if the user decides at step 240 to edit the message, the message is edited at step 250 before being transmitted at step 245. In various embodiments of the present invention, the user may edit the message in several different ways including speaking edits into the speaker 195 or typing edits into the keypad 120. The method 200 then ends at step 255. In an alternative embodiment of the present invention, after the speech recognition step 230 described above the provide a text message step 235 may include providing a user of the telephone 100 with a list of candidate message templates from which the user may select the template that is most appropriate for the intended text message. Fig. 3 is a flow diagram that illustrates a method 300 for providing such a list of candidate templates to a user. The method 300 is invoked at start step 305 when a user inputs a command into the keypad 120 or into the microphone
190. The method 300 first includes the processor 110 selecting at step 310 a message template from a list of available message templates. At step 315 the selected template is then compared with the feature vectors provided in step 225 of method 200. The processor 110 then calculates a likelihood score at step 320 that estimates the matching quality between aspects of the selected template and the feature vectors of the input utterance. The processor 110 then determines at step 325 whether the likelihood score is above a set threshold. The threshold may be automatically calculated by processor 110, or it may be pre-set by a user of the telephone 100. If the likelihood score of the selected template is below the set threshold, the template is rejected at step 330. However if the likelihood score of the selected template is above the set threshold, then at step 335 the template is considered to be a reasonable match with the input utterance and the template is added to a list of candidate templates. Regardless whether the selected template is rejected or added to the list of candidate templates, the method 300 then proceeds to step 340 where the processor 110 determines whether all available templates have been evaluated. If all available templates have not been evaluated, at step 345 the method 300 selects the next message template and returns to step 315 where the next template is compared with the feature vectors of the input utterance. If all templates have been evaluated at step 340, the method 300 continues to step 350 and provides a list of all of the candidate templates to the user. The candidate templates may be provided to the user using one of the I/O interfaces such as the display 115 or the speaker 195 of the device 100. The method 300 then ends at step 355. According to one embodiment of present invention, users of the telephone 100 are not limited to the use of templates supplied by a manufacturer of the device 100. Rather, users of the device 100 are able to edit existing templates stored in the static programmable memory 155 to create their own personalized message templates. Referring to Fig. 4, there is illustrated a method 400 for enabling a user to edit existing templates and save new templates in static programmable memory 155. The method 400 is invoked at start step 405 when a user inputs a command into the keypad 120 or into the microphone 190. At step 410 a list of existing templates is provided to the user of the device 100 through an I/O interface such as the display 115 or the speaker 195. The user then selects a desired message template at step 415 using an I/O interface such as the microphone 190 or the keypad 120. Next, the user edits the template at step
420, again using an I/O interface such as the microphone 190 or the keypad 120. Finally, at step 425 the user saves the edited template in static programmable memory 155. The method 400 then ends at step 430. Other methods of editing the message templates are also within the scope of the present invention, including connecting the telephone 100 to a host computer using a communication channel such as a USB cable and then downloading or flashing edited templates to the static programmable memory- 155. The method of the present invention may further include message templates that comprise fixed and variable language components. The fixed language components are not changed when a user selects a template and transmits a message. However the variable language components may be changed by the user from message to message. The use of fixed and variable language components can greatly leverage the limited processing power and memory of the telephone 100. For example, a particular template of a short text message concerning a meeting request might include the following: "Meet me at $PLACE at $TIME". Here the fixed language components are underlined and the variable language components are capitalized and begin with "$". Different users of the template may then edit the variable such as $PLACE to suit their particular circumstances. For example a university student might define the variable $PLACE as: $PLACE = sp I library I dormitory | cafeteria, etc.
Whereas a lawyer might define the variable $PLACE as:
$PLACE = sp|office|courthouse|home, etc.
In the above, "sp" means a pause or no voice event, and "|" means the logic operator "OR". Another example of a message template that may be used in the present invention is "Happy $FESTIVAL." Here the variable language component $FESTINAL may be edited by the user to include:
$FESTINAL = sp|birthday|new year|thanksgiving, etc.
Using open vocabulary speech recognition, the phone 100 is able to recognize the edited variable language components entered by a user. Because the variable language components consist of discrete sets of variables, the speech recognition processing overhead and memory requirements are minimized. The above method is thus particularly suited for devices having limited processing and memory resources such as mobile phones. The use of templates including fixed and variable language components increases the efficiency of a speech recognition system for several reasons. First, the fixed language components of a particular template may generally be recognized quickly and efficiently because there are only a modest number of templates saved in the static programmable memory 155 compared with the almost unlimited number of sentence permutations associated with natural language sentence structures. Second, the variable language components may also be recognized efficiently because the intra-sentence location of a variable language component in a message template automatically identifies a discrete set of possible responses. For example, referring to the "Happy $FESTINAL" message template given above, the fixed language component "Happy" may act as a signal such that the processor 110 knows that the subsequent voice input received at the microphone 190 will be the variable language component "$FESTINAL." Although the above embodiments of the present invention are described in relation to a radio telephone 100, the method and apparatus of the present invention could also include other electronic devices that provide text messages such as Personal Digital Assistants (PDAs). Accordingly, the present invention simplifies the steps required for providing and transmitting a text message from a portable electronic device. A text message may be provided through voice inputs rather than through typed characters entered into a small keypad. Further the invention may include open vocabulary speech recognition to avoid the memory intensive requirements of prior art closed vocabulary speech recognition. Open vocabulary speech recognition uses speaker-independent sub-word acoustic models designed to cover all of the acoustic occurrences, or phonemes, of a language. Thus a user is not limited to a predefined vocabulary but can edit the variable language components as described above to include words not found in a dictionary, such as names and locations. The result is that the text messages provided by the present invention may be highly personalized. The above detailed description provides preferred exemplary embodiments only, and is not intended to limit the scope, applicability, or configuration of the invention. Rather, the detailed description of the preferred exemplary embodiments provides those skilled in the art with an enabling description for implementing the preferred exemplary embodiments of the invention. It should be understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the invention as set forth in the appended claims.

Claims

WE CLAIM:
1. A method of providing a text message, said method comprising the steps of: receiving an utterance at an input of an electronic device; performing speech recognition of said utterance guided by user- defined message templates stored in a memory associated with said electronic device, wherein speech recognition is defined by matching said utterance with one of said templates to create a matching template; and providing a text message from said matching template.
2. The method of claim 1, wherein at least one of said message templates comprises a fixed language component.
3. The method of claim 1, wherein at least one of said message templates comprises a variable language component.
4. The method of claim 1, wherein at least one of said message templates comprises both a fixed and a variable language component.
5. The method of claim 1, wherein said text message is an SMS message.
6. The method of claim 1, further comprising the step of editing said user-defined message template by receiving typed characters from a keypad of said electronic device.
7. The method of claim 1, wherein a component of said text message is a transcription of said utterance.
8. The method of claim 1 , wherein the entirety of said text message is a transcription of said utterance.
9. An electronic device for providing a text message, said device comprising: a microphone operative to receive an utterance; a non-volatile memory for storing message templates; and a processor operative to perform speech recognition of said utterance guided by said message templates, said processor operative to match said utterance with one of said templates to create a matching template and to provide a text message from said matching template.
10. The method of claim 9, wherein at least one of said message templates comprises a fixed language component.
11. The method of claim 9, wherein at least one of said message templates comprises a variable language component.
12. The method of claim 9, wherein at least one of said message templates comprises both a fixed and a variable language component.
13. The device of claim 9, wherein said text message is an SMS message.
14. The device of claim 9, further comprising a keypad operative for editing said message template.
15. The device of claim 9, wherein said processor is operative to match said utterance with a plurality of said templates and to calculate a likelihood score for each of said templates.
PCT/US2004/030553 2003-09-23 2004-09-17 Method and apparatus for providing a text message WO2005031995A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP04784421A EP1665561A4 (en) 2003-09-23 2004-09-17 Method and apparatus for providing a text message

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN03124963.9 2003-09-23
CNB031249639A CN100353417C (en) 2003-09-23 2003-09-23 Method and device for providing text message

Publications (1)

Publication Number Publication Date
WO2005031995A1 true WO2005031995A1 (en) 2005-04-07

Family

ID=34383973

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/030553 WO2005031995A1 (en) 2003-09-23 2004-09-17 Method and apparatus for providing a text message

Country Status (5)

Country Link
EP (1) EP1665561A4 (en)
KR (1) KR100759728B1 (en)
CN (1) CN100353417C (en)
RU (1) RU2320082C2 (en)
WO (1) WO2005031995A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100719942B1 (en) * 2002-03-27 2007-05-18 노키아 코포레이션 Pattern recognition
KR100805252B1 (en) 2005-06-27 2008-02-21 서울통신기술 주식회사 Apparatus And Method Of Communication Processing In IP Terminal
EP2073581A1 (en) * 2007-12-17 2009-06-24 Vodafone Holding GmbH Transmission of text messages generated from voice messages in telecommunication networks
CN102263851A (en) * 2010-05-31 2011-11-30 北京迅捷英翔网络科技有限公司 Message conversion method
US8566101B2 (en) 2009-05-07 2013-10-22 Samsung Electronics Co., Ltd. Apparatus and method for generating avatar based video message
US9185211B2 (en) 2013-11-08 2015-11-10 Sorenson Communications, Inc. Apparatuses and methods for operating a communication system in one of a tone mode and a text mode
US9473627B2 (en) 2013-11-08 2016-10-18 Sorenson Communications, Inc. Video endpoints and related methods for transmitting stored text to other video endpoints
WO2022081571A1 (en) * 2020-10-15 2022-04-21 Google Llc Composition of complex content via user interaction with an automated assistant

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103366741B (en) * 2012-03-31 2019-05-17 上海果壳电子有限公司 Voice inputs error correction method and system
US10026400B2 (en) 2013-06-27 2018-07-17 Google Llc Generating dialog recommendations for chat information systems based on user interaction and environmental data
KR101894928B1 (en) 2017-02-14 2018-09-05 (주)스톤아이 Bonus calculating apparatus using number of visit and method thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6526292B1 (en) * 1999-03-26 2003-02-25 Ericsson Inc. System and method for creating a digit string for use by a portable phone
US20040176139A1 (en) * 2003-02-19 2004-09-09 Motorola, Inc. Method and wireless communication device using voice recognition for entering text characters
US6795808B1 (en) * 2000-10-30 2004-09-21 Koninklijke Philips Electronics N.V. User interface/entertainment device that simulates personal interaction and charges external database with relevant data
US20040204115A1 (en) * 2002-09-27 2004-10-14 International Business Machines Corporation Method, apparatus and computer program product for transcribing a telephone communication

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU707122B2 (en) * 1994-10-25 1999-07-01 British Telecommunications Public Limited Company Voice-operated services
US6173316B1 (en) * 1998-04-08 2001-01-09 Geoworks Corporation Wireless communication device with markup language based man-machine interface
DE19959903A1 (en) * 1999-12-07 2001-06-13 Bruno Jentner Module for supporting text messaging communications in mobile radio networks uses text-to-speech converter for speech output, speech-to-text converter for speech input and detection
KR20020028501A (en) * 2000-10-10 2002-04-17 김철권 Method for conversion between sound data and text data in network and apparatus thereof
WO2002077975A1 (en) * 2001-03-27 2002-10-03 Koninklijke Philips Electronics N.V. Method to select and send text messages with a mobile
DE50104036D1 (en) * 2001-12-12 2004-11-11 Siemens Ag Speech recognition system and method for operating such a system
US6895257B2 (en) * 2002-02-18 2005-05-17 Matsushita Electric Industrial Co., Ltd. Personalized agent for portable devices and cellular phone

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6526292B1 (en) * 1999-03-26 2003-02-25 Ericsson Inc. System and method for creating a digit string for use by a portable phone
US6795808B1 (en) * 2000-10-30 2004-09-21 Koninklijke Philips Electronics N.V. User interface/entertainment device that simulates personal interaction and charges external database with relevant data
US20040204115A1 (en) * 2002-09-27 2004-10-14 International Business Machines Corporation Method, apparatus and computer program product for transcribing a telephone communication
US20040176139A1 (en) * 2003-02-19 2004-09-09 Motorola, Inc. Method and wireless communication device using voice recognition for entering text characters

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP1665561A4 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100719942B1 (en) * 2002-03-27 2007-05-18 노키아 코포레이션 Pattern recognition
KR100805252B1 (en) 2005-06-27 2008-02-21 서울통신기술 주식회사 Apparatus And Method Of Communication Processing In IP Terminal
EP2073581A1 (en) * 2007-12-17 2009-06-24 Vodafone Holding GmbH Transmission of text messages generated from voice messages in telecommunication networks
US8566101B2 (en) 2009-05-07 2013-10-22 Samsung Electronics Co., Ltd. Apparatus and method for generating avatar based video message
CN102263851A (en) * 2010-05-31 2011-11-30 北京迅捷英翔网络科技有限公司 Message conversion method
US9185211B2 (en) 2013-11-08 2015-11-10 Sorenson Communications, Inc. Apparatuses and methods for operating a communication system in one of a tone mode and a text mode
US9473627B2 (en) 2013-11-08 2016-10-18 Sorenson Communications, Inc. Video endpoints and related methods for transmitting stored text to other video endpoints
US10165225B2 (en) 2013-11-08 2018-12-25 Sorenson Ip Holdings, Llc Video endpoints and related methods for transmitting stored text to other video endpoints
US10250847B2 (en) 2013-11-08 2019-04-02 Sorenson Ip Holdings Llc Video endpoints and related methods for transmitting stored text to other video endpoints
WO2022081571A1 (en) * 2020-10-15 2022-04-21 Google Llc Composition of complex content via user interaction with an automated assistant
US11924149B2 (en) 2020-10-15 2024-03-05 Google Llc Composition of complex content via user interaction with an automated assistant

Also Published As

Publication number Publication date
CN1601548A (en) 2005-03-30
EP1665561A1 (en) 2006-06-07
RU2320082C2 (en) 2008-03-20
RU2006113581A (en) 2007-10-27
KR100759728B1 (en) 2007-09-20
EP1665561A4 (en) 2011-03-23
KR20060054469A (en) 2006-05-22
CN100353417C (en) 2007-12-05

Similar Documents

Publication Publication Date Title
US6424945B1 (en) Voice packet data network browsing for mobile terminals system and method using a dual-mode wireless connection
CN100403828C (en) Portable digital mobile communication apparatus and voice control method and system thereof
AU684872B2 (en) Communication system
EP2224705B1 (en) Mobile wireless communications device with speech to text conversion and related method
US6694295B2 (en) Method and a device for recognizing speech
US6895257B2 (en) Personalized agent for portable devices and cellular phone
US8577681B2 (en) Pronunciation discovery for spoken words
US6526292B1 (en) System and method for creating a digit string for use by a portable phone
WO2005027482A1 (en) Text messaging via phrase recognition
EP1751742A1 (en) Mobile station and method for transmitting and receiving messages
US7043436B1 (en) Apparatus for synthesizing speech sounds of a short message in a hands free kit for a mobile phone
KR100759728B1 (en) Method and apparatus for providing a text message
CN111325039A (en) Language translation method, system, program and handheld terminal based on real-time call
WO2008118038A1 (en) Message exchange method and devices for carrying out said method
US20050256710A1 (en) Text message generation
JP4070963B2 (en) Mobile communication equipment
CN111274828B (en) Language translation method, system, computer program and handheld terminal based on message leaving
KR100724848B1 (en) Method for voice announcing input character in portable terminal
KR19990043026A (en) Speech Recognition Korean Input Device
JP2002140086A (en) Device for conversion from short message for portable telephone set into voice output
KR20060063420A (en) Voice recognition for portable terminal
JP2001223816A (en) Method and device for generating text message by telephone set
JP2000151827A (en) Telephone voice recognizing system
JP2005286886A (en) Server
JP2002330194A (en) Telephone unit, voice synthesizing system, voice element registration unit, and voice element registration and voice synthesizing unit

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1309/DELNP/2006

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 2004784421

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 1020067005735

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2006113581

Country of ref document: RU

WWP Wipo information: published in national office

Ref document number: 1020067005735

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 2004784421

Country of ref document: EP