WO2013039459A1 - A method for creating keyboard and/or speech - assisted text input on electronic devices - Google Patents

A method for creating keyboard and/or speech - assisted text input on electronic devices Download PDF

Info

Publication number
WO2013039459A1
WO2013039459A1 PCT/TR2011/000207 TR2011000207W WO2013039459A1 WO 2013039459 A1 WO2013039459 A1 WO 2013039459A1 TR 2011000207 W TR2011000207 W TR 2011000207W WO 2013039459 A1 WO2013039459 A1 WO 2013039459A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
speech
voice
letter
sentence
Prior art date
Application number
PCT/TR2011/000207
Other languages
French (fr)
Inventor
Levent Arslan
Original Assignee
Sestek Ses Ve Iletisim Bilgisayar Teknolojileri Sanayi Ve Ticaret Anonim Sirketi
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sestek Ses Ve Iletisim Bilgisayar Teknolojileri Sanayi Ve Ticaret Anonim Sirketi filed Critical Sestek Ses Ve Iletisim Bilgisayar Teknolojileri Sanayi Ve Ticaret Anonim Sirketi
Priority to PCT/TR2011/000207 priority Critical patent/WO2013039459A1/en
Publication of WO2013039459A1 publication Critical patent/WO2013039459A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0237Character input methods using prediction or retrieval techniques

Abstract

The invention is a method providing conversion of speech to text in electronic devices such as computers, mobile phones by using character input unit and voice - over in a manner supporting one another wherein the operation steps of user's (U) repeating the sentence he wishes to get as a text print and receiving and recording that data by means of the said voice - over (4), typing in the first letter or first letter and last letter of word contained in the said sentence by using the said character entry unit (2), forming an output text by compiling the expressions in the speech input and character input by means of a decision module (6), displaying the output text obtained on the monitor of device are performed.

Description

DESCRIPTION

A METHOD FOR CREATING KEYBOARD AND/OR SPEECH - ASSISTED TEXT

INPUT ON ELECTRONIC DEVICES

The Related Art

The invention relates to converting speech into text and placing it into units allowing text entries. The invention relates a method providing text entries by selecting and filtering directly from the words stored in the memory thanks to typing of first letters of the words contained in a speech utterance into electronic devices that text can be entered, particularly such as desktop, laptop and tablet PC's or mobile phones and PDA. The Prior Art

Today, in parallel to development of technology, use of computers and mobile phones also show an increase. The said devices are becoming inevitable parts of users of daily life at homes day by day. By using existing keyboards of computers and mobile phones, mail, short message texting, official or unofficial text entries can be made.

However, making text entries into mobile phones, tablet computers using keyboard is a rather difficult to achieve and time-consuming task. What is more, recent mobile phones, tablet computers are touch screen, keyboards are activated when usage is necessary, which makes text entry even more difficult. At present, on mobile phones or computers having a keyboard, that text entry process is relatively easier. Despite that, typing at the pace of speaking without a mistake is not possible for most of the users. When pace is increased, rate of making mistakes is also increased. In the prior art, voice recognition systems are gradually being improved and alternatives are being created for making input into such kind of systems. However, accuracy rates of these input alternatives are not much higher. In the literature, one of the patents contained with regard to the subject is Canadian patent application numbered CA 2416592 (A1). The said invention comprises a network for transmission of speech packages, a phone for transmission of speech data and a monitor for display of the data spoken as text.

In the prior art, there is also a method filtering the list when first letter of the word (and respectively the other letters) is typed in. However, this method at present is filtering only according to the first word, therefore it eliminates all the names beginning with the same letters in the list and is not able to produce practically sufficient result for the user.

In conclusion; improvements are being made in the methods providing transmission of a speech as a text into computers, mobile phones or PDAs with minimum error, therefore new embodiments eliminating the disadvantages touched above and bringing solutions to existing systems are needed.

Purpose of the Invention

The present invention relates to a method meeting the above mentioned requirements, eliminating all the disadvantages and introducing some additional advantages, providing entry of all kinds of text inputs such as SMS messaging, e- mails, etc., into computers by using keyboard entries and voice recognition method in unison.

A purpose of the invention is that a person enters only the first letter of the words in the sentence he is going to type in by using a keyboard and he repeats the sentence as voice at the same time, thus first letters typed in are used an input to the voice recognition algorithm. In this way, display of the sentence as text on the monitor of the device can be achieved by performing voice recognition process over the probable word list and order. Therefore, reduction of probable voice recognition errors to a rather low level is aimed .

First letters of the words in the sentence whose entry has been made are deleted from the monitor after the pause occurring as a result of termination of voiced repetition. For displaying the result of voice recognition, only the words having first letters typed in with the keyboard are submitted as alternatives to the user.

In languages comprising lengthy words, typing in of the words preferably at the beginning and end of the word and thus improving the result belonging to the recognition process is aimed. In that case, keyboard entrance slows down according to the case wherein only the first letter of the word is typed in. Based on that, another purpose of the invention is, in any case, realisation of a fast transmission compared to text entries made by keyboard.

A still another purpose of the invention is usability of the said method for any type of data entries into the computer.

A still another purpose of the invention on the other hand is development of a method facilitating entry any type of text input such as short messaging termed as SMS, electronic e-mails or similar by use of voice recognition and/or keyboard into electronic devices.

A still another purpose of the invention on the other hand is to offer optional methods depending on user choice for text entry into electronic devices, to use keyboard and voice recognition application in unison in all of these said optional methods. In this method, by use of voice recognition and keyboard in unison, a much faster and easier text entry is provided compared to the traditional method wherein keyboard is used standalone. Besides, enhancement of rate of accuracy in voice recognition is also supported through use of keyboard.

A further purpose of the invention is, thanks to use of keyboard entry and voice recognition application in unison, to be able to enter text into these electronic devices, at a pace close to that of pace of speaking, wherein text entry via touch screen or keyboard is quite slow and troublesome such as mobile phones, PDA, tablet computer. Similarly, hitting the data entry pace close to pace of speaking also at desktop or laptop computers, in the line of this pace, reduction of voice recognition errors to rather a low level by also use of keyboard support is aimed. Another purpose of the invention on the other hand is to achieve increased voice recognition accuracy rate in voice recognition technology, in particular at noisy environments, problematic articulations and accented speaker utilisation. Another purpose of the invention is submission of making entries via voice entry as to the user optionally. In that case, if user wishes to do so, he will make entry of the text only by way of keyboard, yet instead of typing in all of the word in the text area, as in the case with the traditional method, it will be sufficient for him to type in two letters (for example the first and the last letter) by his choice. For example; during searching for the name and family name of a person to be looked for in the mobile phone directory, pressing the first letters of the name and family name of the person to be called will suffice, providing filtering of the name and family name of a person whose name begins with those words in the directory is achieved. Thus, data entry into the device is made by using far less buttons in a practical and rapid manner.

In the new method submitted in the invention on the other hand filtering can be made not according to the first word, filtering can be made by entering data according to beginning and/or last letters of all of the words contained in the search, therefore, the data wished to be entered is aimed to be accessed to in much more practical manner is aimed.

To meet the objectives mentioned above, the invention is a method providing conversion of speech to text in electronic devices such as computers, mobile phones by using character input unit and voice - over in a manner supporting one another, wherein performing the operation steps of

- user's repeating the sentence he wishes to get as a text print and receiving and recording that data by means of the said voice - over,

- typing in the first letter or first letter and last letter of word contained in the said sentence by using the said character entry unit,

- forming an output text by compiling the expressions in the speech input and character input by means of a decision module,

- displaying the output text obtained on the monitor of device. To meet the objectives mentioned above, the invention is amethod providing conversion of speech to text in electronic devices such as computers, mobile phones by use of character entry unit, wherein performing the operation steps of

- typing in the first letter or first last letter of word contained in the sentence that will be converted to text, by using the said character entry unit,

- selecting the words, by means of a decision module, matching the character input and forming output text by compiling them,

- displaying the output text obtained on the monitor of device.

The structural and characteristic features of the invention and all the advantages will be understood better in detailed descriptions with the figures given below and with reference to the figures, and therefore, the assessment should be made taking into account the said figures and detailed explanations.

Brief Description of the Drawings To understand of the embodiment of present invention and its advantages with its additional components in the best way, it should be evaluated together with below described figures.

Figure 1 : interaction of the elements required for implementation of the method regarding conversion of keypad - assisted speech to text, which is the subject of the invention with one another is illustrated schematically.

The drawings do not need to be absolutely put to scales and details not essential to understand the present invention may have been omitted. Furthermore, the elements that are at least identical, or at least substantially have identical functions are illustrated with the same number.

Reference Numbers

1. Voice recognition system

2. Character input unit (keyboard)

3. Sound card

4. Voice - over 5. Input forming area

6. Decision module

7. Output display area U: User

Detailed Description of the Invention

In this detailed description, the preferred embodiments of the method providing display of the said speech on a device that allows text entry by support of a keyboard, which is the subject of the invention, are disclosed only for better understanding of the subject, and in a manner not constituting any restrictive effect.

In the text-forming method being the subject of the invention;

- User (U) first of all types in the letters found at the beginning or at the beginning and end of the words contained in the sentence that he wants to enter as text via keyboard which is the character input unit (2) comprising touch-sensitive or traditional keyboard by using his fingers, meanwhile, the said sentence is repeated aloud and it is picked up from the medium by the voice-over (4) that is, microphone.

- The sentence/s received by the said voice-over (4) are transmitted to the sound card (3).

- The speech data in the sound card (3) and the data comprising the letters typed in by means of keyboard are transmitted to the voice recognition system (1 ). From existing directory of words, the letters typed in and the words matching the sentence repeated aloud are selected.

- The data updated at the output of the said voice recognition system (1 ), that is, the data converted to text is submitted to the user (U) in text in the output display area (7) located on the monitor of the device.

Via touch-sensitive or standard keyboard, first and/or last letters of the words are typed in as data in the input forming area. By compiling the voice recognition system (1 ) and text entry technique together, correction and verification of data entered occurs at the decision module (6). The result coming from the decision module (6) on the other hand is displayed in the output display area for the user (U). Keyboard which is the character input unit (2) provides enhancement of accuracy rate in voice recognition by assisting the voice recognition system (1 ). Voice-over (4) is located on the unit wherein text conversion is made and it is used for receiving sound data coming over by the speech of the user (U) from external environment and transmitting it to the sound card (3).

If an example is required to give for operation of the method being the subject of the invention:

When the sentence Ί want to go home today." is being uttered aloud to submit the sentence as input, on the other hand the user (U) will be typing in the letters "iwtght" via the keyboard. With a pause coming after utterance and completion of the sentence aloud, the letters "iwtght" inscribed on the monitor of the device to which the text is to be transmitted become deleted and replaced with the sentence formed as a result of voice recognition. During that voice recognition process, only the letters being the first letters typed in via the keyboard are submitted to the user (U) as an alternative by also making restrictions as to the said consecutive order of the words.

A further option on the other hand is that first and last letters of the words in a sentence can be entered into the voice recognition system as an input so that error in voice recognition can be reduced to the lowest level. Besides, entry of first and last letters of the words by the keyboard only without making a voiced entry is also possible. In that case, for the example above, the entry "i wttoqohetv" is made. And subsequent to that entry, sentence alternatives matching that pattern again in the statistical linguistic model can appear on the monitor as choice. Thus, for hearing impaired or mute people the method is made usable. Thus, voice input is given to the system optionally.

Claims

1. The invention is a method providing conversion of speech to text in electronic devices such as computers, mobile phones by use of a character entry unit (2) and a voice - over (4) in a manner supporting one another, and it is characterised in that;
- user repeats the sentence he wishes to get as a text print and that data is received and recorded by means of the said voice - over (4),
- first letter or first letter and last letter of word contained in the said sentence are typed in by using the said character entry unit (2),
- an output text is formed by compiling the expressions in the speech input and character input is formed by means of a decision module (6),
- the output text obtained is displayed on the monitor of device.
2. The invention is a method providing conversion of speech to text in electronic devices such as computers, mobile phones by use of a character entry unit (2) and a voice - over (4) in a manner supporting one another, and it is characterised in that; the operation steps of
- typing in the first letter or first last letter of word contained in the sentence that will be converted to text, by using the said character entry unit (2),
- selecting the words, by means of a decision module (6), matching the character input and forming output text by compiling them,
- displaying the output text obtained on the monitor of device are performed.
3. A method according to Claim 1 , and it is characterised in that; the said speech input and character input are converted to text by means of a voice recognition system (1 ).
4. A method according to Claim 1 and 2, and it is characterised in that; voiced inputs coming over by the speech are recorded in a sound card (3).
5. A method according to Claim 1 to 3, and it is characterised in that; the said character entry unit (2) is a keyboard and the voice-over (4) is a microphone.
PCT/TR2011/000207 2011-09-12 2011-09-12 A method for creating keyboard and/or speech - assisted text input on electronic devices WO2013039459A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/TR2011/000207 WO2013039459A1 (en) 2011-09-12 2011-09-12 A method for creating keyboard and/or speech - assisted text input on electronic devices

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/TR2011/000207 WO2013039459A1 (en) 2011-09-12 2011-09-12 A method for creating keyboard and/or speech - assisted text input on electronic devices

Publications (1)

Publication Number Publication Date
WO2013039459A1 true WO2013039459A1 (en) 2013-03-21

Family

ID=44863199

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/TR2011/000207 WO2013039459A1 (en) 2011-09-12 2011-09-12 A method for creating keyboard and/or speech - assisted text input on electronic devices

Country Status (1)

Country Link
WO (1) WO2013039459A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2416592A1 (en) 2002-01-22 2003-07-22 At&T Corp. Method and device for providing speech-to-text encoding and telephony service
US20100031143A1 (en) * 2006-11-30 2010-02-04 Rao Ashwin P Multimodal interface for input of text

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2416592A1 (en) 2002-01-22 2003-07-22 At&T Corp. Method and device for providing speech-to-text encoding and telephony service
US20100031143A1 (en) * 2006-11-30 2010-02-04 Rao Ashwin P Multimodal interface for input of text

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WATANABE Y ET AL: "Semi-Synchronous Speech and Pen Input", 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING 15-20 APRIL 2007 HONOLULU, HI, USA, IEEE, PISCATAWAY, NJ, USA, 15 April 2007 (2007-04-15), pages IV-409, XP031463873, ISBN: 978-1-4244-0727-9 *

Similar Documents

Publication Publication Date Title
EP2339576B1 (en) Multi-modal input on an electronic device
KR101042119B1 (en) Semantic object synchronous understanding implemented with speech application language tags
US10126936B2 (en) Typing assistance for editing
US7941316B2 (en) Combined speech and alternate input modality to a mobile device
US8812325B2 (en) Use of multiple speech recognition software instances
US9183843B2 (en) Configurable speech recognition system using multiple recognizers
US8498872B2 (en) Filtering transcriptions of utterances
US8571862B2 (en) Multimodal interface for input of text
EP2959476B1 (en) Recognizing accented speech
CN101840300B (en) For receiving the method and system of the Text Input on touch-sensitive display device
CN106471570B (en) Order single language input method more
KR20130006596A (en) Word-level correction of speech input
US20030157968A1 (en) Personalized agent for portable devices and cellular phone
US20130024195A1 (en) Corrective feedback loop for automated speech recognition
US20060293889A1 (en) Error correction for speech recognition systems
TWI281146B (en) Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition
US7010490B2 (en) Method, system, and apparatus for limiting available selections in a speech recognition system
US9674328B2 (en) Hybridized client-server speech recognition
US7389235B2 (en) Method and system for unified speech and graphic user interfaces
US7979425B2 (en) Server-side match
US20070033037A1 (en) Redictation of misrecognized words using a list of alternatives
US8775156B2 (en) Translating languages in response to device motion
KR101532447B1 (en) Recognition architecture for generating asian characters
JP2013068952A (en) Consolidating speech recognition results
KR101606229B1 (en) Textual disambiguation using social connections

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11775861

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct app. not ent. europ. phase

Ref document number: 11775861

Country of ref document: EP

Kind code of ref document: A1