WO2001015140A1 - Systeme de reconnaissance vocale pour la saisie de donnees - Google Patents
Systeme de reconnaissance vocale pour la saisie de donnees Download PDFInfo
- Publication number
- WO2001015140A1 WO2001015140A1 PCT/CA2000/000776 CA0000776W WO0115140A1 WO 2001015140 A1 WO2001015140 A1 WO 2001015140A1 CA 0000776 W CA0000776 W CA 0000776W WO 0115140 A1 WO0115140 A1 WO 0115140A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- textual output
- output message
- index
- user
- entry
- Prior art date
Links
- 238000013479 data entry Methods 0.000 title claims abstract description 25
- 238000012545 processing Methods 0.000 claims abstract description 16
- 238000000034 method Methods 0.000 claims abstract description 12
- 238000004891 communication Methods 0.000 claims description 4
- 238000012552 review Methods 0.000 abstract description 2
- 241001672694 Citrus reticulata Species 0.000 description 3
- 241000699670 Mus sp. Species 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 208000026757 poor enunciation Diseases 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/243—Natural language query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/226—Validation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Definitions
- Speech recognition systems also suffer from disadvantages in that they must be trained by each user for the vocabulary to be recognized and this can require a significant amount of time and effort. Further, less than desired results can be obtained due to a variety of factors including background noise, poor enunciation by the user, etc.
- a data entry system comprising: a speech recognition engine operable to receive speech and to recognize a search phrase therein; a database engine in communication with the speech recognition engine, the database engine including an index against which said recognized search phrase is applied to identify a corresponding index entry, each index entry having at least one textual output message defined therefore; a user terminal in communication with the database engine, the user interface (24) including a display device for displaying said at least one textual output message corresponding to said identified index entry, and a user input device for receiving a user input representing an approval and/or a completion of said displayed textual output message, the database engine (40) being configured for outputting said approved and/or completed textual output message upon receipt of said user input.
- a method of performing data entry comprising the steps of:
- Figure 1 shows a schematic representation of the data entry system in accordance with the present invention.
- System 20 includes a data entry terminal 24 which can be any suitable data entry terminal such as a VT-100 or other "dumb terminal" or a personal computer. As shown, terminal 24 includes a keyboard and a display. Data input by a user of system 20 is passed to a data processing system 28, as discussed in more detail below.
- Data processing system 28 can be any computer-implemented system requiring data input such as an order entry system, an inventory control system and, in a preferred embodiment of the present invention, is a wireless paging network.
- System 20 also includes a microphone 32 which, in a preferred embodiment of the invention, is the mouthpiece of a telephone headset or handset but which can be any suitable microphone or other mechanism for capturing the voice of a user.
- Microphone 32 is connected to a speech recognition engine 36 which can be any appropriate speech recognition system.
- speech recognition engine 36 can employ Hidden Markov Models (HMM) or other known algorithms to recognize speech and can be implemented in dedicated hardware or as an application running on a general purpose personal computer with adequate memory and processing capacity.
- HMM Hidden Markov Models
- the output of speech recognition engine 36 is applied to a database engine 40 which can be any suitable database such as those sold by Oracle, or a Microsoft Access database, etc. As described below in more detail, database engine 36 maintains at least one table relating predefined recognized phrases with corresponding textual message outputs. Selected corresponding textual message outputs from database engine 40 can be reviewed, approved, amended, modified from user terminal 24, or alternative selections of textual message output from user terminal 24, before they are output to data processing system 28.
- a user defines a set of textual output messages of interest. These messages are selected as being text strings which will be commonly used by the user and can be represented in any language or character set desired, including multi-byte Unicode character sets and/or ideographic character sets.
- examples of textual output messages of interest and their corresponding index phrases can for example, include:
- Cell number is I can be reached at my cellular and the number is
- textual output messages can be added, amended or deleted from database engine 40 by users as desired.
- speech recognition engine 36 need not be extremely sophisticated. In fact it is contemplated that in some circumstances speech recognition engine 36 may not require training for each individual user and yet can provide acceptably accurate recognition of index entries.
- a paging operator i.e. - a user
- Microphone 32 can either be an additional microphone into which the operator can speak when desired, or can be the mouthpiece of an otherwise conventional telephone headset or handset.
- a switch (not shown) is provided which allows the operator to speak such that the person on the other end of the telephone (the caller) can hear the operator or to speak such that the caller and speech recognition engine 36 can each "hear" the operator.
- speech recognition engine 36 will analyze the speech it has heard and will provide the output of its analysis, as a search input, to database engine 40.
- Database engine 40 compares the received search input to the index entries in its table or tables and selects the appropriate table entry.
- the corresponding textual output string in this example, "For flight arrival information, call 555-1212. Please pick me up at the airport at” is selected by database engine 40 and is displayed on user terminal 24 for approval and/or completion by the operator.
- the operator would verify that the correct textual output message has been identified and will complete the output message by entering the text " 5 : 00PM" , representing variable information, in a conventional manner such as by the keyboard.
- index entries and output textual messages in database engine 40 can be in different languages.
- the index entries in database engine 40 can be in English (in any suitable form such as textual or phonetic) and the corresponding textual output messages can be in Unicode Mandarin Chinese. In this manner an operator speaking with an English language caller will be able to create output messages in Mandarin Chinese.
- variable completion information it can be selected from a list of appropriate choices displayed to the operator in English and, once a selection is made, database engine 40 will complete the textual output message with predefined corresponding Mandarin Chinese text.
- database engine 40 can include multiple textual output messages, arranged by languages of interest, for each index entry.
- the textual output messages displayed to the operator on user terminal 24 for approval and/or completion will be in a language selected by the operator, who can, once the message is completed and/or approved, indicate which of the available languages it is to be input to data processing system 28 in.
- the present invention provides an efficient real-time data entry system in which user speech is analyzed to extract a search phrase.
- This search phrase is used to search an index to locate an index entry for which one or more textual output phrases have been defined.
- a corresponding textual output message is presented to the user for approval and/or completion by the user and is then provided as input to a data processing system, such as a paging system.
- a data processing system such as a paging system.
- the user can select the desired textual output message.
- the corresponding textual output messages can include additional information, defined fields to be completed by the user and/or can be in a different language from the index entry.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU56683/00A AU5668300A (en) | 1999-07-01 | 2000-07-04 | Speech recognition system for data entry |
CA002342787A CA2342787A1 (fr) | 1999-07-01 | 2000-07-04 | Systeme de reconnaissance vocale pour la saisie de donnees |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14183999P | 1999-07-01 | 1999-07-01 | |
US60/141,839 | 1999-07-01 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2001015140A1 true WO2001015140A1 (fr) | 2001-03-01 |
Family
ID=22497496
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CA2000/000776 WO2001015140A1 (fr) | 1999-07-01 | 2000-07-04 | Systeme de reconnaissance vocale pour la saisie de donnees |
Country Status (3)
Country | Link |
---|---|
AU (1) | AU5668300A (fr) |
CA (1) | CA2342787A1 (fr) |
WO (1) | WO2001015140A1 (fr) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002077975A1 (fr) * | 2001-03-27 | 2002-10-03 | Koninklijke Philips Electronics N.V. | Procede de selection et de transmission de messages alphabetiques via un mobile |
EP1315098A1 (fr) * | 2001-11-27 | 2003-05-28 | Telefonaktiebolaget L M Ericsson (Publ) | Recherche de messages vocaux |
EP1361740A1 (fr) * | 2002-05-08 | 2003-11-12 | Sap Ag | Méthode et système de traitement des informations de la parole d'un dialogue |
EP1361736A1 (fr) * | 2002-05-08 | 2003-11-12 | Sap Ag | Méthode pour la reconnaissance des informations de la parole |
EP1361737A1 (fr) * | 2002-05-08 | 2003-11-12 | Sap Ag | Méthode et système de traitement du signal de parole et de classification de dialogues |
EP1361738A1 (fr) * | 2002-05-08 | 2003-11-12 | Sap Ag | Méthode et système de traitement du signal de parole à l'aide de reconnaissance de parole et d'analyse fréquentielle |
EP1361739A1 (fr) * | 2002-05-08 | 2003-11-12 | Sap Ag | Méthode et système de traitement du signal de parole après reconnaissance de la langue |
EP1363271A1 (fr) * | 2002-05-08 | 2003-11-19 | Sap Ag | Méthode et système pour le traitement et la mémorisation du signal de parole d'un dialogue |
EP2279508A2 (fr) * | 2008-04-23 | 2011-02-02 | nVoq Incorporated | Procédés et systèmes de mesure de performance utilisateur présentant une conversion de parole en texte pour des systèmes de dictée |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5500920A (en) * | 1993-09-23 | 1996-03-19 | Xerox Corporation | Semantic co-occurrence filtering for speech recognition and signal transcription applications |
US5724526A (en) * | 1994-12-27 | 1998-03-03 | Sharp Kabushiki Kaisha | Electronic interpreting machine |
US5758318A (en) * | 1993-09-20 | 1998-05-26 | Fujitsu Limited | Speech recognition apparatus having means for delaying output of recognition result |
WO1999003092A2 (fr) * | 1997-07-07 | 1999-01-21 | Motorola Inc. | Systeme et procede de reconnaissance de voix modulaire |
-
2000
- 2000-07-04 AU AU56683/00A patent/AU5668300A/en not_active Abandoned
- 2000-07-04 CA CA002342787A patent/CA2342787A1/fr not_active Abandoned
- 2000-07-04 WO PCT/CA2000/000776 patent/WO2001015140A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5758318A (en) * | 1993-09-20 | 1998-05-26 | Fujitsu Limited | Speech recognition apparatus having means for delaying output of recognition result |
US5500920A (en) * | 1993-09-23 | 1996-03-19 | Xerox Corporation | Semantic co-occurrence filtering for speech recognition and signal transcription applications |
US5724526A (en) * | 1994-12-27 | 1998-03-03 | Sharp Kabushiki Kaisha | Electronic interpreting machine |
WO1999003092A2 (fr) * | 1997-07-07 | 1999-01-21 | Motorola Inc. | Systeme et procede de reconnaissance de voix modulaire |
Non-Patent Citations (1)
Title |
---|
KONDO K ET AL: "SURFIN' THE WORLD WIDE WEB WITH JAPANESE", IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP '97), 21 April 1997 (1997-04-21), IEEE COMP. SOC. PRESS, Los Alamitos, US, pages 1151 - 1154, XP000822656, ISBN: 0-8186-7920-4 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002077975A1 (fr) * | 2001-03-27 | 2002-10-03 | Koninklijke Philips Electronics N.V. | Procede de selection et de transmission de messages alphabetiques via un mobile |
US6934552B2 (en) | 2001-03-27 | 2005-08-23 | Koninklijke Philips Electronics, N.V. | Method to select and send text messages with a mobile |
EP1315098A1 (fr) * | 2001-11-27 | 2003-05-28 | Telefonaktiebolaget L M Ericsson (Publ) | Recherche de messages vocaux |
EP1361740A1 (fr) * | 2002-05-08 | 2003-11-12 | Sap Ag | Méthode et système de traitement des informations de la parole d'un dialogue |
EP1361736A1 (fr) * | 2002-05-08 | 2003-11-12 | Sap Ag | Méthode pour la reconnaissance des informations de la parole |
EP1361737A1 (fr) * | 2002-05-08 | 2003-11-12 | Sap Ag | Méthode et système de traitement du signal de parole et de classification de dialogues |
EP1361738A1 (fr) * | 2002-05-08 | 2003-11-12 | Sap Ag | Méthode et système de traitement du signal de parole à l'aide de reconnaissance de parole et d'analyse fréquentielle |
EP1361739A1 (fr) * | 2002-05-08 | 2003-11-12 | Sap Ag | Méthode et système de traitement du signal de parole après reconnaissance de la langue |
EP1363271A1 (fr) * | 2002-05-08 | 2003-11-19 | Sap Ag | Méthode et système pour le traitement et la mémorisation du signal de parole d'un dialogue |
EP2279508A2 (fr) * | 2008-04-23 | 2011-02-02 | nVoq Incorporated | Procédés et systèmes de mesure de performance utilisateur présentant une conversion de parole en texte pour des systèmes de dictée |
EP2279508A4 (fr) * | 2008-04-23 | 2012-08-29 | Nvoq Inc | Procédés et systèmes de mesure de performance utilisateur présentant une conversion de parole en texte pour des systèmes de dictée |
Also Published As
Publication number | Publication date |
---|---|
CA2342787A1 (fr) | 2001-03-01 |
AU5668300A (en) | 2001-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7369988B1 (en) | Method and system for voice-enabled text entry | |
US6570964B1 (en) | Technique for recognizing telephone numbers and other spoken information embedded in voice messages stored in a voice messaging system | |
KR101109265B1 (ko) | 텍스트 입력 방법 | |
US6895257B2 (en) | Personalized agent for portable devices and cellular phone | |
KR100769029B1 (ko) | 다언어의 이름들의 음성 인식을 위한 방법 및 시스템 | |
US20100217591A1 (en) | Vowel recognition system and method in speech to text applictions | |
CN108305626A (zh) | 应用程序的语音控制方法和装置 | |
US20060247932A1 (en) | Conversation aid device | |
US7715531B1 (en) | Charting audible choices | |
US20060069563A1 (en) | Constrained mixed-initiative in a voice-activated command system | |
US20070016420A1 (en) | Dictionary lookup for mobile devices using spelling recognition | |
WO2001015140A1 (fr) | Systeme de reconnaissance vocale pour la saisie de donnees | |
Callejas et al. | Implementing modular dialogue systems: A case of study | |
JP4230142B2 (ja) | 悪環境下でのキーパッド/音声を用いたハイブリッドな東洋文字認識技術 | |
Collingham et al. | The Durham telephone enquiry system | |
Kouroupetroglou et al. | Speech-enabled e-Commerce for disabled and elderly persons | |
JP3221477B2 (ja) | データベース照合型入力方法及び装置、データベース照合型日本語入力装置、並びに、電話番号案内サービスシステム | |
EP1187431B1 (fr) | Terminal portable avec composition vocal de numéro minimisant l'usage de mémoire | |
EP1895748B1 (fr) | Méthode, programme et système pour l'identification univoque d'un contact dans une base de contacts par commande vocale unique | |
EP1554864B1 (fr) | Procede et appareil d'assistance-annuaire | |
US11902466B2 (en) | Captioned telephone service system having text-to-speech and answer assistance functions | |
JP4067483B2 (ja) | 電話受け付け翻訳システム | |
Sharman | Speech interfaces for computer systems: Problems and potential | |
Goldman et al. | Voice Portals—Where Theory Meets Practice | |
JP2001309049A (ja) | メール作成システム、装置、方法及び記録媒体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
ENP | Entry into the national phase |
Ref document number: 2342787 Country of ref document: CA Kind code of ref document: A Ref document number: 2342787 Country of ref document: CA |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |