WO2001067435A9 - Verfahren zum sprachgesteuerten initiieren von in einem gerät ausführbaren aktionen durch einen begrenzten benutzerkreis - Google Patents
Verfahren zum sprachgesteuerten initiieren von in einem gerät ausführbaren aktionen durch einen begrenzten benutzerkreisInfo
- Publication number
- WO2001067435A9 WO2001067435A9 PCT/DE2001/000891 DE0100891W WO0167435A9 WO 2001067435 A9 WO2001067435 A9 WO 2001067435A9 DE 0100891 W DE0100891 W DE 0100891W WO 0167435 A9 WO0167435 A9 WO 0167435A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- speech
- user
- language
- recognition
- pattern
- Prior art date
Links
- 230000009471 action Effects 0.000 title claims abstract description 26
- 230000000977 initiatory effect Effects 0.000 title claims abstract description 9
- 238000000034 method Methods 0.000 title claims description 38
- 238000001514 detection method Methods 0.000 claims abstract 5
- 238000012549 training Methods 0.000 claims description 15
- 230000000694 effects Effects 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 claims 1
- 230000003993 interaction Effects 0.000 claims 1
- 238000011017 operating method Methods 0.000 claims 1
- 230000001419 dependent effect Effects 0.000 abstract description 9
- 230000014509 gene expression Effects 0.000 abstract description 2
- 238000012790 confirmation Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
Definitions
- a device Entering information or data or commands into a device - e.g. a telecommunication terminal such as the corded or cordless telephone, the mobile phone, etc., a household appliance such as the washing machine, the electric cooker, the refrigerator etc., a vehicle such as the car, the plane, the ship etc., a consumer electronics device such as the Televisions, the HIFI system, etc., an electronic device for control and command input such as the personal computer, the personal digital assistant, etc. - using speech, the natural form of communication of people, for voice-controlled initiation of actions that can be carried out in the respective device
- the primary goal is to free up the hands used for data or command input for other routine activities.
- the device has a speech recognition device, which is also referred to in the specialist literature as speech recognizer.
- the field of automatic recognition of speech as a system of characters and sounds includes recognizing the characters and sounds spoken in isolation - e.g. single words, commands - up to the recognition of fluently spoken characters and sounds - e.g. several connected words, one or more sentences, a speech - according to the form of communication of the person.
- the automatic speech recognition is in principle a search process, which according to the document
- the speaker-independent speech recognizer works almost exclusively on the basis of phonemes, while the speaker-dependent speech recognizer is more or less a single word recognizer.
- the speaker-independent speech recognizers are used in particular in devices where, on the one hand, fluently spoken language - e.g. several connected words, sentences etc. - and large to very large vocabulary - i.e.
- the device is used by an unlimited number of users - must be processed and on the other hand the computing and storage effort for recognizing this language and vocabulary is irrelevant because the corresponding capacities are available.
- the speaker-dependent speech recognizers have their preferred field of application in devices where, on the one hand, discretely spoken language, e.g. individual words and commands, and small to medium-sized vocabulary - i.e. a limited group of users uses the device - have to be processed and, on the other hand, the computing and storage effort for this Recognizing this language and vocabulary is important because the corresponding capacities are not available.
- the speaker-dependent speech recognizers are therefore characterized by a low level of complexity in terms of computation and memory requirements.
- speech-dependent speech recognizers currently in use, sufficiently high word recognition rates for small to medium-sized vocabularies (10-100 words) are already achieved, so that these speech recognizers are particularly useful for control and command input (command-and-control) but also for voice-controlled database access (e.g. Voice dialing from a phone book). These speech recognizers are therefore being used to an increasing extent in devices on the mass market, such as, for example, in telephones, household appliances, consumer electronics devices, devices with control and command input toys, but also in motor vehicles.
- the problem with these applications is that the devices are often not only used by one user, but by several users, e.g. frequently members of a household or a family are used (limited number of users).
- the object on which the invention is based is to control the initiation of actions which can be carried out in a device by means of speech by users of a limited group of users of the device, the speech being recognized independently of the user and without user identification on the basis of a speaker-dependent speech recognition system.
- the idea on which the invention is based is that the recognition speech expressions of the users of the user group, for example the words of a vocabulary, are assigned the reference speech patterns of all users of a speech recognition system which are necessary for the recognition.
- the vocabulary (telephone book, command word list, 7) contains, for example, "i * words (names, commands, %), each of which an action to be performed (telephone numbers to be selected, action of a connected device, 7) a possible acoustic confirmation (usually the pronunciation of the word) (voice prompt) and up to "j * reference language patterns are assigned to the" k * users of the speech recognition system, where "i ⁇ eN,” j ⁇ GN and * * eN.
- the inventive step lies in the use of a common vocabulary for all users of a speech recognition system, with one word being assigned the reference speech pattern by several users.
- the method requires the rejection strategy described above for voice training and for voice recognition.
Landscapes
- Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Electrically Operated Instructional Devices (AREA)
- Machine Translation (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP01921173A EP1261964A1 (de) | 2000-03-08 | 2001-03-08 | Verfahren zum sprachgesteuerten initiieren von in einem gerät ausführbaren aktionen durch einen begrenzten benutzerkreis |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE10011178.5 | 2000-03-08 | ||
DE10011178A DE10011178A1 (de) | 2000-03-08 | 2000-03-08 | Verfahren zum sprachgesteuerten Initieren von in einem Gerät ausführbaren Aktionen durch einen begrenzten Benutzerkreis |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2001067435A1 WO2001067435A1 (de) | 2001-09-13 |
WO2001067435A9 true WO2001067435A9 (de) | 2002-11-28 |
Family
ID=7633897
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/DE2001/000891 WO2001067435A1 (de) | 2000-03-08 | 2001-03-08 | Verfahren zum sprachgesteuerten initiieren von in einem gerät ausführbaren aktionen durch einen begrenzten benutzerkreis |
Country Status (5)
Country | Link |
---|---|
US (1) | US20030040915A1 (de) |
EP (1) | EP1261964A1 (de) |
CN (1) | CN1217314C (de) |
DE (1) | DE10011178A1 (de) |
WO (1) | WO2001067435A1 (de) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1665748B1 (de) | 2003-09-17 | 2013-05-15 | Gigaset Communications GmbH | Verfahren und telekommunikationssystem mit drahtloser telekommunikation zwischen einem mobilteil und einer basisstation zum registrieren eines mobilteils |
US20060287864A1 (en) * | 2005-06-16 | 2006-12-21 | Juha Pusa | Electronic device, computer program product and voice control method |
DE102008024257A1 (de) * | 2008-05-20 | 2009-11-26 | Siemens Aktiengesellschaft | Verfahren zur Sprecheridentifikation bei einer Spracherkennung |
CN102262879B (zh) * | 2010-05-24 | 2015-05-13 | 乐金电子(中国)研究开发中心有限公司 | 语音命令竞争处理方法、装置、语音遥控器和数字电视 |
US9316400B2 (en) * | 2013-09-03 | 2016-04-19 | Panasonic Intellctual Property Corporation of America | Appliance control method, speech-based appliance control system, and cooking appliance |
US10767879B1 (en) * | 2014-02-13 | 2020-09-08 | Gregg W Burnett | Controlling and monitoring indoor air quality (IAQ) devices |
US20150336786A1 (en) * | 2014-05-20 | 2015-11-26 | General Electric Company | Refrigerators for providing dispensing in response to voice commands |
CN105224523A (zh) * | 2014-06-08 | 2016-01-06 | 上海能感物联网有限公司 | 非特定人外语语音远程自动导航并驾驶汽车的控制器装置 |
US10257629B2 (en) | 2017-04-18 | 2019-04-09 | Vivint, Inc. | Event detection by microphone |
JP6771681B2 (ja) * | 2017-10-11 | 2020-10-21 | 三菱電機株式会社 | 空調用コントローラ |
CN108509225B (zh) * | 2018-03-28 | 2021-07-16 | 联想(北京)有限公司 | 一种信息处理方法及电子设备 |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4181821A (en) * | 1978-10-31 | 1980-01-01 | Bell Telephone Laboratories, Incorporated | Multiple template speech recognition system |
US5040213A (en) * | 1989-01-27 | 1991-08-13 | Ricoh Company, Ltd. | Method of renewing reference pattern stored in dictionary |
US5794205A (en) * | 1995-10-19 | 1998-08-11 | Voice It Worldwide, Inc. | Voice recognition interface apparatus and method for interacting with a programmable timekeeping device |
US6073101A (en) * | 1996-02-02 | 2000-06-06 | International Business Machines Corporation | Text independent speaker recognition for transparent command ambiguity resolution and continuous access control |
US5719921A (en) * | 1996-02-29 | 1998-02-17 | Nynex Science & Technology | Methods and apparatus for activating telephone services in response to speech |
DE19636452A1 (de) * | 1996-09-07 | 1998-03-12 | Altenburger Ind Naehmasch | Mehrnutzersystem zur Spracheingabe |
US5777571A (en) * | 1996-10-02 | 1998-07-07 | Holtek Microelectronics, Inc. | Remote control device for voice recognition and user identification restrictions |
DE69720224T2 (de) * | 1996-12-24 | 2003-12-04 | Cellon France Sas Le Mans | Verfahren zum trainieren eines spracherkennungssystems und ein gerät zum praktizieren des verfahrens, insbesondere eines tragbaren telefons |
FR2761848B1 (fr) * | 1997-04-04 | 2004-09-17 | Parrot Sa | Dispositif de commande vocale pour radiotelephone, notamment pour utilisation dans un vehicule automobile |
US6289140B1 (en) * | 1998-02-19 | 2001-09-11 | Hewlett-Packard Company | Voice control input for portable capture devices |
US6018711A (en) * | 1998-04-21 | 2000-01-25 | Nortel Networks Corporation | Communication system user interface with animated representation of time remaining for input to recognizer |
DE19841166A1 (de) * | 1998-09-09 | 2000-03-16 | Deutsche Telekom Ag | Verfahren zur Kontrolle der Zugangsberechtigung für die Sprachtelefonie an einem Festnetz- oder Mobiltelefonanschluß sowie Kommunikationsnetz |
US20030093281A1 (en) * | 1999-05-21 | 2003-05-15 | Michael Geilhufe | Method and apparatus for machine to machine communication using speech |
-
2000
- 2000-03-08 DE DE10011178A patent/DE10011178A1/de not_active Withdrawn
-
2001
- 2001-03-08 US US10/220,906 patent/US20030040915A1/en not_active Abandoned
- 2001-03-08 CN CN01806169.9A patent/CN1217314C/zh not_active Expired - Fee Related
- 2001-03-08 WO PCT/DE2001/000891 patent/WO2001067435A1/de active Application Filing
- 2001-03-08 EP EP01921173A patent/EP1261964A1/de not_active Withdrawn
Also Published As
Publication number | Publication date |
---|---|
EP1261964A1 (de) | 2002-12-04 |
CN1416560A (zh) | 2003-05-07 |
DE10011178A1 (de) | 2001-09-13 |
US20030040915A1 (en) | 2003-02-27 |
WO2001067435A1 (de) | 2001-09-13 |
CN1217314C (zh) | 2005-08-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE69922104T2 (de) | Spracherkenner mit durch buchstabierte Worteingabe adaptierbarem Wortschatz | |
DE60125542T2 (de) | System und verfahren zur spracherkennung mit einer vielzahl von spracherkennungsvorrichtungen | |
DE60313706T2 (de) | Spracherkennungs- und -antwortsystem, Spracherkennungs- und -antwortprogramm und zugehöriges Aufzeichnungsmedium | |
WO2005013261A1 (de) | Verfahren zur spracherkennung und kommunikationsgerät | |
WO2003060877A1 (de) | Betriebsverfahren eines automatischen spracherkenners zur sprecherunabhängigen spracherkennung von worten aus verschiedenen sprachen und automatischer spracherkenner | |
EP0925578A1 (de) | Sprachverarbeitungssystem und verfahren zur sprachverarbeitung | |
DE102006006069A1 (de) | Verteiltes Sprachverarbeitungssystem und Verfahren zur Ausgabe eines Zwischensignals davon | |
DE60212725T2 (de) | Verfahren zur automatischen spracherkennung | |
DE10054583C2 (de) | Verfahren und Vorrichtung zum Aufzeichnen, Suchen und Wiedergeben von Notizen | |
DE60034772T2 (de) | Zurückweisungsverfahren in der spracherkennung | |
EP1884924A1 (de) | Verfahren zum Erzeugen einer kontextbasierten Sprachdialogausgabe in einem Sprachdialogsystem | |
WO2001067435A9 (de) | Verfahren zum sprachgesteuerten initiieren von in einem gerät ausführbaren aktionen durch einen begrenzten benutzerkreis | |
DE60214850T2 (de) | Für eine benutzergruppe spezifisches musterverarbeitungssystem | |
DE4111995A1 (de) | Schaltungsanordnung zur spracherkennung | |
EP1249016B1 (de) | Verfahren zur sprachgesteuerten identifizierung des nutzers eines telekommunikationsanschlusses im telekommunikationsnetz beim dialog mit einem sprachgesteuerten dialogsystem | |
WO2001086634A1 (de) | Verfahren zum erzeugen einer sprachdatenbank für einen zielwortschatz zum trainieren eines spracherkennungssystems | |
DE19532114C2 (de) | Sprachdialog-System zur automatisierten Ausgabe von Informationen | |
US20010056345A1 (en) | Method and system for speech recognition of the alphabet | |
WO1993002448A1 (de) | Verfahren und anordnung zum erkennen von einzelwörtern gesprochener sprache | |
DE10220522B4 (de) | Verfahren und System zur Verarbeitung von Sprachdaten mittels Spracherkennung und Frequenzanalyse | |
DE19851287A1 (de) | Datenverarbeitungssystem oder Kommunikationsendgerät mit einer Einrichtung zur Erkennugn gesprochener Sprache und Verfahren zur Erkennung bestimmter akustischer Objekte | |
DE19912405A1 (de) | Bestimmung einer Regressionsklassen-Baumstruktur für Spracherkenner | |
DE10229207B3 (de) | Verfahren zur natürlichen Spracherkennung auf Basis einer Generativen Transformations-/Phrasenstruktur-Grammatik | |
EP1457966A1 (de) | Verfahren zum Ermitteln der Verwechslungsgefahr von Vokabulareinträgen bei der phonembasierten Spracherkennung | |
EP1063633A2 (de) | Verfahren zum Training eines automatischen Spracherkenners |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): BR CA CN HU JP KR PL RU US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2001921173 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 018061699 Country of ref document: CN Ref document number: 10220906 Country of ref document: US |
|
AK | Designated states |
Kind code of ref document: C2 Designated state(s): BR CA CN HU JP KR PL RU US |
|
AL | Designated countries for regional patents |
Kind code of ref document: C2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR |
|
WWP | Wipo information: published in national office |
Ref document number: 2001921173 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: JP |