WO2003017252A1 - Verfahren und vorrichtung zum erkennen einer phonetischen lautfolge oder zeichenfolge - Google Patents
Verfahren und vorrichtung zum erkennen einer phonetischen lautfolge oder zeichenfolge Download PDFInfo
- Publication number
- WO2003017252A1 WO2003017252A1 PCT/EP2001/009353 EP0109353W WO03017252A1 WO 2003017252 A1 WO2003017252 A1 WO 2003017252A1 EP 0109353 W EP0109353 W EP 0109353W WO 03017252 A1 WO03017252 A1 WO 03017252A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- features
- feature
- phonetic
- combination
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 18
- 230000009471 action Effects 0.000 claims abstract description 11
- 238000013528 artificial neural network Methods 0.000 claims description 28
- 238000001514 detection method Methods 0.000 claims description 8
- 239000013598 vector Substances 0.000 claims description 6
- 230000015572 biosynthetic process Effects 0.000 claims description 2
- 230000008859 change Effects 0.000 claims description 2
- 230000008094 contradictory effect Effects 0.000 claims description 2
- 238000011156 evaluation Methods 0.000 claims 2
- 230000001537 neural effect Effects 0.000 abstract description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units
Definitions
- the present invention relates to a method and an apparatus for recognizing a sequence of phonetic sounds or characters, e.g. a string of characters according to the ASC Il standard.
- control commands are entered in a predetermined manner, which is defined by the command language of the computer software. Entries that do not exactly correspond to the specified command sets are simply not recognized and processed. The input can be made either by entering a character string on the keyboard or by speaking a control command into a microphone that is connected to a speech recognition system. Regardless of what type of control command is entered, it is in any case necessary that the entered or recognized control command corresponds exactly to a control command specified by the computer software.
- US Pat. No. 5,737,485 shows a device for speech recognition, in which feature extras are provided both behind a microphone for short-range communication and behind a microphone arrangement for room detection, which form a feature sequence from a phonetic sound sequence.
- the feature sequences of the proximity and spatial detection are fed to a neural network, which is intended to recognize correlations between disturbed features and largely trouble-free features in a learning phase and later replace speech-impaired features with largely trouble-free features in accordance with the knowledge learned, before the Features of a speech recognition are supplied.
- the disadvantage of this device is that . Inability of the system to react dynamically to changes in sound generation. Detection continues indistinctly spoken words are not improved by the system.
- DE 198 04 603 shows a speech recognition system that generates a feature sequence in the form of test signals from a phonetic sequence. These test signals are then compared with lexical knowledge in the form of reference signals, with sequences of reference signals representing words that are obtained from a training signal. Speech recognition is to be improved in that the features in the form of the test signals are not only compared with the vocabulary learned in this way but also with learned combinations of sentence elements that are more likely to occur. The accuracy of the
- the input character string or phonetic sound sequence is fed to a neural network in which the character string is broken down into individual features in the character sequence either based on phonetic, semantic and / or lexically stored information or merely based on separating marks (e.g. spaces, speech pause).
- This Characteristics can be short strings / words or certain sounds and combinations of sounds.
- the neural network compiles combinations of features that are combined with lexical, semantic and / or phonetic information, taking into account the information from a lexicon, in such a way that they have a defined meaningfulness.
- the neural network forms many of these combinations of features and compares them in the form of time-coded neurons with the sequence of features.
- the defined information content can then, if necessary, be transformed into an executable command according to a predefined command set of computer software, which causes an action such as output, query, etc.
- the sequence of characteristics formed in the neural network is redefined, for example by defining and combining the related sounds, ie characteristics, in a sequence of sounds. With the newly formed sequence of features, a comparison can again be made with those found on the part of the neural network Characteristic combinations are carried out.
- the matching part can be shown as recognized in a display or output unit and asked for a new entry of the undetected part.
- a temporary storage area is provided in the storage area for the lexical and / or semantic knowledge, in which matching features and combinations of features recognized from the previous recognition activity are stored, which the neural network prefers to access when forming the combination of features.
- the system focuses on a specific topic or statement area. This considerably improves the recognition of the characters or sequence of sounds.
- character sequences or sound sequences can be stored together with their meaningfulness and semantic information, so that it is the neural It is possible for the network to generate meaningful information content of the combination of features in the compiled combination of features based on the semantic information, which comes close to the feature sequence in terms of sound or the character sequence.
- the proximity can be recognized by characteristic vowel sequences, consonant sequences or vowel / consonant combinations.
- the statement of the phonetic sound sequence or character string must include an input device into which a text can be spoken or, for example, entered via a keyboard. While a character string is entered, the sequence of features is separated by delimiters such as e.g. Whenever spaces are specified, the sequence of features must be formed by the neural network itself in the case of a phonetic sequence of sounds. It uses the stored lexical and semantic knowledge in the memory areas that the neural network uses.
- the formation of the feature sequence can be carried out using the associative word recognition, whereby sound or character sequences are formed from sounds or character components, the existence of which in the sound or character sequence is represented by a vector.
- the vector can also make statements, e.g. semantic type, contain and / or statements in connection with given control parameters of a computer software.
- the recognition can then be realized by comparing the vector of the entered sound or character string with stored vectors.
- the combination of features When the combination of features is formed by the neural network, the combination of features can ultimately be formed immediately on the basis of a meaningful content which, as a predetermined control command of a computer software, promptly initiates a certain action.
- the feature comparison can, however, also be carried out with combinations of features that contain a meaningful statement at all. This matching combination of features can then be checked to determine whether this meaningful statement matches a given control command.
- 1 shows a recognition device 10 for phonetic sequences
- the input device 12 can be, for example, a keyboard or
- the phonetic string. or character string 1 is "ann a neural network 14 is supplied equipped with a
- the neural network is also connected to fuzzy logic 20, which serves to process features or combinations of features that are not recognized as matching. Furthermore, the neural network 14 is connected to an output interface 22, within which permissible command sets for desired actions, for example an output or a query or the like, are stored.
- the operation of the recognition device from FIG. 1 is described below.
- the character string or phonetic string transferred into the neural network 14 is divided into a sequence of features.
- This structure can either be based on separators in a character string or, for example, due to pauses in speech in a phonetic sequence. consequences. However, the structure can also be based on information from the memory 16.
- the subdivided individual components of the sequence form features, so that the phonetic or character sequence is converted into a feature sequence that can be examined for its meaning in the neural network. This recognition of the content of meaning is realized by a comparison with time-coded neurons.
- the neural network creates hypotheses in the form of combinations of features that are similar to the sequence of features and have a defined meaning.
- the neural network forms a large number of such combinations of features, which are compared in time with the feature sequence of the phonetic or character string. . 'In Figure 2, such a comparison scheme is shown: this, the abscissa shows a listing Ü prepared provoking features, in the present example of ⁇ l to 6, and the ordinate represents a time axis showing the comparison of the characteristic sequence of phonetic or character strings with Characteristic combinations of different hypotheses Hl to H4 shows. The number of contradictions to the features of the feature sequence and / or to the lexical / semantic knowledge and / or to the command set of the data interface 22 is determined for each hypothesis.
- hypotheses H1 to H4 are compared with the sequence of features. After the first comparison, a hypothesis with most contradictions is eliminated. At the second point in time, the remaining hypotheses are compared again, possibly taking into account new parameters (command set compatibility, etc.). This in turn leads to the falling out of the most contradictory hypothesis.
- the comparisons are carried out until a hypothesis (combination of features) remains that shows good agreement with the feature sequence and few contradictions.
- the meaningfulness is defined which is assigned to the remaining hypothesis. In the comparison it was already taken into account as a parameter whether the statement of the combination of features corresponds to a command from the output interface 16 or can be transformed into such a command.
- the unrecognized feature sequence can be tried by feeding the unrecognized part of the sequence into the fuzzy logic 20 in conjunction with the lexical and semantic knowledge from the memory 16 weighting the
- the sequence of the features can be taken into account as an additional parameter when weighting the similarity.
- unrecognized phonetic or character strings may still be recognized by interpreting the sequence. If this method does not lead to a result either, the recognized character or sound component can be shown on a display and the meaning, description or changed input of the undetected phonetic or character string can be asked. In this way, the operator is taught which statements they need to clarify.
- hypotheses corresponds to the instruction set of an output or action interface 22.
- the output interface 22 can be the input area of a computer program or different computer programs, by means of which different actions can be initiated.
- the recognition device 10 could be used in an information or information terminal of an airport or the train.
- Such a detection device can also be used to control terminals or computers in order to enable access to databases, the system being able to recreate assignments between different data based on the entered / spoken links.
- Such a terminal could therefore not only be used to output data, but also to put together new data or to generate new data.
- the device preferably contains a display for displaying the command known or derived from the character or phonetic sequence, which may have to be confirmed by the operator before the action is carried out. In this way it can be avoided that wrong actions are started.
Landscapes
- Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/EP2001/009353 WO2003017252A1 (de) | 2001-08-13 | 2001-08-13 | Verfahren und vorrichtung zum erkennen einer phonetischen lautfolge oder zeichenfolge |
US10/486,847 US7966177B2 (en) | 2001-08-13 | 2001-08-13 | Method and device for recognising a phonetic sound sequence or character sequence |
EP01974180A EP1417678A1 (de) | 2001-08-13 | 2001-08-13 | Verfahren und vorrichtung zum erkennen einer phonetischen lautfolge oder zeichenfolge |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/EP2001/009353 WO2003017252A1 (de) | 2001-08-13 | 2001-08-13 | Verfahren und vorrichtung zum erkennen einer phonetischen lautfolge oder zeichenfolge |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2003017252A1 true WO2003017252A1 (de) | 2003-02-27 |
Family
ID=8164543
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2001/009353 WO2003017252A1 (de) | 2001-08-13 | 2001-08-13 | Verfahren und vorrichtung zum erkennen einer phonetischen lautfolge oder zeichenfolge |
Country Status (3)
Country | Link |
---|---|
US (1) | US7966177B2 (de) |
EP (1) | EP1417678A1 (de) |
WO (1) | WO2003017252A1 (de) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102007005560A1 (de) | 2007-01-24 | 2008-07-31 | Sänger, Bernhard | Verfahren zum Betreiben einer Bohrvorrichtung für geologische Strukturen, Verfahren zum Erkennen von geologischen Strukturen sowie Bohrvorrichtung für geologische Strukturen |
WO2015082723A1 (en) | 2013-12-06 | 2015-06-11 | Mic Ag | Pattern recognition system and method |
US10119845B2 (en) | 2014-04-18 | 2018-11-06 | Mic Ag | Optical fibre sensor system |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8073681B2 (en) | 2006-10-16 | 2011-12-06 | Voicebox Technologies, Inc. | System and method for a cooperative conversational voice user interface |
US7818176B2 (en) | 2007-02-06 | 2010-10-19 | Voicebox Technologies, Inc. | System and method for selecting and presenting advertisements based on natural language processing of voice-based input |
US8475815B2 (en) * | 2007-10-29 | 2013-07-02 | Ayman Boutros | Alloplastic injectable dermal filler and methods of use thereof |
US8140335B2 (en) | 2007-12-11 | 2012-03-20 | Voicebox Technologies, Inc. | System and method for providing a natural language voice user interface in an integrated voice navigation services environment |
US9305548B2 (en) | 2008-05-27 | 2016-04-05 | Voicebox Technologies Corporation | System and method for an integrated, multi-modal, multi-device natural language voice services environment |
US8918383B2 (en) * | 2008-07-09 | 2014-12-23 | International Business Machines Corporation | Vector space lightweight directory access protocol data search |
DE102008046339A1 (de) * | 2008-09-09 | 2010-03-11 | Giesecke & Devrient Gmbh | Freigabe von Transaktionsdaten |
US8326637B2 (en) | 2009-02-20 | 2012-12-04 | Voicebox Technologies, Inc. | System and method for processing multi-modal device interactions in a natural language voice services environment |
EP2221805B1 (de) * | 2009-02-20 | 2014-06-25 | Nuance Communications, Inc. | Verfahren zum automatisierten Training einer Vielzahl künstlicher neuronaler Netzwerke |
WO2014189486A1 (en) | 2013-05-20 | 2014-11-27 | Intel Corporation | Natural human-computer interaction for virtual personal assistant systems |
CN107003996A (zh) | 2014-09-16 | 2017-08-01 | 声钰科技 | 语音商务 |
US9898459B2 (en) | 2014-09-16 | 2018-02-20 | Voicebox Technologies Corporation | Integration of domain information into state transitions of a finite state transducer for natural language processing |
WO2016061309A1 (en) | 2014-10-15 | 2016-04-21 | Voicebox Technologies Corporation | System and method for providing follow-up responses to prior natural language inputs of a user |
US10431214B2 (en) | 2014-11-26 | 2019-10-01 | Voicebox Technologies Corporation | System and method of determining a domain and/or an action related to a natural language input |
US10614799B2 (en) | 2014-11-26 | 2020-04-07 | Voicebox Technologies Corporation | System and method of providing intent predictions for an utterance prior to a system detection of an end of the utterance |
US20180033425A1 (en) * | 2016-07-28 | 2018-02-01 | Fujitsu Limited | Evaluation device and evaluation method |
US10331784B2 (en) | 2016-07-29 | 2019-06-25 | Voicebox Technologies Corporation | System and method of disambiguating natural language processing requests |
CN108133706B (zh) * | 2017-12-21 | 2020-10-27 | 深圳市沃特沃德股份有限公司 | 语义识别方法及装置 |
RU2692051C1 (ru) | 2017-12-29 | 2019-06-19 | Общество С Ограниченной Ответственностью "Яндекс" | Способ и система для синтеза речи из текста |
CN111160003B (zh) * | 2018-11-07 | 2023-12-08 | 北京猎户星空科技有限公司 | 一种断句方法及装置 |
Family Cites Families (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5179624A (en) * | 1988-09-07 | 1993-01-12 | Hitachi, Ltd. | Speech recognition apparatus using neural network and fuzzy logic |
JP2764277B2 (ja) * | 1988-09-07 | 1998-06-11 | 株式会社日立製作所 | 音声認識装置 |
EP0435282B1 (de) * | 1989-12-28 | 1997-04-23 | Sharp Kabushiki Kaisha | Spracherkennungseinrichtung |
US5276741A (en) * | 1991-05-16 | 1994-01-04 | Trw Financial Systems & Services, Inc. | Fuzzy string matcher |
US5440651A (en) * | 1991-06-12 | 1995-08-08 | Microelectronics And Computer Technology Corp. | Pattern recognition neural network |
KR100202425B1 (ko) * | 1992-08-27 | 1999-06-15 | 정호선 | 가전제품의 리모콘 명령어를 인식하기 위한 음성 인식 시스템 |
AU5803394A (en) * | 1992-12-17 | 1994-07-04 | Bell Atlantic Network Services, Inc. | Mechanized directory assistance |
DE4322372A1 (de) * | 1993-07-06 | 1995-01-12 | Sel Alcatel Ag | Verfahren und Vorrichtung zur Spracherkennung |
US5528728A (en) * | 1993-07-12 | 1996-06-18 | Kabushiki Kaisha Meidensha | Speaker independent speech recognition system and method using neural network and DTW matching technique |
US5727124A (en) * | 1994-06-21 | 1998-03-10 | Lucent Technologies, Inc. | Method of and apparatus for signal recognition that compensates for mismatching |
US5640490A (en) * | 1994-11-14 | 1997-06-17 | Fonix Corporation | User independent, real-time speech recognition system and method |
US5638487A (en) * | 1994-12-30 | 1997-06-10 | Purespeech, Inc. | Automatic speech recognition |
US5737485A (en) * | 1995-03-07 | 1998-04-07 | Rutgers The State University Of New Jersey | Method and apparatus including microphone arrays and neural networks for speech/speaker recognition systems |
US5749066A (en) * | 1995-04-24 | 1998-05-05 | Ericsson Messaging Systems Inc. | Method and apparatus for developing a neural network for phoneme recognition |
US6026177A (en) * | 1995-08-29 | 2000-02-15 | The Hong Kong University Of Science & Technology | Method for identifying a sequence of alphanumeric characters |
US5799276A (en) * | 1995-11-07 | 1998-08-25 | Accent Incorporated | Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals |
US5937383A (en) * | 1996-02-02 | 1999-08-10 | International Business Machines Corporation | Apparatus and methods for speech recognition including individual or speaker class dependent decoding history caches for fast word acceptance or rejection |
EP0859332A1 (de) * | 1997-02-12 | 1998-08-19 | STMicroelectronics S.r.l. | Einrichtung und Verfahren zum Erkennen von Wörtern |
US6138098A (en) * | 1997-06-30 | 2000-10-24 | Lernout & Hauspie Speech Products N.V. | Command parsing and rewrite system |
DE19804603A1 (de) | 1998-02-06 | 1999-08-12 | Philips Patentverwaltung | Verfahren zum Ermitteln von Wörtern in einem Sprachsignal |
ITTO980383A1 (it) * | 1998-05-07 | 1999-11-07 | Cselt Centro Studi Lab Telecom | Procedimento e dispositivo di riconoscimento vocale con doppio passo di riconoscimento neurale e markoviano. |
US6208963B1 (en) * | 1998-06-24 | 2001-03-27 | Tony R. Martinez | Method and apparatus for signal classification using a multilayer network |
WO2000038175A1 (en) * | 1998-12-21 | 2000-06-29 | Koninklijke Philips Electronics N.V. | Language model based on the speech recognition history |
JP2000221990A (ja) * | 1999-01-28 | 2000-08-11 | Ricoh Co Ltd | 音声認識装置 |
US6374217B1 (en) * | 1999-03-12 | 2002-04-16 | Apple Computer, Inc. | Fast update implementation for efficient latent semantic language modeling |
US7177798B2 (en) * | 2000-04-07 | 2007-02-13 | Rensselaer Polytechnic Institute | Natural language interface using constrained intermediate dictionary of results |
GB0028277D0 (en) * | 2000-11-20 | 2001-01-03 | Canon Kk | Speech processing system |
US6937983B2 (en) * | 2000-12-20 | 2005-08-30 | International Business Machines Corporation | Method and system for semantic speech recognition |
US20020087317A1 (en) * | 2000-12-29 | 2002-07-04 | Lee Victor Wai Leung | Computer-implemented dynamic pronunciation method and system |
-
2001
- 2001-08-13 EP EP01974180A patent/EP1417678A1/de not_active Ceased
- 2001-08-13 WO PCT/EP2001/009353 patent/WO2003017252A1/de active Application Filing
- 2001-08-13 US US10/486,847 patent/US7966177B2/en not_active Expired - Fee Related
Non-Patent Citations (3)
Title |
---|
ALVAREZ-CERCADILLO J ET AL: "Context modeling using RNN for keyword detection", ICASSP-93. 1993 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (CAT. NO.92CH3252-4), PROCEEDINGS OF ICASSP '93, MINNEAPOLIS, MN, USA, 27-30 APRIL 1993, 1993, New York, NY, USA, IEEE, USA, pages 569 - 572 vol.1, XP002198612, ISBN: 0-7803-0946-4 * |
CASTANO I A ET AL: "PRELIMINARY EXPERIMENTS FOR AUTOMATIC SPEECH UNDERSTANDING THROUGH SIMPLE RECURRENT NETWORKS", 4TH EUROPEAN CONFERENCE ON SPEECH COMMUNICATION AND TECHNOLOGY. EUROSPEECH '95. MADRID, SPAIN, SEPT. 18 - 21, 1995, EUROPEAN CONFERENCE ON SPEECH COMMUNICATION AND TECHNOLOGY. (EUROSPEECH), MADRID: GRAFICAS BRENS, ES, vol. 3 CONF. 4, 18 September 1995 (1995-09-18), pages 1673 - 1676, XP000855024 * |
OPPIZZI O ET AL: "Rescoring under fuzzy measures with a multilayer neural network in a rule-based speech recognition system", 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (CAT. NO.97CB36052), 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, MUNICH, GERMANY, 21-24 APRIL 1997, 1997, Los Alamitos, CA, USA, IEEE Comput. Soc. Press, USA, pages 1723 - 1726 vol.3, XP002198611, ISBN: 0-8186-7919-0 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102007005560A1 (de) | 2007-01-24 | 2008-07-31 | Sänger, Bernhard | Verfahren zum Betreiben einer Bohrvorrichtung für geologische Strukturen, Verfahren zum Erkennen von geologischen Strukturen sowie Bohrvorrichtung für geologische Strukturen |
WO2015082723A1 (en) | 2013-12-06 | 2015-06-11 | Mic Ag | Pattern recognition system and method |
US10119845B2 (en) | 2014-04-18 | 2018-11-06 | Mic Ag | Optical fibre sensor system |
Also Published As
Publication number | Publication date |
---|---|
EP1417678A1 (de) | 2004-05-12 |
US20040199389A1 (en) | 2004-10-07 |
US7966177B2 (en) | 2011-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2003017252A1 (de) | Verfahren und vorrichtung zum erkennen einer phonetischen lautfolge oder zeichenfolge | |
DE60111329T2 (de) | Anpassung des phonetischen Kontextes zur Verbesserung der Spracherkennung | |
EP1927980B1 (de) | Verfahren zur Klassifizierung der gesprochenen Sprache in Sprachdialogsystemen | |
DE69829235T2 (de) | Registrierung für die Spracherkennung | |
DE3337353C2 (de) | Sprachanalysator auf der Grundlage eines verborgenen Markov-Modells | |
DE69818231T2 (de) | Verfahren zum diskriminativen training von spracherkennungsmodellen | |
EP0925578B1 (de) | Sprachverarbeitungssystem und verfahren zur sprachverarbeitung | |
DE60016722T2 (de) | Spracherkennung in zwei Durchgängen mit Restriktion des aktiven Vokabulars | |
DE69908047T2 (de) | Verfahren und System zur automatischen Bestimmung von phonetischen Transkriptionen in Verbindung mit buchstabierten Wörtern | |
DE60313706T2 (de) | Spracherkennungs- und -antwortsystem, Spracherkennungs- und -antwortprogramm und zugehöriges Aufzeichnungsmedium | |
EP0994461A2 (de) | Verfahren zur automatischen Erkennung einer buchstabierten sprachlichen Äusserung | |
EP1139333A2 (de) | Spracherkennungsverfahren und Spracherkennungsvorrichtung | |
DE602004006641T2 (de) | Audio-dialogsystem und sprachgesteuertes browsing-verfahren | |
WO2006111230A1 (de) | Verfahren zur gezielten ermittlung eines vollständigen eingabedatensatzes in einem sprachdialogsystem | |
EP1182646A2 (de) | Verfahren zur Zuordnung von Phonemen | |
WO2000005709A1 (de) | Verfahren und vorrichtung zur erkennung vorgegebener schlüsselwörter in gesprochener sprache | |
EP0813734B1 (de) | Verfahren zur erkennung mindestens eines definierten, durch hidden-markov-modelle modellierten musters in einem zeitvarianten messignal, welches von mindestens einem störsignal überlagert wird | |
EP1039447B1 (de) | Bestimmung einer Regressionsklassen-Baumstruktur für einen Spracherkenner | |
EP0817167B1 (de) | Spracherkennungsverfahren und Anordnung zum Durchführen des Verfahrens | |
DE10006725A1 (de) | Verfahren und Vorrichtung zum Erkennen einer phonetischen Lautfolge oder Zeichenfolge | |
DE102016125162B4 (de) | Verfahren und Vorrichtung zum maschinellen Verarbeiten von Texten | |
EP1345208A2 (de) | Automatische Detektion von Sprecherwechseln in sprecheradaptiven Spracherkennungssystemen | |
DE10308611A1 (de) | Ermittlung der Verwechslungsgefahr von Vokabulareinträgen bei der phonembasierten Spracherkennung | |
EP1400951A2 (de) | Verfahren zur rechnergestützten Spracherkennung, Spracherkennungssystem und Steuereinrichtung zum Steuern eines technischen Systems und Telekommunikationsgerät | |
EP3570189B1 (de) | Computerimplementiertes verfahren zum bereitstellen eines adaptiven dialogsystems und ein adaptives dialogsystem |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AU BA BB BG BR BZ CA CO CR CU CZ DM DZ EC EE GD GE HU ID IL IN IS JP KP KR LC LK LR LT MA MG MK MN MX NO NZ PL RO SG SK TT UA US UZ VN YU Kind code of ref document: A1 Designated state(s): AE AG AL AU BA BB BG BR BZ CA CN CO CR CU CZ DM DZ EC EE GD GE HR HU ID IL IN IS JP KP KR LC LK LR LT LV MA MG MK MN MX NO NZ PL RO SG SI SK TT UA US UZ VN YU ZA |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ UG ZW AM AZ BY KG KZ MD TJ TM AT BE CH CY DE DK ES FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW MR NE SN TD TG Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 10486847 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2001974180 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2001974180 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: JP |