EP1285434A1 - Dynamische sprachmodelle für die spracherkennung - Google Patents

Dynamische sprachmodelle für die spracherkennung

Info

Publication number
EP1285434A1
EP1285434A1 EP01936519A EP01936519A EP1285434A1 EP 1285434 A1 EP1285434 A1 EP 1285434A1 EP 01936519 A EP01936519 A EP 01936519A EP 01936519 A EP01936519 A EP 01936519A EP 1285434 A1 EP1285434 A1 EP 1285434A1
Authority
EP
European Patent Office
Prior art keywords
node
branch
language model
voice recognition
development
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP01936519A
Other languages
English (en)
French (fr)
Inventor
Frédéric Thomson Multimedia Soufflet
Serge Thomson multimedia LE HUITOUZE
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
THOMSON LICENSING
Original Assignee
Thomson Licensing SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing SAS filed Critical Thomson Licensing SAS
Priority to EP01936519A priority Critical patent/EP1285434A1/de
Publication of EP1285434A1 publication Critical patent/EP1285434A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/193Formal grammars, e.g. finite state automata, context free grammars or word networks

Definitions

  • the present invention relates to the field of voice recognition. More specifically, the invention relates to wide vocabulary voice interfaces.
  • Information and control systems are increasingly using a voice interface to make interaction with the user quick and intuitive.
  • these systems become more complex, the styles of dialogue supported are becoming richer, and we are entering the field of continuous speech recognition, with a very wide vocabulary.
  • the language model therefore allows the voice processing module to construct the sentence (that is to say the sequence of words) most likely in relation to the acoustic signal presented to it. This sentence must then be analyzed by a comprehension module in order to transform it into a series of adequate actions at the level of the voice-controlled system.
  • This approach has the advantage of minimizing the apparent cost of execution, since the grammar is transformed once and for all before execution (by a compilation process) into an internal representation perfectly tailored for the needs of the processing module. voice.
  • it has the drawback of building a representation (automaton) which can become very memory-consuming in the case of complex grammars, which can pose resource problems on the computer system of execution, and may even slow execution if the paging mechanism for the virtual memory of the execution system becomes too frequent.
  • the invention aims in particular to overcome these drawbacks of the prior art.
  • an objective of the invention is to provide a system and a method of voice recognition optimizing the use of memory, in particular for applications with a large vocabulary.
  • the invention also aims to reduce the costs of implementation or use.
  • An additional objective of the invention is to provide a method allowing energy saving, in particular when the method is implemented in a device with an autonomous energy source (for example an infrared remote control or a mobile telephone).
  • an autonomous energy source for example an infrared remote control or a mobile telephone.
  • An objective of the invention is also to improve the speed of speech recognition.
  • the invention proposes a voice recognition method, remarkable in that it includes a voice recognition step taking into account at least one grammatical language model and implementing a decoding algorithm intended to identify a series of words from a series of vocal samples, the language model being associated with at least one state machine, finite or infinite, dynamically developed.
  • finite state machine (s) are developed dynamically as a function in particular of the needs, as opposed to statically developed machines which are developed completely, systematically.
  • the method is remarkable in that it comprises a step of dynamic development in width of the automaton (s) from at least one grammar defining a language model.
  • the method is remarkable in that it comprises a step of constructing at least part of an automaton comprising at least one branch, each branch comprising at least one node, the step of construction comprising a sub-step of selective development of the node (s), according to a predetermined rule.
  • the method does not allow the systematic development of all the nodes but selectively according to a predetermined rule.
  • the method is remarkable in that the algorithm includes a step of requesting the development of at least one undeveloped node allowing development of the node or nodes according to the predetermined rule.
  • the method advantageously allows the development of the nodes required by the algorithm itself according to its needs, linked in particular to the incoming acoustic information.
  • the algorithm will not require the development of this node.
  • a likely passage through this knot will lead to its development.
  • the method is remarkable in that, according to the predetermined rule, for each branch, each first node of the branch is developed.
  • the method systematically authorizes the development of the first node of each branch originating from a developed node.
  • the method is remarkable in that, for at least one branch comprising a first node and at least one node following the first node, the construction step comprises a substep for replacing the following node (s) by a special undeveloped knot.
  • the method advantageously only allows the development of necessary nodes, thus saving the resources of a device implementing the method.
  • the method is remarkable in that the decoding algorithm is a maximum likelihood decoding algorithm.
  • the method is advantageously compatible with a maximum likelihood algorithm, such as in particular the Viterbi algorithm, thus allowing reliable speech recognition and a reasonable complexity of implementation, in particular in the case of applications with a large vocabulary.
  • a maximum likelihood algorithm such as in particular the Viterbi algorithm
  • the invention also relates to a voice recognition device, remarkable in that it comprises voice recognition means taking into account at least one grammatical language model and implementing a decoding algorithm intended to identify a series of words from of a series of vocal samples, the language model being associated with a state machine, finite or infinite, dynamically developed.
  • the invention further relates to a computer program product comprising program elements, recorded on a medium readable by at least one microprocessor, remarkable in that the program elements control the microprocessor (s) so that they carry out a voice recognition step taking into account at least one grammatical language model and implementing a decoding algorithm intended to identify a series of words from a series of voice samples, the language model being associated with a PLC states, finite or infinite, dynamically developed.
  • the invention also relates to a computer program product, remarkable in that the program comprises sequences of instructions adapted to the implementation of the voice recognition method described above when the program is executed on a computer.
  • FIG. 1 shows a general block diagram of a system comprising a voice-controlled unit, in which the technique of the invention is implemented;
  • FIG. 2 shows a block diagram of the voice recognition unit of the system of Figure 1;
  • Figure 3 describes an electronic diagram of a voice recognition unit implementing the block diagram of Figure 2;
  • FIG. 5 shows a dynamic development algorithm in width of a node implemented by the housing of Figures 1 and 3;
  • FIGS. 6 to 10 illustrate requests for the development of a dynamic voice recognition network, according to the algorithm of FIG. 5.
  • the general principle of the invention is based on the replacement of the representation in the form of an automaton statically calculated by a dynamic representation allowing the progressive development of the grammar, which makes it possible to solve the problem of size.
  • the invention consists in using a representation making it possible to develop the primers of sentences in a progressive manner.
  • FIG. 1 a general block diagram of a system comprising a voice-controlled unit 102 implementing the technique of the invention.
  • this system notably includes:
  • a voice source 100 which may in particular consist of a microphone intended to pick up a voice signal produced by a speaker;
  • a voice recognition unit 102 - A control unit 105 intended to control an apparatus 107;
  • a controlled device 107 for example of the television or video recorder type.
  • the source 100 is connected to the voice recognition unit 102, via a link
  • the unit 102 can retrieve context information 104 (such as for example, the type of device 107 that can be controlled by the control unit 105 or the list of command codes) via a link 104 and send to the control unit 105 of commands via a link 103.
  • context information 104 such as for example, the type of device 107 that can be controlled by the control unit 105 or the list of command codes
  • the control unit 105 issues commands via a link 106, for example infrared, to the device 107.
  • a link 106 for example infrared
  • the source 100, the voice recognition unit 102 and the control unit 105 are part of the same device. and thus the links 101, 103 and 104 are links internal to the device.
  • the link 106 is typically a wireless link.
  • the elements 100, 102 and 105 are partly or completely separate and do not form part of the same device.
  • the links 101, 103 and 104 are external connections, wired or not.
  • the source 100, the boxes 102 and 105 and the device 107 are part of the same device and are connected to each other by internal buses (links 101, 103, 104 and 106).
  • This variant is particularly advantageous when the device is, for example, a telephone or portable telecommunication terminal.
  • FIG. 2 shows a block diagram of a voice-activated unit such as the unit
  • the box 102 receives from the outside the analog source wave 101 which is processed by an Acoustic-Phonetic Decoder 200 or DAP (called “front-end” in English).
  • the DAP 200 samples at regular intervals (typically every 10 ms) the source wave 101 to produce real vectors or those belonging to code books (or “code books” in English), typically representing oral resonances which are emitted via a link 201 to a recognition engine 203.
  • an acousto-phonetic decoder translates the digital samples into acoustic symbols chosen from a predetermined alphabet.
  • a linguistic decoder processes these symbols in order to determine, for a sequence A of symbols, the most probable sequence W of words, given the sequence A.
  • the linguistic decoder comprises a recognition engine using an acoustic model and a language.
  • the acoustic model is for example a model called "Hidden Markov Model" (HMM). It calculates from in a manner known per se the acoustic scores of the sequences of words considered.
  • HMM Hidden Markov Model
  • the language model implemented in this embodiment is based on a grammar described using syntax rules of Backus Naur form. The language model is used to determine a plurality of word sequence hypotheses and to calculate linguistic scores.
  • the recognition engine is based on a Viterbi type algorithm called "n-best".
  • the n-best algorithm determines at each stage of the analysis of a sentence the n most likely word sequences. At the end of the sentence, the most likely solution is chosen from among the n candidates, from the scores provided by the acoustic model and the language model.
  • the recognition engine uses a Viterbi type algorithm (n-best algorithm) to analyze a sentence composed of a sequence of acoustic symbols (vectors).
  • the algorithm determines the N most probable word sequences, given the sequence A of acoustic symbols observed up to the current symbol.
  • the most probable word sequences are determined through the stochastic grammar type language model.
  • HMM Hidden Markov Models or "Hidden Markov Models"
  • the Viterbi algorithm is implemented in parallel, but instead of retaining a single transition to each state during iteration i, we retain for each state the N most likely transitions.
  • Information concerning in particular the Viterbi, beam search and "n-best" algorithms is given in the work:
  • the analysis performed by the recognition engine stops when all of the acoustic symbols relating to a sentence have been processed.
  • the recognition engine then has a trellis consisting of the states at each previous iteration of the algorithm and the transitions between these states, up to the final states. Finally, we retain among the final states and their N associated transitions the N most likely transitions. By retracing the transitions from the final states, the N most probable word sequences corresponding to the acoustic symbols are determined. These sequences are then subjected to a processing using a parser in order to select the unique final sequence on grammatical criteria.
  • the recognition engine 203 analyzes the real vectors which it receives using in particular hidden Markov models or
  • HMM from the English Hidden Markov Models
  • language models which represent the probability that a word follows another word
  • the recognition engine 203 supplies words which it has identified from the vectors received to a means for translating these words into commands which can be understood by the apparatus 107.
  • This means uses an artificial intelligence translation method which itself even takes into account a context 104 provided by the control unit 105 before issuing one or more commands 103 to the control unit 105.
  • FIG. 3 schematically illustrates a voice recognition module or device 102 as illustrated with reference to FIG. 1, and implementing the block diagram of FIG. 2.
  • the housing 102 comprises interconnected by an address and data: a voice interface 301; an Analog-to-Digital converter 302 a processor 304; a non-volatile memory 305; - a random access memory 306; and an apparatus control interface 307.
  • a voice interface 301 a voice interface 301
  • an Analog-to-Digital converter 302 a processor 304
  • non-volatile memory 305 a non-volatile memory 305
  • - a random access memory 306 a random access memory
  • apparatus control interface 307 Each of the elements illustrated in Figure 3 is well known to those skilled in the art. These common elements are not described here.
  • register designates in each of the memories mentioned, both a low-capacity memory area (some binary data) and a high-capacity memory area
  • Non-volatile memory 305 stores in registers which, for convenience, have the same names as the data they store:
  • the RAM 306 stores data, variables and intermediate processing results and includes in particular: - an automaton 313;
  • FIG. 4 illustrates a static voice recognition automaton, known per se, which makes it possible to describe a Viterbi trellis used for voice recognition. According to the state of the art, the entirety of this trellis is taken into account.
  • the corresponding automaton is developed in extenso according to FIG. 4 and comprises: nodes represented in a rectangular form, which are expanded; and terminal nodes in an elliptical form, which are not expanded and which correspond to a word or an expression of the current language.
  • the basic node 400 "G" is expanded into four nodes 401, 403, 404 and
  • node 406 (“Chain") is developed as an alternative:
  • this automaton although corresponding to a small model, includes many developed states and leads to a Viterbi lattice already requiring a memory and significant computational resources relative to the size of the model (we note that the size of the trellis increases with the number of states of the automaton).
  • an automaton entirely statically calculated is replaced by an automaton calculated as and when the needs of the Viterbi algorithm seek to determine the best path in this automaton. This is what is called “dynamic development in width”, since the grammar is developed on all fronts deemed interesting in relation to the incoming acoustic information.
  • FIG. 5 describes an algorithm for dynamic development in width of a node capable of being expanded according to the invention. This algorithm is implemented by the processor 304 of the voice recognition device or module 102 as illustrated with reference to FIG. 3.
  • This algorithm is applied to the nodes to be developed (as chosen by the Viterbi algorithm) in a recursive manner to form an automaton comprising a developed node as a base, until all of the immediate successors are labeled by a Markovian model. , i.e. it is necessary to recursively develop all the non-terminals in the left part of an automaton (assuming that the automaton is built from left to right, the first element of a branch being therefore found to the left).
  • the processor 304 dynamically uses: - the dictionary 310 associated with the non-terminal nodes (which makes it possible to obtain their definition); and - the dictionary 309 associated with the words (which makes it possible to obtain their HMM). It should be noted that such dictionaries are known per se since they are also used in the static construction of complete automata according to the state of the art. Thus, according to the invention, the special nodes introduced (called “DynX” in the figures) also refer to portions of dictionary definitions and are expanded to the bare minimum of requirements.
  • the processor 304 initializes working variables related to the taking into account of the node considered, and in particular a branch counter i. Then, during a step 501, the processor 304 takes into account the i th branch originating from a first development of the node considered, which becomes the active branch to be developed.
  • the processor 304 determines whether the first node of the active branch is a terminal node.
  • the processor 304 develops the first node of the active branch, on the basis of the algorithm defined with reference to FIG. 5 according to a recursive mechanism.
  • the processor 304 determines whether the active branch comprises a single node.
  • the processor 304 groups the following nodes of the branch i into a single special node Dynx which will only be developed later if necessary.
  • the execution of the Viterbi algorithm can indeed lead to eliminating this branch, the probability of occurrence associated with the first node of the branch (materialized by the metric of node in the trellis developed from the automaton) can be too weak compared to one or more alternatives.
  • the development of the special node Dynx is not carried out which saves CPU computation time of microprocessor and memory.
  • the processor 304 determines whether the active branch is the last branch resulting from the first development of the node considered.
  • step 508 the branch counter i is incremented by one and step 501 is repeated.
  • this algorithm is applied to an acoustic input corresponds to the sentence "what is there this afternoon on FR3?" With the following grammar:
  • ⁇ Channel> the ⁇ Channel2>
  • the automaton will be built little by little, as and when the requests of the Viterbi algorithm are made.
  • the Viterbi algorithm requires dynamic development from a state of the automaton, the development must continue until all the immediate successors are labeled by a Markovian model, that is to say that is to say recursively develop all the non-terminals on the left side (example: in Figure 3, the development of ⁇ Date> is obviously necessary, but that of ⁇ Day> is also necessary in order to make visible words "this" and "tomorrow”).
  • FIG. 6 shows the automaton resulting from the application to a first base node "G" 600, of the algorithm for developing a node presented with reference to FIG. 5, according to the invention.
  • node "G” 600 is broken down into a single branch.
  • the first node “what is it” 601 of this branch is a terminal node. It is therefore associated directly with the corresponding expression 603.
  • the branch contains at least one other node according to the grammar describing this node. We will therefore represent this branch in the form of a first node and a special Dynl node which is not developed.
  • the node 600 is broken down into a single branch. The development of the knot
  • the first "Date" node 700 of this branch is not a terminal node. It is therefore developed recursively according to the development algorithm illustrated with reference to FIG. 5.
  • the node 700 is broken down into a single branch.
  • the first "Day" node 702 of this branch is not a terminal node. It is therefore itself developed.
  • the node 702 is broken down into two branches symbolizing an alternative.
  • the first node of each of these two branches respectively "ce” 704 and "tomorrow” 706 is a terminal node. It is therefore directly associated with the corresponding expression 705 and 707 respectively.
  • FIG. 8 presents the automaton resulting from the application to the special node Dyn3 703, of the algorithm for developing a node presented with reference to FIG. 5, according to the invention.
  • node 703 breaks down into a single branch.
  • the only “Day Complement" node 800 in this branch is not a terminal node. It is therefore developed recursively according to the development algorithm illustrated with reference to FIG. 5.
  • the node 800 is broken down into two branches symbolizing an alternative.
  • node of each of these two branches respectively "noon" 801 and “evening” 804 is a terminal node. It is therefore associated directly with the corresponding expression respectively 802 and 804.
  • FIG. 9 presents the automaton resulting from the application to the special node Dyn2 701, of the algorithm for developing a node presented with reference to FIG. 5, according to the invention.
  • the Viterbi algorithm considering as likely the beginning of sentence "what is it this noon", will require the development of node 703.
  • the node 701 breaks down into a single branch.
  • the first node "on" 901 of this branch is a terminal node. It is therefore associated directly with the corresponding expression 903.
  • the branch contains more than one node, it is broken down into the terminal node "on" 901 developed and into a special node Dyn4 704.
  • node 701 The development of node 701 is ended in this way and, in summary, the automaton resulting from node 701 thus constructed is defined, according to the formalism previously used, in the following way:
  • FIG. 10 presents the automaton resulting from the application to the special node Dyn4 902, of the algorithm for developing a node presented with reference to FIG. 5, according to the invention. Viterbi's algorithm considering the beginning of the sentence as likely
  • the node 902 is broken down into two branches symbolizing an alternative.
  • the first node of each of these two branches respectively "la" 1000 and "FR3" 1004 is a terminal node. It is therefore associated directly with the corresponding expression 1002 and 1004 respectively.
  • the Viterbi algorithm eliminates the possibility of having the word "the” corresponding to the node terminal 1002, its probability of occurrence being very low compared to the alternative represented by the terminal node "FR3". It will therefore not ask for the development of the special node Dyn5 1001 which follows the node "la" 1002 on the same branch.
  • the expansion of the automaton is limited as a function of the incoming acoustic data.
  • the vocabulary is relatively narrow for reasons of clarity, but, it is clear that the difference in size between an automaton dynamically constructed and a static automaton grows according to the width of the vocabulary.
  • the invention is not limited to the exemplary embodiments mentioned above.
  • the person skilled in the art can make any variant in dynamic development in width and in particular in determining the cases where a special node is inserted in an automaton.
  • many variants for this insertion are possible between the two extreme cases which are the embodiment of the invention described in FIG. 5 (a node is only developed if necessary), on the one hand, and the state of the art static case, on the other hand.
  • the voice recognition process is not limited to the case where a Viterbi algorithm is implemented but to all the algorithms using a Markov model, in particular in the case of algorithms based on trellises.
  • the invention is not limited to a purely material implementation but that it can also be implemented in the form of a sequence of instructions of a computer program or any form mixing a material part and a software part.
  • the corresponding sequence of instructions may be stored in a removable storage means (such as for example a floppy disk, a CD-ROM or a DVD-ROM) or no, this storage means being partially or totally readable by a computer or a microprocessor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
EP01936519A 2000-05-23 2001-05-15 Dynamische sprachmodelle für die spracherkennung Withdrawn EP1285434A1 (de)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP01936519A EP1285434A1 (de) 2000-05-23 2001-05-15 Dynamische sprachmodelle für die spracherkennung

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP00401433 2000-05-23
EP00401433 2000-05-23
EP01936519A EP1285434A1 (de) 2000-05-23 2001-05-15 Dynamische sprachmodelle für die spracherkennung
PCT/FR2001/001469 WO2001091107A1 (fr) 2000-05-23 2001-05-15 Modeles de langage dynamiques pour la reconnaissance de la parole

Publications (1)

Publication Number Publication Date
EP1285434A1 true EP1285434A1 (de) 2003-02-26

Family

ID=8173699

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01936519A Withdrawn EP1285434A1 (de) 2000-05-23 2001-05-15 Dynamische sprachmodelle für die spracherkennung

Country Status (4)

Country Link
US (1) US20040034519A1 (de)
EP (1) EP1285434A1 (de)
AU (1) AU2001262407A1 (de)
WO (1) WO2001091107A1 (de)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003005345A1 (en) * 2001-07-05 2003-01-16 Speechworks International, Inc. Speech recognition with dynamic grammars
US7149688B2 (en) 2002-11-04 2006-12-12 Speechworks International, Inc. Multi-lingual speech recognition with cross-language context modeling
FR2857528B1 (fr) * 2003-07-08 2006-01-06 Telisma Reconnaissance vocale pour les larges vocabulaires dynamiques
US7849855B2 (en) * 2005-06-17 2010-12-14 Nellcor Puritan Bennett Llc Gas exhaust system for a gas delivery mask
US7490608B2 (en) * 2005-06-17 2009-02-17 Nellcorr Puritan Bennett Llc System and method for adjusting a gas delivery mask
US7900630B2 (en) * 2005-06-17 2011-03-08 Nellcor Puritan Bennett Llc Gas delivery mask with flexible bellows
US7827987B2 (en) * 2005-06-17 2010-11-09 Nellcor Puritan Bennett Llc Ball joint for providing flexibility to a gas delivery pathway
US7697827B2 (en) 2005-10-17 2010-04-13 Konicek Jeffrey C User-friendlier interfaces for a camera
US20080053450A1 (en) * 2006-08-31 2008-03-06 Nellcor Puritan Bennett Incorporated Patient interface assembly for a breathing assistance system
JP5837341B2 (ja) * 2011-06-24 2015-12-24 株式会社ブリヂストン 路面状態判定方法とその装置
JP5875569B2 (ja) * 2013-10-31 2016-03-02 日本電信電話株式会社 音声認識装置とその方法とプログラムとその記録媒体

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0938077B1 (de) * 1992-12-31 2001-06-13 Apple Computer, Inc. Spracherkennungssystem
US5699456A (en) * 1994-01-21 1997-12-16 Lucent Technologies Inc. Large vocabulary connected speech recognition system and method of language representation using evolutional grammar to represent context free grammars
IT1279171B1 (it) * 1995-03-17 1997-12-04 Ist Trentino Di Cultura Sistema di riconoscimento di parlato continuo
US6249761B1 (en) * 1997-09-30 2001-06-19 At&T Corp. Assigning and processing states and arcs of a speech recognition model in parallel processors
US6594393B1 (en) * 2000-05-12 2003-07-15 Thomas P. Minka Dynamic programming operation with skip mode for text line image decoding

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0191107A1 *

Also Published As

Publication number Publication date
WO2001091107A1 (fr) 2001-11-29
US20040034519A1 (en) 2004-02-19
AU2001262407A1 (en) 2001-12-03

Similar Documents

Publication Publication Date Title
EP3373293B1 (de) Spracherkennungsverfahren und -vorrichtung
US7720683B1 (en) Method and apparatus of specifying and performing speech recognition operations
US11776533B2 (en) Building a natural language understanding application using a received electronic record containing programming code including an interpret-block, an interpret-statement, a pattern expression and an action statement
KR20150065171A (ko) 하이브리드 지피유/씨피유(gpu/cpu) 데이터 처리 방법
EP0838073A2 (de) Verfahren und vorrichtung zur dynamischen anpassung eines spracherkennungssystems mit grossem wortschatz und zur verwendung von einschränkungen aus einer datenbank in einem spracherkennungssystem mit grossem wortschatz
EP1669886A1 (de) Konstruktion eines Automaten, der Regeln zur Transkription von Graphem/Phonem für einen Phonetisierer kompiliert
US20030009331A1 (en) Grammars for speech recognition
EP1285434A1 (de) Dynamische sprachmodelle für die spracherkennung
JPH09127978A (ja) 音声認識方法及び装置及びコンピュータ制御装置
EP1234303A1 (de) Verfahren und vorrichtung zur spracherkennung mit verschiedenen sprachmodellen
EP1236198B1 (de) Spracherkennung mit einem komplementären sprachmodel für typischen fehlern im sprachdialog
EP1642264B1 (de) Spracherkennung für grosse dynamische vokabulare
JP3634863B2 (ja) 音声認識システム
Buchsbaum et al. Algorithmic aspects in speech recognition: An introduction
EP1285435B1 (de) Syntax- und semantische-analyse von sprachbefehlen
JP2003208195A5 (de)
EP1803116B1 (de) Spracherkennungsverfahren mit temporaler markereinfügung und entsprechendes system
FR3031823A1 (fr) Lemmatisateur semantique base sur des dictionnaires ontologiques.
FR2801716A1 (fr) Dispositif de reconnaissance vocale mettant en oeuvre une regle syntaxique de permutation
EP1981020A1 (de) Automatisches Worterkennungsverfahren und -system zur Erkennung von bereichsfremden Angaben
Kabré ECHO: A speech recognition package for the design of robust interactive speech-based applications
FR2878990A1 (fr) Construction informatique d'un automate compilant des regles de transcription grapheme/phoneme
FR2837969A1 (fr) Procede de traduction de donnees autorisant une gestion de memoire simplifiee
EP1428205A1 (de) Grammatiken für die spracherkennung
WO2007026094A1 (fr) Identification de concepts avec tolerance

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20021203

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

RBV Designated contracting states (corrected)

Designated state(s): AT BE CH CY DE DK ES FR GB IT LI NL

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: THOMSON LICENSING

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20060607