EP1588351A1 - Production automatique d'interfaces de reconnaissance vocale pour un domaine d'application - Google Patents
Production automatique d'interfaces de reconnaissance vocale pour un domaine d'applicationInfo
- Publication number
- EP1588351A1 EP1588351A1 EP03799565A EP03799565A EP1588351A1 EP 1588351 A1 EP1588351 A1 EP 1588351A1 EP 03799565 A EP03799565 A EP 03799565A EP 03799565 A EP03799565 A EP 03799565A EP 1588351 A1 EP1588351 A1 EP 1588351A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- grammar
- application
- generic
- model
- conceptual model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000004519 manufacturing process Methods 0.000 title claims abstract description 7
- 230000001755 vocal effect Effects 0.000 title abstract 2
- 238000000034 method Methods 0.000 claims description 17
- 238000011161 development Methods 0.000 claims description 3
- 230000003993 interaction Effects 0.000 claims description 3
- 238000009795 derivation Methods 0.000 abstract description 14
- 230000014509 gene expression Effects 0.000 description 10
- 230000006870 function Effects 0.000 description 7
- 238000012552 review Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000007474 system interaction Effects 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
- G10L2015/228—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
Definitions
- the present invention relates to a generic method for automatically producing speech recognition interfaces for an application domain and to a device for implementing this method.
- Speech recognition interfaces are used, in particular in operator-system interaction systems, which are special cases of human-machine interfaces.
- An interface of this type is the means by which an operator can access functions included in a system or machine. More precisely, this interface enables the operator to evaluate the state of the system through perception methods and to modify this state by means of action modalities.
- Such an interface is generally the result of a reflection and a design work carried out beforehand on the operator-system interaction, a discipline aimed at studying the relations between a user and the system with which he interacts.
- the interface of a system for example the human-machine interface of a computer system, must be natural, efficient, intelligent (adaptable to the context), reliable, intuitive (ie - to say easy to understand and to use), in other words the most "transparent” possible, in order to allow the user to accomplish his task without increasing his workload by activities that do not fall within his primary objective.
- the voice interfaces are both more user-friendly and more efficient. Nevertheless, their implementation proves to be more complex than that of traditional interfaces, graphics for example, because it requires the acquisition of multiple knowledge, usually high level, and the implementation of complex treatments to exploit this knowledge to "intelligently" manage the dialogue between the operator and the system.
- the present invention relates to a method for automating the realization of voice interfaces in the easiest possible way and the simplest possible, with a time and a development cost as small as possible.
- the present invention also relates to a device for implementing this method, which device is simple to use and inexpensive.
- the method according to the invention is characterized by the fact that a conceptual model of the field of application of the voice interface is captured, that a set of generic grammar rules representative of a class of applications is produced. , that the various rules of generic grammar whose constraints are satisfied, that the grammar of the domain of application considered from the exemplary generic grammar and the conceptual model are exemplified, and that we manage the operator / system interaction.
- the device for automated production of voice interfaces according to the invention comprises means for capturing a conceptual model, derivation means, means for supplying a generic model and means for executing the specific grammar of the invention. scope of application. The present invention will be better understood on reading the detailed description of an embodiment, taken by way of nonlimiting example and illustrated by the appended drawing, in which:
- FIG. 1 is a block diagram of the principal means implemented by the invention
- FIG. 2 is a more detailed block diagram than that of FIG. 1, and
- FIG. 3 is a detailed block diagram of the execution means of FIGS. 1 and 2.
- FIG. 1 shows input means 1 making it possible to input the various data describing the conceptual model of the considered field of application and the relationships linking these data.
- This data can be, for example, in the case of the voice command used to control an aircraft, the terminology of all the aircraft and all the functions of an aircraft, as well as their different mutual relations.
- a set of grammar rules 2 is constructed and stored to form a generic model representing a class of applications (for the example mentioned above, this class would be that relating to the control of vehicles in general).
- derivation means 3 From the conceptual model 1 and the generic model 2, derivation means 3 automatically calculate all the resources required to produce the desired voice interface, and deduce all of the language statements likely to be treated by this interface in the context of the application being processed.
- the device of the invention comprises revision means 4 and explanation means 5.
- the revision means 4 are supervised by the operator of the device or by its designer. Their role is to review the data entered by the operator using means 1, in order to correct terms contrary to the semantics of the application in question and / or to add new terms to enrich the grammar. the scope of application.
- the explanatory means 5 make it easier to review the data entered by the operator by explaining the rules that were applied during the development of the grammar specific to the application domain.
- the execution means 6 are responsible for automatically producing the voice interface of the considered application domain. The production method of this interface is based on the distinction between the application-dependent resources and the specific resources (ie the set of concepts constituting the conceptual model grasped by means 1 and 1). set of terms that make up the lexicon), and resources that do not depend on this application (generic resources), that is, the syntactic rules of the grammar and the whole basic lexicon, which are specific to the language used.
- the voice interface designer must describe using input means 1 the resources specific to the application in question, that is to say the conceptual model and the lexicon of this application . It is for him to define the concepts of the application that he wishes to be able to be governed by the voice, then to verbalize these concepts.
- This input work can be facilitated by the use of a formal model of the intended application, provided that this model exists and is available.
- the derivation means 3 which operate entirely automatically, calculate from these specific resources and generic resources provided by the means 2 the linguistic model of the voice interface for said application.
- This linguistic model consists of the grammar and lexicon of the sub-language dedicated to this interface.
- the derivation means 3 also make it possible to calculate all the statements of this sub-language (that is to say its phraseology), as well as all the knowledge relating to the application and necessary for the management of the dialogue. operator system.
- the revision means 4 then allow the operator to view all or part of the phraseology corresponding to his input work, in order to refine this phraseology by adding, deleting or modifying.
- the means for producing explanations makes it possible to automatically identify which conceptual and lexical data entered by the operator are at the origin of a given characteristic of a statement or of a set of statements of the product sub-language.
- execution means 6 constitute the environment which is called upon when using the voice interface produced, in order to validate this interface.
- the execution means exploit all the data provided by the input means 1 and the derivation means 3.
- FIG. 2 shows an exemplary embodiment of the device for implementing the method of the invention.
- the operator has an input interface 7, such as a graphical interface, to enter the conceptual model 8 of the application in question. It also has a database 9 including the entities or concepts of the application, and a lexicon 10 of this application.
- the conceptual model is formed of the entities of the application and their mutual associations, that is to say predicative relationships linking the concepts of the application.
- the capture of the conceptual model is conceived as an iterative and assisted process using two main sources of knowledge, which are generic grammar 11 and basic lexicon 12.
- One of the ways to achieve the derivation means 3 is to extend a syntactic and semantic grammar so as to allow the consideration of conceptual constraints. It is thus possible to define in this high-level formalism a generic grammar whose adaptation to the application domain is done automatically through the data entered by the operator. The derivation means thus make it possible to calculate the syntactic-semantic grammar and the lexicon specific to the field of application. Thus, as schematized in FIG. 2, from the conceptual model 8 grasped by the operator, the device deduces the linguistic model that it transmits to the derivation means 13.
- the conceptual model is used not only to calculate the linguistic model and related sub-models (linguistic model for recognition, linguistic model for analysis and linguistic model for generation, but also for the management of the operator-system dialogue for everything related to reference to the concepts and objects of the application.
- the revision-explanation means 14, for their revision function, are accessible via the graphic interface 7 for inputting the conceptual model of the application. They use a grammar generator 15 which calculates the grammar corresponding to the model entered and provides mechanisms for displaying all or part of the corresponding statements.
- the grammar generator 15 includes a syntactic and semantic grammar 16 of utterance analysis, a statement generation grammar 17 and a speech recognition grammar 18.
- the revision-explanation means 14, for their explanatory function, are based on a formal analysis of the calculation made by the derivation means 13 to identify the data which are at the origin of the characteristics of these statements. These means allow the operator to iteratively design his model while ensuring that the statements that will be produced meet his expectations.
- FIG. 3 shows an embodiment of the execution means 6 of the voice interface.
- These means include: a speech recognition device 19, which uses the grammar 18 derived from the linguistic model automatically;
- an utterance analyzer 20 that uses the linguistic model provided by the derivation means 13. It verifies syntactically and semantically the accuracy of the utterances;
- dialog processor 21 that uses the conceptual model entered by the operator, as well as the database 9 of the linguistic entities of the application, entered by the operator or automatically constructed by the application 22;
- a utterance generator 23 which uses the utterance generation grammar 17 derived from the linguistic model automatically;
- a device 24 for speech synthesis All the elements 19 to 21 and 23, 24 execution of the voice interface is managed in this case by a system 25 of the multi-agent type.
- the input means make it possible to help the designer of the voice interface during the constitution of the lexicon.
- the input means make it possible to help the designer of the voice interface during the constitution of the lexicon.
- mechanisms are implemented to propose, for a given term (for example "movie” for the English version of the lexicon and "film” for the French version) all the inflected forms corresponding to this term (singular and plural of a common noun or conjugations of a verb, by example).
- the designer of the lexicon has only to select from all these forms, those he wants to find in the voice interface.
- the revision means allow the voice interface designer to validate or correct the conceptual model that has been created via the input means.
- a first step in the review process is to visualize all or part of the phraseology that corresponds to the conceptual model.
- PROGRAM plays the role of the subject
- CHANNEL acts as a subject.
- the revision means allow the voice interface designer to visualize this error, and modify the conceptual model to correct it.
- the means of explanation have the function of identifying and describing the subset or characteristic of the conceptual model whose compilation produces the sub-grammar corresponding to a particular statement, to a language expression - a piece of utterance - particular, or to a language property - a particular expression characteristic -.
- the user has the possibility, by selecting a statement, an expression or a property engendered by the grammar, to find and understand the subset or the characteristic of the conceptual model which is the same. 'origin.
- the conceptual model can modify the statement, the expression or the generated property and, by reiterating the process, refine the conceptual model in order to obtain the grammar of the desired language.
- the possibility of using the plural in the relation between the unit entity and the mission entity in the four expressions below is a function of the cardinality of this relation.
- the means of explanation must allow the user to identify that the cardinality of the conceptual rule must be modified to obtain the grammar corresponding to the plural expressions that he wishes to include in his language.
- One embodiment of the means of explanation consists in constructing a backtracking method on the method of compiling the grammar, which will make it possible to start from the result to find the conceptual rules that lead to this result and subsequently describe them to the user.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR0216902 | 2002-12-16 | ||
FR0216902A FR2849515B1 (fr) | 2002-12-31 | 2002-12-31 | Procede generique de production automatique d'interfaces de reconnaissance vocale pour un domaine d'application et dispositif de mise en oeuvre |
PCT/EP2003/051001 WO2004059617A1 (fr) | 2002-12-31 | 2003-12-15 | Production automatique d'interfaces de reconnaissance vocale pour un domaine d'application |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1588351A1 true EP1588351A1 (fr) | 2005-10-26 |
Family
ID=32480321
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP03799565A Withdrawn EP1588351A1 (fr) | 2002-12-31 | 2003-12-15 | Production automatique d'interfaces de reconnaissance vocale pour un domaine d'application |
Country Status (6)
Country | Link |
---|---|
US (1) | US20060089835A1 (fr) |
EP (1) | EP1588351A1 (fr) |
CN (1) | CN1745409A (fr) |
AU (1) | AU2003299231A1 (fr) |
FR (1) | FR2849515B1 (fr) |
WO (1) | WO2004059617A1 (fr) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2862780A1 (fr) * | 2003-11-25 | 2005-05-27 | Thales Sa | Procede d'elaboration d'une grammaire specifique a un domaine a partir d'une grammaire sous-specifiee |
FR2864646B1 (fr) * | 2003-12-24 | 2006-04-21 | Thales Sa | Procede d'augmentation d'un modele de tache pour permettre la gestion de l'interaction homme-machine |
US20080201148A1 (en) * | 2007-02-15 | 2008-08-21 | Adacel, Inc. | System and method for generating and using an array of dynamic grammar |
CN101329868B (zh) * | 2008-07-31 | 2011-06-01 | 林超 | 一种针对地区语言使用偏好的语音识别优化系统及其方法 |
US8442826B2 (en) * | 2009-06-10 | 2013-05-14 | Microsoft Corporation | Application-dependent information for recognition processing |
EP2680599A1 (fr) * | 2012-06-29 | 2014-01-01 | Thomson Licensing | Fourniture d'un contenu multimédia personnalisé |
US11100291B1 (en) | 2015-03-13 | 2021-08-24 | Soundhound, Inc. | Semantic grammar extensibility within a software development framework |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5642519A (en) * | 1994-04-29 | 1997-06-24 | Sun Microsystems, Inc. | Speech interpreter with a unified grammer compiler |
CN1163869C (zh) * | 1997-05-06 | 2004-08-25 | 语音工程国际公司 | 用于开发交互式语音应用程序的系统和方法 |
US6188976B1 (en) * | 1998-10-23 | 2001-02-13 | International Business Machines Corporation | Apparatus and method for building domain-specific language models |
US6321198B1 (en) * | 1999-02-23 | 2001-11-20 | Unisys Corporation | Apparatus for design and simulation of dialogue |
US6434523B1 (en) * | 1999-04-23 | 2002-08-13 | Nuance Communications | Creating and editing grammars for speech recognition graphically |
US6985852B2 (en) * | 2001-08-21 | 2006-01-10 | Microsoft Corporation | Method and apparatus for dynamic grammars and focused semantic parsing |
FR2845174B1 (fr) * | 2002-09-27 | 2005-04-08 | Thales Sa | Procede permettant de rendre l'interaction utilisateur-systeme independante de l'application et des medias d'interaction |
-
2002
- 2002-12-31 FR FR0216902A patent/FR2849515B1/fr not_active Expired - Lifetime
-
2003
- 2003-12-15 US US10/541,192 patent/US20060089835A1/en not_active Abandoned
- 2003-12-15 AU AU2003299231A patent/AU2003299231A1/en not_active Abandoned
- 2003-12-15 CN CNA2003801093874A patent/CN1745409A/zh active Pending
- 2003-12-15 WO PCT/EP2003/051001 patent/WO2004059617A1/fr not_active Application Discontinuation
- 2003-12-15 EP EP03799565A patent/EP1588351A1/fr not_active Withdrawn
Non-Patent Citations (1)
Title |
---|
See references of WO2004059617A1 * |
Also Published As
Publication number | Publication date |
---|---|
US20060089835A1 (en) | 2006-04-27 |
WO2004059617A1 (fr) | 2004-07-15 |
CN1745409A (zh) | 2006-03-08 |
AU2003299231A1 (en) | 2004-07-22 |
FR2849515A1 (fr) | 2004-07-02 |
FR2849515B1 (fr) | 2007-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210124562A1 (en) | Conversational user interface agent development environment | |
Vanderdonckt | Model-driven engineering of user interfaces: Promises, successes, failures, and challenges | |
US20140019116A1 (en) | System and methods for semiautomatic generation and tuning of natural language interaction applications | |
WO2010009996A1 (fr) | Procede de compilation de programme informatique | |
FR3017474A1 (fr) | Saisie assistee de regles dans une base de connaissance | |
WO2004059617A1 (fr) | Production automatique d'interfaces de reconnaissance vocale pour un domaine d'application | |
CA2020505C (fr) | Procede d'assistance pour l'utilisateur d'un systeme informatique et dispositif pour la mise en oeuvre dudit procede | |
EP3248111A1 (fr) | Procédé de lemmatisation, dispositif et programme correspondant | |
WO2011098677A1 (fr) | Systeme et un procede de gestion et de compilation d'un cadre d'applications de developpement logiciel. | |
WO2010119208A1 (fr) | Procede d'assistance au developpement ou a l'utilisation d'un systeme complexe | |
EP3729273B1 (fr) | Systeme et procede d'elaboration et d'execution de tests fonctionnels pour grappe de serveurs | |
EP1280074A1 (fr) | Utilisation d'hyperliens dans un programme d'une application d'automatisme et station de programmation d'une telle application | |
EP3195113B1 (fr) | Procédé de vérification de traçabilité de premières instructions en un langage de programmation procédurale générées à partir de secondes instructions en un langage de modélisation | |
EP1713243A1 (fr) | Procédé et système de génération automatique de composants logiciels pour la conception de services vocaux | |
EP1764684A1 (fr) | Structure de données et procedé de création d'une documentation de logiciel | |
Becker et al. | D5. 3: In-car showcase based on talk libraries | |
US20190073360A1 (en) | Query-based granularity selection for partitioning recordings | |
FR3115624A1 (fr) | Procede d’annotation de donnees d’entrainement | |
FR2644608A1 (fr) | Procede et appareil dans le domaine des systemes experts | |
WO2020079109A1 (fr) | Dispositif de traitement automatique de texte par ordinateur | |
FR3024566A1 (fr) | Procede de verification de tracabilite de premieres instructions en un langage de programmation procedurale generees a partir de secondes instructions en un langage de modelisation | |
FR2975202A1 (fr) | Dispositif d'elaboration d'une definition d'une version d'un produit | |
FR2852125A1 (fr) | Systeme et procede de modelisation d'un projet informatique | |
SOMÉ | Use Case Editor (UCEd) User Guide Version 1.6. 0 | |
EP2738693A1 (fr) | Procédé d'enregistrement de données hiérarchisées |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20050630 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL LT LV MK |
|
DAX | Request for extension of the european patent (deleted) | ||
17Q | First examination report despatched |
Effective date: 20051201 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
RTI1 | Title (correction) |
Free format text: AUTOMATIC PRODUCTION OF VOCAL RECOGNITION INTERFACES FOR AN APPLICATION ENVIRONMENT |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20071019 |