WO2021034613A1 - Développement d'applications vocales et d'autres applications d'interaction - Google Patents
Développement d'applications vocales et d'autres applications d'interaction Download PDFInfo
- Publication number
- WO2021034613A1 WO2021034613A1 PCT/US2020/046201 US2020046201W WO2021034613A1 WO 2021034613 A1 WO2021034613 A1 WO 2021034613A1 US 2020046201 W US2020046201 W US 2020046201W WO 2021034613 A1 WO2021034613 A1 WO 2021034613A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- interaction
- utterance
- markup language
- general
- developer
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/9038—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
- G06F16/986—Document structures and storage, e.g. HTML extensions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/14—Tree-structured documents
- G06F40/143—Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/226—Validation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
Definitions
- An intent represents a function that is bound to one or more utterances.
- An utterance may contain one or more slots to represent dynamic values (for example, a time of day).
- an intent is indicated by interaction of an end user with an interaction assistant (e.g., an Amazon Echo Dot)
- information about the interaction is delivered by the assistant platform to the endpoint for additional processing.
- An endpoint is essentially an application having a collection of functions or methods that map to the intents defined within the interaction model.
- the endpoint’s functions may contain references to items of content or literal content (we sometimes refer to the “items of content” and “literal content” simply as “content”) that becomes part of the responses sent back to the assistant platform.
- the development platform is its use of a “content-first” (or content-centric) development approach.
- the content-first development approach gives priority to the aspects of the app development and deployment process that involve development of content and management of relationships between end-user requests and responses.
- the following hard-coded interaction model can support only two user requests: Welcome and Weather.
- the development platform Using the entered content and questions and information contained in the template, the development platform has enough information to automatically process and generate a response to essentially any type of request an end user might pose and handle variations of utterances that don’t require exact matching. For example, end-user requests that use the general utterance pattern “how do I ⁇ Query ⁇ ?” will map to a single intent within the development platform’s general interaction model.
- the development platform uses the value of ⁇ Query ⁇ to search for a content match that will provide a suitable answer to both the general “how do I” part of the request and the specific ⁇ Query ⁇ part of the request. Because ⁇ Query ⁇ can have a wide range of specific values representing a variety of implicit intents, the use of the general utterance pattern support a wide range of requests.
- the interaction platform may determine (for example, through automated inspection of repeated developer updates) that particular intents are worth updating for all interaction models for all interaction applications. In these cases, administrative updates can be made automatically (or with human assistance) across all interaction models to add, remove, or edit one or more intents.
- the validation process will return an error stating the given unit, property, and element that does not allow it Check that the node’ s immediate children are among the child types allowed four the node If there are any children nodes that are not in the allowed child types, the validation process will return an error with the name of the child type that is not allowed for the specific node type.
- the development platform has divided the original tree into elements that are fully valid on the left segment, and what would be invalid on the right segment.
- the segmentation process can then either proceed with just the left branch or it could alter the right branch to remove the ⁇ voice> element resulting in the two trees (segments, branches) shown in figure 7
- the segmenting process can also be applied separately to allow for using the separated trees to run custom logic. For example, some text-to-speech services support the ⁇ audio> element while others don’t. So when trying to generate audio files from the SSML that has ⁇ audio> elements, the segmentation engine can segment the trees separately, then generate the output speech audio files and keep the audio files separate but in order.
- the development platform can process them individually for text-to-speech, resulting in three .mp3 files that can be played back to back as one full representation of the entire input.
- the visual tool presents a small vertical value indicator 140 next to the icon to show where the current value 142 is on the scale.
- the user of the SSML visual tool can also cause the pointer to hover over the icon or the scale indicator to view a tooltip 144 explaining the details of the element including the name, value, and others. The user can then click the tooltip to open the
- the development platform leverages generalized, abstract intents and open-ended slot types that provide greater flexibility for utterance matching. This greater flexibility enables other features including that new content can be added without requiring an update to the general interaction model, and therefore without requiring re-deployment or recertification.
- the ability to create interaction applications without coding enables a broad non technical user base to create voice, chat, and other interaction applications.
- the development platform also allows users to manage content without managing business logic, whereas content, business logic, and intents are tightly coupled in custom or flow-based tools.
- the development platform also uses a more traditional content form style of managing content which does not require a large canvas of intersecting items.
Abstract
Entre autres, un développeur d'une application d'interaction pour une entreprise peut créer des éléments de contenu qui seront fournis à une plateforme d'assistant destinée à être utilisée dans des réponses à des demandes d'utilisateurs finaux. Le développeur peut déployer l'application d'interaction à l'aide d'éléments de contenu définis et d'un modèle d'interaction général disponible qui comprend des intentions et des énoncés échantillons avec des intervalles. Le développeur peut déployer l'application d'interaction sans que le développeur ait besoin de formuler des intentions, des énoncés échantillons ou des intervalles du modèle d'interaction général.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA3151910A CA3151910A1 (fr) | 2019-08-19 | 2020-08-13 | Developpement d'applications vocales et d'autres applications d'interaction |
CN202080071550.6A CN114945979A (zh) | 2019-08-19 | 2020-08-13 | 语音和其他交互应用的开发 |
EP20853981.7A EP4018436A4 (fr) | 2019-08-19 | 2020-08-13 | Développement d'applications vocales et d'autres applications d'interaction |
Applications Claiming Priority (8)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/544,527 US10614800B1 (en) | 2019-08-19 | 2019-08-19 | Development of voice and other interaction applications |
US16/544,375 US11508365B2 (en) | 2019-08-19 | 2019-08-19 | Development of voice and other interaction applications |
US16/544,375 | 2019-08-19 | ||
US16/544,508 US10762890B1 (en) | 2019-08-19 | 2019-08-19 | Development of voice and other interaction applications |
US16/544,527 | 2019-08-19 | ||
US16/544,508 | 2019-08-19 | ||
US16/816,535 | 2020-03-12 | ||
US16/816,535 US11538466B2 (en) | 2019-08-19 | 2020-03-12 | Development of voice and other interaction applications |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021034613A1 true WO2021034613A1 (fr) | 2021-02-25 |
Family
ID=74660576
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2020/046201 WO2021034613A1 (fr) | 2019-08-19 | 2020-08-13 | Développement d'applications vocales et d'autres applications d'interaction |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP4018436A4 (fr) |
CN (1) | CN114945979A (fr) |
CA (1) | CA3151910A1 (fr) |
WO (1) | WO2021034613A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IT202100012548A1 (it) | 2021-05-14 | 2022-11-14 | Hitbytes Srl | Metodo per la creazione di applicazioni vocali multipiattaforma |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2022155107A (ja) * | 2021-03-30 | 2022-10-13 | 本田技研工業株式会社 | 情報処理装置、情報処理方法、移動体の制御装置、移動体の制御方法及びプログラム |
CN115064166B (zh) * | 2022-08-17 | 2022-12-13 | 广州小鹏汽车科技有限公司 | 车辆语音交互方法、服务器和存储介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150100943A1 (en) * | 2013-10-09 | 2015-04-09 | Viv Labs, Inc. | Dynamically evolving cognitive architecture system based on contributions from third-party developers |
US20170212884A1 (en) * | 2016-01-23 | 2017-07-27 | Microsoft Technology Licensing, Llc | Tool for Facilitating the Development of New Language Understanding Scenarios |
US20180366114A1 (en) * | 2017-06-16 | 2018-12-20 | Amazon Technologies, Inc. | Exporting dialog-driven applications to digital communication platforms |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020010715A1 (en) * | 2001-07-26 | 2002-01-24 | Garry Chinn | System and method for browsing using a limited display device |
US20040194016A1 (en) * | 2003-03-28 | 2004-09-30 | International Business Machines Corporation | Dynamic data migration for structured markup language schema changes |
US10235999B1 (en) * | 2018-06-05 | 2019-03-19 | Voicify, LLC | Voice application platform |
-
2020
- 2020-08-13 CN CN202080071550.6A patent/CN114945979A/zh active Pending
- 2020-08-13 WO PCT/US2020/046201 patent/WO2021034613A1/fr unknown
- 2020-08-13 CA CA3151910A patent/CA3151910A1/fr active Pending
- 2020-08-13 EP EP20853981.7A patent/EP4018436A4/fr active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150100943A1 (en) * | 2013-10-09 | 2015-04-09 | Viv Labs, Inc. | Dynamically evolving cognitive architecture system based on contributions from third-party developers |
US20170212884A1 (en) * | 2016-01-23 | 2017-07-27 | Microsoft Technology Licensing, Llc | Tool for Facilitating the Development of New Language Understanding Scenarios |
US20180366114A1 (en) * | 2017-06-16 | 2018-12-20 | Amazon Technologies, Inc. | Exporting dialog-driven applications to digital communication platforms |
Non-Patent Citations (1)
Title |
---|
See also references of EP4018436A4 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IT202100012548A1 (it) | 2021-05-14 | 2022-11-14 | Hitbytes Srl | Metodo per la creazione di applicazioni vocali multipiattaforma |
Also Published As
Publication number | Publication date |
---|---|
CA3151910A1 (fr) | 2021-02-25 |
EP4018436A4 (fr) | 2022-10-12 |
EP4018436A1 (fr) | 2022-06-29 |
CN114945979A (zh) | 2022-08-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11538466B2 (en) | Development of voice and other interaction applications | |
US11508365B2 (en) | Development of voice and other interaction applications | |
EP3545427B1 (fr) | Service pour développer des applications commandées par dialogue | |
US11749256B2 (en) | Development of voice and other interaction applications | |
US8117023B2 (en) | Language understanding apparatus, language understanding method, and computer program | |
JP3964134B2 (ja) | 言語文法を作成するための方法 | |
US9081550B2 (en) | Adding speech capabilities to existing computer applications with complex graphical user interfaces | |
JP4237915B2 (ja) | ユーザが文字列の発音を設定することを可能にするためにコンピュータ上で実行される方法 | |
US7630892B2 (en) | Method and apparatus for transducer-based text normalization and inverse text normalization | |
WO2021034613A1 (fr) | Développement d'applications vocales et d'autres applications d'interaction | |
US8447610B2 (en) | Method and apparatus for generating synthetic speech with contrastive stress | |
US11776533B2 (en) | Building a natural language understanding application using a received electronic record containing programming code including an interpret-block, an interpret-statement, a pattern expression and an action statement | |
US20150106101A1 (en) | Method and apparatus for providing speech output for speech-enabled applications | |
JP2005537532A (ja) | 自然言語理解アプリケーションを構築するための総合開発ツール | |
WO2002033542A2 (fr) | Procedes et systemes de developpement de logiciels | |
US8914291B2 (en) | Method and apparatus for generating synthetic speech with contrastive stress | |
CA2671722A1 (fr) | Methodes et systemes de prestation de services de grammaire | |
US20100191519A1 (en) | Tool and framework for creating consistent normalization maps and grammars | |
Gruenstein et al. | Scalable and portable web-based multimodal dialogue interaction with geographical databases | |
US11604929B2 (en) | Guided text generation for task-oriented dialogue | |
US20140257816A1 (en) | Speech synthesis dictionary modification device, speech synthesis dictionary modification method, and computer program product | |
Di Fabbrizio et al. | AT&t help desk. | |
Wigmore | Speech-based creation and editing of mathematical content | |
Albin | Typologizing native language influence on intonation in a second language: Three transfer phenomena in Japanese EFL learners | |
TW201537372A (zh) | 動作設計裝置及動作設計程式產品 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20853981 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 3151910 Country of ref document: CA |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2020853981 Country of ref document: EP Effective date: 20220321 |