CN114582314A - 基于asr的人机音视频交互逻辑模型设计方法 - Google Patents
基于asr的人机音视频交互逻辑模型设计方法 Download PDFInfo
- Publication number
- CN114582314A CN114582314A CN202210187875.1A CN202210187875A CN114582314A CN 114582314 A CN114582314 A CN 114582314A CN 202210187875 A CN202210187875 A CN 202210187875A CN 114582314 A CN114582314 A CN 114582314A
- Authority
- CN
- China
- Prior art keywords
- node
- intention
- interaction
- text
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 73
- 238000000034 method Methods 0.000 title claims abstract description 66
- 238000013461 design Methods 0.000 title claims abstract description 25
- 238000013515 script Methods 0.000 claims abstract description 92
- 238000012549 training Methods 0.000 claims abstract description 9
- 238000013518 transcription Methods 0.000 claims abstract description 8
- 230000035897 transcription Effects 0.000 claims abstract description 8
- 230000002452 interceptive effect Effects 0.000 claims description 23
- 230000008569 process Effects 0.000 claims description 14
- 238000009877 rendering Methods 0.000 claims description 6
- 230000004044 response Effects 0.000 claims description 3
- 238000011144 upstream manufacturing Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 2
- 238000004458 analytical method Methods 0.000 abstract description 9
- 238000003860 storage Methods 0.000 abstract description 9
- 230000001960 triggered effect Effects 0.000 abstract description 5
- 239000008358 core component Substances 0.000 abstract description 4
- 238000000605 extraction Methods 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000005096 rolling process Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0638—Interactive procedures
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Machine Translation (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210187875.1A CN114582314B (zh) | 2022-02-28 | 2022-02-28 | 基于asr的人机音视频交互逻辑模型设计方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210187875.1A CN114582314B (zh) | 2022-02-28 | 2022-02-28 | 基于asr的人机音视频交互逻辑模型设计方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114582314A true CN114582314A (zh) | 2022-06-03 |
CN114582314B CN114582314B (zh) | 2023-06-23 |
Family
ID=81771153
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210187875.1A Active CN114582314B (zh) | 2022-02-28 | 2022-02-28 | 基于asr的人机音视频交互逻辑模型设计方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114582314B (zh) |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11120003A (ja) * | 1997-10-17 | 1999-04-30 | Hitachi Ltd | ループ飛び出しを含むループに対する並列実行方法および並列プログラム生成方法 |
CN1591315A (zh) * | 2003-05-29 | 2005-03-09 | 微软公司 | 用于高级交互接口的语义对象同步理解 |
JP2008046399A (ja) * | 2006-08-17 | 2008-02-28 | Murata Mach Ltd | 音声対話装置と音声対話方法及び音声対話プログラム |
CN101355490A (zh) * | 2007-07-25 | 2009-01-28 | 华为技术有限公司 | 消息路由方法、系统和节点设备 |
US8325880B1 (en) * | 2010-07-20 | 2012-12-04 | Convergys Customer Management Delaware Llc | Automated application testing |
US20160134752A1 (en) * | 2014-11-12 | 2016-05-12 | 24/7 Customer, Inc. | Method and apparatus for facilitating speech application testing |
US20180255180A1 (en) * | 2017-03-01 | 2018-09-06 | Speech-Soft Solutions LLC | Bridge for Non-Voice Communications User Interface to Voice-Enabled Interactive Voice Response System |
CN108510292A (zh) * | 2018-03-26 | 2018-09-07 | 国家电网公司客户服务中心 | 用于电力呼叫服务中故障场景问题的自动流程辅助方法 |
CN110209791A (zh) * | 2019-06-12 | 2019-09-06 | 百融云创科技股份有限公司 | 一种多轮对话智能语音交互系统及装置 |
CN110532515A (zh) * | 2019-08-05 | 2019-12-03 | 北京交通大学 | 基于afc与视频数据的城市轨道交通乘客行程反演系统 |
US20200371818A1 (en) * | 2019-05-22 | 2020-11-26 | Software Ag | Systems and/or methods for computer-automated execution of digitized natural language video stream instructions |
CN112002323A (zh) * | 2020-08-24 | 2020-11-27 | 平安科技(深圳)有限公司 | 语音数据处理方法、装置、计算机设备及存储介质 |
US20200409693A1 (en) * | 2019-06-28 | 2020-12-31 | Baidu Online Network Technology (Beijing) Co., Ltd. | File generation method, device and apparatus, and storage medium |
CN112256854A (zh) * | 2020-11-05 | 2021-01-22 | 云南电网有限责任公司 | 一种基于ai自然语言理解的智能ai会话方法及装置 |
WO2021027198A1 (zh) * | 2019-08-15 | 2021-02-18 | 苏州思必驰信息科技有限公司 | 语音对话处理方法及装置 |
CN113935337A (zh) * | 2021-10-22 | 2022-01-14 | 平安科技(深圳)有限公司 | 一种对话管理方法、系统、终端及存储介质 |
-
2022
- 2022-02-28 CN CN202210187875.1A patent/CN114582314B/zh active Active
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11120003A (ja) * | 1997-10-17 | 1999-04-30 | Hitachi Ltd | ループ飛び出しを含むループに対する並列実行方法および並列プログラム生成方法 |
CN1591315A (zh) * | 2003-05-29 | 2005-03-09 | 微软公司 | 用于高级交互接口的语义对象同步理解 |
JP2008046399A (ja) * | 2006-08-17 | 2008-02-28 | Murata Mach Ltd | 音声対話装置と音声対話方法及び音声対話プログラム |
CN101355490A (zh) * | 2007-07-25 | 2009-01-28 | 华为技术有限公司 | 消息路由方法、系统和节点设备 |
US8325880B1 (en) * | 2010-07-20 | 2012-12-04 | Convergys Customer Management Delaware Llc | Automated application testing |
US20160134752A1 (en) * | 2014-11-12 | 2016-05-12 | 24/7 Customer, Inc. | Method and apparatus for facilitating speech application testing |
US20180255180A1 (en) * | 2017-03-01 | 2018-09-06 | Speech-Soft Solutions LLC | Bridge for Non-Voice Communications User Interface to Voice-Enabled Interactive Voice Response System |
CN108510292A (zh) * | 2018-03-26 | 2018-09-07 | 国家电网公司客户服务中心 | 用于电力呼叫服务中故障场景问题的自动流程辅助方法 |
US20200371818A1 (en) * | 2019-05-22 | 2020-11-26 | Software Ag | Systems and/or methods for computer-automated execution of digitized natural language video stream instructions |
CN110209791A (zh) * | 2019-06-12 | 2019-09-06 | 百融云创科技股份有限公司 | 一种多轮对话智能语音交互系统及装置 |
US20200409693A1 (en) * | 2019-06-28 | 2020-12-31 | Baidu Online Network Technology (Beijing) Co., Ltd. | File generation method, device and apparatus, and storage medium |
CN110532515A (zh) * | 2019-08-05 | 2019-12-03 | 北京交通大学 | 基于afc与视频数据的城市轨道交通乘客行程反演系统 |
WO2021027198A1 (zh) * | 2019-08-15 | 2021-02-18 | 苏州思必驰信息科技有限公司 | 语音对话处理方法及装置 |
CN112002323A (zh) * | 2020-08-24 | 2020-11-27 | 平安科技(深圳)有限公司 | 语音数据处理方法、装置、计算机设备及存储介质 |
CN112256854A (zh) * | 2020-11-05 | 2021-01-22 | 云南电网有限责任公司 | 一种基于ai自然语言理解的智能ai会话方法及装置 |
CN113935337A (zh) * | 2021-10-22 | 2022-01-14 | 平安科技(深圳)有限公司 | 一种对话管理方法、系统、终端及存储介质 |
Non-Patent Citations (2)
Title |
---|
刘锴等: "基于意图识别和自动机理论的任务型聊天机器人的设计", 《信息技术与信息化》 * |
刘锴等: "基于意图识别和自动机理论的任务型聊天机器人的设计", 《信息技术与信息化》, no. 09, 28 September 2020 (2020-09-28) * |
Also Published As
Publication number | Publication date |
---|---|
CN114582314B (zh) | 2023-06-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109101545A (zh) | 基于人机交互的自然语言处理方法、装置、设备和介质 | |
CN110910903B (zh) | 语音情绪识别方法、装置、设备及计算机可读存储介质 | |
WO2021169825A1 (zh) | 语音合成方法、装置、设备和存储介质 | |
WO2024066920A1 (zh) | 虚拟场景的对话方法、装置、电子设备、计算机程序产品及计算机存储介质 | |
CN115599894A (zh) | 情绪识别的方法、装置、电子设备及存储介质 | |
CN112860871B (zh) | 自然语言理解模型训练方法、自然语言理解方法及装置 | |
US20230075893A1 (en) | Speech recognition model structure including context-dependent operations independent of future data | |
CN117494761A (zh) | 信息处理及模型训练方法、装置、设备、介质、程序产品 | |
CN115394321A (zh) | 音频情感识别方法、装置、设备、存储介质及产品 | |
CN116913278B (zh) | 语音处理方法、装置、设备和存储介质 | |
CN117556057A (zh) | 知识问答方法、向量数据库构建方法及装置 | |
WO2023226767A1 (zh) | 模型训练方法和装置及语音含义的理解方法和装置 | |
CN117636874A (zh) | 机器人对话方法、系统、机器人和存储介质 | |
CN113393841A (zh) | 语音识别模型的训练方法、装置、设备及存储介质 | |
CN116909435A (zh) | 一种数据处理方法、装置、电子设备及存储介质 | |
CN117216206A (zh) | 会话处理方法、装置、电子设备及存储介质 | |
CN114582314A (zh) | 基于asr的人机音视频交互逻辑模型设计方法 | |
CN113035200B (zh) | 基于人机交互场景的语音识别纠错方法、装置以及设备 | |
CN116089601A (zh) | 对话摘要生成方法、装置、设备及介质 | |
CN115114281A (zh) | 查询语句的生成方法和装置,存储介质和电子设备 | |
CN112233661A (zh) | 基于语音识别的影视内容字幕生成方法、系统及设备 | |
CN113674745A (zh) | 语音识别方法及装置 | |
CN113223513A (zh) | 语音转换方法、装置、设备和存储介质 | |
CN113505612B (zh) | 多人对话语音实时翻译方法、装置、设备及存储介质 | |
CN115062691B (zh) | 属性识别方法和装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230804 Address after: Room 705, Unit B, Building 15, Changzhou Tian'an Digital City, No. 588 Changwu South Road, Wujin National High tech Industrial Development Zone, Changzhou City, Jiangsu Province, 213000 Patentee after: Changzhou Xiaowen Intelligent Technology Co.,Ltd. Address before: 213000 room 706, unit B, building B, Tian'an Digital City, No. 588, Changwu South Road, Wujin national high tech Industrial Development Zone, Changzhou City, Jiangsu Province Patentee before: Jiangsu Kaiwen Telecom Technology Co.,Ltd. |
|
TR01 | Transfer of patent right |