CN113420783B - 一种基于图文匹配的智能人机交互方法及装置 - Google Patents
一种基于图文匹配的智能人机交互方法及装置 Download PDFInfo
- Publication number
- CN113420783B CN113420783B CN202110587993.7A CN202110587993A CN113420783B CN 113420783 B CN113420783 B CN 113420783B CN 202110587993 A CN202110587993 A CN 202110587993A CN 113420783 B CN113420783 B CN 113420783B
- Authority
- CN
- China
- Prior art keywords
- features
- image
- target
- matching
- entity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 50
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000005516 engineering process Methods 0.000 claims abstract description 25
- 238000013528 artificial neural network Methods 0.000 claims abstract description 18
- 230000008569 process Effects 0.000 claims abstract description 7
- 238000004364 calculation method Methods 0.000 claims description 48
- 238000000605 extraction Methods 0.000 claims description 24
- 238000013527 convolutional neural network Methods 0.000 claims description 18
- 238000012549 training Methods 0.000 claims description 17
- 238000010606 normalization Methods 0.000 claims description 14
- 230000006835 compression Effects 0.000 claims description 8
- 238000007906 compression Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 8
- 238000013139 quantization Methods 0.000 claims description 8
- 238000013179 statistical model Methods 0.000 claims description 8
- 238000004458 analytical method Methods 0.000 claims description 7
- 238000005520 cutting process Methods 0.000 claims description 7
- 230000008447 perception Effects 0.000 claims description 7
- 238000011176 pooling Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 230000002457 bidirectional effect Effects 0.000 claims description 4
- 230000015654 memory Effects 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 230000008034 disappearance Effects 0.000 claims description 2
- 230000009467 reduction Effects 0.000 claims description 2
- 238000003058 natural language processing Methods 0.000 abstract description 11
- 230000006872 improvement Effects 0.000 description 9
- 230000000007 visual effect Effects 0.000 description 6
- 241000282414 Homo sapiens Species 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 4
- 230000000306 recurrent effect Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000005236 sound signal Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
- G06F18/295—Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Character Discrimination (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110587993.7A CN113420783B (zh) | 2021-05-27 | 2021-05-27 | 一种基于图文匹配的智能人机交互方法及装置 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110587993.7A CN113420783B (zh) | 2021-05-27 | 2021-05-27 | 一种基于图文匹配的智能人机交互方法及装置 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113420783A CN113420783A (zh) | 2021-09-21 |
CN113420783B true CN113420783B (zh) | 2022-04-08 |
Family
ID=77713149
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110587993.7A Active CN113420783B (zh) | 2021-05-27 | 2021-05-27 | 一种基于图文匹配的智能人机交互方法及装置 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113420783B (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114067233B (zh) * | 2021-09-26 | 2023-05-23 | 四川大学 | 一种跨模态匹配方法及系统 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010523025A (ja) * | 2007-05-10 | 2010-07-08 | ▲ホア▼▲ウェイ▼技術有限公司 | 目標物の位置探索を実行する画像収集装置を制御するためのシステム及び方法 |
CN112562669A (zh) * | 2020-12-01 | 2021-03-26 | 浙江方正印务有限公司 | 一种智能数字报自动摘要与语音交互聊新闻方法及系统 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060085392A1 (en) * | 2004-09-30 | 2006-04-20 | Microsoft Corporation | System and method for automatic generation of search results based on local intention |
CN106845499A (zh) * | 2017-01-19 | 2017-06-13 | 清华大学 | 一种基于自然语言语义的图像目标检测方法 |
CN109840287B (zh) * | 2019-01-31 | 2021-02-19 | 中科人工智能创新技术研究院(青岛)有限公司 | 一种基于神经网络的跨模态信息检索方法和装置 |
-
2021
- 2021-05-27 CN CN202110587993.7A patent/CN113420783B/zh active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010523025A (ja) * | 2007-05-10 | 2010-07-08 | ▲ホア▼▲ウェイ▼技術有限公司 | 目標物の位置探索を実行する画像収集装置を制御するためのシステム及び方法 |
CN112562669A (zh) * | 2020-12-01 | 2021-03-26 | 浙江方正印务有限公司 | 一种智能数字报自动摘要与语音交互聊新闻方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN113420783A (zh) | 2021-09-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12039454B2 (en) | Microexpression-based image recognition method and apparatus, and related device | |
JP6810232B2 (ja) | 画像処理装置及び方法 | |
CN113421547B (zh) | 一种语音处理方法及相关设备 | |
CN112329525A (zh) | 一种基于时空图卷积神经网络的手势识别方法和装置 | |
CN111680550B (zh) | 情感信息识别方法、装置、存储介质及计算机设备 | |
CN114029963B (zh) | 一种基于视觉听觉融合的机器人操作方法 | |
CN111967334B (zh) | 一种人体意图识别方法、系统以及存储介质 | |
CN111108508A (zh) | 脸部情感识别方法、智能装置和计算机可读存储介质 | |
CN114550057A (zh) | 一种基于多模态表示学习的视频情绪识别方法 | |
Rwelli et al. | Gesture based Arabic sign language recognition for impaired people based on convolution neural network | |
CN116758451A (zh) | 基于多尺度和全局交叉注意力的视听情感识别方法及系统 | |
CN112906520A (zh) | 一种基于姿态编码的动作识别方法及装置 | |
CN113420783B (zh) | 一种基于图文匹配的智能人机交互方法及装置 | |
Al Farid et al. | Single Shot Detector CNN and Deep Dilated Masks for Vision-Based Hand Gesture Recognition From Video Sequences | |
CN112765955B (zh) | 一种中文指代表达下的跨模态实例分割方法 | |
Karthik et al. | Survey on Gestures Translation System for Hearing Impaired People in Emergency Situation using Deep Learning Approach | |
Khubchandani et al. | Sign Language Recognition | |
Manglani et al. | Lip Reading Into Text Using Deep Learning | |
Shane et al. | Sign Language Detection Using Faster RCNN Resnet | |
Perera et al. | Finger spelled Sign Language Translator for Deaf and Speech Impaired People in Srilanka using Convolutional Neural Network | |
CN117576279B (zh) | 基于多模态数据的数字人驱动方法及系统 | |
CN118093840B (zh) | 视觉问答方法、装置、设备及存储介质 | |
Thakore et al. | An Interface for Communication for the Deaf Using Hand Gesture Recognition through Computer Vision and Natural Language Processing | |
Moorthi et al. | Novel Method for Recognizing Sign Language using Regularized Extreme Learning Machine | |
Radha et al. | ILAPP: An Efficient Design of Sign Language Recognition System using Intelligent Learning Assisted Prediction Principles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information |
Inventor after: Yin Erwei Inventor after: Xie Liang Inventor after: Zhang Junqian Inventor after: Zhang Jing Inventor after: Yan Huijiong Inventor after: Luo Zhiguo Inventor after: Zhang Yakun Inventor after: Ai Yongbao Inventor after: Yan Ye Inventor before: Yin Erwei Inventor before: Zhang Junqian Inventor before: Xie Liang Inventor before: Zhang Jing Inventor before: Yan Huijiong Inventor before: Luo Zhiguo Inventor before: Zhang Yakun Inventor before: Ai Yongbao Inventor before: Yan Ye |
|
CB03 | Change of inventor or designer information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |