CN112860847B - 视频问答的交互方法及系统 - Google Patents
视频问答的交互方法及系统 Download PDFInfo
- Publication number
- CN112860847B CN112860847B CN202110069976.4A CN202110069976A CN112860847B CN 112860847 B CN112860847 B CN 112860847B CN 202110069976 A CN202110069976 A CN 202110069976A CN 112860847 B CN112860847 B CN 112860847B
- Authority
- CN
- China
- Prior art keywords
- visual
- semantic
- global
- feature
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 230000003993 interaction Effects 0.000 title claims description 16
- 230000000007 visual effect Effects 0.000 claims abstract description 283
- 230000002452 interceptive effect Effects 0.000 claims abstract description 14
- 230000004927 fusion Effects 0.000 claims description 30
- 230000007246 mechanism Effects 0.000 claims description 18
- 238000013507 mapping Methods 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 11
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000011176 pooling Methods 0.000 claims description 4
- 230000000306 recurrent effect Effects 0.000 claims description 4
- 230000004931 aggregating effect Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7844—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Library & Information Science (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110069976.4A CN112860847B (zh) | 2021-01-19 | 2021-01-19 | 视频问答的交互方法及系统 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110069976.4A CN112860847B (zh) | 2021-01-19 | 2021-01-19 | 视频问答的交互方法及系统 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112860847A CN112860847A (zh) | 2021-05-28 |
CN112860847B true CN112860847B (zh) | 2022-08-19 |
Family
ID=76007372
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110069976.4A Active CN112860847B (zh) | 2021-01-19 | 2021-01-19 | 视频问答的交互方法及系统 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112860847B (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113220859B (zh) * | 2021-06-01 | 2024-05-10 | 平安科技(深圳)有限公司 | 基于图像的问答方法、装置、计算机设备及存储介质 |
CN113901302B (zh) * | 2021-09-29 | 2022-09-27 | 北京百度网讯科技有限公司 | 数据处理方法、装置、电子设备和介质 |
CN115688083B (zh) * | 2022-12-29 | 2023-03-28 | 广东工业大学 | 图文型验证码的识别方法、装置、设备及存储介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107818306A (zh) * | 2017-10-31 | 2018-03-20 | 天津大学 | 一种基于注意力模型的视频问答方法 |
WO2019007041A1 (zh) * | 2017-07-06 | 2019-01-10 | 北京大学深圳研究生院 | 基于多视图联合嵌入空间的图像-文本双向检索方法 |
CN111464881A (zh) * | 2019-01-18 | 2020-07-28 | 复旦大学 | 基于自优化机制的全卷积视频描述生成方法 |
CN111652202A (zh) * | 2020-08-10 | 2020-09-11 | 浙江大学 | 利用自适应的时空图模型通过提升视频-语言表征学习来解决视频问答问题的方法及其系统 |
CN111949824A (zh) * | 2020-07-08 | 2020-11-17 | 合肥工业大学 | 基于语义对齐的视觉问答方法和系统、存储介质 |
CN112036276A (zh) * | 2020-08-19 | 2020-12-04 | 北京航空航天大学 | 一种人工智能视频问答方法 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108228703B (zh) * | 2017-10-31 | 2020-05-08 | 北京市商汤科技开发有限公司 | 图像问答方法、装置、系统和存储介质 |
-
2021
- 2021-01-19 CN CN202110069976.4A patent/CN112860847B/zh active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019007041A1 (zh) * | 2017-07-06 | 2019-01-10 | 北京大学深圳研究生院 | 基于多视图联合嵌入空间的图像-文本双向检索方法 |
CN107818306A (zh) * | 2017-10-31 | 2018-03-20 | 天津大学 | 一种基于注意力模型的视频问答方法 |
CN111464881A (zh) * | 2019-01-18 | 2020-07-28 | 复旦大学 | 基于自优化机制的全卷积视频描述生成方法 |
CN111949824A (zh) * | 2020-07-08 | 2020-11-17 | 合肥工业大学 | 基于语义对齐的视觉问答方法和系统、存储介质 |
CN111652202A (zh) * | 2020-08-10 | 2020-09-11 | 浙江大学 | 利用自适应的时空图模型通过提升视频-语言表征学习来解决视频问答问题的方法及其系统 |
CN112036276A (zh) * | 2020-08-19 | 2020-12-04 | 北京航空航天大学 | 一种人工智能视频问答方法 |
Also Published As
Publication number | Publication date |
---|---|
CN112860847A (zh) | 2021-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112860847B (zh) | 视频问答的交互方法及系统 | |
CN111078836B (zh) | 基于外部知识增强的机器阅读理解方法、系统、装置 | |
CN111695779B (zh) | 一种知识追踪方法、装置及存储介质 | |
CN112183577A (zh) | 一种半监督学习模型的训练方法、图像处理方法及设备 | |
Sonkar et al. | qdkt: Question-centric deep knowledge tracing | |
CN111652202B (zh) | 利用自适应的时空图模型通过提升视频-语言表征学习来解决视频问答问题的方法及其系统 | |
CN113656570A (zh) | 基于深度学习模型的视觉问答方法及装置、介质、设备 | |
CN111046671A (zh) | 基于图网络融入词典的中文命名实体识别方法 | |
CN107544960B (zh) | 一种基于变量绑定和关系激活的自动问答方法 | |
CN112257966A (zh) | 模型处理方法、装置、电子设备及存储介质 | |
CN116136870A (zh) | 基于增强实体表示的智能社交对话方法、对话系统 | |
CN115238036A (zh) | 一种基于图注意力网络和文本信息的认知诊断方法及装置 | |
CN114529917A (zh) | 一种零样本中文单字识别方法、系统、装置及存储介质 | |
CN111666375B (zh) | 文本相似度的匹配方法、电子设备和计算机可读介质 | |
CN113609355B (zh) | 一种基于动态注意力与图网络推理的视频问答系统、方法、计算机及存储介质 | |
CN116362242A (zh) | 一种小样本槽值提取方法、装置、设备及存储介质 | |
CN116611517A (zh) | 融合图嵌入和注意力的知识追踪方法 | |
CN112487811B (zh) | 基于强化学习的级联信息提取系统及方法 | |
CN112785039B (zh) | 一种试题作答得分率的预测方法及相关装置 | |
CN116012627A (zh) | 一种基于超图聚类的因果时序双增强知识追踪方法 | |
CN113239699B (zh) | 一种融合多特征的深度知识追踪方法及系统 | |
CN114971066A (zh) | 融合遗忘因素和学习能力的知识追踪方法及系统 | |
CN114936564A (zh) | 一种基于对齐变分自编码的多语言语义匹配方法及系统 | |
CN111061851B (zh) | 基于给定事实的问句生成方法及系统 | |
CN113987124A (zh) | 深度知识追踪方法、系统及可存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240419 Address after: Room 524, Automation Building, No. 95 Zhongguancun East Road, Haidian District, Beijing, 100190 Patentee after: BEIJING ZHONGZI SCIENCE AND TECHNOLOGY BUSINESS INCUBATOR CO.,LTD. Country or region after: China Address before: 100190 No. 95 East Zhongguancun Road, Beijing, Haidian District Patentee before: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES Country or region before: China |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240423 Address after: 200-19, 2nd Floor, Building B, Wanghai Building, No.10 West Third Ring Middle Road, Haidian District, Beijing, 100190 Patentee after: Zhongke Zidong Taichu (Beijing) Technology Co.,Ltd. Country or region after: China Address before: Room 524, Automation Building, No. 95 Zhongguancun East Road, Haidian District, Beijing, 100190 Patentee before: BEIJING ZHONGZI SCIENCE AND TECHNOLOGY BUSINESS INCUBATOR CO.,LTD. Country or region before: China |
|
TR01 | Transfer of patent right |