CN115599954B - 一种基于场景图推理的视频问答方法 - Google Patents
一种基于场景图推理的视频问答方法 Download PDFInfo
- Publication number
- CN115599954B CN115599954B CN202211587240.7A CN202211587240A CN115599954B CN 115599954 B CN115599954 B CN 115599954B CN 202211587240 A CN202211587240 A CN 202211587240A CN 115599954 B CN115599954 B CN 115599954B
- Authority
- CN
- China
- Prior art keywords
- video
- features
- attention
- information
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000008569 process Effects 0.000 claims description 20
- 238000000605 extraction Methods 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 7
- 238000013461 design Methods 0.000 claims description 6
- 230000003993 interaction Effects 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 4
- 238000011176 pooling Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 230000008859 change Effects 0.000 claims description 2
- 230000002596 correlated effect Effects 0.000 claims 1
- 230000017105 transposition Effects 0.000 claims 1
- 230000006870 function Effects 0.000 description 15
- 230000000007 visual effect Effects 0.000 description 15
- 238000010586 diagram Methods 0.000 description 7
- 230000009471 action Effects 0.000 description 5
- 230000002301 combined effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000006993 memory improvement Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211587240.7A CN115599954B (zh) | 2022-12-12 | 2022-12-12 | 一种基于场景图推理的视频问答方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211587240.7A CN115599954B (zh) | 2022-12-12 | 2022-12-12 | 一种基于场景图推理的视频问答方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115599954A CN115599954A (zh) | 2023-01-13 |
CN115599954B true CN115599954B (zh) | 2023-03-31 |
Family
ID=84852707
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211587240.7A Active CN115599954B (zh) | 2022-12-12 | 2022-12-12 | 一种基于场景图推理的视频问答方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115599954B (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116069973B (zh) * | 2023-04-04 | 2023-06-06 | 石家庄铁道大学 | 一种基于语义自挖掘的视频摘要生成方法 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107818306A (zh) * | 2017-10-31 | 2018-03-20 | 天津大学 | 一种基于注意力模型的视频问答方法 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111898448B (zh) * | 2020-06-30 | 2023-10-24 | 北京大学 | 一种基于深度学习的行人属性识别方法和系统 |
CN111652357B (zh) * | 2020-08-10 | 2021-01-15 | 浙江大学 | 一种利用基于图的特定目标网络解决视频问答问题的方法及其系统 |
CN115391548A (zh) * | 2022-07-08 | 2022-11-25 | 浙江工业大学 | 基于场景图和概念网相结合的检索知识图谱库生成方法 |
-
2022
- 2022-12-12 CN CN202211587240.7A patent/CN115599954B/zh active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107818306A (zh) * | 2017-10-31 | 2018-03-20 | 天津大学 | 一种基于注意力模型的视频问答方法 |
Also Published As
Publication number | Publication date |
---|---|
CN115599954A (zh) | 2023-01-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113902964B (zh) | 基于关键词感知的多模态注意力视频问答方法与系统 | |
CN113297370B (zh) | 基于多交互注意力的端到端多模态问答方法及系统 | |
CN112036276B (zh) | 一种人工智能视频问答方法 | |
CN111652357B (zh) | 一种利用基于图的特定目标网络解决视频问答问题的方法及其系统 | |
CN109685724B (zh) | 一种基于深度学习的对称感知人脸图像补全方法 | |
CN114428866B (zh) | 一种基于面向对象的双流注意力网络的视频问答方法 | |
CN115239944A (zh) | 基于因果推理的图像标题自动生成方法 | |
CN115146100A (zh) | 一种基于反事实推理的跨模态检索模型、方法及计算机设备 | |
CN114049381A (zh) | 一种融合多层语义信息的孪生交叉目标跟踪方法 | |
CN113191357A (zh) | 基于图注意力网络的多层次图像-文本匹配方法 | |
CN116935438A (zh) | 一种基于模型结构自主进化的行人图像重识别方法 | |
CN117235216A (zh) | 一种基于异构知识融合的知识推理方法 | |
CN112527993A (zh) | 一种跨媒体层次化深度视频问答推理框架 | |
CN117036545A (zh) | 一种基于图像场景特征的图像描述文本生成方法及系统 | |
Xu et al. | Isolated Word Sign Language Recognition Based on Improved SKResNet‐TCN Network | |
CN115599954B (zh) | 一种基于场景图推理的视频问答方法 | |
CN113689514A (zh) | 一种面向主题的图像场景图生成方法 | |
CN117975216A (zh) | 一种基于多模态特征细化和融合的显著性物体检测方法 | |
CN116912579A (zh) | 基于多层级注意力机制的场景图生成方法 | |
Yumeng et al. | News image-text matching with news knowledge graph | |
CN114037936A (zh) | 一种基于传递式视觉关系检测的视频描述生成方法 | |
CN116151226B (zh) | 一种基于机器学习的聋哑人手语纠错方法、设备和介质 | |
CN117473279A (zh) | 基于问题类型感知的测试时vqa模型去偏方法及系统 | |
CN112765955B (zh) | 一种中文指代表达下的跨模态实例分割方法 | |
Wang et al. | How to make a BLT sandwich? learning to reason towards understanding web instructional videos |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240807 Address after: 230000 B-1015, wo Yuan Garden, 81 Ganquan Road, Shushan District, Hefei, Anhui. Patentee after: HEFEI MINGLONG ELECTRONIC TECHNOLOGY Co.,Ltd. Country or region after: China Address before: 510006 No. 100 West Ring Road, Guangzhou University, Guangzhou, Guangdong, Panyu District Patentee before: GUANGDONG University OF TECHNOLOGY Country or region before: China |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20241230 Address after: Building 3, Phase III, Dachong Business Center, No. 18 Dachong 1st Road, Dachong Community, Yuehai Street, Nanshan District, Shenzhen City, Guangdong Province 518000 Patentee after: Yingxi Intelligent (Shenzhen) Co.,Ltd. Country or region after: China Address before: 230000 B-1015, wo Yuan Garden, 81 Ganquan Road, Shushan District, Hefei, Anhui. Patentee before: HEFEI MINGLONG ELECTRONIC TECHNOLOGY Co.,Ltd. Country or region before: China |