ZA202307784B - Video data processing method and apparatus, electronic device, and storage medium - Google Patents
Video data processing method and apparatus, electronic device, and storage mediumInfo
- Publication number
- ZA202307784B ZA202307784B ZA2023/07784A ZA202307784A ZA202307784B ZA 202307784 B ZA202307784 B ZA 202307784B ZA 2023/07784 A ZA2023/07784 A ZA 2023/07784A ZA 202307784 A ZA202307784 A ZA 202307784A ZA 202307784 B ZA202307784 B ZA 202307784B
- Authority
- ZA
- South Africa
- Prior art keywords
- video
- modal
- target video
- feature
- modalities
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title abstract 2
- 230000004927 fusion Effects 0.000 abstract 3
- 238000010586 diagram Methods 0.000 abstract 2
- 238000000034 method Methods 0.000 abstract 2
- 230000001364 causal effect Effects 0.000 abstract 1
- 238000000605 extraction Methods 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Closed-Circuit Television Systems (AREA)
Abstract
The present application is applicable to the field of multimedia technology, and provides a video data processing method and apparatus, an electronic device, and a storage medium. The method includes: inputting a target video into a multi-modal feature extraction model in response to a type identification instruction of the target video, and outputting modal features of a plurality of different modalities corresponding to each video image frame in the target video; generating a fusion feature corresponding to each modal feature respectively based on a preset mutual causal relationship between the different modalities; constructing a modal object diagram corresponding to the target video according to fusion features of all video image frames in various modalities, determining an attention feature corresponding to the target video through the modal object diagram, the attention feature fusing the fusion features of the plurality of modalities; determining a video type of the target video based on the attention feature. With the above method, the accuracy of video surveillance is improved, and the labor cost of video surveillance is also reduced.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210289901.1A CN114387567B (en) | 2022-03-23 | 2022-03-23 | Video data processing method and device, electronic equipment and storage medium |
PCT/CN2023/081690 WO2023179429A1 (en) | 2022-03-23 | 2023-03-15 | Video data processing method and apparatus, electronic device, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
ZA202307784B true ZA202307784B (en) | 2024-03-27 |
Family
ID=81206070
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
ZA2023/07784A ZA202307784B (en) | 2022-03-23 | 2023-08-08 | Video data processing method and apparatus, electronic device, and storage medium |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN114387567B (en) |
WO (1) | WO2023179429A1 (en) |
ZA (1) | ZA202307784B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114387567B (en) * | 2022-03-23 | 2022-06-28 | 长视科技股份有限公司 | Video data processing method and device, electronic equipment and storage medium |
CN115101091A (en) * | 2022-05-11 | 2022-09-23 | 上海事凡物联网科技有限公司 | Sound data classification method, terminal and medium based on multi-dimensional feature weighted fusion |
CN115100725B (en) * | 2022-08-23 | 2022-11-22 | 浙江大华技术股份有限公司 | Object recognition method, object recognition apparatus, and computer storage medium |
CN116156298B (en) * | 2023-04-11 | 2023-07-04 | 安徽医科大学 | Endoscopic high-definition video processing system and method based on sense-in-store calculation |
CN117370912B (en) * | 2023-10-25 | 2024-08-27 | 杭州赛创电气安装工程有限公司 | Intelligent supervision system and method for electric power construction site |
CN117876941B (en) * | 2024-03-08 | 2024-07-09 | 杭州阿里云飞天信息技术有限公司 | Target multi-mode model system, construction method, video processing model training method and video processing method |
CN118196579A (en) * | 2024-03-21 | 2024-06-14 | 广东华锐信息科技有限公司 | Multimedia content management and control optimization method based on target recognition |
CN118349619B (en) * | 2024-06-18 | 2024-09-06 | 硕威工程科技股份有限公司 | Multisource geographic information fusion and visual display method and system |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10417498B2 (en) * | 2016-12-30 | 2019-09-17 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for multi-modal fusion model |
CN109344288B (en) * | 2018-09-19 | 2021-09-24 | 电子科技大学 | Video description combining method based on multi-modal feature combining multi-layer attention mechanism |
CN111275085B (en) * | 2020-01-15 | 2022-09-13 | 重庆邮电大学 | Online short video multi-modal emotion recognition method based on attention fusion |
CN111866607B (en) * | 2020-07-30 | 2022-03-11 | 腾讯科技(深圳)有限公司 | Video clip positioning method and device, computer equipment and storage medium |
CN112733764A (en) * | 2021-01-15 | 2021-04-30 | 天津大学 | Method for recognizing video emotion information based on multiple modes |
CN113343922B (en) * | 2021-06-30 | 2024-04-19 | 北京达佳互联信息技术有限公司 | Video identification method, device, electronic equipment and storage medium |
CN113837259B (en) * | 2021-09-17 | 2023-05-30 | 中山大学附属第六医院 | Education video question-answering method and system for graph-note-meaning fusion of modal interaction |
CN114020891A (en) * | 2021-11-05 | 2022-02-08 | 中山大学 | Double-channel semantic positioning multi-granularity attention mutual enhancement video question-answering method and system |
CN114387567B (en) * | 2022-03-23 | 2022-06-28 | 长视科技股份有限公司 | Video data processing method and device, electronic equipment and storage medium |
-
2022
- 2022-03-23 CN CN202210289901.1A patent/CN114387567B/en active Active
-
2023
- 2023-03-15 WO PCT/CN2023/081690 patent/WO2023179429A1/en unknown
- 2023-08-08 ZA ZA2023/07784A patent/ZA202307784B/en unknown
Also Published As
Publication number | Publication date |
---|---|
CN114387567B (en) | 2022-06-28 |
WO2023179429A1 (en) | 2023-09-28 |
CN114387567A (en) | 2022-04-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
ZA202307784B (en) | Video data processing method and apparatus, electronic device, and storage medium | |
WO2021017606A1 (en) | Video processing method and apparatus, and electronic device and storage medium | |
US10438077B2 (en) | Face liveness detection method, terminal, server and storage medium | |
EP3792818A1 (en) | Video processing method and device, and storage medium | |
US9928397B2 (en) | Method for identifying a target object in a video file | |
CN106156693B (en) | Robust error correction method based on multi-model representation for face recognition | |
CN111783620A (en) | Expression recognition method, device, equipment and storage medium | |
JP2015529354A (en) | Method and apparatus for face recognition | |
CN113015978B (en) | Processing images to locate novel objects | |
US10319095B2 (en) | Method, an apparatus and a computer program product for video object segmentation | |
WO2019214321A1 (en) | Vehicle damage identification processing method, processing device, client and server | |
CN110866936A (en) | Video labeling method, tracking method, device, computer equipment and storage medium | |
WO2019099205A1 (en) | Generating object embeddings from images | |
WO2020244151A1 (en) | Image processing method and apparatus, terminal, and storage medium | |
Bian et al. | Machine learning-based real-time monitoring system for smart connected worker to improve energy efficiency | |
CN112820071A (en) | Behavior identification method and device | |
CN111738769A (en) | Video processing method and device | |
Jones et al. | Top–down learning of low-level vision tasks | |
CN113033377B (en) | Character position correction method, device, electronic equipment and storage medium | |
CN111310595B (en) | Method and device for generating information | |
Cao et al. | CMAN: Leaning global structure correlation for monocular 3D object detection | |
CN113312951A (en) | Dynamic video target tracking system, related method, device and equipment | |
CN116682049A (en) | Multi-mode gazing target estimation method based on attention mechanism | |
Jin et al. | Keyframe-based dynamic elimination SLAM system using YOLO detection | |
KR20140033667A (en) | Apparatus and method for video edit based on object |