EP4134921A4 - Verfahren zum trainieren eines videoetikettempfehlungsmodells und verfahren zur bestimmung eines videoetiketts - Google Patents
Verfahren zum trainieren eines videoetikettempfehlungsmodells und verfahren zur bestimmung eines videoetiketts Download PDFInfo
- Publication number
- EP4134921A4 EP4134921A4 EP22789452.4A EP22789452A EP4134921A4 EP 4134921 A4 EP4134921 A4 EP 4134921A4 EP 22789452 A EP22789452 A EP 22789452A EP 4134921 A4 EP4134921 A4 EP 4134921A4
- Authority
- EP
- European Patent Office
- Prior art keywords
- video label
- recommendation model
- determining
- training
- training video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/75—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7834—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7844—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
- G06F16/785—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using colour or luminescence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
- G06F16/7854—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using shape
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
- G06F16/7857—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/49—Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Library & Information Science (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Human Computer Interaction (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110754370.4A CN113378784B (zh) | 2021-07-01 | 2021-07-01 | 视频标签推荐模型的训练方法和确定视频标签的方法 |
PCT/CN2022/096229 WO2023273769A1 (zh) | 2021-07-01 | 2022-05-31 | 视频标签推荐模型的训练方法和确定视频标签的方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
EP4134921A1 EP4134921A1 (de) | 2023-02-15 |
EP4134921A4 true EP4134921A4 (de) | 2023-11-01 |
Family
ID=84237960
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP22789452.4A Withdrawn EP4134921A4 (de) | 2021-07-01 | 2022-05-31 | Verfahren zum trainieren eines videoetikettempfehlungsmodells und verfahren zur bestimmung eines videoetiketts |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240221401A1 (de) |
EP (1) | EP4134921A4 (de) |
JP (1) | JP2023535108A (de) |
KR (1) | KR20220153088A (de) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230074189A1 (en) * | 2021-08-19 | 2023-03-09 | Fmr Llc | Methods and systems for intelligent text classification with limited or no training data |
CN116308960B (zh) * | 2023-03-27 | 2023-11-21 | 杭州绿城信息技术有限公司 | 基于数据分析的智慧园区物业防控管理系统及其实现方法 |
CN116843998B (zh) * | 2023-08-29 | 2023-11-14 | 四川省分析测试服务中心 | 一种光谱样本加权方法及系统 |
CN117726721B (zh) * | 2024-02-08 | 2024-04-30 | 湖南君安科技有限公司 | 基于主题驱动与多模态融合的图像生成方法、设备及介质 |
CN118035491B (zh) * | 2024-04-11 | 2024-07-05 | 北京搜狐新媒体信息技术有限公司 | 视频标签标注模型的训练方法、使用方法及相关产品 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112203122B (zh) * | 2020-10-10 | 2024-01-26 | 腾讯科技(深圳)有限公司 | 基于人工智能的相似视频处理方法、装置及电子设备 |
-
2022
- 2022-05-31 EP EP22789452.4A patent/EP4134921A4/de not_active Withdrawn
- 2022-05-31 JP JP2022564826A patent/JP2023535108A/ja active Pending
- 2022-05-31 US US17/920,966 patent/US20240221401A1/en active Pending
- 2022-05-31 KR KR1020227037066A patent/KR20220153088A/ko unknown
Non-Patent Citations (3)
Title |
---|
CAO DA ET AL: "Hashtag our stories: Hashtag recommendation for micro-videos via harnessing multiple modalities", KNOWLEDGE-BASED SYSTEMS, ELSEVIER, AMSTERDAM, NL, vol. 203, 8 June 2020 (2020-06-08), XP086223736, ISSN: 0950-7051, [retrieved on 20200608], DOI: 10.1016/J.KNOSYS.2020.106114 * |
See also references of WO2023273769A1 * |
TIAN HAIMAN ET AL: "Multimodal deep representation learning for video classification", WORLD WIDE WEB, BALTZER SCIENCE PUBLISHERS, BUSSUM, NL, vol. 22, no. 3, 3 May 2018 (2018-05-03), pages 1325 - 1341, XP036770680, ISSN: 1386-145X, [retrieved on 20180503], DOI: 10.1007/S11280-018-0548-3 * |
Also Published As
Publication number | Publication date |
---|---|
EP4134921A1 (de) | 2023-02-15 |
KR20220153088A (ko) | 2022-11-17 |
US20240221401A1 (en) | 2024-07-04 |
JP2023535108A (ja) | 2023-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP4134921A4 (de) | Verfahren zum trainieren eines videoetikettempfehlungsmodells und verfahren zur bestimmung eines videoetiketts | |
EP4198875A4 (de) | Bildfusionsverfahren und trainingsverfahren und -vorrichtung für bildfusionsmodell | |
EP4273746A4 (de) | Modelltrainingsverfahren und -vorrichtung sowie bildabrufverfahren und -vorrichtung | |
EP4181026A4 (de) | Empfehlungsmodelltrainingsverfahren und -vorrichtung, empfehlungsverfahren und -vorrichtung sowie computerlesbares medium | |
EP3940638A4 (de) | Bildregionpositionierungsverfahren, modelltrainingsverfahren und zugehörige vorrichtung | |
EP3862893A4 (de) | Empfehlungsmodelltrainingsverfahren, empfehlungsverfahren, vorrichtung und computerlesbares medium | |
EP3985578A4 (de) | Verfahren und system zum automatischen trainieren eines maschinenlernmodells | |
EP4024232A4 (de) | Textverarbeitungsmodelltrainingsverfahren und textverarbeitungsverfahren und -vorrichtung | |
EP4016986A4 (de) | Verfahren und apparat für intelligente videoaufnahme | |
EP3951646A4 (de) | Verfahren zum trainieren eines netzwerkmodells zur bilderkennung, verfahren und vorrichtung zur bilderkennung | |
EP3989111A4 (de) | Verfahren und vorrichtung zur videoklassifizierung, modelltrainingsverfahren und gerät, vorrichtung und speichermedium | |
EP3779774A4 (de) | Trainingsverfahren für ein semantisches bildsegmentierungsmodell und server | |
EP3937073A4 (de) | Verfahren zur videoklassifizierung, verfahren und vorrichtung für modelltraining und speichermedium | |
EP3964998A4 (de) | Textverarbeitungsverfahren und modelltrainingsverfahren und -einrichtung | |
GB2596370B (en) | Model training method and apparatus, and prediction method and apparatus | |
EP3876161A4 (de) | Verfahren und vorrichtung zum trainieren eines tiefenlernmodells | |
EP4009231A4 (de) | Verfahren, vorrichtung und gerät zum kennzeichnen von video-bildinformationen und speichermedium | |
EP4064284A4 (de) | Sprachdetektionsverfahren, trainingsverfahren für vorhersagemodelle, gerät, vorrichtung und medium | |
EP4250189A4 (de) | Modelltrainingsverfahren, datenverarbeitungsverfahren und -vorrichtung | |
EP3989121A4 (de) | Verfahren zur konstruktion eines videoklassifizierungsmodells und gerät, verfahren zur videoklassifizierung und gerät sowie vorrichtung und medium | |
EP3971772A4 (de) | Modelltrainingsverfahren und -vorrichtung sowie endgerät und speichermedium | |
EP3951702A4 (de) | Verfahren zum trainieren eines bildverarbeitungsmodells, bildverarbeitungsverfahren, netzwerkvorrichtung und speichermedium | |
EP4181020A4 (de) | Modelltrainingsverfahren und -vorrichtung | |
EP3993320A4 (de) | Trainingsverfahren, vorrichtung und system mit einem mos-modell | |
EP3989109A4 (de) | Verfahren und vorrichtung zur bildidentifikation, verfahren und vorrichtung zum trainieren eines identifikationsmodells und speichermedium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20221025 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
A4 | Supplementary search report drawn up and despatched |
Effective date: 20231005 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06N 3/045 20230101ALN20230928BHEP Ipc: G06V 10/80 20220101ALI20230928BHEP Ipc: G06V 20/40 20220101AFI20230928BHEP |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20240504 |