EP4134921A4 - Verfahren zum trainieren eines videoetikettempfehlungsmodells und verfahren zur bestimmung eines videoetiketts - Google Patents

Verfahren zum trainieren eines videoetikettempfehlungsmodells und verfahren zur bestimmung eines videoetiketts Download PDF

Info

Publication number
EP4134921A4
EP4134921A4 EP22789452.4A EP22789452A EP4134921A4 EP 4134921 A4 EP4134921 A4 EP 4134921A4 EP 22789452 A EP22789452 A EP 22789452A EP 4134921 A4 EP4134921 A4 EP 4134921A4
Authority
EP
European Patent Office
Prior art keywords
video label
recommendation model
determining
training
training video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP22789452.4A
Other languages
English (en)
French (fr)
Other versions
EP4134921A1 (de
Inventor
Zhi Ye
Xin TANG
Hewei WANG
Li GE
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202110754370.4A external-priority patent/CN113378784B/zh
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Publication of EP4134921A1 publication Critical patent/EP4134921A1/de
Publication of EP4134921A4 publication Critical patent/EP4134921A4/de
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/75Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/785Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using colour or luminescence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/7854Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using shape
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/7857Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
EP22789452.4A 2021-07-01 2022-05-31 Verfahren zum trainieren eines videoetikettempfehlungsmodells und verfahren zur bestimmung eines videoetiketts Withdrawn EP4134921A4 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110754370.4A CN113378784B (zh) 2021-07-01 2021-07-01 视频标签推荐模型的训练方法和确定视频标签的方法
PCT/CN2022/096229 WO2023273769A1 (zh) 2021-07-01 2022-05-31 视频标签推荐模型的训练方法和确定视频标签的方法

Publications (2)

Publication Number Publication Date
EP4134921A1 EP4134921A1 (de) 2023-02-15
EP4134921A4 true EP4134921A4 (de) 2023-11-01

Family

ID=84237960

Family Applications (1)

Application Number Title Priority Date Filing Date
EP22789452.4A Withdrawn EP4134921A4 (de) 2021-07-01 2022-05-31 Verfahren zum trainieren eines videoetikettempfehlungsmodells und verfahren zur bestimmung eines videoetiketts

Country Status (4)

Country Link
US (1) US20240221401A1 (de)
EP (1) EP4134921A4 (de)
JP (1) JP2023535108A (de)
KR (1) KR20220153088A (de)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230074189A1 (en) * 2021-08-19 2023-03-09 Fmr Llc Methods and systems for intelligent text classification with limited or no training data
CN116308960B (zh) * 2023-03-27 2023-11-21 杭州绿城信息技术有限公司 基于数据分析的智慧园区物业防控管理系统及其实现方法
CN116843998B (zh) * 2023-08-29 2023-11-14 四川省分析测试服务中心 一种光谱样本加权方法及系统
CN117726721B (zh) * 2024-02-08 2024-04-30 湖南君安科技有限公司 基于主题驱动与多模态融合的图像生成方法、设备及介质
CN118035491B (zh) * 2024-04-11 2024-07-05 北京搜狐新媒体信息技术有限公司 视频标签标注模型的训练方法、使用方法及相关产品

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112203122B (zh) * 2020-10-10 2024-01-26 腾讯科技(深圳)有限公司 基于人工智能的相似视频处理方法、装置及电子设备

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CAO DA ET AL: "Hashtag our stories: Hashtag recommendation for micro-videos via harnessing multiple modalities", KNOWLEDGE-BASED SYSTEMS, ELSEVIER, AMSTERDAM, NL, vol. 203, 8 June 2020 (2020-06-08), XP086223736, ISSN: 0950-7051, [retrieved on 20200608], DOI: 10.1016/J.KNOSYS.2020.106114 *
See also references of WO2023273769A1 *
TIAN HAIMAN ET AL: "Multimodal deep representation learning for video classification", WORLD WIDE WEB, BALTZER SCIENCE PUBLISHERS, BUSSUM, NL, vol. 22, no. 3, 3 May 2018 (2018-05-03), pages 1325 - 1341, XP036770680, ISSN: 1386-145X, [retrieved on 20180503], DOI: 10.1007/S11280-018-0548-3 *

Also Published As

Publication number Publication date
EP4134921A1 (de) 2023-02-15
KR20220153088A (ko) 2022-11-17
US20240221401A1 (en) 2024-07-04
JP2023535108A (ja) 2023-08-16

Similar Documents

Publication Publication Date Title
EP4134921A4 (de) Verfahren zum trainieren eines videoetikettempfehlungsmodells und verfahren zur bestimmung eines videoetiketts
EP4198875A4 (de) Bildfusionsverfahren und trainingsverfahren und -vorrichtung für bildfusionsmodell
EP4273746A4 (de) Modelltrainingsverfahren und -vorrichtung sowie bildabrufverfahren und -vorrichtung
EP4181026A4 (de) Empfehlungsmodelltrainingsverfahren und -vorrichtung, empfehlungsverfahren und -vorrichtung sowie computerlesbares medium
EP3940638A4 (de) Bildregionpositionierungsverfahren, modelltrainingsverfahren und zugehörige vorrichtung
EP3862893A4 (de) Empfehlungsmodelltrainingsverfahren, empfehlungsverfahren, vorrichtung und computerlesbares medium
EP3985578A4 (de) Verfahren und system zum automatischen trainieren eines maschinenlernmodells
EP4024232A4 (de) Textverarbeitungsmodelltrainingsverfahren und textverarbeitungsverfahren und -vorrichtung
EP4016986A4 (de) Verfahren und apparat für intelligente videoaufnahme
EP3951646A4 (de) Verfahren zum trainieren eines netzwerkmodells zur bilderkennung, verfahren und vorrichtung zur bilderkennung
EP3989111A4 (de) Verfahren und vorrichtung zur videoklassifizierung, modelltrainingsverfahren und gerät, vorrichtung und speichermedium
EP3779774A4 (de) Trainingsverfahren für ein semantisches bildsegmentierungsmodell und server
EP3937073A4 (de) Verfahren zur videoklassifizierung, verfahren und vorrichtung für modelltraining und speichermedium
EP3964998A4 (de) Textverarbeitungsverfahren und modelltrainingsverfahren und -einrichtung
GB2596370B (en) Model training method and apparatus, and prediction method and apparatus
EP3876161A4 (de) Verfahren und vorrichtung zum trainieren eines tiefenlernmodells
EP4009231A4 (de) Verfahren, vorrichtung und gerät zum kennzeichnen von video-bildinformationen und speichermedium
EP4064284A4 (de) Sprachdetektionsverfahren, trainingsverfahren für vorhersagemodelle, gerät, vorrichtung und medium
EP4250189A4 (de) Modelltrainingsverfahren, datenverarbeitungsverfahren und -vorrichtung
EP3989121A4 (de) Verfahren zur konstruktion eines videoklassifizierungsmodells und gerät, verfahren zur videoklassifizierung und gerät sowie vorrichtung und medium
EP3971772A4 (de) Modelltrainingsverfahren und -vorrichtung sowie endgerät und speichermedium
EP3951702A4 (de) Verfahren zum trainieren eines bildverarbeitungsmodells, bildverarbeitungsverfahren, netzwerkvorrichtung und speichermedium
EP4181020A4 (de) Modelltrainingsverfahren und -vorrichtung
EP3993320A4 (de) Trainingsverfahren, vorrichtung und system mit einem mos-modell
EP3989109A4 (de) Verfahren und vorrichtung zur bildidentifikation, verfahren und vorrichtung zum trainieren eines identifikationsmodells und speichermedium

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20221025

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

A4 Supplementary search report drawn up and despatched

Effective date: 20231005

RIC1 Information provided on ipc code assigned before grant

Ipc: G06N 3/045 20230101ALN20230928BHEP

Ipc: G06V 10/80 20220101ALI20230928BHEP

Ipc: G06V 20/40 20220101AFI20230928BHEP

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20240504