CN113439277B - 用于机器学习的动态音频片段填充 - Google Patents

用于机器学习的动态音频片段填充

Info

Publication number
CN113439277B
CN113439277B CN202080014830.3A CN202080014830A CN113439277B CN 113439277 B CN113439277 B CN 113439277B CN 202080014830 A CN202080014830 A CN 202080014830A CN 113439277 B CN113439277 B CN 113439277B
Authority
CN
China
Prior art keywords
audiovisual
segment
time interval
unfilled
filler
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202080014830.3A
Other languages
English (en)
Chinese (zh)
Other versions
CN113439277A (zh
Inventor
A·鲍曼
S·哈梅
G·坎农
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CN113439277A publication Critical patent/CN113439277A/zh
Application granted granted Critical
Publication of CN113439277B publication Critical patent/CN113439277B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • G06V20/47Detecting features for summarising video content
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • G11B27/036Insert-editing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/10Recognition assisted with metadata

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Computer Security & Cryptography (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Television Signal Processing For Recording (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Electrically Operated Instructional Devices (AREA)
CN202080014830.3A 2019-02-25 2020-02-25 用于机器学习的动态音频片段填充 Active CN113439277B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16/283,912 US10832734B2 (en) 2019-02-25 2019-02-25 Dynamic audiovisual segment padding for machine learning
US16/283,912 2019-02-25
PCT/IB2020/051586 WO2020174383A1 (en) 2019-02-25 2020-02-25 Dynamic audiovisual segment padding for machine learning

Publications (2)

Publication Number Publication Date
CN113439277A CN113439277A (zh) 2021-09-24
CN113439277B true CN113439277B (zh) 2025-08-22

Family

ID=72143024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080014830.3A Active CN113439277B (zh) 2019-02-25 2020-02-25 用于机器学习的动态音频片段填充

Country Status (5)

Country Link
US (2) US10832734B2 (https=)
JP (1) JP7450623B2 (https=)
CN (1) CN113439277B (https=)
GB (1) GB2596463B (https=)
WO (1) WO2020174383A1 (https=)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10824487B2 (en) 2018-07-17 2020-11-03 Xandr Inc. Real-time data processing pipeline and pacing control systems and methods
US10997464B2 (en) * 2018-11-09 2021-05-04 Adobe Inc. Digital image layout training using wireframe rendering within a generative adversarial network (GAN) system
US10832734B2 (en) * 2019-02-25 2020-11-10 International Business Machines Corporation Dynamic audiovisual segment padding for machine learning
WO2021162935A1 (en) 2020-02-13 2021-08-19 Stats Llc Dynamically predicting shot type using a personalized deep neural network
GB2616012A (en) * 2022-02-23 2023-08-30 Sony Group Corp A method, apparatus and computer program for generating sports game highlight video based on excitement of gameplay
US12093883B2 (en) * 2022-03-10 2024-09-17 International Business Machines Corporation Automated delivery coordination and meeting scheduling for multiple-recipient orders received at a computer system
CN118055199A (zh) 2022-11-17 2024-05-17 北京字跳网络技术有限公司 视频剪辑方法及装置

Family Cites Families (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5327518A (en) 1991-08-22 1994-07-05 Georgia Tech Research Corporation Audio analysis/synthesis system
US5793888A (en) 1994-11-14 1998-08-11 Massachusetts Institute Of Technology Machine learning apparatus and method for image searching
CA2167748A1 (en) 1995-02-09 1996-08-10 Yoav Freund Apparatus and methods for machine learning hypotheses
US5596159A (en) 1995-11-22 1997-01-21 Invision Interactive, Inc. Software sound synthesis system
US6266068B1 (en) 1998-03-13 2001-07-24 Compaq Computer Corporation Multi-layer image-based rendering for video synthesis
US6513025B1 (en) 1999-12-09 2003-01-28 Teradyne, Inc. Multistage machine learning process
US7024033B2 (en) 2001-12-08 2006-04-04 Microsoft Corp. Method for boosting the performance of machine-learning classifiers
US20030131362A1 (en) 2002-01-09 2003-07-10 Koninklijke Philips Electronics N.V. Method and apparatus for multimodal story segmentation for linking multimedia content
US7142645B2 (en) * 2002-10-04 2006-11-28 Frederick Lowe System and method for generating and distributing personalized media
JP2006058874A (ja) * 2004-08-20 2006-03-02 Mitsubishi Electric Research Laboratories Inc マルチメディア中の事象を検出する方法
US8126763B2 (en) * 2005-01-20 2012-02-28 Koninklijke Philips Electronics N.V. Automatic generation of trailers containing product placements
US8326775B2 (en) 2005-10-26 2012-12-04 Cortica Ltd. Signature generation for multimedia deep-content-classification by a large-scale matching system and method thereof
US9218606B2 (en) 2005-10-26 2015-12-22 Cortica, Ltd. System and method for brand monitoring and trend analysis based on deep-content-classification
US9047374B2 (en) * 2007-06-08 2015-06-02 Apple Inc. Assembling video content
US8207989B2 (en) 2008-12-12 2012-06-26 Microsoft Corporation Multi-video synthesis
US9247225B2 (en) * 2012-09-25 2016-01-26 Intel Corporation Video indexing with viewer reaction estimation and visual cue detection
US10068614B2 (en) * 2013-04-26 2018-09-04 Microsoft Technology Licensing, Llc Video service with automated video timeline curation
US10269390B2 (en) 2015-06-11 2019-04-23 David M. DeCaprio Game video processing systems and methods
EP3475920A4 (en) 2016-06-23 2020-01-15 Loomai, Inc. SYSTEMS AND METHODS FOR GENERATING COMPUTER-READY ANIMATION MODELS OF A HUMAN HEAD FROM IMAGES OF DETECTED DATA
CN107707931B (zh) 2016-08-08 2021-09-10 阿里巴巴集团控股有限公司 根据视频数据生成解释数据、数据合成方法及装置、电子设备
US11024009B2 (en) 2016-09-15 2021-06-01 Twitter, Inc. Super resolution using a generative adversarial network
US10074038B2 (en) 2016-11-23 2018-09-11 General Electric Company Deep learning medical systems and methods for image reconstruction and quality evaluation
US10043109B1 (en) 2017-01-23 2018-08-07 A9.Com, Inc. Attribute similarity-based search
US10474881B2 (en) 2017-03-15 2019-11-12 Nec Corporation Video retrieval system based on larger pose face frontalization
CN107464210B (zh) 2017-07-06 2020-02-21 浙江工业大学 一种基于生成式对抗网络的图像风格迁移方法
CN107483843B (zh) * 2017-08-16 2019-11-15 成都品果科技有限公司 音视频匹配剪辑方法及装置
CN108256627A (zh) 2017-12-29 2018-07-06 中国科学院自动化研究所 视听信息互生装置及其基于循环对抗生成网络的训练系统
US10635939B2 (en) * 2018-07-06 2020-04-28 Capital One Services, Llc System, method, and computer-accessible medium for evaluating multi-dimensional synthetic data using integrated variants analysis
US10832734B2 (en) 2019-02-25 2020-11-10 International Business Machines Corporation Dynamic audiovisual segment padding for machine learning

Also Published As

Publication number Publication date
JP7450623B2 (ja) 2024-03-15
WO2020174383A1 (en) 2020-09-03
CN113439277A (zh) 2021-09-24
US10832734B2 (en) 2020-11-10
US20210012809A1 (en) 2021-01-14
JP2022521120A (ja) 2022-04-06
US11521655B2 (en) 2022-12-06
GB2596463B (en) 2022-05-11
GB2596463A (en) 2021-12-29
US20200273495A1 (en) 2020-08-27

Similar Documents

Publication Publication Date Title
CN113439277B (zh) 用于机器学习的动态音频片段填充
US11663827B2 (en) Generating a video segment of an action from a video
US20190205652A1 (en) System and Method for Automatic Generation of Sports Media Highlights
US10671895B2 (en) Automated selection of subjectively best image frames from burst captured image sequences
CN107463698B (zh) 基于人工智能推送信息的方法和装置
US20140143183A1 (en) Hierarchical model for human activity recognition
CN109905772A (zh) 视频片段查询方法、装置、计算机设备及存储介质
US20180330249A1 (en) Method and apparatus for immediate prediction of performance of media content
CN112245934B (zh) 虚拟场景应用中虚拟资源的数据分析方法、装置及设备
US20250315648A1 (en) Systems and methods for agentic operations using multimodal generative models for baseball
CN120568143A (zh) 一种基于人工智能的视频剪辑处理方法和系统
Finocchiaro et al. Calisthenics skills temporal video segmentation
Brooks Using machine learning to derive insights from sports location data
US20250292154A1 (en) Machine learning techniques for stoppage time prediction in soccer
US20250316082A1 (en) Systems and methods for agentic operations using multimodal generative models for racing
US20250316085A1 (en) Systems and methods for agentic operations using multimodal generative models for golf
US20250316084A1 (en) Systems and methods for agentic operations using multimodal generative models for cricket
US20250252811A1 (en) Systems and methods for generating an interactive display for player indexing
US20250312649A1 (en) Systems and methods for agentic operations using multimodal generative models for tennis
US20250315700A1 (en) Systems and methods for agentic operations using multimodal generative models for football
US20240342552A1 (en) Defensive and fitness player analysis using remote tracking in sports
US20250010210A1 (en) Frictionless ai-assisted video game messaging system
US20250312676A1 (en) Systems and methods for agentic operations using multimodal generative models for basketball
US20250315647A1 (en) Systems and methods for agentic operations using multimodal generative models for rugby
US20250315661A1 (en) Systems and methods for agentic operations using multimodal generative models for soccer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant