KR20250018507A - 게이팅 이력을 이용한 비디오 프레임 액션 검출 - Google Patents

게이팅 이력을 이용한 비디오 프레임 액션 검출 Download PDF

Info

Publication number
KR20250018507A
KR20250018507A KR1020247040148A KR20247040148A KR20250018507A KR 20250018507 A KR20250018507 A KR 20250018507A KR 1020247040148 A KR1020247040148 A KR 1020247040148A KR 20247040148 A KR20247040148 A KR 20247040148A KR 20250018507 A KR20250018507 A KR 20250018507A
Authority
KR
South Korea
Prior art keywords
video frame
frames
video
frame
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
KR1020247040148A
Other languages
English (en)
Korean (ko)
Inventor
가우라브 미탈
예 유
메이 첸
준웬 첸
Original Assignee
마이크로소프트 테크놀로지 라이센싱, 엘엘씨
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 filed Critical 마이크로소프트 테크놀로지 라이센싱, 엘엘씨
Priority claimed from PCT/US2023/018777 external-priority patent/WO2023235058A1/en
Publication of KR20250018507A publication Critical patent/KR20250018507A/ko
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/43Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of news video content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
KR1020247040148A 2022-06-03 2023-04-17 게이팅 이력을 이용한 비디오 프레임 액션 검출 Pending KR20250018507A (ko)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US202263348993P 2022-06-03 2022-06-03
US63/348,993 2022-06-03
US17/852,310 2022-06-28
US17/852,310 US11895343B2 (en) 2022-06-03 2022-06-28 Video frame action detection using gated history
PCT/US2023/018777 WO2023235058A1 (en) 2022-06-03 2023-04-17 Video frame action detection using gated history

Publications (1)

Publication Number Publication Date
KR20250018507A true KR20250018507A (ko) 2025-02-06

Family

ID=88976208

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020247040148A Pending KR20250018507A (ko) 2022-06-03 2023-04-17 게이팅 이력을 이용한 비디오 프레임 액션 검출

Country Status (4)

Country Link
US (2) US11895343B2 (https=)
EP (1) EP4533415A1 (https=)
JP (1) JP2025518664A (https=)
KR (1) KR20250018507A (https=)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4266260B1 (en) * 2022-04-20 2026-01-07 Axis AB Encoding of training data for training of a neural network
US12299914B2 (en) * 2022-09-06 2025-05-13 Toyota Research Institute, Inc. Self-supervised training from a teacher network for cost volume based depth estimates
US12187279B2 (en) * 2022-09-29 2025-01-07 Toyota Motor Engineering & Manufacturing North America, Inc. Method and system for personalized car following with transformers and RNNs
US20250029410A1 (en) * 2023-07-19 2025-01-23 University Of Central Florida Research Foundation, Inc. Active Sparse Labeling of Video Frames
US20250077893A1 (en) * 2023-09-01 2025-03-06 Royal Bank Of Canada Meta temporal point processes
CN118657773B (zh) * 2024-08-20 2024-10-25 浙江啄云智能科技有限公司 违禁品检测及模型的训练方法、装置、设备、介质和产品
CN120953899B (zh) * 2025-10-16 2025-12-26 中国人民解放军海军工程大学 一种基于双向交叉注意力的运动目标识别方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20220051717A (ko) * 2020-10-19 2022-04-26 한국전자통신연구원 동영상에서 영상 패턴을 이용한 자기지도학습 기반 분할 및 추적 시스템 및 방법

Also Published As

Publication number Publication date
EP4533415A1 (en) 2025-04-09
US20240244279A1 (en) 2024-07-18
US12192543B2 (en) 2025-01-07
US11895343B2 (en) 2024-02-06
JP2025518664A (ja) 2025-06-19
US20230396817A1 (en) 2023-12-07

Similar Documents

Publication Publication Date Title
KR20250018507A (ko) 게이팅 이력을 이용한 비디오 프레임 액션 검출
US12033082B2 (en) Maintaining fixed sizes for target objects in frames
JP7147078B2 (ja) ビデオフレームの情報ラベリング方法、装置、機器及びコンピュータプログラム
US11423695B2 (en) Face location tracking method, apparatus, and electronic device
US11468680B2 (en) Shuffle, attend, and adapt: video domain adaptation by clip order prediction and clip attention alignment
US20190304102A1 (en) Memory efficient blob based object classification in video analytics
US20200285859A1 (en) Video summary generation method and apparatus, electronic device, and computer storage medium
JP2022523606A (ja) 動画解析のためのゲーティングモデル
CN110622176A (zh) 视频分区
KR20220011078A (ko) 능동적 인터랙션 방법, 장치, 전자 기기 및 판독 가능 기록 매체
KR20220120401A (ko) 엣지 인공지능의 표적 추론 방법
CN112883817B (zh) 动作定位方法、装置、电子设备和存储介质
CN121000952A (zh) 视频生成方法、装置、电子设备、存储介质和程序产品
WO2023235058A1 (en) Video frame action detection using gated history
US20240354963A1 (en) Method for image segmentation and system therefor
US12608836B2 (en) Video engagement determination based on statistical positional object tracking
US12299969B2 (en) Efficient vision perception
CN117455948A (zh) 基于深度学习算法的多视角行人轨迹提取分析方法
CN116704405A (zh) 行为识别方法、电子设备及存储介质
Qian et al. edgeVLM: Cloud-edge Collaborative Real-time VLM based on Context Transfer
WO2025020080A1 (en) Spatio-temporal video saliency analysis
EP4555491A1 (en) Bilateral attention transformer in motion-appearance neighboring space for video object segmentation
CN120877165A (zh) 视频目标分割方法、电子设备、存储介质及程序产品
CN114419508A (zh) 识别方法、训练方法、装置、设备及存储介质
CN119337192A (zh) 数据分类方法、装置及存储介质

Legal Events

Date Code Title Description
PA0105 International application

St.27 status event code: A-0-1-A10-A15-nap-PA0105

PG1501 Laying open of application

St.27 status event code: A-1-1-Q10-Q12-nap-PG1501

P11 Amendment of application requested

Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P11-NAP-X000 (AS PROVIDED BY THE NATIONAL OFFICE)

P11-X000 Amendment of application requested

St.27 status event code: A-2-2-P10-P11-nap-X000

P13 Application amended

Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P13-NAP-X000 (AS PROVIDED BY THE NATIONAL OFFICE)

P13-X000 Application amended

St.27 status event code: A-2-2-P10-P13-nap-X000