JP2025518664A - ゲート付き過去を用いたビデオフレームのアクション検出 - Google Patents

ゲート付き過去を用いたビデオフレームのアクション検出 Download PDF

Info

Publication number
JP2025518664A
JP2025518664A JP2024564988A JP2024564988A JP2025518664A JP 2025518664 A JP2025518664 A JP 2025518664A JP 2024564988 A JP2024564988 A JP 2024564988A JP 2024564988 A JP2024564988 A JP 2024564988A JP 2025518664 A JP2025518664 A JP 2025518664A
Authority
JP
Japan
Prior art keywords
past
video frames
frames
video
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2024564988A
Other languages
English (en)
Japanese (ja)
Other versions
JP2025518664A5 (https=
Inventor
ミッタル,ゴーラヴ
ユー,イー
チェン,メイ
チェン,ジュンウェン
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Priority claimed from PCT/US2023/018777 external-priority patent/WO2023235058A1/en
Publication of JP2025518664A publication Critical patent/JP2025518664A/ja
Publication of JP2025518664A5 publication Critical patent/JP2025518664A5/ja
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/43Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of news video content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
JP2024564988A 2022-06-03 2023-04-17 ゲート付き過去を用いたビデオフレームのアクション検出 Pending JP2025518664A (ja)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US202263348993P 2022-06-03 2022-06-03
US63/348,993 2022-06-03
US17/852,310 2022-06-28
US17/852,310 US11895343B2 (en) 2022-06-03 2022-06-28 Video frame action detection using gated history
PCT/US2023/018777 WO2023235058A1 (en) 2022-06-03 2023-04-17 Video frame action detection using gated history

Publications (2)

Publication Number Publication Date
JP2025518664A true JP2025518664A (ja) 2025-06-19
JP2025518664A5 JP2025518664A5 (https=) 2026-04-08

Family

ID=88976208

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2024564988A Pending JP2025518664A (ja) 2022-06-03 2023-04-17 ゲート付き過去を用いたビデオフレームのアクション検出

Country Status (4)

Country Link
US (2) US11895343B2 (https=)
EP (1) EP4533415A1 (https=)
JP (1) JP2025518664A (https=)
KR (1) KR20250018507A (https=)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4266260B1 (en) * 2022-04-20 2026-01-07 Axis AB Encoding of training data for training of a neural network
US12299914B2 (en) * 2022-09-06 2025-05-13 Toyota Research Institute, Inc. Self-supervised training from a teacher network for cost volume based depth estimates
US12187279B2 (en) * 2022-09-29 2025-01-07 Toyota Motor Engineering & Manufacturing North America, Inc. Method and system for personalized car following with transformers and RNNs
US20250029410A1 (en) * 2023-07-19 2025-01-23 University Of Central Florida Research Foundation, Inc. Active Sparse Labeling of Video Frames
US20250077893A1 (en) * 2023-09-01 2025-03-06 Royal Bank Of Canada Meta temporal point processes
CN118657773B (zh) * 2024-08-20 2024-10-25 浙江啄云智能科技有限公司 违禁品检测及模型的训练方法、装置、设备、介质和产品
CN120953899B (zh) * 2025-10-16 2025-12-26 中国人民解放军海军工程大学 一种基于双向交叉注意力的运动目标识别方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20220051717A (ko) * 2020-10-19 2022-04-26 한국전자통신연구원 동영상에서 영상 패턴을 이용한 자기지도학습 기반 분할 및 추적 시스템 및 방법

Also Published As

Publication number Publication date
KR20250018507A (ko) 2025-02-06
EP4533415A1 (en) 2025-04-09
US20240244279A1 (en) 2024-07-18
US12192543B2 (en) 2025-01-07
US11895343B2 (en) 2024-02-06
US20230396817A1 (en) 2023-12-07

Similar Documents

Publication Publication Date Title
JP2025518664A (ja) ゲート付き過去を用いたビデオフレームのアクション検出
JP7147078B2 (ja) ビデオフレームの情報ラベリング方法、装置、機器及びコンピュータプログラム
US11557085B2 (en) Neural network processing for multi-object 3D modeling
JP7556161B2 (ja) エクステンデッドリアリティ環境における画像キャプチャ
US11182592B2 (en) Target object recognition method and apparatus, storage medium, and electronic device
US20200285859A1 (en) Video summary generation method and apparatus, electronic device, and computer storage medium
CN112149615B (zh) 人脸活体检测方法、装置、介质及电子设备
US20180121733A1 (en) Reducing computational overhead via predictions of subjective quality of automated image sequence processing
CN110622176A (zh) 视频分区
CN107578017A (zh) 用于生成图像的方法和装置
WO2022104026A1 (en) Consistency measure for image segmentation processes
US12217474B2 (en) Detection of moment of perception
US11099396B2 (en) Depth map re-projection based on image and pose changes
US20230063229A1 (en) Spoof detection based on challenge response analysis
CN113379877B (zh) 人脸视频生成方法、装置、电子设备及存储介质
WO2018005565A1 (en) Automated selection of subjectively best images from burst captured image sequences
CN109271929B (zh) 检测方法和装置
CN110688874A (zh) 人脸表情识别方法及其装置、可读存储介质和电子设备
CN114120456A (zh) 一种学习专注力检测方法、计算机设备及可读介质
CN121000952A (zh) 视频生成方法、装置、电子设备、存储介质和程序产品
US12100244B2 (en) Semi-supervised action-actor detection from tracking data in sport
CN112434629B (zh) 一种在线时序动作检测方法及设备
CN113486717A (zh) 一种行为识别的方法及装置
WO2023235058A1 (en) Video frame action detection using gated history
CN117455948A (zh) 基于深度学习算法的多视角行人轨迹提取分析方法

Legal Events

Date Code Title Description
RD03 Notification of appointment of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7423

Effective date: 20250602

RD04 Notification of resignation of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7424

Effective date: 20250603

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20260330

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20260330