BR112023019163A2 - Uso adaptativo de modelos de vídeo para compreensão holística de vídeo - Google Patents

Uso adaptativo de modelos de vídeo para compreensão holística de vídeo

Info

Publication number
BR112023019163A2
BR112023019163A2 BR112023019163A BR112023019163A BR112023019163A2 BR 112023019163 A2 BR112023019163 A2 BR 112023019163A2 BR 112023019163 A BR112023019163 A BR 112023019163A BR 112023019163 A BR112023019163 A BR 112023019163A BR 112023019163 A2 BR112023019163 A2 BR 112023019163A2
Authority
BR
Brazil
Prior art keywords
video
holistic
understanding
machine learning
learning model
Prior art date
Application number
BR112023019163A
Other languages
English (en)
Inventor
Amir Ghodrati
Amirhossein Habibian
Ben Yahia Haitam
Mihir JAIN
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of BR112023019163A2 publication Critical patent/BR112023019163A2/pt

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/87Arrangements for image or video recognition or understanding using pattern recognition or machine learning using selection of the recognition techniques, e.g. of a classifier in a multiple classifier system
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/57Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for processing of video signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

uso adaptativo de modelos de vídeo para compreensão holística de vídeo. sistemas e técnicas são fornecidos para realizar a compreensão holística do vídeo. por exemplo, um processo pode incluir a obtenção de um primeiro vídeo e determinar, usando um mecanismo de decisão de modelo de aprendizado de máquina, um primeiro modelo de aprendizado de máquina a partir de um conjunto de modelos de aprendizado de máquina para uso no processamento de pelo menos uma porção do primeiro vídeo. o primeiro modelo de aprendizado de máquina pode ser determinado com base em uma ou mais características pelo menos da porção do primeiro vídeo. o processo pode incluir o processamento pelo menos da porção do primeiro vídeo usando o primeiro modelo de aprendizado de máquina.
BR112023019163A 2021-03-31 2022-01-27 Uso adaptativo de modelos de vídeo para compreensão holística de vídeo BR112023019163A2 (pt)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17/219,460 US11842540B2 (en) 2021-03-31 2021-03-31 Adaptive use of video models for holistic video understanding
PCT/US2022/014137 WO2022211891A1 (en) 2021-03-31 2022-01-27 Adaptive use of video models for holistic video understanding

Publications (1)

Publication Number Publication Date
BR112023019163A2 true BR112023019163A2 (pt) 2023-10-17

Family

ID=80447671

Family Applications (1)

Application Number Title Priority Date Filing Date
BR112023019163A BR112023019163A2 (pt) 2021-03-31 2022-01-27 Uso adaptativo de modelos de vídeo para compreensão holística de vídeo

Country Status (8)

Country Link
US (1) US11842540B2 (pt)
EP (1) EP4315273A1 (pt)
JP (1) JP2024511932A (pt)
KR (1) KR20230163382A (pt)
CN (1) CN116997938A (pt)
BR (1) BR112023019163A2 (pt)
TW (1) TW202240451A (pt)
WO (1) WO2022211891A1 (pt)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6800453B1 (ja) * 2020-05-07 2020-12-16 株式会社 情報システムエンジニアリング 情報処理装置及び情報処理方法
US20220414482A1 (en) * 2021-06-29 2022-12-29 Sap Se Visual question answering with knowledge graphs
US20240211802A1 (en) * 2022-12-22 2024-06-27 Lumana Inc. Hybrid machine learning architecture for visual content processing and uses thereof

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982351A (zh) * 2012-11-15 2013-03-20 河北省电力公司电力科学研究院 基于bp神经网络的瓷绝缘子振动声学检测数据分类方法
KR20180057409A (ko) 2016-11-22 2018-05-30 박진수 오디오 신호 기반의 영상 분류 방법 및 영상 분류 장치
WO2018211602A1 (ja) * 2017-05-16 2018-11-22 株式会社ソニー・インタラクティブエンタテインメント 学習装置、推定装置、学習方法及びプログラム
US10796152B2 (en) * 2018-09-21 2020-10-06 Ancestry.Com Operations Inc. Ventral-dorsal neural networks: object detection via selective attention
US20200202167A1 (en) * 2018-12-20 2020-06-25 Here Global B.V. Dynamically loaded neural network models
US11663815B2 (en) * 2020-03-27 2023-05-30 Infosys Limited System and method for inspection of heat recovery steam generator

Also Published As

Publication number Publication date
US20220318553A1 (en) 2022-10-06
EP4315273A1 (en) 2024-02-07
CN116997938A (zh) 2023-11-03
KR20230163382A (ko) 2023-11-30
WO2022211891A1 (en) 2022-10-06
TW202240451A (zh) 2022-10-16
JP2024511932A (ja) 2024-03-18
US11842540B2 (en) 2023-12-12

Similar Documents

Publication Publication Date Title
BR112023019163A2 (pt) Uso adaptativo de modelos de vídeo para compreensão holística de vídeo
BR112023018094A2 (pt) Amostragem com base em pontos-chaves para estimação de pose
EP4322533A3 (en) Checking order of motion candidates in lut
BR112017022028A2 (pt) extração automática de compromissos e solicitações a partir de comunicações e conteúdo
BRPI0414332A (pt) métodos e sistemas para aperfeiçoar uma ordenação de busca, usando perguntas relacionadas
BR112021009717A2 (pt) Método e sistema para fornecimento de recomendações de alocação de recursos humanos multidimensionais
GB2580805A (en) Training data update
BR112018077294A2 (pt) sistemas e métodos para identificação de conteúdo correspondente
BR112018077322A2 (pt) sistemas e métoodos para identificar conteúdo de correspondência
BRPI0507021A (pt) métodos de ordenação de membros associados numa rede social e meio legìvel em computador
BR112017019015A2 (pt) sistema que facilita o uso de palavras-chave inseridas pelo usuário para buscar conceitos clínicos relacionados, e método para facilitar o uso de palavras-chave inseridas pelo usuário para buscar conceitos clínicos relacionados
BR112017021986A2 (pt) sistema e método para extrair e compartilhar dados de usuário relacionados com aplicativo
BR112022025655A2 (pt) Acessando mídia utilizando descrições de cena
BR112022000477A2 (pt) Método para reduzir um volume de dados de um fluxo de dados em um sistema robótico cirúrgico, sistema de modificação de volume de dados para um sistema robótico cirúrgico, sistema robótico cirúrgico e meio de armazenamento legível por computador não transitório
WO2019118469A3 (en) Methods and systems for management of media content associated with message context on mobile computing devices
BR112022000466A2 (pt) Controle de cancelamento de eco acústico para dispositivos de áudio distribuído
BR112022011338A2 (pt) Modelos de classificação para analisar uma amostra
BR112018008737A8 (pt) método para fornecer módulos de filtro, produto de programa de computador, e aparelho para gerenciamento de processo
Ethridge et al. Ex-offenders in rural settings seeking employment.
WO2022090178A3 (en) Partitioned template matching and symbolic peephole optimization
CN104320460A (zh) 一种大数据处理方法
BRPI0609760C1 (pt) "método de análise de partículas de vírus"
BR112022000176A2 (pt) Método e aparelho para realizar remoção de desfocalização usando aprendizagem profunda
BR112024001198A2 (pt) Treinamento de modelo usando aprendizagem federada
BR112022004014A2 (pt) Pré-processamento automático para tradução de caixa preta