KR20240153975A - 온-디바이스 인공 지능 비디오 검색 - Google Patents

온-디바이스 인공 지능 비디오 검색 Download PDF

Info

Publication number
KR20240153975A
KR20240153975A KR1020247026108A KR20247026108A KR20240153975A KR 20240153975 A KR20240153975 A KR 20240153975A KR 1020247026108 A KR1020247026108 A KR 1020247026108A KR 20247026108 A KR20247026108 A KR 20247026108A KR 20240153975 A KR20240153975 A KR 20240153975A
Authority
KR
South Korea
Prior art keywords
video
ann
mobile device
search term
representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
KR1020247026108A
Other languages
English (en)
Korean (ko)
Inventor
슈브함 디팍 파텔
파완 아수다람 부드화니
샤라스 찬드라 나디팔리
사이쿠마르 콘다파르티
Original Assignee
퀄컴 인코포레이티드
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 퀄컴 인코포레이티드 filed Critical 퀄컴 인코포레이티드
Publication of KR20240153975A publication Critical patent/KR20240153975A/ko
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/732Query formulation
    • G06F16/7343Query language or query format
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/738Presentation of query results
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Image Analysis (AREA)
  • Acoustics & Sound (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
KR1020247026108A 2022-03-03 2023-02-16 온-디바이스 인공 지능 비디오 검색 Pending KR20240153975A (ko)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
IN202241011422 2022-03-03
IN202241011422 2022-03-03
PCT/US2023/013252 WO2023167791A1 (en) 2022-03-03 2023-02-16 On-device artificial intelligence video search

Publications (1)

Publication Number Publication Date
KR20240153975A true KR20240153975A (ko) 2024-10-24

Family

ID=85641112

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020247026108A Pending KR20240153975A (ko) 2022-03-03 2023-02-16 온-디바이스 인공 지능 비디오 검색

Country Status (6)

Country Link
US (1) US20250036681A1 (https=)
EP (1) EP4487223A1 (https=)
JP (1) JP2025512659A (https=)
KR (1) KR20240153975A (https=)
CN (1) CN118786423A (https=)
WO (1) WO2023167791A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20250291845A1 (en) * 2024-03-18 2025-09-18 Rishi Kumar Artificial intelligence assisted streaming video scene selection

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6271892B1 (en) * 1994-06-02 2001-08-07 Lucent Technologies Inc. Method and apparatus for compressing a sequence of information-bearing frames having at least two media
US9785639B2 (en) * 2012-04-27 2017-10-10 Mobitv, Inc. Search-based navigation of media content
US10691737B2 (en) * 2013-02-05 2020-06-23 Intel Corporation Content summarization and/or recommendation apparatus and method
US10331661B2 (en) * 2013-10-23 2019-06-25 At&T Intellectual Property I, L.P. Video content search using captioning data
US20170083623A1 (en) * 2015-09-21 2017-03-23 Qualcomm Incorporated Semantic multisensory embeddings for video search by text
US10678854B1 (en) * 2016-03-11 2020-06-09 Amazon Technologies, Inc. Approximate string matching in search queries to locate quotes
US10963702B1 (en) * 2019-09-10 2021-03-30 Huawei Technologies Co., Ltd. Method and system for video segmentation
US11238093B2 (en) * 2019-10-15 2022-02-01 Adobe Inc. Video retrieval based on encoding temporal relationships among video frames
US11302361B2 (en) * 2019-12-23 2022-04-12 Samsung Electronics Co., Ltd. Apparatus for video searching using multi-modal criteria and method thereof
KR20220167056A (ko) * 2021-06-11 2022-12-20 주식회사 엔씨소프트 비디오 내 구간을 검색하기 위한 뉴럴 네트워크의 학습 방법 및 장치

Also Published As

Publication number Publication date
US20250036681A1 (en) 2025-01-30
WO2023167791A1 (en) 2023-09-07
CN118786423A (zh) 2024-10-15
JP2025512659A (ja) 2025-04-22
EP4487223A1 (en) 2025-01-08

Similar Documents

Publication Publication Date Title
TWI795447B (zh) 基於關注提議進行視訊動作定位
US20210005183A1 (en) Orthogonally constrained multi-head attention for speech tasks
JP7817999B2 (ja) 個人化ニューラルネットワークプルーニング
CN107430703A (zh) 对细调特征的顺序图像采样和存储
US20190108400A1 (en) Actor-deformation-invariant action proposals
US12249138B2 (en) Context-driven learning of human-object interactions
CN113870863A (zh) 声纹识别方法及装置、存储介质及电子设备
CN116472560A (zh) 视觉对象的话语约束跟踪
CN120813950A (zh) 没有误差累积的稳健测试时间自适应
CN120409657B (zh) 多模态大模型驱动的人物知识图谱构建方法及系统
JP7806073B2 (ja) ビデオ処理における改善された時間的一貫性のための効率的なテスト時間適応
US20250036681A1 (en) On-device artificial intelligence video search
CN120917459A (zh) 无遗忘的动态类增量学习
TW202520125A (zh) 用於文字至影像擴散模型的硬體感知高效架構
KR20260012201A (ko) 시각적 추론을 개선하기 위한 기반 근거들의 사용
US12307214B2 (en) Hybrid language translation on mobile devices
WO2024238024A1 (en) Using grounded rationales to improve visual reasoning
KR20240116711A (ko) 흐름 애그노스틱 뉴럴 비디오 압축
US20250278629A1 (en) Efficient attention using soft masking and soft channel pruning
WO2025111916A1 (en) Accelerating prompt inferencing of large language models
WO2025107137A1 (en) Pipeline for accelerating first token generation of large language models
US20250252627A1 (en) Temporally consistent and semantics guided text-based video editing generative artificial intelligence (ai) model with improved initialization
US20240005158A1 (en) Model performance linter
WO2025159835A1 (en) Selective parameter-efficient fine-tuning for large-scale models
CN120689792A (zh) 情感预测方法、装置、设备、存储介质及计算机程序产品

Legal Events

Date Code Title Description
PA0105 International application

Patent event date: 20240802

Patent event code: PA01051R01D

Comment text: International Patent Application

PG1501 Laying open of application