CN118786423A - 设备上人工智能视频搜索 - Google Patents

设备上人工智能视频搜索 Download PDF

Info

Publication number
CN118786423A
CN118786423A CN202380023890.5A CN202380023890A CN118786423A CN 118786423 A CN118786423 A CN 118786423A CN 202380023890 A CN202380023890 A CN 202380023890A CN 118786423 A CN118786423 A CN 118786423A
Authority
CN
China
Prior art keywords
video
search query
ann
mobile device
representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202380023890.5A
Other languages
English (en)
Chinese (zh)
Inventor
S·D·帕特尔
P·A·布德瓦尼
S·C·纳迪帕里
S·孔达帕蒂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN118786423A publication Critical patent/CN118786423A/zh
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/732Query formulation
    • G06F16/7343Query language or query format
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/738Presentation of query results
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Image Analysis (AREA)
  • Acoustics & Sound (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
CN202380023890.5A 2022-03-03 2023-02-16 设备上人工智能视频搜索 Pending CN118786423A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
IN202241011422 2022-03-03
IN202241011422 2022-03-03
PCT/US2023/013252 WO2023167791A1 (en) 2022-03-03 2023-02-16 On-device artificial intelligence video search

Publications (1)

Publication Number Publication Date
CN118786423A true CN118786423A (zh) 2024-10-15

Family

ID=85641112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202380023890.5A Pending CN118786423A (zh) 2022-03-03 2023-02-16 设备上人工智能视频搜索

Country Status (6)

Country Link
US (1) US20250036681A1 (https=)
EP (1) EP4487223A1 (https=)
JP (1) JP2025512659A (https=)
KR (1) KR20240153975A (https=)
CN (1) CN118786423A (https=)
WO (1) WO2023167791A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20250291845A1 (en) * 2024-03-18 2025-09-18 Rishi Kumar Artificial intelligence assisted streaming video scene selection

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6271892B1 (en) * 1994-06-02 2001-08-07 Lucent Technologies Inc. Method and apparatus for compressing a sequence of information-bearing frames having at least two media
US9785639B2 (en) * 2012-04-27 2017-10-10 Mobitv, Inc. Search-based navigation of media content
US10691737B2 (en) * 2013-02-05 2020-06-23 Intel Corporation Content summarization and/or recommendation apparatus and method
US10331661B2 (en) * 2013-10-23 2019-06-25 At&T Intellectual Property I, L.P. Video content search using captioning data
US20170083623A1 (en) * 2015-09-21 2017-03-23 Qualcomm Incorporated Semantic multisensory embeddings for video search by text
US10678854B1 (en) * 2016-03-11 2020-06-09 Amazon Technologies, Inc. Approximate string matching in search queries to locate quotes
US10963702B1 (en) * 2019-09-10 2021-03-30 Huawei Technologies Co., Ltd. Method and system for video segmentation
US11238093B2 (en) * 2019-10-15 2022-02-01 Adobe Inc. Video retrieval based on encoding temporal relationships among video frames
US11302361B2 (en) * 2019-12-23 2022-04-12 Samsung Electronics Co., Ltd. Apparatus for video searching using multi-modal criteria and method thereof
KR20220167056A (ko) * 2021-06-11 2022-12-20 주식회사 엔씨소프트 비디오 내 구간을 검색하기 위한 뉴럴 네트워크의 학습 방법 및 장치

Also Published As

Publication number Publication date
US20250036681A1 (en) 2025-01-30
WO2023167791A1 (en) 2023-09-07
KR20240153975A (ko) 2024-10-24
JP2025512659A (ja) 2025-04-22
EP4487223A1 (en) 2025-01-08

Similar Documents

Publication Publication Date Title
US10275719B2 (en) Hyper-parameter selection for deep convolutional networks
CN107430703A (zh) 对细调特征的顺序图像采样和存储
JP7817999B2 (ja) 個人化ニューラルネットワークプルーニング
CN118355396A (zh) 用于知识蒸馏的信任区域感知神经网络架构搜索
CN115053265B (zh) 人类-对象交互的上下文驱动学习
CN116472560A (zh) 视觉对象的话语约束跟踪
US20240303497A1 (en) Robust test-time adaptation without error accumulation
CN120409657B (zh) 多模态大模型驱动的人物知识图谱构建方法及系统
CN120917459A (zh) 无遗忘的动态类增量学习
CN118786423A (zh) 设备上人工智能视频搜索
TW202520125A (zh) 用於文字至影像擴散模型的硬體感知高效架構
US20250124265A1 (en) Practical activation range restriction for neural network quantization
TW202520130A (zh) 對上界固有任意不確定性的共形預測
TW202449656A (zh) 使用扎根原理改進視覺推理
WO2024238024A1 (en) Using grounded rationales to improve visual reasoning
KR20250065594A (ko) 도메인 적응을 위한 뉴럴 네트워크 프로세싱을 일반화하기 위한 증강들에 의한 메타-프리-트레이닝
US20250278629A1 (en) Efficient attention using soft masking and soft channel pruning
WO2025111916A1 (en) Accelerating prompt inferencing of large language models
WO2025107137A1 (en) Pipeline for accelerating first token generation of large language models
US20260087596A1 (en) Selective adaptation in generative machine learning models for enhancing domain alignment
WO2025054890A1 (en) On-device unified inference-training pipeline of hybrid precision forward-backward propagation by heterogeneous floating point graphics processing unit (gpu) and fixed point digital signal processor (dsp)
WO2025080325A1 (en) Practical activation range restriction for neural network quantization
WO2024186380A1 (en) Robust test-time adaptation without error accumulation
TW202512021A (zh) 人工智慧(ai)加速裝置中基於轉接器的高效上下文切換
WO2025170664A1 (en) Temporally consistent and semantics guided text-based video editing generative artificial intelligence (ai) model with improved initialization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination