KR20240153975A - 온-디바이스 인공 지능 비디오 검색 - Google Patents
온-디바이스 인공 지능 비디오 검색 Download PDFInfo
- Publication number
- KR20240153975A KR20240153975A KR1020247026108A KR20247026108A KR20240153975A KR 20240153975 A KR20240153975 A KR 20240153975A KR 1020247026108 A KR1020247026108 A KR 1020247026108A KR 20247026108 A KR20247026108 A KR 20247026108A KR 20240153975 A KR20240153975 A KR 20240153975A
- Authority
- KR
- South Korea
- Prior art keywords
- video
- ann
- mobile device
- search term
- representation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
- G06F16/732—Query formulation
- G06F16/7343—Query language or query format
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
- G06F16/738—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/74—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7844—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Library & Information Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Image Analysis (AREA)
- Acoustics & Sound (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| IN202241011422 | 2022-03-03 | ||
| IN202241011422 | 2022-03-03 | ||
| PCT/US2023/013252 WO2023167791A1 (en) | 2022-03-03 | 2023-02-16 | On-device artificial intelligence video search |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| KR20240153975A true KR20240153975A (ko) | 2024-10-24 |
Family
ID=85641112
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| KR1020247026108A Pending KR20240153975A (ko) | 2022-03-03 | 2023-02-16 | 온-디바이스 인공 지능 비디오 검색 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US20250036681A1 (https=) |
| EP (1) | EP4487223A1 (https=) |
| JP (1) | JP2025512659A (https=) |
| KR (1) | KR20240153975A (https=) |
| CN (1) | CN118786423A (https=) |
| WO (1) | WO2023167791A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20250291845A1 (en) * | 2024-03-18 | 2025-09-18 | Rishi Kumar | Artificial intelligence assisted streaming video scene selection |
Family Cites Families (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6271892B1 (en) * | 1994-06-02 | 2001-08-07 | Lucent Technologies Inc. | Method and apparatus for compressing a sequence of information-bearing frames having at least two media |
| US9785639B2 (en) * | 2012-04-27 | 2017-10-10 | Mobitv, Inc. | Search-based navigation of media content |
| US10691737B2 (en) * | 2013-02-05 | 2020-06-23 | Intel Corporation | Content summarization and/or recommendation apparatus and method |
| US10331661B2 (en) * | 2013-10-23 | 2019-06-25 | At&T Intellectual Property I, L.P. | Video content search using captioning data |
| US20170083623A1 (en) * | 2015-09-21 | 2017-03-23 | Qualcomm Incorporated | Semantic multisensory embeddings for video search by text |
| US10678854B1 (en) * | 2016-03-11 | 2020-06-09 | Amazon Technologies, Inc. | Approximate string matching in search queries to locate quotes |
| US10963702B1 (en) * | 2019-09-10 | 2021-03-30 | Huawei Technologies Co., Ltd. | Method and system for video segmentation |
| US11238093B2 (en) * | 2019-10-15 | 2022-02-01 | Adobe Inc. | Video retrieval based on encoding temporal relationships among video frames |
| US11302361B2 (en) * | 2019-12-23 | 2022-04-12 | Samsung Electronics Co., Ltd. | Apparatus for video searching using multi-modal criteria and method thereof |
| KR20220167056A (ko) * | 2021-06-11 | 2022-12-20 | 주식회사 엔씨소프트 | 비디오 내 구간을 검색하기 위한 뉴럴 네트워크의 학습 방법 및 장치 |
-
2023
- 2023-02-16 CN CN202380023890.5A patent/CN118786423A/zh active Pending
- 2023-02-16 US US18/714,516 patent/US20250036681A1/en active Pending
- 2023-02-16 KR KR1020247026108A patent/KR20240153975A/ko active Pending
- 2023-02-16 EP EP23711263.6A patent/EP4487223A1/en active Pending
- 2023-02-16 JP JP2024547596A patent/JP2025512659A/ja active Pending
- 2023-02-16 WO PCT/US2023/013252 patent/WO2023167791A1/en not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| US20250036681A1 (en) | 2025-01-30 |
| WO2023167791A1 (en) | 2023-09-07 |
| CN118786423A (zh) | 2024-10-15 |
| JP2025512659A (ja) | 2025-04-22 |
| EP4487223A1 (en) | 2025-01-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| TWI795447B (zh) | 基於關注提議進行視訊動作定位 | |
| US20210005183A1 (en) | Orthogonally constrained multi-head attention for speech tasks | |
| JP7817999B2 (ja) | 個人化ニューラルネットワークプルーニング | |
| CN107430703A (zh) | 对细调特征的顺序图像采样和存储 | |
| US20190108400A1 (en) | Actor-deformation-invariant action proposals | |
| US12249138B2 (en) | Context-driven learning of human-object interactions | |
| CN113870863A (zh) | 声纹识别方法及装置、存储介质及电子设备 | |
| CN116472560A (zh) | 视觉对象的话语约束跟踪 | |
| CN120813950A (zh) | 没有误差累积的稳健测试时间自适应 | |
| CN120409657B (zh) | 多模态大模型驱动的人物知识图谱构建方法及系统 | |
| JP7806073B2 (ja) | ビデオ処理における改善された時間的一貫性のための効率的なテスト時間適応 | |
| US20250036681A1 (en) | On-device artificial intelligence video search | |
| CN120917459A (zh) | 无遗忘的动态类增量学习 | |
| TW202520125A (zh) | 用於文字至影像擴散模型的硬體感知高效架構 | |
| KR20260012201A (ko) | 시각적 추론을 개선하기 위한 기반 근거들의 사용 | |
| US12307214B2 (en) | Hybrid language translation on mobile devices | |
| WO2024238024A1 (en) | Using grounded rationales to improve visual reasoning | |
| KR20240116711A (ko) | 흐름 애그노스틱 뉴럴 비디오 압축 | |
| US20250278629A1 (en) | Efficient attention using soft masking and soft channel pruning | |
| WO2025111916A1 (en) | Accelerating prompt inferencing of large language models | |
| WO2025107137A1 (en) | Pipeline for accelerating first token generation of large language models | |
| US20250252627A1 (en) | Temporally consistent and semantics guided text-based video editing generative artificial intelligence (ai) model with improved initialization | |
| US20240005158A1 (en) | Model performance linter | |
| WO2025159835A1 (en) | Selective parameter-efficient fine-tuning for large-scale models | |
| CN120689792A (zh) | 情感预测方法、装置、设备、存储介质及计算机程序产品 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PA0105 | International application |
Patent event date: 20240802 Patent event code: PA01051R01D Comment text: International Patent Application |
|
| PG1501 | Laying open of application |