KR100883657B1 - 음성 인식 기반의 음악 검색 방법 및 장치 - Google Patents
음성 인식 기반의 음악 검색 방법 및 장치 Download PDFInfo
- Publication number
- KR100883657B1 KR100883657B1 KR1020070008583A KR20070008583A KR100883657B1 KR 100883657 B1 KR100883657 B1 KR 100883657B1 KR 1020070008583 A KR1020070008583 A KR 1020070008583A KR 20070008583 A KR20070008583 A KR 20070008583A KR 100883657 B1 KR100883657 B1 KR 100883657B1
- Authority
- KR
- South Korea
- Prior art keywords
- music
- preference
- model
- search
- user
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000000694 effects Effects 0.000 claims abstract description 8
- 239000013598 vector Substances 0.000 claims description 21
- 238000004364 calculation method Methods 0.000 claims description 7
- 239000000203 mixture Substances 0.000 claims description 4
- 230000001755 vocal effect Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 11
- 239000000284 extract Substances 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- MQJKPEGWNLWLTK-UHFFFAOYSA-N Dapsone Chemical compound C1=CC(N)=CC=C1S(=O)(=O)C1=CC=C(N)C=C1 MQJKPEGWNLWLTK-UHFFFAOYSA-N 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 229920001690 polydopamine Polymers 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/63—Querying
- G06F16/632—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/63—Querying
- G06F16/635—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
곡명 | 로그 우도(log likelihood) |
사랑은 얄미운 나비인가봐 | -9732 |
사랑의 불시착 | -9732 |
사랑2 | -9732 |
사랑의 미로 | -9732 |
별, 바람, 햇살, 그리고 사랑 | -9732 |
사라 | -9747 |
.... | .... |
곡명 | 선호도 기반 스코어 |
별, 바람, 햇살 그리고 사랑 | -12522 |
사랑2 | -12524 |
사라 | -12525 |
사랑의 미로 | -12527 |
사랑의 불시착 | -12533 |
... | ... |
Claims (20)
- 삭제
- 음성 인식으로 음악을 검색하는 방법에 있어서,(a) 음향 모델을 이용하여 음성 입력에 대한 검색 스코어를 계산하는 단계;(b) 사용자 선호 모델을 이용하여 음악에 대한 선호도를 계산하고, 상기 계산된 선호도를 상기 검색 스코어에 반영하여 계산하는 단계; 및(c) 상기 선호도가 반영된 검색 스코어에 따라 음악 리스트를 추출하는 단계를 포함하며,상기 (b) 단계는,상기 검색 스코어와 상기 선호도를 선형 결합하여 상기 선호도가 반영된 검색 스코어를 계산하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 방법.
- 제 2 항에 있어서,상기 (a) 단계는,상기 입력 음성의 품질을 모델링하여 저장한 월드 모델을 이용하여 상기 검색 스코어에 반영하는 정도를 계산하는 단계를 더 포함하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 방법.
- 제 3 항에 있어서,상기 월드 모델은 상기 입력 음성의 품질에 대한 가우시안 혼합 모델(GMM)인 것을 특징으로 하는 음성 인식 기반의 음악 검색 방법.
- 제 2 항에 있어서,상기 (a) 단계는,상기 음향 모델의 모노폰(monophone)에 대해서 우도(likelihood)를 계산하여 상기 검색 스코어에 반영하는 정도를 계산하는 단계를 더 포함하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 방법.
- 제 2 항에 있어서,상기 (a) 단계는,입력 음성의 프레임 개수로 정규화하여 상기 검색 스코어를 계산하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 방법.
- 제 2 항에 있어서,상기 (b) 단계는,상기 선호도를 상기 검색 스코어에 반영하는 정도를 조절하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 방법.
- 제 2 항에 있어서,상기 (b) 단계는,다음 수학식 20에 의해 상기 선호도가 반영된 검색 스코어를 계산하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 방법.[수학식 20](여기서, W는 특정 음악이고, w는 W의 부분 명칭 단어이고, x는 음성 입력이고, λw 는 부분 명칭 단어 w에 대한 음향 모델이고, U는 사용자 선호 모델이고, Nframe은 입력 음성 특징벡터의 길이이고, αuser는 음악 선호도의 반영 정도를 의미하는 상수이고, λphone는 발성환경 변화에 대한 영향을 없애주기 위한 모노폰으로 구성된 음향 모델이다.)
- 제 2 항 내지 제 10 항 중 어느 한 항에 따른 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체.
- 삭제
- 음성 인식으로 음악을 검색하는 장치에 있어서,사용자가 선호하는 음악을 모델링하여 저장하는 사용자 선호 모델; 및음향 모델을 이용하여 음성 입력에 대한 검색 스코어를 계산하고, 상기 사용자 선호 모델을 이용하여 음악에 대한 선호도를 계산하고, 상기 계산된 선호도를 상기 검색 스코어에 반영하여 음악 리스트를 추출하는 검색부를 포함하며,상기 검색부는,상기 음향 모델을 이용하여 음성 입력에 대한 검색 스코어를 계산하는 검색 스코어 계산부; 상기 사용자 선호 모델을 이용하여 음악에 대한 선호도를 계산하는 선호도 계산부; 상기 계산된 선호도를 상기 검색 스코어에 반영하여 계산하는 통합 계산부; 및 상기 선호도가 반영된 검색 스코어에 따라 음악 리스트를 추출하는 추출부를 포함하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 장치.
- 제 13 항에 있어서,상기 입력 음성의 품질을 모델링한 월드 모델을 더 포함하고,상기 검색부는,상기 월드 모델을 이용하여 상기 검색 스코어의 반영 정도를 계산하는 반영도 계산부를 더 포함하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 장치.
- 제 14 항에 있어서,상기 반영도 계산부는,상기 음향 모델의 모노폰(monophone)에 대해서 우도(likelihood)를 계산하여 상기 검색 스코어에 반영하는 정도를 계산하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 장치.
- 제 13 항에 있어서,상기 검색부는,다음 수학식 20에 의해 상기 선호도가 반영된 검색 스코어를 계산하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 장치.[수학식 20](여기서, W는 특정 음악이고, w는 W의 부분 명칭 단어이고, x는 음성 입력이고, λw 는 부분 명칭 단어 w에 대한 음향 모델이고, U는 사용자 선호 모델이고, Nframe은 입력 음성 특징벡터의 길이이고, αuser는 음악 선호도의 반영 정도를 의미하는 상수이고, λphone는 발성환경 변화에 대한 영향을 없애주기 위한 모노폰으로 구성된 음향 모델이다.)
- 삭제
- 특징 추출부, 검색부, 음향 모델, 발음 모델, 언어 모델 및 음악 데이터베이스를 포함하는 음성 인식으로 음악을 검색하는 장치에 있어서,사용자가 선호하는 음악을 모델링한 사용자 선호 모델; 및 입력 음성의 품질을 모델링하여 저장한 월드 모델을 포함하고,상기 검색부는,상기 음향 모델을 이용하여 상기 특징 추출부로부터 입력된 음성 특징 벡터에 대한 검색 스코어를 계산하고, 상기 사용자 선호 모델을 이용하여 상기 음악 데이터베이스에 저장된 음악에 대한 사용자 선호도를 계산하고, 상기 월드 모델을 이용하여 상기 검색 스코어에 반영하는 정도를 계산하고, 상기 계산된 선호도를 상기 검색 스코어에 반영하여 입력된 음성에 매칭되는 음악 리스트를 추출하는 것을 특징으로 하는 음성 인식 기반의 음악 검색 장치.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020070008583A KR100883657B1 (ko) | 2007-01-26 | 2007-01-26 | 음성 인식 기반의 음악 검색 방법 및 장치 |
US11/892,137 US20080249770A1 (en) | 2007-01-26 | 2007-08-20 | Method and apparatus for searching for music based on speech recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020070008583A KR100883657B1 (ko) | 2007-01-26 | 2007-01-26 | 음성 인식 기반의 음악 검색 방법 및 장치 |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20080070445A KR20080070445A (ko) | 2008-07-30 |
KR100883657B1 true KR100883657B1 (ko) | 2009-02-18 |
Family
ID=39823195
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020070008583A KR100883657B1 (ko) | 2007-01-26 | 2007-01-26 | 음성 인식 기반의 음악 검색 방법 및 장치 |
Country Status (2)
Country | Link |
---|---|
US (1) | US20080249770A1 (ko) |
KR (1) | KR100883657B1 (ko) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10403267B2 (en) | 2015-01-16 | 2019-09-03 | Samsung Electronics Co., Ltd | Method and device for performing voice recognition using grammar model |
Families Citing this family (200)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU6630800A (en) | 1999-08-13 | 2001-03-13 | Pixo, Inc. | Methods and apparatuses for display and traversing of links in page character array |
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
ITFI20010199A1 (it) | 2001-10-22 | 2003-04-22 | Riccardo Vieri | Sistema e metodo per trasformare in voce comunicazioni testuali ed inviarle con una connessione internet a qualsiasi apparato telefonico |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US7633076B2 (en) | 2005-09-30 | 2009-12-15 | Apple Inc. | Automated response to and sensing of user activity in portable devices |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US9053089B2 (en) | 2007-10-02 | 2015-06-09 | Apple Inc. | Part-of-speech tagging using latent analogy |
US8620662B2 (en) | 2007-11-20 | 2013-12-31 | Apple Inc. | Context-aware unit selection |
US10002189B2 (en) | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US8065143B2 (en) | 2008-02-22 | 2011-11-22 | Apple Inc. | Providing text input using speech data and non-speech data |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US8082148B2 (en) * | 2008-04-24 | 2011-12-20 | Nuance Communications, Inc. | Testing a grammar used in speech recognition for reliability in a plurality of operating environments having different background noise |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US8464150B2 (en) | 2008-06-07 | 2013-06-11 | Apple Inc. | Automatic language identification for dynamic text processing |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US8768702B2 (en) | 2008-09-05 | 2014-07-01 | Apple Inc. | Multi-tiered voice feedback in an electronic device |
US8898568B2 (en) | 2008-09-09 | 2014-11-25 | Apple Inc. | Audio user interface |
US8712776B2 (en) | 2008-09-29 | 2014-04-29 | Apple Inc. | Systems and methods for selective text to speech synthesis |
US8583418B2 (en) | 2008-09-29 | 2013-11-12 | Apple Inc. | Systems and methods of detecting language and natural language strings for text to speech synthesis |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
KR101483307B1 (ko) * | 2008-10-21 | 2015-01-15 | 주식회사 케이티 | 대용량 음성인식을 위한 음성인식 처리 장치 및 그 방법 |
WO2010067118A1 (en) | 2008-12-11 | 2010-06-17 | Novauris Technologies Limited | Speech recognition involving a mobile device |
KR101057191B1 (ko) * | 2008-12-30 | 2011-08-16 | 주식회사 하이닉스반도체 | 반도체 소자의 미세 패턴 형성방법 |
US8862252B2 (en) | 2009-01-30 | 2014-10-14 | Apple Inc. | Audio user interface for displayless electronic device |
US8380507B2 (en) | 2009-03-09 | 2013-02-19 | Apple Inc. | Systems and methods for determining the language to use for speech generated by a text to speech engine |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US10540976B2 (en) | 2009-06-05 | 2020-01-21 | Apple Inc. | Contextual voice commands |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US20120309363A1 (en) | 2011-06-03 | 2012-12-06 | Apple Inc. | Triggering notifications associated with tasks items that represent tasks to perform |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
TW201104465A (en) * | 2009-07-17 | 2011-02-01 | Aibelive Co Ltd | Voice songs searching method |
US8682649B2 (en) | 2009-11-12 | 2014-03-25 | Apple Inc. | Sentiment prediction from textual data |
US20110131040A1 (en) * | 2009-12-01 | 2011-06-02 | Honda Motor Co., Ltd | Multi-mode speech recognition |
US8600743B2 (en) | 2010-01-06 | 2013-12-03 | Apple Inc. | Noise profile determination for voice-related feature |
US8311838B2 (en) | 2010-01-13 | 2012-11-13 | Apple Inc. | Devices and methods for identifying a prompt corresponding to a voice input in a sequence of prompts |
US8381107B2 (en) | 2010-01-13 | 2013-02-19 | Apple Inc. | Adaptive audio feedback system and method |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US8682667B2 (en) * | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
US20110231189A1 (en) * | 2010-03-19 | 2011-09-22 | Nuance Communications, Inc. | Methods and apparatus for extracting alternate media titles to facilitate speech recognition |
US8639516B2 (en) | 2010-06-04 | 2014-01-28 | Apple Inc. | User-specific noise suppression for voice quality improvements |
US8713021B2 (en) | 2010-07-07 | 2014-04-29 | Apple Inc. | Unsupervised document clustering using latent semantic density analysis |
US8719006B2 (en) | 2010-08-27 | 2014-05-06 | Apple Inc. | Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis |
US8719014B2 (en) | 2010-09-27 | 2014-05-06 | Apple Inc. | Electronic device with text error correction based on voice recognition data |
US10762293B2 (en) | 2010-12-22 | 2020-09-01 | Apple Inc. | Using parts-of-speech tagging and named entity recognition for spelling correction |
US10515147B2 (en) | 2010-12-22 | 2019-12-24 | Apple Inc. | Using statistical language models for contextual lookup |
US8781836B2 (en) | 2011-02-22 | 2014-07-15 | Apple Inc. | Hearing assistance system for providing consistent human speech |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US10672399B2 (en) | 2011-06-03 | 2020-06-02 | Apple Inc. | Switching between text data and audio data based on a mapping |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US8812294B2 (en) | 2011-06-21 | 2014-08-19 | Apple Inc. | Translating phrases from one language into another using an order-based set of declarative rules |
US8706472B2 (en) | 2011-08-11 | 2014-04-22 | Apple Inc. | Method for disambiguating multiple readings in language conversion |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US8762156B2 (en) | 2011-09-28 | 2014-06-24 | Apple Inc. | Speech recognition repair using contextual information |
US9715581B1 (en) * | 2011-11-04 | 2017-07-25 | Christopher Estes | Digital media reproduction and licensing |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US8775442B2 (en) | 2012-05-15 | 2014-07-08 | Apple Inc. | Semantic search using a single-source semantic model |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
WO2013185109A2 (en) | 2012-06-08 | 2013-12-12 | Apple Inc. | Systems and methods for recognizing textual identifiers within a plurality of words |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US9799328B2 (en) * | 2012-08-03 | 2017-10-24 | Veveo, Inc. | Method for using pauses detected in speech input to assist in interpreting the input during conversational interaction for information retrieval |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
US8935167B2 (en) | 2012-09-25 | 2015-01-13 | Apple Inc. | Exemplar-based latent perceptual modeling for automatic speech recognition |
EP3809407A1 (en) | 2013-02-07 | 2021-04-21 | Apple Inc. | Voice trigger for a digital assistant |
US9733821B2 (en) | 2013-03-14 | 2017-08-15 | Apple Inc. | Voice control to diagnose inadvertent activation of accessibility features |
US9977779B2 (en) | 2013-03-14 | 2018-05-22 | Apple Inc. | Automatic supplementation of word correction dictionaries |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
US10572476B2 (en) | 2013-03-14 | 2020-02-25 | Apple Inc. | Refining a search based on schedule items |
US10642574B2 (en) | 2013-03-14 | 2020-05-05 | Apple Inc. | Device, method, and graphical user interface for outputting captions |
WO2014144579A1 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | System and method for updating an adaptive speech recognition model |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
KR102057795B1 (ko) | 2013-03-15 | 2019-12-19 | 애플 인크. | 콘텍스트-민감성 방해 처리 |
CN105190607B (zh) | 2013-03-15 | 2018-11-30 | 苹果公司 | 通过智能数字助理的用户培训 |
AU2014233517B2 (en) | 2013-03-15 | 2017-05-25 | Apple Inc. | Training an at least partial voice command system |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
KR101922663B1 (ko) | 2013-06-09 | 2018-11-28 | 애플 인크. | 디지털 어시스턴트의 둘 이상의 인스턴스들에 걸친 대화 지속성을 가능하게 하기 위한 디바이스, 방법 및 그래픽 사용자 인터페이스 |
JP2016521948A (ja) | 2013-06-13 | 2016-07-25 | アップル インコーポレイテッド | 音声コマンドによって開始される緊急電話のためのシステム及び方法 |
AU2014306221B2 (en) | 2013-08-06 | 2017-04-06 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
US10296160B2 (en) | 2013-12-06 | 2019-05-21 | Apple Inc. | Method for extracting salient dialog usage from live data |
US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US9966065B2 (en) | 2014-05-30 | 2018-05-08 | Apple Inc. | Multi-command single utterance input method |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US11182431B2 (en) * | 2014-10-03 | 2021-11-23 | Disney Enterprises, Inc. | Voice searching metadata through media content |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US10152299B2 (en) | 2015-03-06 | 2018-12-11 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
KR102413692B1 (ko) * | 2015-07-24 | 2022-06-27 | 삼성전자주식회사 | 음성 인식을 위한 음향 점수 계산 장치 및 방법, 음성 인식 장치 및 방법, 전자 장치 |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
DE102016204183A1 (de) * | 2016-03-15 | 2017-09-21 | Bayerische Motoren Werke Aktiengesellschaft | Verfahren zur Musikauswahl mittels Gesten- und Sprachsteuerung |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179309B1 (en) | 2016-06-09 | 2018-04-23 | Apple Inc | Intelligent automated assistant in a home environment |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
DK201770383A1 (en) | 2017-05-09 | 2018-12-14 | Apple Inc. | USER INTERFACE FOR CORRECTING RECOGNITION ERRORS |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
DK201770439A1 (en) | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
DK201770429A1 (en) | 2017-05-12 | 2018-12-14 | Apple Inc. | LOW-LATENCY INTELLIGENT AUTOMATED ASSISTANT |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770432A1 (en) | 2017-05-15 | 2018-12-21 | Apple Inc. | Hierarchical belief states for digital assistants |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
DK179549B1 (en) | 2017-05-16 | 2019-02-12 | Apple Inc. | FAR-FIELD EXTENSION FOR DIGITAL ASSISTANT SERVICES |
US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
DK179822B1 (da) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | VIRTUAL ASSISTANT OPERATION IN MULTI-DEVICE ENVIRONMENTS |
US10496705B1 (en) | 2018-06-03 | 2019-12-03 | Apple Inc. | Accelerated task performance |
CN112836080B (zh) * | 2021-02-05 | 2023-09-12 | 小叶子(北京)科技有限公司 | 一种通过音频查找曲谱的方法及系统 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11242496A (ja) | 1998-02-26 | 1999-09-07 | Kobe Steel Ltd | 情報再生装置 |
KR20010099450A (ko) * | 2001-09-28 | 2001-11-09 | 오진근 | 음악파일 재생장치 |
KR20030059503A (ko) * | 2001-12-29 | 2003-07-10 | 한국전자통신연구원 | 사용자별 선호도에 따른 맞춤형 음악 서비스 시스템 및 방법 |
KR20070080299A (ko) * | 2006-02-07 | 2007-08-10 | 삼성전자주식회사 | 사용자 의도 자동 해석 기반의 음악 추천 방법 및 그 장치 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6233555B1 (en) * | 1997-11-25 | 2001-05-15 | At&T Corporation | Method and apparatus for speaker identification using mixture discriminant analysis to develop speaker models |
US7246060B2 (en) * | 2001-11-06 | 2007-07-17 | Microsoft Corporation | Natural input recognition system and method using a contextual mapping engine and adaptive user bias |
AUPS270902A0 (en) * | 2002-05-31 | 2002-06-20 | Canon Kabushiki Kaisha | Robust detection and classification of objects in audio using limited training data |
US7617511B2 (en) * | 2002-05-31 | 2009-11-10 | Microsoft Corporation | Entering programming preferences while browsing an electronic programming guide |
JP2004163590A (ja) * | 2002-11-12 | 2004-06-10 | Denso Corp | 再生装置及びプログラム |
US7844464B2 (en) * | 2005-07-22 | 2010-11-30 | Multimodal Technologies, Inc. | Content-based audio playback emphasis |
US7302468B2 (en) * | 2004-11-01 | 2007-11-27 | Motorola Inc. | Local area preference determination system and method |
-
2007
- 2007-01-26 KR KR1020070008583A patent/KR100883657B1/ko not_active IP Right Cessation
- 2007-08-20 US US11/892,137 patent/US20080249770A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11242496A (ja) | 1998-02-26 | 1999-09-07 | Kobe Steel Ltd | 情報再生装置 |
KR20010099450A (ko) * | 2001-09-28 | 2001-11-09 | 오진근 | 음악파일 재생장치 |
KR20030059503A (ko) * | 2001-12-29 | 2003-07-10 | 한국전자통신연구원 | 사용자별 선호도에 따른 맞춤형 음악 서비스 시스템 및 방법 |
KR20070080299A (ko) * | 2006-02-07 | 2007-08-10 | 삼성전자주식회사 | 사용자 의도 자동 해석 기반의 음악 추천 방법 및 그 장치 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10403267B2 (en) | 2015-01-16 | 2019-09-03 | Samsung Electronics Co., Ltd | Method and device for performing voice recognition using grammar model |
US10706838B2 (en) | 2015-01-16 | 2020-07-07 | Samsung Electronics Co., Ltd. | Method and device for performing voice recognition using grammar model |
US10964310B2 (en) | 2015-01-16 | 2021-03-30 | Samsung Electronics Co., Ltd. | Method and device for performing voice recognition using grammar model |
USRE49762E1 (en) | 2015-01-16 | 2023-12-19 | Samsung Electronics Co., Ltd. | Method and device for performing voice recognition using grammar model |
Also Published As
Publication number | Publication date |
---|---|
US20080249770A1 (en) | 2008-10-09 |
KR20080070445A (ko) | 2008-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100883657B1 (ko) | 음성 인식 기반의 음악 검색 방법 및 장치 | |
US9934777B1 (en) | Customized speech processing language models | |
Karpagavalli et al. | A review on automatic speech recognition architecture and approaches | |
US10210862B1 (en) | Lattice decoding and result confirmation using recurrent neural networks | |
US10121467B1 (en) | Automatic speech recognition incorporating word usage information | |
JP4195428B2 (ja) | 多数の音声特徴を利用する音声認識 | |
US9477753B2 (en) | Classifier-based system combination for spoken term detection | |
US6163768A (en) | Non-interactive enrollment in speech recognition | |
US9640175B2 (en) | Pronunciation learning from user correction | |
EP0867859B1 (en) | Speech recognition language models | |
US8244522B2 (en) | Language understanding device | |
US20080189106A1 (en) | Multi-Stage Speech Recognition System | |
EP0874353A2 (en) | Pronunciation generation in speech recognition | |
JP2003036093A (ja) | 音声入力検索システム | |
WO2002035519A1 (en) | Speech recognition using word-in-phrase command | |
Liao et al. | Uncertainty decoding for noise robust speech recognition | |
Aggarwal et al. | Integration of multiple acoustic and language models for improved Hindi speech recognition system | |
Deligne et al. | A robust high accuracy speech recognition system for mobile applications | |
JP2007240589A (ja) | 音声認識信頼度推定装置、その方法、およびプログラム | |
KR100480790B1 (ko) | 양방향 n-그램 언어모델을 이용한 연속 음성인식방법 및장치 | |
Shen et al. | Automatic selection of phonetically distributed sentence sets for speaker adaptation with application to large vocabulary Mandarin speech recognition | |
Yapanel et al. | Robust digit recognition in noise: an evaluation using the AURORA corpus. | |
Chen et al. | Improved spoken term detection by feature space pseudo-relevance feedback | |
Li et al. | Partially speaker-dependent automatic speech recognition using deep neural networks | |
JP2001109491A (ja) | 連続音声認識装置および方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant | ||
FPAY | Annual fee payment |
Payment date: 20130130 Year of fee payment: 5 |
|
FPAY | Annual fee payment |
Payment date: 20140128 Year of fee payment: 6 |
|
FPAY | Annual fee payment |
Payment date: 20150129 Year of fee payment: 7 |
|
FPAY | Annual fee payment |
Payment date: 20160128 Year of fee payment: 8 |
|
FPAY | Annual fee payment |
Payment date: 20170125 Year of fee payment: 9 |
|
FPAY | Annual fee payment |
Payment date: 20180130 Year of fee payment: 10 |
|
LAPS | Lapse due to unpaid annual fee |