JP6965331B2 - 音声認識システム - Google Patents
音声認識システム Download PDFInfo
- Publication number
- JP6965331B2 JP6965331B2 JP2019227504A JP2019227504A JP6965331B2 JP 6965331 B2 JP6965331 B2 JP 6965331B2 JP 2019227504 A JP2019227504 A JP 2019227504A JP 2019227504 A JP2019227504 A JP 2019227504A JP 6965331 B2 JP6965331 B2 JP 6965331B2
- Authority
- JP
- Japan
- Prior art keywords
- voice input
- contexts
- user
- current voice
- context
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013518 transcription Methods 0.000 claims description 51
- 230000035897 transcription Effects 0.000 claims description 51
- 230000009471 action Effects 0.000 claims description 36
- 238000000034 method Methods 0.000 claims description 33
- 238000012545 processing Methods 0.000 claims description 22
- 230000006399 behavior Effects 0.000 claims 1
- 238000004590 computer program Methods 0.000 description 15
- 230000008569 process Effects 0.000 description 11
- 230000004044 response Effects 0.000 description 9
- 238000013519 translation Methods 0.000 description 8
- 230000014616 translation Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 6
- 230000003993 interaction Effects 0.000 description 4
- 241001620634 Roger Species 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000013515 script Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- NLMDJJTUQPXZFG-UHFFFAOYSA-N 1,4,10,13-tetraoxa-7,16-diazacyclooctadecane Chemical compound C1COCCOCCNCCOCCOCCN1 NLMDJJTUQPXZFG-UHFFFAOYSA-N 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 229920001690 polydopamine Polymers 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/63—Querying
- G06F16/632—Query formulation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/081—Search algorithms, e.g. Baum-Welch or Viterbi
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/085—Methods for reducing search complexity, pruning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- User Interface Of Digital Computer (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Description
100 音声認識システム
110 音声入力
111 第1の部分
112 第2の部分
120 ユーザデバイス
140 発話認識エンジン
142 音声デコーダ
144 コンテキストモジュール
146 コンテキスト調整モジュール
148 コンテキスト
150 検索クエリ
160 検索エンジン
170 検索結果
180 ネットワーク
210 コンテキスト
220 コンテキスト
310 音声入力
311 部分
312 部分
Claims (16)
- ユーザから現在の音声入力を、自動発話認識(ASR)システムにおいて受信するステップであって、前記現在の音声入力が少なくとも2つのコンテキストに関連付けられ、前記少なくとも2つのコンテキストの各コンテキストが、前記音声入力がそれぞれのコンテキストに関連付けられる可能性を示すそれぞれの重みを有する、ステップと、
前記ユーザからの前記現在の音声入力の中間認識結果を、前記ASRシステムによって生成するステップと、
前記中間認識結果に基づいて前記少なくとも2つのコンテキストの前記それぞれの重みを、前記ASRシステムによって調整するステップであって、前記中間認識結果に基づいて前記少なくとも2つのコンテキストの前記それぞれの重みを調整するステップが、
前記中間認識結果中の特定のキーワードを識別することによって、前記少なくとも2つのコンテキストのうちの最も関連性の高い1つのコンテキストを決定するステップと、
前記少なくとも2つのコンテキストのうちの前記最も関連性の高い1つのコンテキストの重みを大きくするステップとを含む、ステップと、
言語モデルを使用して前記現在の音声入力を、前記ASRシステムによってトランスクライブするステップであって、前記言語モデルが前記調整された重みに基づいて前記少なくとも2つのコンテキストのうちの1つの方に前記音声入力のトランスクリプションを偏らせる、ステップと
を含む方法。 - 前記言語モデルがNグラムモデルを含む、請求項1に記載の方法。
- 前記現在の音声入力に関連付けられる前記少なくとも2つのコンテキストの前記それぞれの重みを調整するステップが、前記少なくとも2つのコンテキストのうちの少なくとも1つに対するそれぞれの基準重みをブーストするステップを含む、請求項1に記載の方法。
- 前記ユーザからの前記現在の音声入力が、前記現在の音声入力の前記トランスクリプションを使用してアクションを実行するためにソフトウェアアプリケーションを起動するように構成される、請求項1に記載の方法。
- 前記ユーザと相互作用するダイアログシステムに前記現在の音声入力の前記トランスクリプションを与えるステップをさらに含む、請求項1に記載の方法。
- 前記現在の音声入力に関連付けられる前記少なくとも2つのコンテキストのうちの少なくとも1つが、前記現在の音声入力の過去の期間内の前記ユーザからの1つまたは複数の以前の音声入力に基づく、請求項1に記載の方法。
- 前記少なくとも2つのコンテキストのうちの少なくとも1つが、特定のカテゴリに関連付けられる、名前を有するエンティティを含む、請求項1に記載の方法。
- 前記ASRシステムは、前記ユーザに関連付けられるコンピューティングデバイスと通信するサーバ上にあり、前記コンピューティングデバイスが、前記ユーザによって発話された現在の音声入力を取得するとともに、前記ASRシステムに取得された音声入力を送信するように構成される、請求項1に記載の方法。
- 自動発話認識(ASR)システムであって、
データ処理ハードウェアと、
前記データ処理ハードウェアと通信し、前記データ処理ハードウェア上で実行されると、前記データ処理ハードウェアに動作を実行させる命令を格納するメモリハードウェアとを備え、前記動作は、
ユーザから現在の音声入力を受信する動作であって、前記現在の音声入力が少なくとも2つのコンテキストに関連付けられ、前記少なくとも2つのコンテキストの各コンテキストが、前記音声入力がそれぞれのコンテキストに関連付けられる可能性を示すそれぞれの重みを有する、動作と、
前記ユーザからの前記現在の音声入力の中間認識結果を生成する動作と、
前記中間認識結果に基づいて前記少なくとも2つのコンテキストの前記それぞれの重みを調整する動作であって、前記中間認識結果に基づいて前記少なくとも2つのコンテキストの前記それぞれの重みを調整する動作が、
前記中間認識結果中の特定のキーワードを識別することによって、前記少なくとも2つのコンテキストのうちの最も関連性の高い1つのコンテキストを決定する動作と、
前記少なくとも2つのコンテキストのうちの前記最も関連性の高い1つのコンテキストの重みを大きくする動作とを含む、動作と、
言語モデルを使用して前記現在の音声入力をトランスクライブする動作であって、前記言語モデルが前記調整された重みに基づいて前記少なくとも2つのコンテキストのうちの1つの方に前記音声入力のトランスクリプションを偏らせる、動作とを含む、ASRシステム。 - 前記言語モデルがNグラムモデルを含む、請求項9に記載のASRシステム。
- 前記現在の音声入力に関連付けられる前記少なくとも2つのコンテキストの前記それぞれの重みを調整する動作が、前記少なくとも2つのコンテキストのうちの少なくとも1つに対するそれぞれの基準重みをブーストする動作を含む、請求項10に記載のASRシステム。
- 前記ユーザからの前記現在の音声入力が、前記現在の音声入力の前記トランスクリプションを使用してアクションを実行するためにソフトウェアアプリケーションを起動するように構成される、請求項9に記載のASRシステム。
- 前記ユーザと相互作用するダイアログシステムに前記現在の音声入力の前記トランスクリプションを与える動作をさらに含む、請求項9に記載のASRシステム。
- 前記現在の音声入力に関連付けられる前記少なくとも2つのコンテキストのうちの少なくとも1つが、前記現在の音声入力の過去の期間内の前記ユーザからの1つまたは複数の以前の音声入力に基づく、請求項9に記載のASRシステム。
- 前記少なくとも2つのコンテキストのうちの少なくとも1つが、特定のカテゴリに関連付けられる、名前を有するエンティティを含む、請求項9に記載のASRシステム。
- 前記データ処理ハードウェアおよび前記メモリハードウェアが、前記ユーザに関連付けられるコンピューティングデバイスと通信するサーバ上にあり、前記コンピューティングデバイスが、前記ユーザによって発話された現在の音声入力を取得するとともに、前記ASRシステムに前記現在の音声入力を送信するように構成される、請求項9に記載のASRシステム。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021137163A JP2021182168A (ja) | 2016-01-06 | 2021-08-25 | 音声認識システム |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/989,642 US10049666B2 (en) | 2016-01-06 | 2016-01-06 | Voice recognition system |
US14/989,642 | 2016-01-06 | ||
JP2018534820A JP6637604B2 (ja) | 2016-01-06 | 2016-11-30 | 音声認識システム |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2018534820A Division JP6637604B2 (ja) | 2016-01-06 | 2016-11-30 | 音声認識システム |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2021137163A Division JP2021182168A (ja) | 2016-01-06 | 2021-08-25 | 音声認識システム |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2020042313A JP2020042313A (ja) | 2020-03-19 |
JP6965331B2 true JP6965331B2 (ja) | 2021-11-10 |
Family
ID=57589199
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2018534820A Active JP6637604B2 (ja) | 2016-01-06 | 2016-11-30 | 音声認識システム |
JP2019227504A Active JP6965331B2 (ja) | 2016-01-06 | 2019-12-17 | 音声認識システム |
JP2021137163A Pending JP2021182168A (ja) | 2016-01-06 | 2021-08-25 | 音声認識システム |
JP2023084794A Pending JP2023099706A (ja) | 2016-01-06 | 2023-05-23 | 音声認識システム |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2018534820A Active JP6637604B2 (ja) | 2016-01-06 | 2016-11-30 | 音声認識システム |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2021137163A Pending JP2021182168A (ja) | 2016-01-06 | 2021-08-25 | 音声認識システム |
JP2023084794A Pending JP2023099706A (ja) | 2016-01-06 | 2023-05-23 | 音声認識システム |
Country Status (7)
Country | Link |
---|---|
US (5) | US10049666B2 (ja) |
EP (2) | EP3822965A1 (ja) |
JP (4) | JP6637604B2 (ja) |
KR (2) | KR102268087B1 (ja) |
CN (2) | CN112992146A (ja) |
DE (2) | DE102016125831B4 (ja) |
WO (1) | WO2017119965A1 (ja) |
Families Citing this family (74)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10049666B2 (en) * | 2016-01-06 | 2018-08-14 | Google Llc | Voice recognition system |
US10264030B2 (en) | 2016-02-22 | 2019-04-16 | Sonos, Inc. | Networked microphone device control |
US9820039B2 (en) | 2016-02-22 | 2017-11-14 | Sonos, Inc. | Default playback devices |
US10095470B2 (en) | 2016-02-22 | 2018-10-09 | Sonos, Inc. | Audio response playback |
US10142754B2 (en) | 2016-02-22 | 2018-11-27 | Sonos, Inc. | Sensor on moving component of transducer |
US9947316B2 (en) | 2016-02-22 | 2018-04-17 | Sonos, Inc. | Voice control of a media playback system |
US9965247B2 (en) | 2016-02-22 | 2018-05-08 | Sonos, Inc. | Voice controlled media playback system based on user profile |
US10509626B2 (en) | 2016-02-22 | 2019-12-17 | Sonos, Inc | Handling of loss of pairing between networked devices |
US9978390B2 (en) | 2016-06-09 | 2018-05-22 | Sonos, Inc. | Dynamic player selection for audio signal processing |
US10152969B2 (en) | 2016-07-15 | 2018-12-11 | Sonos, Inc. | Voice detection by multiple devices |
US10134399B2 (en) | 2016-07-15 | 2018-11-20 | Sonos, Inc. | Contextualization of voice inputs |
US10115400B2 (en) | 2016-08-05 | 2018-10-30 | Sonos, Inc. | Multiple voice services |
US9693164B1 (en) | 2016-08-05 | 2017-06-27 | Sonos, Inc. | Determining direction of networked microphone device relative to audio playback device |
US9794720B1 (en) | 2016-09-22 | 2017-10-17 | Sonos, Inc. | Acoustic position measurement |
US9942678B1 (en) | 2016-09-27 | 2018-04-10 | Sonos, Inc. | Audio playback settings for voice interaction |
US9743204B1 (en) | 2016-09-30 | 2017-08-22 | Sonos, Inc. | Multi-orientation playback device microphones |
US10181323B2 (en) | 2016-10-19 | 2019-01-15 | Sonos, Inc. | Arbitration-based voice recognition |
CN108447471B (zh) * | 2017-02-15 | 2021-09-10 | 腾讯科技(深圳)有限公司 | 语音识别方法及语音识别装置 |
US11276395B1 (en) * | 2017-03-10 | 2022-03-15 | Amazon Technologies, Inc. | Voice-based parameter assignment for voice-capturing devices |
US11183181B2 (en) | 2017-03-27 | 2021-11-23 | Sonos, Inc. | Systems and methods of multiple voice services |
US10475449B2 (en) | 2017-08-07 | 2019-11-12 | Sonos, Inc. | Wake-word detection suppression |
US10048930B1 (en) | 2017-09-08 | 2018-08-14 | Sonos, Inc. | Dynamic computation of system response volume |
US10446165B2 (en) | 2017-09-27 | 2019-10-15 | Sonos, Inc. | Robust short-time fourier transform acoustic echo cancellation during audio playback |
US10051366B1 (en) | 2017-09-28 | 2018-08-14 | Sonos, Inc. | Three-dimensional beam forming with a microphone array |
CN107644095A (zh) * | 2017-09-28 | 2018-01-30 | 百度在线网络技术(北京)有限公司 | 用于搜索信息的方法和装置 |
US10621981B2 (en) | 2017-09-28 | 2020-04-14 | Sonos, Inc. | Tone interference cancellation |
US10482868B2 (en) | 2017-09-28 | 2019-11-19 | Sonos, Inc. | Multi-channel acoustic echo cancellation |
US10466962B2 (en) | 2017-09-29 | 2019-11-05 | Sonos, Inc. | Media playback system with voice assistance |
US10880650B2 (en) | 2017-12-10 | 2020-12-29 | Sonos, Inc. | Network microphone devices with automatic do not disturb actuation capabilities |
US10818290B2 (en) | 2017-12-11 | 2020-10-27 | Sonos, Inc. | Home graph |
CN108182943B (zh) * | 2017-12-29 | 2021-03-26 | 北京奇艺世纪科技有限公司 | 一种智能设备控制方法、装置及智能设备 |
WO2019152722A1 (en) | 2018-01-31 | 2019-08-08 | Sonos, Inc. | Device designation of playback and network microphone device arrangements |
US11175880B2 (en) | 2018-05-10 | 2021-11-16 | Sonos, Inc. | Systems and methods for voice-assisted media content selection |
US10847178B2 (en) | 2018-05-18 | 2020-11-24 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection |
US10959029B2 (en) | 2018-05-25 | 2021-03-23 | Sonos, Inc. | Determining and adapting to changes in microphone performance of playback devices |
US10629205B2 (en) * | 2018-06-12 | 2020-04-21 | International Business Machines Corporation | Identifying an accurate transcription from probabilistic inputs |
US10681460B2 (en) | 2018-06-28 | 2020-06-09 | Sonos, Inc. | Systems and methods for associating playback devices with voice assistant services |
US10461710B1 (en) | 2018-08-28 | 2019-10-29 | Sonos, Inc. | Media playback system with maximum volume setting |
US11076035B2 (en) | 2018-08-28 | 2021-07-27 | Sonos, Inc. | Do not disturb feature for audio notifications |
US10587430B1 (en) | 2018-09-14 | 2020-03-10 | Sonos, Inc. | Networked devices, systems, and methods for associating playback devices based on sound codes |
US10878811B2 (en) | 2018-09-14 | 2020-12-29 | Sonos, Inc. | Networked devices, systems, and methods for intelligently deactivating wake-word engines |
US11024331B2 (en) | 2018-09-21 | 2021-06-01 | Sonos, Inc. | Voice detection optimization using sound metadata |
US10811015B2 (en) | 2018-09-25 | 2020-10-20 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
US11100923B2 (en) | 2018-09-28 | 2021-08-24 | Sonos, Inc. | Systems and methods for selective wake word detection using neural network models |
US10692518B2 (en) | 2018-09-29 | 2020-06-23 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection via multiple network microphone devices |
US11627012B2 (en) | 2018-10-09 | 2023-04-11 | NewTekSol, LLC | Home automation management system |
US11899519B2 (en) | 2018-10-23 | 2024-02-13 | Sonos, Inc. | Multiple stage network microphone device with reduced power consumption and processing load |
EP3654249A1 (en) | 2018-11-15 | 2020-05-20 | Snips | Dilated convolutions and gating for efficient keyword spotting |
EP4276816A3 (en) * | 2018-11-30 | 2024-03-06 | Google LLC | Speech processing |
US11183183B2 (en) | 2018-12-07 | 2021-11-23 | Sonos, Inc. | Systems and methods of operating media playback systems having multiple voice assistant services |
US11132989B2 (en) | 2018-12-13 | 2021-09-28 | Sonos, Inc. | Networked microphone devices, systems, and methods of localized arbitration |
US10602268B1 (en) | 2018-12-20 | 2020-03-24 | Sonos, Inc. | Optimization of network microphone devices using noise classification |
US11315556B2 (en) | 2019-02-08 | 2022-04-26 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing by transmitting sound data associated with a wake word to an appropriate device for identification |
US10867604B2 (en) | 2019-02-08 | 2020-12-15 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing |
US11120794B2 (en) | 2019-05-03 | 2021-09-14 | Sonos, Inc. | Voice assistant persistence across multiple network microphone devices |
US11361756B2 (en) | 2019-06-12 | 2022-06-14 | Sonos, Inc. | Conditional wake word eventing based on environment |
US11200894B2 (en) | 2019-06-12 | 2021-12-14 | Sonos, Inc. | Network microphone device with command keyword eventing |
US10586540B1 (en) | 2019-06-12 | 2020-03-10 | Sonos, Inc. | Network microphone device with command keyword conditioning |
US11138969B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
US10871943B1 (en) | 2019-07-31 | 2020-12-22 | Sonos, Inc. | Noise classification for event detection |
US11138975B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
WO2021040092A1 (ko) | 2019-08-29 | 2021-03-04 | 엘지전자 주식회사 | 음성 인식 서비스 제공 방법 및 장치 |
US11189286B2 (en) | 2019-10-22 | 2021-11-30 | Sonos, Inc. | VAS toggle based on device orientation |
US11610588B1 (en) * | 2019-10-28 | 2023-03-21 | Meta Platforms, Inc. | Generating contextually relevant text transcripts of voice recordings within a message thread |
US11200900B2 (en) | 2019-12-20 | 2021-12-14 | Sonos, Inc. | Offline voice control |
US11562740B2 (en) | 2020-01-07 | 2023-01-24 | Sonos, Inc. | Voice verification for media playback |
US11556307B2 (en) | 2020-01-31 | 2023-01-17 | Sonos, Inc. | Local voice data processing |
US11308958B2 (en) | 2020-02-07 | 2022-04-19 | Sonos, Inc. | Localized wakeword verification |
US11727919B2 (en) | 2020-05-20 | 2023-08-15 | Sonos, Inc. | Memory allocation for keyword spotting engines |
US11308962B2 (en) | 2020-05-20 | 2022-04-19 | Sonos, Inc. | Input detection windowing |
US11482224B2 (en) | 2020-05-20 | 2022-10-25 | Sonos, Inc. | Command keywords with input detection windowing |
US11698771B2 (en) | 2020-08-25 | 2023-07-11 | Sonos, Inc. | Vocal guidance engines for playback devices |
US11984123B2 (en) | 2020-11-12 | 2024-05-14 | Sonos, Inc. | Network device interaction by range |
US11551700B2 (en) | 2021-01-25 | 2023-01-10 | Sonos, Inc. | Systems and methods for power-efficient keyword detection |
Family Cites Families (58)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6444211B2 (en) | 1991-04-03 | 2002-09-03 | Connaught Laboratories, Inc. | Purification of a pertussis outer membrane protein |
JP3254994B2 (ja) * | 1995-03-01 | 2002-02-12 | セイコーエプソン株式会社 | 音声認識対話装置および音声認識対話処理方法 |
US5986650A (en) | 1996-07-03 | 1999-11-16 | News America Publications, Inc. | Electronic television program guide schedule system and method with scan feature |
US6418431B1 (en) | 1998-03-30 | 2002-07-09 | Microsoft Corporation | Information retrieval and speech recognition based on language models |
US6360201B1 (en) * | 1999-06-08 | 2002-03-19 | International Business Machines Corp. | Method and apparatus for activating and deactivating auxiliary topic libraries in a speech dictation system |
US6513006B2 (en) * | 1999-08-26 | 2003-01-28 | Matsushita Electronic Industrial Co., Ltd. | Automatic control of household activity using speech recognition and natural language |
US20020157116A1 (en) * | 2000-07-28 | 2002-10-24 | Koninklijke Philips Electronics N.V. | Context and content based information processing for multimedia segmentation and indexing |
JP3581648B2 (ja) * | 2000-11-27 | 2004-10-27 | キヤノン株式会社 | 音声認識システム、情報処理装置及びそれらの制御方法、プログラム |
AU2003278431A1 (en) * | 2002-11-22 | 2004-06-18 | Koninklijke Philips Electronics N.V. | Speech recognition device and method |
KR20040055417A (ko) * | 2002-12-21 | 2004-06-26 | 한국전자통신연구원 | 대화체 연속음성인식 장치 및 방법 |
JP3923513B2 (ja) * | 2004-06-08 | 2007-06-06 | 松下電器産業株式会社 | 音声認識装置および音声認識方法 |
JP2006050568A (ja) | 2004-07-06 | 2006-02-16 | Ricoh Co Ltd | 画像処理装置、プログラム及び画像処理方法 |
US7433819B2 (en) * | 2004-09-10 | 2008-10-07 | Scientific Learning Corporation | Assessing fluency based on elapsed time |
US7195999B2 (en) | 2005-07-07 | 2007-03-27 | Micron Technology, Inc. | Metal-substituted transistor gates |
US7590536B2 (en) * | 2005-10-07 | 2009-09-15 | Nuance Communications, Inc. | Voice language model adjustment based on user affinity |
KR100755677B1 (ko) * | 2005-11-02 | 2007-09-05 | 삼성전자주식회사 | 주제 영역 검출을 이용한 대화체 음성 인식 장치 및 방법 |
EP2050456A4 (en) | 2006-08-09 | 2013-01-23 | Mitsubishi Tanabe Pharma Corp | COMPRESSED |
CN101266793B (zh) | 2007-03-14 | 2011-02-02 | 财团法人工业技术研究院 | 通过对话回合间上下文关系来减少辨识错误的装置与方法 |
US8788267B2 (en) * | 2009-09-10 | 2014-07-22 | Mitsubishi Electric Research Laboratories, Inc. | Multi-purpose contextual control |
TWI403663B (zh) | 2010-07-20 | 2013-08-01 | Foxsemicon Integrated Tech Inc | Led發光裝置 |
US8417530B1 (en) | 2010-08-20 | 2013-04-09 | Google Inc. | Accent-influenced search results |
IL209008A (en) * | 2010-10-31 | 2015-09-24 | Verint Systems Ltd | A system and method for analyzing ip traffic of targets |
US8352245B1 (en) | 2010-12-30 | 2013-01-08 | Google Inc. | Adjusting language models |
US8296142B2 (en) * | 2011-01-21 | 2012-10-23 | Google Inc. | Speech recognition using dock context |
US9159324B2 (en) | 2011-07-01 | 2015-10-13 | Qualcomm Incorporated | Identifying people that are proximate to a mobile device user via social graphs, speech models, and user context |
US8650031B1 (en) * | 2011-07-31 | 2014-02-11 | Nuance Communications, Inc. | Accuracy improvement of spoken queries transcription using co-occurrence information |
CA2791277C (en) * | 2011-09-30 | 2019-01-15 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US9256396B2 (en) * | 2011-10-10 | 2016-02-09 | Microsoft Technology Licensing, Llc | Speech recognition for context switching |
US8909512B2 (en) * | 2011-11-01 | 2014-12-09 | Google Inc. | Enhanced stability prediction for incrementally generated speech recognition hypotheses based on an age of a hypothesis |
US9152223B2 (en) | 2011-11-04 | 2015-10-06 | International Business Machines Corporation | Mobile device with multiple security domains |
US8805684B1 (en) | 2012-05-31 | 2014-08-12 | Google Inc. | Distributed speaker adaptation |
US8571859B1 (en) | 2012-05-31 | 2013-10-29 | Google Inc. | Multi-stage speaker adaptation |
US8515750B1 (en) | 2012-06-05 | 2013-08-20 | Google Inc. | Realtime acoustic adaptation using stability measures |
US9043205B2 (en) | 2012-06-21 | 2015-05-26 | Google Inc. | Dynamic language model |
US20140011465A1 (en) | 2012-07-05 | 2014-01-09 | Delphi Technologies, Inc. | Molded conductive plastic antenna |
US9380833B2 (en) | 2012-07-12 | 2016-07-05 | Diana Irving | Shoe insert |
US8880398B1 (en) | 2012-07-13 | 2014-11-04 | Google Inc. | Localized speech recognition with offload |
WO2014039106A1 (en) * | 2012-09-10 | 2014-03-13 | Google Inc. | Answering questions using environmental context |
US8484017B1 (en) * | 2012-09-10 | 2013-07-09 | Google Inc. | Identifying media content |
US20140122069A1 (en) * | 2012-10-30 | 2014-05-01 | International Business Machines Corporation | Automatic Speech Recognition Accuracy Improvement Through Utilization of Context Analysis |
EP2912567A4 (en) * | 2012-12-11 | 2016-05-18 | Nuance Communications Inc | SYSTEM AND METHODS FOR VIRTUAL AGENT RECOMMENDATION FOR MULTIPLE PEOPLE |
CN103871424A (zh) * | 2012-12-13 | 2014-06-18 | 上海八方视界网络科技有限公司 | 一种基于贝叶斯信息准则的线上说话人聚类分析方法 |
CN103064936B (zh) * | 2012-12-24 | 2018-03-30 | 北京百度网讯科技有限公司 | 一种基于语音输入的图像信息提取分析方法及装置 |
JP2014240940A (ja) * | 2013-06-12 | 2014-12-25 | 株式会社東芝 | 書き起こし支援装置、方法、及びプログラム |
US20150005801A1 (en) | 2013-06-27 | 2015-01-01 | Covidien Lp | Microcatheter system |
PL3033819T3 (pl) | 2013-08-15 | 2019-07-31 | Fontem Holdings 4 B.V. | Sposób, system i urządzenie do bezprzełącznikowego wykrywania i ładowania |
EP2862164B1 (en) * | 2013-08-23 | 2017-05-31 | Nuance Communications, Inc. | Multiple pass automatic speech recognition |
US10565984B2 (en) | 2013-11-15 | 2020-02-18 | Intel Corporation | System and method for maintaining speech recognition dynamic dictionary |
EP3107274B1 (en) * | 2014-02-13 | 2020-12-16 | Nec Corporation | Communication device, communication system, and communication method |
US9679558B2 (en) * | 2014-05-15 | 2017-06-13 | Microsoft Technology Licensing, Llc | Language modeling for conversational understanding domains using semantic web resources |
US20150340024A1 (en) * | 2014-05-23 | 2015-11-26 | Google Inc. | Language Modeling Using Entities |
US20160018085A1 (en) | 2014-07-18 | 2016-01-21 | Soraa, Inc. | Compound light control lens field |
US10628483B1 (en) * | 2014-08-07 | 2020-04-21 | Amazon Technologies, Inc. | Entity resolution with ranking |
US9552816B2 (en) * | 2014-12-19 | 2017-01-24 | Amazon Technologies, Inc. | Application focus in speech-based systems |
US9805713B2 (en) * | 2015-03-13 | 2017-10-31 | Google Inc. | Addressing missing features in models |
US9966073B2 (en) * | 2015-05-27 | 2018-05-08 | Google Llc | Context-sensitive dynamic update of voice to text model in a voice-enabled electronic device |
US10491967B1 (en) * | 2015-12-21 | 2019-11-26 | Amazon Technologies, Inc. | Integrating a live streaming video service with external computing systems |
US10049666B2 (en) * | 2016-01-06 | 2018-08-14 | Google Llc | Voice recognition system |
-
2016
- 2016-01-06 US US14/989,642 patent/US10049666B2/en active Active
- 2016-11-30 WO PCT/US2016/064092 patent/WO2017119965A1/en active Application Filing
- 2016-11-30 JP JP2018534820A patent/JP6637604B2/ja active Active
- 2016-11-30 EP EP20217773.9A patent/EP3822965A1/en active Pending
- 2016-11-30 KR KR1020207024503A patent/KR102268087B1/ko active IP Right Grant
- 2016-11-30 EP EP16816078.6A patent/EP3378061B1/en active Active
- 2016-11-30 KR KR1020187019323A patent/KR102150509B1/ko active IP Right Grant
- 2016-12-23 CN CN202110154554.7A patent/CN112992146A/zh active Pending
- 2016-12-23 CN CN201611207951.1A patent/CN107039040B/zh active Active
- 2016-12-29 DE DE102016125831.8A patent/DE102016125831B4/de active Active
- 2016-12-29 DE DE202016008203.6U patent/DE202016008203U1/de active Active
-
2018
- 2018-03-02 US US15/910,872 patent/US10269354B2/en active Active
-
2019
- 2019-03-14 US US16/353,441 patent/US10643617B2/en active Active
- 2019-12-17 JP JP2019227504A patent/JP6965331B2/ja active Active
-
2020
- 2020-04-01 US US16/837,250 patent/US11410660B2/en active Active
-
2021
- 2021-08-25 JP JP2021137163A patent/JP2021182168A/ja active Pending
-
2022
- 2022-07-11 US US17/811,605 patent/US11996103B2/en active Active
-
2023
- 2023-05-23 JP JP2023084794A patent/JP2023099706A/ja active Pending
Also Published As
Publication number | Publication date |
---|---|
KR102150509B1 (ko) | 2020-09-01 |
CN112992146A (zh) | 2021-06-18 |
WO2017119965A1 (en) | 2017-07-13 |
EP3378061B1 (en) | 2021-01-06 |
DE102016125831A1 (de) | 2017-07-06 |
DE102016125831B4 (de) | 2022-02-03 |
EP3822965A1 (en) | 2021-05-19 |
US11410660B2 (en) | 2022-08-09 |
JP2023099706A (ja) | 2023-07-13 |
US20190214012A1 (en) | 2019-07-11 |
KR20200103876A (ko) | 2020-09-02 |
JP2019504358A (ja) | 2019-02-14 |
JP2020042313A (ja) | 2020-03-19 |
US11996103B2 (en) | 2024-05-28 |
KR102268087B1 (ko) | 2021-06-22 |
US10643617B2 (en) | 2020-05-05 |
US20200227046A1 (en) | 2020-07-16 |
US20220343915A1 (en) | 2022-10-27 |
US20170193999A1 (en) | 2017-07-06 |
DE202016008203U1 (de) | 2017-04-27 |
JP2021182168A (ja) | 2021-11-25 |
EP3378061A1 (en) | 2018-09-26 |
US10049666B2 (en) | 2018-08-14 |
US10269354B2 (en) | 2019-04-23 |
CN107039040A (zh) | 2017-08-11 |
CN107039040B (zh) | 2021-02-12 |
US20180190293A1 (en) | 2018-07-05 |
JP6637604B2 (ja) | 2020-01-29 |
KR20180091056A (ko) | 2018-08-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6965331B2 (ja) | 音声認識システム | |
US11398236B2 (en) | Intent-specific automatic speech recognition result generation | |
JP6535349B2 (ja) | 以前の対話行為を使用する自然言語処理における文脈解釈 | |
US10964312B2 (en) | Generation of predictive natural language processing models | |
US8417530B1 (en) | Accent-influenced search results | |
JP6726354B2 (ja) | 訂正済みタームを使用する音響モデルトレーニング | |
US20130132079A1 (en) | Interactive speech recognition | |
US9922650B1 (en) | Intent-specific automatic speech recognition result generation | |
US10152298B1 (en) | Confidence estimation based on frequency | |
US11289075B1 (en) | Routing of natural language inputs to speech processing applications | |
US20170200455A1 (en) | Suggested query constructor for voice actions | |
US11043215B2 (en) | Method and system for generating textual representation of user spoken utterance | |
US11756538B1 (en) | Lower latency speech processing | |
US20240185842A1 (en) | Interactive decoding of words from phoneme score distributions | |
US11380308B1 (en) | Natural language processing | |
US11893994B1 (en) | Processing optimization using machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20191218 |
|
A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20201125 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20201130 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20210113 |
|
A02 | Decision of refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A02 Effective date: 20210621 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20210825 |
|
C60 | Trial request (containing other claim documents, opposition documents) |
Free format text: JAPANESE INTERMEDIATE CODE: C60 Effective date: 20210825 |
|
A911 | Transfer to examiner for re-examination before appeal (zenchi) |
Free format text: JAPANESE INTERMEDIATE CODE: A911 Effective date: 20210906 |
|
C21 | Notice of transfer of a case for reconsideration by examiners before appeal proceedings |
Free format text: JAPANESE INTERMEDIATE CODE: C21 Effective date: 20210913 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20210927 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20211020 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 6965331 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |