JP2022040183A - コンピュータによるエージェントのための合成音声の選択 - Google Patents
コンピュータによるエージェントのための合成音声の選択 Download PDFInfo
- Publication number
- JP2022040183A JP2022040183A JP2021214388A JP2021214388A JP2022040183A JP 2022040183 A JP2022040183 A JP 2022040183A JP 2021214388 A JP2021214388 A JP 2021214388A JP 2021214388 A JP2021214388 A JP 2021214388A JP 2022040183 A JP2022040183 A JP 2022040183A
- Authority
- JP
- Japan
- Prior art keywords
- agent
- assistant
- module
- computing device
- utterance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims abstract description 56
- 238000004891 communication Methods 0.000 claims description 49
- 230000004044 response Effects 0.000 claims description 6
- 239000003795 chemical substances by application Substances 0.000 description 772
- 230000009471 action Effects 0.000 description 27
- 235000013550 pizza Nutrition 0.000 description 27
- 230000006870 function Effects 0.000 description 14
- 238000005516 engineering process Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 12
- 235000013305 food Nutrition 0.000 description 10
- 230000003993 interaction Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000012545 processing Methods 0.000 description 8
- 235000013351 cheese Nutrition 0.000 description 7
- 230000000875 corresponding effect Effects 0.000 description 7
- 230000003287 optical effect Effects 0.000 description 5
- 230000001413 cellular effect Effects 0.000 description 4
- 239000000843 powder Substances 0.000 description 4
- 238000012552 review Methods 0.000 description 4
- 231100000735 select agent Toxicity 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 230000033001 locomotion Effects 0.000 description 3
- 235000019995 prosecco Nutrition 0.000 description 3
- UIIMBOGNXHQVGW-UHFFFAOYSA-M Sodium bicarbonate Chemical compound [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000013475 authorization Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000011982 device technology Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 241001137251 Corvidae Species 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 235000015278 beef Nutrition 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000004138 cluster model Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 235000016337 monopotassium tartrate Nutrition 0.000 description 1
- AVTYONGGKAJVTE-OLXYHTOASA-L potassium L-tartrate Chemical compound [K+].[K+].[O-]C(=O)[C@H](O)[C@@H](O)C([O-])=O AVTYONGGKAJVTE-OLXYHTOASA-L 0.000 description 1
- 229910000030 sodium bicarbonate Inorganic materials 0.000 description 1
- 235000017557 sodium bicarbonate Nutrition 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000010897 surface acoustic wave method Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
- G10L13/047—Architecture of speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5055—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06311—Scheduling, planning or task assignment for a person or group
- G06Q10/063112—Skill-based matching of a person or a group to a task
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/01—Customer relationship services
- G06Q30/015—Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
- G06Q30/016—After-sales
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0282—Rating or review of business operators or products
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0623—Item investigation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5017—Task decomposition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- Entrepreneurship & Innovation (AREA)
- General Health & Medical Sciences (AREA)
- Educational Administration (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
【解決手段】例示的な方法が、1つまたは複数のプロセッサにおいて実行されるコンピュータによるアシスタントによって、コンピューティングデバイスにおいて話された発話の表現を受け取るステップと、発話に基づいて複数のエージェントからエージェントを選択するステップであって、複数のエージェントが、1つまたは複数のファーストパーティエージェントおよび複数のサードパーティエージェントを含む、ステップと、選択されたエージェントがファーストパーティエージェントを含むとの判定に応じて、複数の音声から予約された音声を選択するステップと、発話を満足させるために選択された音声を使用して合成されたオーディオデータを出力するステップとを含む。
【選択図】図1
Description
本出願は、2016年10月3日に出願した米国特許仮出願第62/403,665号の利益を主張するものであり、この仮出願の内容全体は、参照により本明細書に組み込まれる。
て、エージェントのエントリポイントは、エージェントのリソースアドレスまたはその他の引数(argument)である可能性がある。
110 コンピューティングデバイス
112 ユーザインターフェースデバイス(UID)
114 ユーザインターフェース
120 ユーザインターフェース(UI)モジュール
122 アシスタントモジュール
122A ローカルアシスタントモジュール
122B リモートアシスタントモジュール
124 エージェントインデックス
124A エージェントインデックス
124B エージェントインデックス
128 モジュール
128a ローカル3Pエージェントモジュール
128b 3Pエージェントモジュール、リモート3Pエージェントモジュール
128Ab~128Nb モジュール
130 ネットワーク
170 3Pエージェントサーバシステム
180 検索サーバシステム
182 検索モジュール
210 コンピューティングデバイス
212 ユーザインターフェースデバイス(UID)
220 UIモジュール
222 アシスタントモジュール
224 エージェントインデックス
227 エージェント選択モジュール
228 ローカル3Pエージェントモジュール
230 コンテキストモジュール
240 プロセッサ
282 検索モジュール
322 アシスタントモジュール
324 エージェントインデックス
327 エージェント選択モジュール
330 コンテキストモジュール
331 エージェント精度モジュール
340 プロセッサ
342 通信ユニット
348 ストレージデバイス
350 通信チャネル
360 アシスタントサーバシステム
382 検索モジュール
428 3Pエージェントモジュール
440 プロセッサ
442 通信ユニット
448 ストレージ構成要素、ストレージデバイス
450 通信チャネル
470 3Pエージェントサーバシステム
Claims (8)
前記発話に基づいて複数のエージェントからエージェントを選択するステップであって、前記複数のエージェントが、1つまたは複数のファーストパーティエージェントおよび複数のサードパーティエージェントを含む、ステップと、
前記選択されたエージェントがファーストパーティエージェントを含むとの判定に応じて、複数の音声から予約された音声を選択するステップと、
前記発話を満足させるために、前記選択された音声を使用して、前記コンピューティングデバイスの1つまたは複数のスピーカによる再生のために、合成されたオーディオデータを出力するステップと
を含む方法。
前記コンピューティングデバイスにおいて話された第2の発話の表現を受け取るステップと、
前記第2の発話に基づいて前記複数のエージェントから第2のエージェントを選択するステップと、
前記選択された第2のエージェントがサードパーティエージェントを含むとの判定に応じて、前記予約された音声以外の前記複数の音声からの音声を選択するステップと、
前記第2の発話を満足させるために前記選択された音声を使用して合成されたオーディオデータを出力するステップと
をさらに含む請求項1に記載の方法。
前記予約された音声以外の前記複数の音声からの音声を使用して、前記検索結果の第1のサブセットを表す合成されたオーディオデータを出力するステップと
をさらに含み、前記発話を満足させるために、前記選択された音声を使用して、前記合成されたオーディオデータを出力するステップが、
前記予約された音声を使用して、前記検索結果の第2のサブセットを表す合成されたオーディオデータを出力することを含む請求項1または請求項2に記載の方法。
実行されるときに、前記少なくとも1つのプロセッサに、請求項1から3のいずれか一項に記載の方法を実行させる命令を含む少なくとも1つのメモリと
を含むコンピューティングデバイス。
少なくとも1つのプロセッサと、
実行されるときに、前記少なくとも1つのプロセッサに、請求項1から3のいずれか一項に記載の方法を実行させる命令を含む少なくとも1つのメモリと
を含むコンピューティングシステム。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662403665P | 2016-10-03 | 2016-10-03 | |
US62/403,665 | 2016-10-03 | ||
JP2020109771A JP7005694B2 (ja) | 2016-10-03 | 2020-06-25 | コンピュータによるエージェントのための合成音声の選択 |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2020109771A Division JP7005694B2 (ja) | 2016-10-03 | 2020-06-25 | コンピュータによるエージェントのための合成音声の選択 |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2022040183A true JP2022040183A (ja) | 2022-03-10 |
JP7108122B2 JP7108122B2 (ja) | 2022-07-27 |
Family
ID=60043416
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2019517918A Active JP6882463B2 (ja) | 2016-10-03 | 2017-09-29 | コンピュータによるエージェントのための合成音声の選択 |
JP2020109771A Active JP7005694B2 (ja) | 2016-10-03 | 2020-06-25 | コンピュータによるエージェントのための合成音声の選択 |
JP2021214388A Active JP7108122B2 (ja) | 2016-10-03 | 2021-12-28 | コンピュータによるエージェントのための合成音声の選択 |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2019517918A Active JP6882463B2 (ja) | 2016-10-03 | 2017-09-29 | コンピュータによるエージェントのための合成音声の選択 |
JP2020109771A Active JP7005694B2 (ja) | 2016-10-03 | 2020-06-25 | コンピュータによるエージェントのための合成音声の選択 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20230274205A1 (ja) |
EP (2) | EP4109375A1 (ja) |
JP (3) | JP6882463B2 (ja) |
KR (1) | KR20190054174A (ja) |
CN (3) | CN109804428B (ja) |
WO (3) | WO2018067403A1 (ja) |
Families Citing this family (77)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9965247B2 (en) | 2016-02-22 | 2018-05-08 | Sonos, Inc. | Voice controlled media playback system based on user profile |
US9947316B2 (en) | 2016-02-22 | 2018-04-17 | Sonos, Inc. | Voice control of a media playback system |
US10095470B2 (en) | 2016-02-22 | 2018-10-09 | Sonos, Inc. | Audio response playback |
US9811314B2 (en) | 2016-02-22 | 2017-11-07 | Sonos, Inc. | Metadata exchange involving a networked playback system and a networked microphone system |
US10264030B2 (en) | 2016-02-22 | 2019-04-16 | Sonos, Inc. | Networked microphone device control |
US9772817B2 (en) | 2016-02-22 | 2017-09-26 | Sonos, Inc. | Room-corrected voice detection |
US9978390B2 (en) | 2016-06-09 | 2018-05-22 | Sonos, Inc. | Dynamic player selection for audio signal processing |
US10152969B2 (en) | 2016-07-15 | 2018-12-11 | Sonos, Inc. | Voice detection by multiple devices |
US10134399B2 (en) | 2016-07-15 | 2018-11-20 | Sonos, Inc. | Contextualization of voice inputs |
US10115400B2 (en) | 2016-08-05 | 2018-10-30 | Sonos, Inc. | Multiple voice services |
US9942678B1 (en) | 2016-09-27 | 2018-04-10 | Sonos, Inc. | Audio playback settings for voice interaction |
US9743204B1 (en) | 2016-09-30 | 2017-08-22 | Sonos, Inc. | Multi-orientation playback device microphones |
US10181323B2 (en) | 2016-10-19 | 2019-01-15 | Sonos, Inc. | Arbitration-based voice recognition |
US10347247B2 (en) * | 2016-12-30 | 2019-07-09 | Google Llc | Modulation of packetized audio signals |
US11183181B2 (en) | 2017-03-27 | 2021-11-23 | Sonos, Inc. | Systems and methods of multiple voice services |
US10475449B2 (en) | 2017-08-07 | 2019-11-12 | Sonos, Inc. | Wake-word detection suppression |
US10048930B1 (en) | 2017-09-08 | 2018-08-14 | Sonos, Inc. | Dynamic computation of system response volume |
US10446165B2 (en) | 2017-09-27 | 2019-10-15 | Sonos, Inc. | Robust short-time fourier transform acoustic echo cancellation during audio playback |
US10621981B2 (en) | 2017-09-28 | 2020-04-14 | Sonos, Inc. | Tone interference cancellation |
US10051366B1 (en) | 2017-09-28 | 2018-08-14 | Sonos, Inc. | Three-dimensional beam forming with a microphone array |
US10482868B2 (en) | 2017-09-28 | 2019-11-19 | Sonos, Inc. | Multi-channel acoustic echo cancellation |
US10466962B2 (en) | 2017-09-29 | 2019-11-05 | Sonos, Inc. | Media playback system with voice assistance |
US10880650B2 (en) | 2017-12-10 | 2020-12-29 | Sonos, Inc. | Network microphone devices with automatic do not disturb actuation capabilities |
US10818290B2 (en) | 2017-12-11 | 2020-10-27 | Sonos, Inc. | Home graph |
JP2019109567A (ja) * | 2017-12-15 | 2019-07-04 | オンキヨー株式会社 | 電子機器、及び、電子機器の制御プログラム |
WO2019152722A1 (en) | 2018-01-31 | 2019-08-08 | Sonos, Inc. | Device designation of playback and network microphone device arrangements |
US20190327330A1 (en) | 2018-04-20 | 2019-10-24 | Facebook, Inc. | Building Customized User Profiles Based on Conversational Data |
US11307880B2 (en) | 2018-04-20 | 2022-04-19 | Meta Platforms, Inc. | Assisting users with personalized and contextual communication content |
US11886473B2 (en) | 2018-04-20 | 2024-01-30 | Meta Platforms, Inc. | Intent identification for agent matching by assistant systems |
US11715042B1 (en) | 2018-04-20 | 2023-08-01 | Meta Platforms Technologies, Llc | Interpretability of deep reinforcement learning models in assistant systems |
US11676220B2 (en) | 2018-04-20 | 2023-06-13 | Meta Platforms, Inc. | Processing multimodal user input for assistant systems |
US11175880B2 (en) | 2018-05-10 | 2021-11-16 | Sonos, Inc. | Systems and methods for voice-assisted media content selection |
US10847178B2 (en) | 2018-05-18 | 2020-11-24 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection |
US10959029B2 (en) | 2018-05-25 | 2021-03-23 | Sonos, Inc. | Determining and adapting to changes in microphone performance of playback devices |
US10681460B2 (en) | 2018-06-28 | 2020-06-09 | Sonos, Inc. | Systems and methods for associating playback devices with voice assistant services |
US10461710B1 (en) | 2018-08-28 | 2019-10-29 | Sonos, Inc. | Media playback system with maximum volume setting |
US11076035B2 (en) | 2018-08-28 | 2021-07-27 | Sonos, Inc. | Do not disturb feature for audio notifications |
US10878811B2 (en) | 2018-09-14 | 2020-12-29 | Sonos, Inc. | Networked devices, systems, and methods for intelligently deactivating wake-word engines |
US10587430B1 (en) | 2018-09-14 | 2020-03-10 | Sonos, Inc. | Networked devices, systems, and methods for associating playback devices based on sound codes |
US11024331B2 (en) | 2018-09-21 | 2021-06-01 | Sonos, Inc. | Voice detection optimization using sound metadata |
US10811015B2 (en) | 2018-09-25 | 2020-10-20 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
US11100923B2 (en) | 2018-09-28 | 2021-08-24 | Sonos, Inc. | Systems and methods for selective wake word detection using neural network models |
US10692518B2 (en) | 2018-09-29 | 2020-06-23 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection via multiple network microphone devices |
US11899519B2 (en) | 2018-10-23 | 2024-02-13 | Sonos, Inc. | Multiple stage network microphone device with reduced power consumption and processing load |
EP3654249A1 (en) | 2018-11-15 | 2020-05-20 | Snips | Dilated convolutions and gating for efficient keyword spotting |
US11183183B2 (en) | 2018-12-07 | 2021-11-23 | Sonos, Inc. | Systems and methods of operating media playback systems having multiple voice assistant services |
US11132989B2 (en) | 2018-12-13 | 2021-09-28 | Sonos, Inc. | Networked microphone devices, systems, and methods of localized arbitration |
US10602268B1 (en) | 2018-12-20 | 2020-03-24 | Sonos, Inc. | Optimization of network microphone devices using noise classification |
DE112019006677T5 (de) * | 2019-01-16 | 2021-11-04 | Sony Group Corporation | Antwortverarbeitungsvorrichtung und Antwortverarbeitungsverfahren |
US11315556B2 (en) | 2019-02-08 | 2022-04-26 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing by transmitting sound data associated with a wake word to an appropriate device for identification |
US10867604B2 (en) | 2019-02-08 | 2020-12-15 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing |
JP7280066B2 (ja) * | 2019-03-07 | 2023-05-23 | 本田技研工業株式会社 | エージェント装置、エージェント装置の制御方法、およびプログラム |
JP7274901B2 (ja) * | 2019-03-25 | 2023-05-17 | 本田技研工業株式会社 | エージェント装置、エージェント装置の制御方法、およびプログラム |
US11120794B2 (en) | 2019-05-03 | 2021-09-14 | Sonos, Inc. | Voice assistant persistence across multiple network microphone devices |
US10586540B1 (en) | 2019-06-12 | 2020-03-10 | Sonos, Inc. | Network microphone device with command keyword conditioning |
US11361756B2 (en) | 2019-06-12 | 2022-06-14 | Sonos, Inc. | Conditional wake word eventing based on environment |
US11200894B2 (en) | 2019-06-12 | 2021-12-14 | Sonos, Inc. | Network microphone device with command keyword eventing |
US10871943B1 (en) | 2019-07-31 | 2020-12-22 | Sonos, Inc. | Noise classification for event detection |
US11138969B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
US11138975B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
US11126446B2 (en) * | 2019-10-15 | 2021-09-21 | Microsoft Technology Licensing, Llc | Contextual extensible skills framework across surfaces |
US11189286B2 (en) | 2019-10-22 | 2021-11-30 | Sonos, Inc. | VAS toggle based on device orientation |
KR102135859B1 (ko) * | 2019-10-24 | 2020-07-20 | 주식회사 유니온플레이스 | 개인화된 가상 비서를 제공하는 장치 |
US11200900B2 (en) | 2019-12-20 | 2021-12-14 | Sonos, Inc. | Offline voice control |
US11562740B2 (en) | 2020-01-07 | 2023-01-24 | Sonos, Inc. | Voice verification for media playback |
US11556307B2 (en) | 2020-01-31 | 2023-01-17 | Sonos, Inc. | Local voice data processing |
US11308958B2 (en) | 2020-02-07 | 2022-04-19 | Sonos, Inc. | Localized wakeword verification |
JP7318587B2 (ja) * | 2020-05-18 | 2023-08-01 | トヨタ自動車株式会社 | エージェント制御装置 |
US11308962B2 (en) | 2020-05-20 | 2022-04-19 | Sonos, Inc. | Input detection windowing |
US11727919B2 (en) | 2020-05-20 | 2023-08-15 | Sonos, Inc. | Memory allocation for keyword spotting engines |
US11482224B2 (en) | 2020-05-20 | 2022-10-25 | Sonos, Inc. | Command keywords with input detection windowing |
US11698771B2 (en) | 2020-08-25 | 2023-07-11 | Sonos, Inc. | Vocal guidance engines for playback devices |
US11984123B2 (en) | 2020-11-12 | 2024-05-14 | Sonos, Inc. | Network device interaction by range |
US11551700B2 (en) | 2021-01-25 | 2023-01-10 | Sonos, Inc. | Systems and methods for power-efficient keyword detection |
EP4040433A1 (de) * | 2021-02-04 | 2022-08-10 | Deutsche Telekom AG | Dynamische generierung einer kette von funktionsmodulen eines virtuellen assistenten |
US20230169963A1 (en) * | 2021-11-30 | 2023-06-01 | Google Llc | Selectively masking query content to provide to a secondary digital assistant |
US20240070632A1 (en) * | 2022-08-24 | 2024-02-29 | Truist Bank | Virtual assistant transfers |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015528140A (ja) * | 2012-05-15 | 2015-09-24 | アップル インコーポレイテッド | サードパーティサービスをデジタルアシスタントと統合するシステム及び方法 |
Family Cites Families (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5727159A (en) * | 1996-04-10 | 1998-03-10 | Kikinis; Dan | System in which a Proxy-Server translates information received from the Internet into a form/format readily usable by low power portable computers |
JP3270356B2 (ja) * | 1996-12-04 | 2002-04-02 | 株式会社ジャストシステム | 発話文書作成装置,発話文書作成方法および発話文書作成手順をコンピュータに実行させるプログラムを格納したコンピュータ読み取り可能な記録媒体 |
US6851115B1 (en) * | 1999-01-05 | 2005-02-01 | Sri International | Software-based architecture for communication and cooperation among distributed electronic agents |
US7036128B1 (en) * | 1999-01-05 | 2006-04-25 | Sri International Offices | Using a community of distributed electronic agents to support a highly mobile, ambient computing environment |
JP2003295890A (ja) | 2002-04-04 | 2003-10-15 | Nec Corp | 音声認識対話選択装置、音声認識対話システム、音声認識対話選択方法、プログラム |
US7398209B2 (en) * | 2002-06-03 | 2008-07-08 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
US6834265B2 (en) * | 2002-12-13 | 2004-12-21 | Motorola, Inc. | Method and apparatus for selective speech recognition |
US8561069B2 (en) * | 2002-12-19 | 2013-10-15 | Fujitsu Limited | Task computing |
US7444327B2 (en) * | 2004-01-09 | 2008-10-28 | Microsoft Corporation | System and method for automated optimization of search result relevance |
WO2011082340A1 (en) * | 2009-12-31 | 2011-07-07 | Volt Delta Resources, Llc | Method and system for processing multiple speech recognition results from a single utterance |
US10560541B2 (en) * | 2010-05-26 | 2020-02-11 | Sap Se | Service delivery management for brokered service delivery |
US10073927B2 (en) * | 2010-11-16 | 2018-09-11 | Microsoft Technology Licensing, Llc | Registration for system level search user interface |
CN102594652B (zh) * | 2011-01-13 | 2015-04-08 | 华为技术有限公司 | 一种虚拟机迁移方法、交换机、虚拟机系统 |
WO2013190963A1 (ja) * | 2012-06-18 | 2013-12-27 | エイディシーテクノロジー株式会社 | 音声応答装置 |
US9313332B1 (en) * | 2012-11-28 | 2016-04-12 | Angel.Com Incorporated | Routing user communications to agents |
US20140222512A1 (en) * | 2013-02-01 | 2014-08-07 | Goodsnitch, Inc. | Receiving, tracking and analyzing business intelligence data |
US9741339B2 (en) * | 2013-06-28 | 2017-08-22 | Google Inc. | Data driven word pronunciation learning and scoring with crowd sourcing based on the word's phonemes pronunciation scores |
US9305554B2 (en) * | 2013-07-17 | 2016-04-05 | Samsung Electronics Co., Ltd. | Multi-level speech recognition |
US9418663B2 (en) * | 2014-07-31 | 2016-08-16 | Google Inc. | Conversational agent with a particular spoken style of speech |
US9986097B2 (en) * | 2014-11-05 | 2018-05-29 | Avaya Inc. | System and method for selecting an agent in an enterprise |
US10192549B2 (en) * | 2014-11-28 | 2019-01-29 | Microsoft Technology Licensing, Llc | Extending digital personal assistant action providers |
US9508339B2 (en) * | 2015-01-30 | 2016-11-29 | Microsoft Technology Licensing, Llc | Updating language understanding classifier models for a digital personal assistant based on crowd-sourcing |
US9336268B1 (en) * | 2015-04-08 | 2016-05-10 | Pearson Education, Inc. | Relativistic sentiment analyzer |
-
2017
- 2017-09-29 WO PCT/US2017/054462 patent/WO2018067403A1/en unknown
- 2017-09-29 EP EP22190358.6A patent/EP4109375A1/en active Pending
- 2017-09-29 CN CN201780061508.4A patent/CN109804428B/zh active Active
- 2017-09-29 CN CN201780061554.4A patent/CN109844855B/zh active Active
- 2017-09-29 WO PCT/US2017/054467 patent/WO2018067404A1/en active Application Filing
- 2017-09-29 KR KR1020197012708A patent/KR20190054174A/ko not_active Application Discontinuation
- 2017-09-29 EP EP17784147.5A patent/EP3504705B1/en active Active
- 2017-09-29 WO PCT/US2017/054456 patent/WO2018067402A1/en active Application Filing
- 2017-09-29 JP JP2019517918A patent/JP6882463B2/ja active Active
- 2017-09-29 CN CN202010767926.9A patent/CN112071302A/zh active Pending
-
2020
- 2020-06-25 JP JP2020109771A patent/JP7005694B2/ja active Active
-
2021
- 2021-12-28 JP JP2021214388A patent/JP7108122B2/ja active Active
-
2023
- 2023-04-17 US US18/135,579 patent/US20230274205A1/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015528140A (ja) * | 2012-05-15 | 2015-09-24 | アップル インコーポレイテッド | サードパーティサービスをデジタルアシスタントと統合するシステム及び方法 |
Also Published As
Publication number | Publication date |
---|---|
JP7108122B2 (ja) | 2022-07-27 |
EP4109375A1 (en) | 2022-12-28 |
CN109844855B (zh) | 2023-12-05 |
JP6882463B2 (ja) | 2021-06-02 |
JP2020173462A (ja) | 2020-10-22 |
US20230274205A1 (en) | 2023-08-31 |
CN109804428A (zh) | 2019-05-24 |
JP2019535037A (ja) | 2019-12-05 |
WO2018067402A1 (en) | 2018-04-12 |
CN109804428B (zh) | 2020-08-21 |
EP3504705B1 (en) | 2022-09-21 |
KR20190054174A (ko) | 2019-05-21 |
WO2018067404A1 (en) | 2018-04-12 |
WO2018067403A1 (en) | 2018-04-12 |
JP7005694B2 (ja) | 2022-01-21 |
EP3504705A1 (en) | 2019-07-03 |
CN109844855A (zh) | 2019-06-04 |
CN112071302A (zh) | 2020-12-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7005694B2 (ja) | コンピュータによるエージェントのための合成音声の選択 | |
US10854188B2 (en) | Synthesized voice selection for computational agents | |
US10853747B2 (en) | Selection of computational agent for task performance | |
JP6953559B2 (ja) | 計算機アシスタントによる遅延応答 | |
JP7121052B2 (ja) | イメージデータに少なくとも部分的に基づく、アクションを実行するためのエージェントの決定 | |
JP7118056B2 (ja) | バーチャルアシスタントのパーソナライズ | |
US11663535B2 (en) | Multi computational agent performance of tasks | |
EP3590087A1 (en) | Smart setup of assistant services |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20220119 |
|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20220119 |
|
A871 | Explanation of circumstances concerning accelerated examination |
Free format text: JAPANESE INTERMEDIATE CODE: A871 Effective date: 20220119 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20220214 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20220405 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20220620 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20220714 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 7108122 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |