JP2019530888A - 話者照合 - Google Patents
話者照合 Download PDFInfo
- Publication number
- JP2019530888A JP2019530888A JP2019500442A JP2019500442A JP2019530888A JP 2019530888 A JP2019530888 A JP 2019530888A JP 2019500442 A JP2019500442 A JP 2019500442A JP 2019500442 A JP2019500442 A JP 2019500442A JP 2019530888 A JP2019530888 A JP 2019530888A
- Authority
- JP
- Japan
- Prior art keywords
- user
- vector
- neural network
- utterance
- language
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012795 verification Methods 0.000 title claims abstract description 76
- 238000013528 artificial neural network Methods 0.000 claims abstract description 106
- 238000000034 method Methods 0.000 claims abstract description 48
- 230000009471 action Effects 0.000 claims abstract description 11
- 239000013598 vector Substances 0.000 claims description 180
- 230000004044 response Effects 0.000 claims description 20
- 230000004913 activation Effects 0.000 claims description 14
- 238000001994 activation Methods 0.000 claims description 14
- 230000015654 memory Effects 0.000 claims description 5
- 238000004590 computer program Methods 0.000 abstract description 12
- 230000002730 additional effect Effects 0.000 abstract 1
- 238000012549 training Methods 0.000 description 72
- 230000008569 process Effects 0.000 description 19
- 238000012545 processing Methods 0.000 description 18
- 238000011524 similarity measure Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 238000013515 script Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 238000013475 authorization Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 230000002618 waking effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
- G10L17/24—Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/08—Use of distortion metrics or a particular distance between probe pattern and reference templates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/14—Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/18—Artificial neural networks; Connectionist approaches
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- User Interface Of Digital Computer (AREA)
- Machine Translation (AREA)
Abstract
Description
105a ホットワード
105b オーディオ
110 ユーザデバイス
111 マイクロフォン
113 "Speaker Identity Verified"
115 オーディオによる挨拶
115a ホットワード
115b オーディオ
120 ユーザデバイス
121 マイクロフォン
123 "Shuohuazhe de shenfen yanzheng"
125 オーディオによる挨拶
130 ネットワーク
140 サーバ
150 ニューラルネットワーク
180 話者照合モデル
200 システム
210 ユーザデバイス
210a、210b トレーニング発話
211 マイクロフォン
213a 第1のトレーニングデータのセット
213b 第2のトレーニングデータのセット
214a トレーニング発話ベクトル
215a、215b 言語ID
230 ネットワーク
240 サーバ
250 ニューラルネットワーク
252 入力層
254a、254b、254c 隠れ層
256 出力層
258 一部
260a 出力
260b 出力
270 コンパレータ
272 比較モジュールの出力
280 言語独立話者照合モデル
305、310、315、320 one-hot言語ベクトル
400 システム
402 ユーザ
410a ホットワード
410b オーディオ
414 音響特徴ベクトル
415 言語ID
430 参照ベクトル
440 比較モジュール
450 照合モジュール
460 メッセージ
500 プロセス
Claims (20)
ユーザデバイスによって、ユーザの発話を表現するオーディオデータを受信することと、
前記ユーザデバイス上に記憶されたニューラルネットワークに、前記オーディオデータと、前記ユーザデバイスに関連付けられた言語識別子または場所識別子とから導出された入力データのセットを提供することであって、前記ニューラルネットワークは異なる言語または異なる方言で音声を表現する音声データを使用してトレーニングされたパラメータを有する、提供することと、
前記入力データのセットの受信に応答して生成される前記ニューラルネットワークの出力に基づいて、前記ユーザの声の特徴を示す話者表現を生成することと、
前記話者表現および第2の表現に基づいて、前記発話が前記ユーザの発話であると判定することと、
前記発話が前記ユーザの発話であると判定したことに基づいて前記ユーザデバイスへの前記ユーザアクセスを提供することと
を含む動作を実行させるように動作可能な命令を記憶した1つまたは複数の記憶デバイスと
を備える、システム。
前記ニューラルネットワークに、前記生成された入力ベクトルを提供することと、
前記入力ベクトルの受信に応答して生成される前記ニューラルネットワークの出力に基づいて、前記ユーザの前記声の特徴を示す話者表現を生成することと
をさらに含む、請求項2に記載のシステム。
前記ニューラルネットワークに、前記生成された入力ベクトルを提供することと、
前記入力ベクトルの受信に応答して生成される前記ニューラルネットワークの出力に基づいて、前記ユーザの前記声の特徴を示す話者表現を生成することと
をさらに含む、請求項2に記載のシステム。
前記ニューラルネットワークに、前記生成された入力ベクトルを提供することと、
前記入力ベクトルの受信に応答して生成される前記ニューラルネットワークの出力に基づいて、前記ユーザの前記声の特徴を示す話者表現を生成することと
をさらに含む、請求項2に記載のシステム。
(i)ユーザの特定の発話に対応する特定のオーディオデータ、および(ii)前記ユーザによって話される特定の言語を示すデータの受信に応答して、出力のために、前記言語独立話者照合モデルが、前記特定のオーディオデータが前記ユーザによって話された前記特定の言語について指定されたホットワードの前記発話を含む可能性が高いと判定したことを示す指示を提供するステップと
を含む、コンピュータ実装方法。
前記ユーザデバイス上に記憶されたニューラルネットワークに、前記オーディオデータと、前記ユーザデバイスに関連付けられた言語識別子または場所識別子とから導出された入力データのセットを提供するステップであって、前記ニューラルネットワークは異なる言語または方言で音声を表現する音声データを使用してトレーニングされたパラメータを有する、ステップと、
前記入力データのセットの受信に応答して生成される前記ニューラルネットワークの出力に基づいて、前記ユーザの声の特徴を示す話者表現を生成するステップと、
前記話者表現および第2の表現に基づいて、前記発話が前記ユーザの発話であると判定するステップと、
前記発話が前記ユーザの発話であると判定したことに基づいて前記ユーザデバイスへの前記ユーザアクセスを提供するステップと
を含む、方法。
前記ニューラルネットワークに、前記生成された入力ベクトルを提供するステップと、
前記入力ベクトルの受信に応答して生成される前記ニューラルネットワークの出力に基づいて、前記ユーザの前記声の特徴を示す話者表現を生成するステップと
をさらに含む、請求項14に記載の方法。
前記ニューラルネットワークに、前記生成された入力ベクトルを提供するステップと、
前記入力ベクトルの受信に応答して生成される前記ニューラルネットワークの出力に基づいて、前記ユーザの前記声の特徴を示す話者表現を生成するステップと
をさらに含む、請求項14に記載の方法。
前記ニューラルネットワークに、前記生成された入力ベクトルを提供するステップと、
前記入力ベクトルの受信に応答して生成される前記ニューラルネットワークの出力に基づいて、前記ユーザの前記声の特徴を示す話者表現を生成するステップと
をさらに含む、請求項14に記載の方法。
前記第1の表現と前記第2の表現との間の距離を決定するステップを含む、
請求項13から18のいずれか一項に記載の方法。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/211,317 US20180018973A1 (en) | 2016-07-15 | 2016-07-15 | Speaker verification |
US15/211,317 | 2016-07-15 | ||
PCT/US2017/040906 WO2018013401A1 (en) | 2016-07-15 | 2017-07-06 | Speaker verification |
Publications (2)
Publication Number | Publication Date |
---|---|
JP6561219B1 JP6561219B1 (ja) | 2019-08-14 |
JP2019530888A true JP2019530888A (ja) | 2019-10-24 |
Family
ID=59366524
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2019500442A Active JP6561219B1 (ja) | 2016-07-15 | 2017-07-06 | 話者照合 |
Country Status (7)
Country | Link |
---|---|
US (4) | US20180018973A1 (ja) |
EP (2) | EP3373294B1 (ja) |
JP (1) | JP6561219B1 (ja) |
KR (1) | KR102109874B1 (ja) |
CN (1) | CN108140386B (ja) |
RU (1) | RU2697736C1 (ja) |
WO (1) | WO2018013401A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2021135313A (ja) * | 2020-02-21 | 2021-09-13 | 日本電信電話株式会社 | 照合装置、照合方法、および、照合プログラム |
JP2021527840A (ja) * | 2018-10-10 | 2021-10-14 | テンセント・テクノロジー・(シェンジェン)・カンパニー・リミテッド | 声紋識別方法、モデルトレーニング方法、サーバ、及びコンピュータプログラム |
JP2022539674A (ja) * | 2019-12-04 | 2022-09-13 | グーグル エルエルシー | 特定話者スピーチモデルを使用した話者認識 |
Families Citing this family (62)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
BR112015018905B1 (pt) | 2013-02-07 | 2022-02-22 | Apple Inc | Método de operação de recurso de ativação por voz, mídia de armazenamento legível por computador e dispositivo eletrônico |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US11676608B2 (en) * | 2021-04-02 | 2023-06-13 | Google Llc | Speaker verification using co-location information |
CN106469040B (zh) * | 2015-08-19 | 2019-06-21 | 华为终端有限公司 | 通信方法、服务器及设备 |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US20180018973A1 (en) | 2016-07-15 | 2018-01-18 | Google Inc. | Speaker verification |
CN106251859B (zh) * | 2016-07-22 | 2019-05-31 | 百度在线网络技术(北京)有限公司 | 语音识别处理方法和装置 |
CN111971742A (zh) * | 2016-11-10 | 2020-11-20 | 赛轮思软件技术(北京)有限公司 | 与语言无关的唤醒词检测的技术 |
US11276395B1 (en) * | 2017-03-10 | 2022-03-15 | Amazon Technologies, Inc. | Voice-based parameter assignment for voice-capturing devices |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK201770427A1 (en) | 2017-05-12 | 2018-12-20 | Apple Inc. | LOW-LATENCY INTELLIGENT AUTOMATED ASSISTANT |
AU2017425736A1 (en) * | 2017-07-31 | 2020-01-23 | Beijing Didi Infinity Technology And Development Co., Ltd. | System and method for language-based service hailing |
US11817103B2 (en) * | 2017-09-15 | 2023-11-14 | Nec Corporation | Pattern recognition apparatus, pattern recognition method, and storage medium |
CN108305615B (zh) * | 2017-10-23 | 2020-06-16 | 腾讯科技(深圳)有限公司 | 一种对象识别方法及其设备、存储介质、终端 |
US10916252B2 (en) | 2017-11-10 | 2021-02-09 | Nvidia Corporation | Accelerated data transfer for latency reduction and real-time processing |
KR102486395B1 (ko) * | 2017-11-23 | 2023-01-10 | 삼성전자주식회사 | 화자 인식을 위한 뉴럴 네트워크 장치, 및 그 동작 방법 |
US10593321B2 (en) * | 2017-12-15 | 2020-03-17 | Mitsubishi Electric Research Laboratories, Inc. | Method and apparatus for multi-lingual end-to-end speech recognition |
US10783873B1 (en) * | 2017-12-15 | 2020-09-22 | Educational Testing Service | Native language identification with time delay deep neural networks trained separately on native and non-native english corpora |
CN111630934B (zh) * | 2018-01-22 | 2023-10-13 | 诺基亚技术有限公司 | 隐私保护的声纹认证装置和方法 |
CN108597525B (zh) * | 2018-04-25 | 2019-05-03 | 四川远鉴科技有限公司 | 语音声纹建模方法及装置 |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US11152006B2 (en) * | 2018-05-07 | 2021-10-19 | Microsoft Technology Licensing, Llc | Voice identification enrollment |
GB2573809B (en) | 2018-05-18 | 2020-11-04 | Emotech Ltd | Speaker Recognition |
JP6980603B2 (ja) * | 2018-06-21 | 2021-12-15 | 株式会社東芝 | 話者モデル作成システム、認識システム、プログラムおよび制御装置 |
US10991379B2 (en) * | 2018-06-22 | 2021-04-27 | Babblelabs Llc | Data driven audio enhancement |
CN110634489B (zh) * | 2018-06-25 | 2022-01-14 | 科大讯飞股份有限公司 | 一种声纹确认方法、装置、设备及可读存储介质 |
KR20200011796A (ko) * | 2018-07-25 | 2020-02-04 | 엘지전자 주식회사 | 음성 인식 시스템 |
CN110874875B (zh) * | 2018-08-13 | 2021-01-29 | 珠海格力电器股份有限公司 | 门锁控制方法及装置 |
US10978059B2 (en) * | 2018-09-25 | 2021-04-13 | Google Llc | Speaker diarization using speaker embedding(s) and trained generative model |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
KR102623246B1 (ko) * | 2018-10-12 | 2024-01-11 | 삼성전자주식회사 | 전자 장치, 전자 장치의 제어 방법 및 컴퓨터 판독 가능 매체. |
US11144542B2 (en) * | 2018-11-01 | 2021-10-12 | Visa International Service Association | Natural language processing system |
US11031017B2 (en) * | 2019-01-08 | 2021-06-08 | Google Llc | Fully supervised speaker diarization |
TW202029181A (zh) * | 2019-01-28 | 2020-08-01 | 正崴精密工業股份有限公司 | 語音識別用於特定目標喚醒的方法及裝置 |
US10978069B1 (en) * | 2019-03-18 | 2021-04-13 | Amazon Technologies, Inc. | Word selection for natural language interface |
US11948582B2 (en) * | 2019-03-25 | 2024-04-02 | Omilia Natural Language Solutions Ltd. | Systems and methods for speaker verification |
WO2020223122A1 (en) * | 2019-04-30 | 2020-11-05 | Walmart Apollo, Llc | Systems and methods for processing retail facility-related information requests of retail facility workers |
US11132992B2 (en) | 2019-05-05 | 2021-09-28 | Microsoft Technology Licensing, Llc | On-device custom wake word detection |
US11222622B2 (en) | 2019-05-05 | 2022-01-11 | Microsoft Technology Licensing, Llc | Wake word selection assistance architectures and methods |
US11158305B2 (en) * | 2019-05-05 | 2021-10-26 | Microsoft Technology Licensing, Llc | Online verification of custom wake word |
US11468890B2 (en) | 2019-06-01 | 2022-10-11 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
US11031013B1 (en) | 2019-06-17 | 2021-06-08 | Express Scripts Strategic Development, Inc. | Task completion based on speech analysis |
CN110400562B (zh) * | 2019-06-24 | 2022-03-22 | 歌尔科技有限公司 | 交互处理方法、装置、设备及音频设备 |
CN110415679B (zh) * | 2019-07-25 | 2021-12-17 | 北京百度网讯科技有限公司 | 语音纠错方法、装置、设备和存储介质 |
CN110379433B (zh) * | 2019-08-02 | 2021-10-08 | 清华大学 | 身份验证的方法、装置、计算机设备及存储介质 |
RU2723902C1 (ru) * | 2020-02-15 | 2020-06-18 | Илья Владимирович Редкокашин | Способ верификации голосовых биометрических данных |
CN111370003B (zh) * | 2020-02-27 | 2023-05-30 | 杭州雄迈集成电路技术股份有限公司 | 一种基于孪生神经网络的声纹比对方法 |
US11443748B2 (en) * | 2020-03-03 | 2022-09-13 | International Business Machines Corporation | Metric learning of speaker diarization |
US11651767B2 (en) | 2020-03-03 | 2023-05-16 | International Business Machines Corporation | Metric learning of speaker diarization |
US20210287681A1 (en) * | 2020-03-16 | 2021-09-16 | Fidelity Information Services, Llc | Systems and methods for contactless authentication using voice recognition |
JPWO2021187146A1 (ja) * | 2020-03-16 | 2021-09-23 | ||
US11061543B1 (en) | 2020-05-11 | 2021-07-13 | Apple Inc. | Providing relevant data items based on context |
US11508380B2 (en) * | 2020-05-26 | 2022-11-22 | Apple Inc. | Personalized voices for text messaging |
US11490204B2 (en) | 2020-07-20 | 2022-11-01 | Apple Inc. | Multi-device audio adjustment coordination |
US11438683B2 (en) | 2020-07-21 | 2022-09-06 | Apple Inc. | User identification using headphones |
KR102277422B1 (ko) * | 2020-07-24 | 2021-07-19 | 이종엽 | 음성 단말기의 음성 검증 및 제한 방법 |
US11676572B2 (en) * | 2021-03-03 | 2023-06-13 | Google Llc | Instantaneous learning in text-to-speech during dialog |
US11776550B2 (en) * | 2021-03-09 | 2023-10-03 | Qualcomm Incorporated | Device operation based on dynamic classifier |
US11798562B2 (en) * | 2021-05-16 | 2023-10-24 | Google Llc | Attentive scoring function for speaker identification |
US20230137652A1 (en) * | 2021-11-01 | 2023-05-04 | Pindrop Security, Inc. | Cross-lingual speaker recognition |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150127336A1 (en) * | 2013-11-04 | 2015-05-07 | Google Inc. | Speaker verification using neural networks |
Family Cites Families (159)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4799262A (en) | 1985-06-27 | 1989-01-17 | Kurzweil Applied Intelligence, Inc. | Speech recognition |
US4868867A (en) | 1987-04-06 | 1989-09-19 | Voicecraft Inc. | Vector excitation speech or audio coder for transmission or storage |
JP2733955B2 (ja) | 1988-05-18 | 1998-03-30 | 日本電気株式会社 | 適応型音声認識装置 |
US5465318A (en) | 1991-03-28 | 1995-11-07 | Kurzweil Applied Intelligence, Inc. | Method for generating a speech recognition model for a non-vocabulary utterance |
JP2979711B2 (ja) | 1991-04-24 | 1999-11-15 | 日本電気株式会社 | パターン認識方式および標準パターン学習方式 |
US5680508A (en) | 1991-05-03 | 1997-10-21 | Itt Corporation | Enhancement of speech coding in background noise for low-rate speech coder |
EP0576765A1 (en) | 1992-06-30 | 1994-01-05 | International Business Machines Corporation | Method for coding digital data using vector quantizing techniques and device for implementing said method |
US5636325A (en) | 1992-11-13 | 1997-06-03 | International Business Machines Corporation | Speech synthesis and analysis of dialects |
US5627939A (en) | 1993-09-03 | 1997-05-06 | Microsoft Corporation | Speech recognition system and method employing data compression |
US5689616A (en) * | 1993-11-19 | 1997-11-18 | Itt Corporation | Automatic language identification/verification system |
US5509103A (en) | 1994-06-03 | 1996-04-16 | Motorola, Inc. | Method of training neural networks used for speech recognition |
US5542006A (en) | 1994-06-21 | 1996-07-30 | Eastman Kodak Company | Neural network based character position detector for use in optical character recognition |
US5729656A (en) | 1994-11-30 | 1998-03-17 | International Business Machines Corporation | Reduction of search space in speech recognition using phone boundaries and phone ranking |
US5839103A (en) * | 1995-06-07 | 1998-11-17 | Rutgers, The State University Of New Jersey | Speaker verification system using decision fusion logic |
US6067517A (en) | 1996-02-02 | 2000-05-23 | International Business Machines Corporation | Transcription of speech data with segments from acoustically dissimilar environments |
US5729694A (en) | 1996-02-06 | 1998-03-17 | The Regents Of The University Of California | Speech coding, reconstruction and recognition using acoustics and electromagnetic waves |
US5745872A (en) | 1996-05-07 | 1998-04-28 | Texas Instruments Incorporated | Method and system for compensating speech signals using vector quantization codebook adaptation |
US6038528A (en) | 1996-07-17 | 2000-03-14 | T-Netix, Inc. | Robust speech processing with affine transform replicated data |
EP0954854A4 (en) * | 1996-11-22 | 2000-07-19 | T Netix Inc | PARTIAL VALUE-BASED SPEAKER VERIFICATION BY UNIFYING DIFFERENT CLASSIFIERS USING CHANNEL, ASSOCIATION, MODEL AND THRESHOLD ADAPTATION |
US6260013B1 (en) | 1997-03-14 | 2001-07-10 | Lernout & Hauspie Speech Products N.V. | Speech recognition system employing discriminatively trained models |
KR100238189B1 (ko) | 1997-10-16 | 2000-01-15 | 윤종용 | 다중 언어 tts장치 및 다중 언어 tts 처리 방법 |
WO1999023643A1 (en) * | 1997-11-03 | 1999-05-14 | T-Netix, Inc. | Model adaptation system and method for speaker verification |
US6188982B1 (en) | 1997-12-01 | 2001-02-13 | Industrial Technology Research Institute | On-line background noise adaptation of parallel model combination HMM with discriminative learning using weighted HMM for noisy speech recognition |
US6397179B2 (en) | 1997-12-24 | 2002-05-28 | Nortel Networks Limited | Search optimization system and method for continuous speech recognition |
US6381569B1 (en) | 1998-02-04 | 2002-04-30 | Qualcomm Incorporated | Noise-compensated speech recognition templates |
US6434520B1 (en) | 1999-04-16 | 2002-08-13 | International Business Machines Corporation | System and method for indexing and querying audio archives |
US6665644B1 (en) | 1999-08-10 | 2003-12-16 | International Business Machines Corporation | Conversational data mining |
GB9927528D0 (en) | 1999-11-23 | 2000-01-19 | Ibm | Automatic language identification |
DE10018134A1 (de) | 2000-04-12 | 2001-10-18 | Siemens Ag | Verfahren und Vorrichtung zum Bestimmen prosodischer Markierungen |
US6631348B1 (en) | 2000-08-08 | 2003-10-07 | Intel Corporation | Dynamic speech recognition pattern switching for enhanced speech recognition accuracy |
DE10047172C1 (de) | 2000-09-22 | 2001-11-29 | Siemens Ag | Verfahren zur Sprachverarbeitung |
US6876966B1 (en) | 2000-10-16 | 2005-04-05 | Microsoft Corporation | Pattern recognition training method and apparatus using inserted noise followed by noise reduction |
JP4244514B2 (ja) | 2000-10-23 | 2009-03-25 | セイコーエプソン株式会社 | 音声認識方法および音声認識装置 |
US7280969B2 (en) | 2000-12-07 | 2007-10-09 | International Business Machines Corporation | Method and apparatus for producing natural sounding pitch contours in a speech synthesizer |
GB2370401A (en) | 2000-12-19 | 2002-06-26 | Nokia Mobile Phones Ltd | Speech recognition |
US7062442B2 (en) | 2001-02-23 | 2006-06-13 | Popcatcher Ab | Method and arrangement for search and recording of media signals |
GB2375673A (en) | 2001-05-14 | 2002-11-20 | Salgen Systems Ltd | Image compression method using a table of hash values corresponding to motion vectors |
GB2375935A (en) | 2001-05-22 | 2002-11-27 | Motorola Inc | Speech quality indication |
GB0113581D0 (en) | 2001-06-04 | 2001-07-25 | Hewlett Packard Co | Speech synthesis apparatus |
US7668718B2 (en) | 2001-07-17 | 2010-02-23 | Custom Speech Usa, Inc. | Synchronized pattern recognition source data processed by manual or automatic means for creation of shared speaker-dependent speech user profile |
US20030033143A1 (en) | 2001-08-13 | 2003-02-13 | Hagai Aronowitz | Decreasing noise sensitivity in speech processing under adverse conditions |
US7571095B2 (en) | 2001-08-15 | 2009-08-04 | Sri International | Method and apparatus for recognizing speech in a noisy environment |
US7043431B2 (en) | 2001-08-31 | 2006-05-09 | Nokia Corporation | Multilingual speech recognition system using text derived recognition models |
US6950796B2 (en) | 2001-11-05 | 2005-09-27 | Motorola, Inc. | Speech recognition by dynamical noise model adaptation |
WO2004003887A2 (en) | 2002-06-28 | 2004-01-08 | Conceptual Speech, Llc | Multi-phoneme streamer and knowledge representation speech recognition system and method |
US20040024582A1 (en) | 2002-07-03 | 2004-02-05 | Scott Shepard | Systems and methods for aiding human translation |
US6756821B2 (en) * | 2002-07-23 | 2004-06-29 | Broadcom | High speed differential signaling logic gate and applications thereof |
JP4352790B2 (ja) | 2002-10-31 | 2009-10-28 | セイコーエプソン株式会社 | 音響モデル作成方法および音声認識装置ならびに音声認識装置を有する乗り物 |
US7593842B2 (en) | 2002-12-10 | 2009-09-22 | Leslie Rousseau | Device and method for translating language |
US20040111272A1 (en) | 2002-12-10 | 2004-06-10 | International Business Machines Corporation | Multimodal speech-to-speech language translation and display |
KR100486735B1 (ko) | 2003-02-28 | 2005-05-03 | 삼성전자주식회사 | 최적구획 분류신경망 구성방법과 최적구획 분류신경망을이용한 자동 레이블링방법 및 장치 |
US7571097B2 (en) | 2003-03-13 | 2009-08-04 | Microsoft Corporation | Method for training of subspace coded gaussian models |
US8849185B2 (en) | 2003-04-15 | 2014-09-30 | Ipventure, Inc. | Hybrid audio delivery system and method therefor |
US7275032B2 (en) | 2003-04-25 | 2007-09-25 | Bvoice Corporation | Telephone call handling center where operators utilize synthesized voices generated or modified to exhibit or omit prescribed speech characteristics |
JP2004325897A (ja) | 2003-04-25 | 2004-11-18 | Pioneer Electronic Corp | 音声認識装置及び音声認識方法 |
US7499857B2 (en) | 2003-05-15 | 2009-03-03 | Microsoft Corporation | Adaptation of compressed acoustic models |
US20040260550A1 (en) | 2003-06-20 | 2004-12-23 | Burges Chris J.C. | Audio processing system and method for classifying speakers in audio data |
JP4548646B2 (ja) | 2003-09-12 | 2010-09-22 | 株式会社エヌ・ティ・ティ・ドコモ | 音声モデルの雑音適応化システム、雑音適応化方法、及び、音声認識雑音適応化プログラム |
US20050144003A1 (en) | 2003-12-08 | 2005-06-30 | Nokia Corporation | Multi-lingual speech synthesis |
FR2865846A1 (fr) | 2004-02-02 | 2005-08-05 | France Telecom | Systeme de synthese vocale |
FR2867598B1 (fr) | 2004-03-12 | 2006-05-26 | Thales Sa | Procede d'identification automatique de langues, en temps reel, dans un signal audio et dispositif de mise en oeuvre |
US20050228673A1 (en) | 2004-03-30 | 2005-10-13 | Nefian Ara V | Techniques for separating and evaluating audio and video source data |
FR2868586A1 (fr) | 2004-03-31 | 2005-10-07 | France Telecom | Procede et systeme ameliores de conversion d'un signal vocal |
US20050267755A1 (en) | 2004-05-27 | 2005-12-01 | Nokia Corporation | Arrangement for speech recognition |
US7406408B1 (en) | 2004-08-24 | 2008-07-29 | The United States Of America As Represented By The Director, National Security Agency | Method of recognizing phones in speech of any language |
US7418383B2 (en) | 2004-09-03 | 2008-08-26 | Microsoft Corporation | Noise robust speech recognition with a switching linear dynamic model |
EP1854095A1 (en) | 2005-02-15 | 2007-11-14 | BBN Technologies Corp. | Speech analyzing system with adaptive noise codebook |
US20060253272A1 (en) | 2005-05-06 | 2006-11-09 | International Business Machines Corporation | Voice prompts for use in speech-to-speech translation system |
US8073696B2 (en) | 2005-05-18 | 2011-12-06 | Panasonic Corporation | Voice synthesis device |
EP1889255A1 (en) | 2005-05-24 | 2008-02-20 | Loquendo S.p.A. | Automatic text-independent, language-independent speaker voice-print creation and speaker recognition |
US20070088552A1 (en) | 2005-10-17 | 2007-04-19 | Nokia Corporation | Method and a device for speech recognition |
US20070118372A1 (en) | 2005-11-23 | 2007-05-24 | General Electric Company | System and method for generating closed captions |
US7991770B2 (en) | 2005-11-29 | 2011-08-02 | Google Inc. | Detecting repeating content in broadcast media |
US7539616B2 (en) * | 2006-02-20 | 2009-05-26 | Microsoft Corporation | Speaker authentication using adapted background models |
US20080004858A1 (en) | 2006-06-29 | 2008-01-03 | International Business Machines Corporation | Apparatus and method for integrated phrase-based and free-form speech-to-speech translation |
US7996222B2 (en) | 2006-09-29 | 2011-08-09 | Nokia Corporation | Prosody conversion |
CN101166017B (zh) | 2006-10-20 | 2011-12-07 | 松下电器产业株式会社 | 用于声音产生设备的自动杂音补偿方法及装置 |
CA2676380C (en) | 2007-01-23 | 2015-11-24 | Infoture, Inc. | System and method for detection and analysis of speech |
US7848924B2 (en) | 2007-04-17 | 2010-12-07 | Nokia Corporation | Method, apparatus and computer program product for providing voice conversion using temporal dynamic features |
US20080300875A1 (en) | 2007-06-04 | 2008-12-04 | Texas Instruments Incorporated | Efficient Speech Recognition with Cluster Methods |
CN101359473A (zh) | 2007-07-30 | 2009-02-04 | 国际商业机器公司 | 自动进行语音转换的方法和装置 |
GB2453366B (en) | 2007-10-04 | 2011-04-06 | Toshiba Res Europ Ltd | Automatic speech recognition method and apparatus |
JP4944241B2 (ja) * | 2008-03-14 | 2012-05-30 | 名古屋油化株式会社 | 離型性シートおよび成形物 |
US8615397B2 (en) | 2008-04-04 | 2013-12-24 | Intuit Inc. | Identifying audio content using distorted target patterns |
WO2009129315A1 (en) | 2008-04-15 | 2009-10-22 | Mobile Technologies, Llc | System and methods for maintaining speech-to-speech translation in the field |
CN101562013B (zh) * | 2008-04-15 | 2013-05-22 | 联芯科技有限公司 | 一种自动识别语音的方法和装置 |
US8374873B2 (en) | 2008-08-12 | 2013-02-12 | Morphism, Llc | Training and applying prosody models |
WO2010025460A1 (en) | 2008-08-29 | 2010-03-04 | O3 Technologies, Llc | System and method for speech-to-speech translation |
US8239195B2 (en) | 2008-09-23 | 2012-08-07 | Microsoft Corporation | Adapting a compressed model for use in speech recognition |
US8332223B2 (en) * | 2008-10-24 | 2012-12-11 | Nuance Communications, Inc. | Speaker verification methods and apparatus |
US9059991B2 (en) | 2008-12-31 | 2015-06-16 | Bce Inc. | System and method for unlocking a device |
US20100198577A1 (en) | 2009-02-03 | 2010-08-05 | Microsoft Corporation | State mapping for cross-language speaker adaptation |
US8380507B2 (en) | 2009-03-09 | 2013-02-19 | Apple Inc. | Systems and methods for determining the language to use for speech generated by a text to speech engine |
EP2406787B1 (en) | 2009-03-11 | 2014-05-14 | Google, Inc. | Audio classification for information retrieval using sparse features |
US9009039B2 (en) | 2009-06-12 | 2015-04-14 | Microsoft Technology Licensing, Llc | Noise adaptive training for speech recognition |
US20110238407A1 (en) | 2009-08-31 | 2011-09-29 | O3 Technologies, Llc | Systems and methods for speech-to-speech translation |
US8886531B2 (en) | 2010-01-13 | 2014-11-11 | Rovi Technologies Corporation | Apparatus and method for generating an audio fingerprint and using a two-stage query |
US8700394B2 (en) | 2010-03-24 | 2014-04-15 | Microsoft Corporation | Acoustic model adaptation using splines |
US8234111B2 (en) | 2010-06-14 | 2012-07-31 | Google Inc. | Speech and noise models for speech recognition |
US20110313762A1 (en) | 2010-06-20 | 2011-12-22 | International Business Machines Corporation | Speech output with confidence indication |
US8725506B2 (en) | 2010-06-30 | 2014-05-13 | Intel Corporation | Speech audio processing |
ES2540995T3 (es) | 2010-08-24 | 2015-07-15 | Veovox Sa | Sistema y método para reconocer un comando de voz de usuario en un entorno con ruido |
US8782012B2 (en) | 2010-08-27 | 2014-07-15 | International Business Machines Corporation | Network analysis |
US8972253B2 (en) | 2010-09-15 | 2015-03-03 | Microsoft Technology Licensing, Llc | Deep belief network for large vocabulary continuous speech recognition |
EP2431969B1 (de) | 2010-09-15 | 2013-04-03 | Svox AG | Spracherkennung mit kleinem Rechenaufwand und reduziertem Quantisierungsfehler |
US9318114B2 (en) | 2010-11-24 | 2016-04-19 | At&T Intellectual Property I, L.P. | System and method for generating challenge utterances for speaker verification |
US20120143604A1 (en) | 2010-12-07 | 2012-06-07 | Rita Singh | Method for Restoring Spectral Components in Denoised Speech Signals |
TWI413105B (zh) | 2010-12-30 | 2013-10-21 | Ind Tech Res Inst | 多語言之文字轉語音合成系統與方法 |
US9286886B2 (en) | 2011-01-24 | 2016-03-15 | Nuance Communications, Inc. | Methods and apparatus for predicting prosody in speech synthesis |
US8594993B2 (en) | 2011-04-04 | 2013-11-26 | Microsoft Corporation | Frame mapping approach for cross-lingual voice transformation |
US8260615B1 (en) | 2011-04-25 | 2012-09-04 | Google Inc. | Cross-lingual initialization of language models |
WO2012089288A1 (en) | 2011-06-06 | 2012-07-05 | Bridge Mediatech, S.L. | Method and system for robust audio hashing |
US8768707B2 (en) * | 2011-09-27 | 2014-07-01 | Sensory Incorporated | Background speech recognition assistant using speaker verification |
US9235799B2 (en) | 2011-11-26 | 2016-01-12 | Microsoft Technology Licensing, Llc | Discriminative pretraining of deep neural networks |
CN103562993B (zh) * | 2011-12-16 | 2015-05-27 | 华为技术有限公司 | 说话人识别方法及设备 |
US9137600B2 (en) | 2012-02-16 | 2015-09-15 | 2236008 Ontario Inc. | System and method for dynamic residual noise shaping |
US9042867B2 (en) | 2012-02-24 | 2015-05-26 | Agnitio S.L. | System and method for speaker recognition on mobile devices |
JP5875414B2 (ja) | 2012-03-07 | 2016-03-02 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | 雑音抑制方法、プログラム及び装置 |
US9524730B2 (en) | 2012-03-30 | 2016-12-20 | Ohio State Innovation Foundation | Monaural speech filter |
US9368104B2 (en) | 2012-04-30 | 2016-06-14 | Src, Inc. | System and method for synthesizing human speech using multiple speakers and context |
US20130297299A1 (en) | 2012-05-07 | 2013-11-07 | Board Of Trustees Of Michigan State University | Sparse Auditory Reproducing Kernel (SPARK) Features for Noise-Robust Speech and Speaker Recognition |
US9489950B2 (en) | 2012-05-31 | 2016-11-08 | Agency For Science, Technology And Research | Method and system for dual scoring for text-dependent speaker verification |
US9123338B1 (en) | 2012-06-01 | 2015-09-01 | Google Inc. | Background audio identification for speech disambiguation |
US9704068B2 (en) | 2012-06-22 | 2017-07-11 | Google Inc. | System and method for labelling aerial images |
US9536528B2 (en) * | 2012-07-03 | 2017-01-03 | Google Inc. | Determining hotword suitability |
US9153230B2 (en) * | 2012-10-23 | 2015-10-06 | Google Inc. | Mobile speech recognition hardware accelerator |
US9336771B2 (en) | 2012-11-01 | 2016-05-10 | Google Inc. | Speech recognition using non-parametric models |
US9477925B2 (en) | 2012-11-20 | 2016-10-25 | Microsoft Technology Licensing, Llc | Deep neural networks training for speech and pattern recognition |
US9263036B1 (en) | 2012-11-29 | 2016-02-16 | Google Inc. | System and method for speech recognition using deep recurrent neural networks |
US20140156575A1 (en) | 2012-11-30 | 2014-06-05 | Nuance Communications, Inc. | Method and Apparatus of Processing Data Using Deep Belief Networks Employing Low-Rank Matrix Factorization |
US9230550B2 (en) * | 2013-01-10 | 2016-01-05 | Sensory, Incorporated | Speaker verification and identification using artificial neural network-based sub-phonetic unit discrimination |
US9502038B2 (en) * | 2013-01-28 | 2016-11-22 | Tencent Technology (Shenzhen) Company Limited | Method and device for voiceprint recognition |
US9454958B2 (en) | 2013-03-07 | 2016-09-27 | Microsoft Technology Licensing, Llc | Exploiting heterogeneous data in deep neural network-based speech recognition systems |
US9361885B2 (en) | 2013-03-12 | 2016-06-07 | Nuance Communications, Inc. | Methods and apparatus for detecting a voice command |
US9728184B2 (en) | 2013-06-18 | 2017-08-08 | Microsoft Technology Licensing, Llc | Restructuring deep neural network acoustic models |
JP5734354B2 (ja) * | 2013-06-26 | 2015-06-17 | ファナック株式会社 | 工具クランプ装置 |
US9311915B2 (en) | 2013-07-31 | 2016-04-12 | Google Inc. | Context-based speech recognition |
US9679258B2 (en) | 2013-10-08 | 2017-06-13 | Google Inc. | Methods and apparatus for reinforcement learning |
US9620145B2 (en) | 2013-11-01 | 2017-04-11 | Google Inc. | Context-dependent state tying using a neural network |
US9514753B2 (en) * | 2013-11-04 | 2016-12-06 | Google Inc. | Speaker identification using hash-based indexing |
US9715660B2 (en) | 2013-11-04 | 2017-07-25 | Google Inc. | Transfer learning for deep neural network based hotword detection |
CN104700831B (zh) * | 2013-12-05 | 2018-03-06 | 国际商业机器公司 | 分析音频文件的语音特征的方法和装置 |
US8965112B1 (en) | 2013-12-09 | 2015-02-24 | Google Inc. | Sequence transcription with deep neural networks |
US9195656B2 (en) | 2013-12-30 | 2015-11-24 | Google Inc. | Multilingual prosody generation |
US9589564B2 (en) | 2014-02-05 | 2017-03-07 | Google Inc. | Multiple speech locale-specific hotword classifiers for selection of a speech locale |
US20150228277A1 (en) | 2014-02-11 | 2015-08-13 | Malaspina Labs (Barbados), Inc. | Voiced Sound Pattern Detection |
US10102848B2 (en) | 2014-02-28 | 2018-10-16 | Google Llc | Hotwords presentation framework |
US9412358B2 (en) * | 2014-05-13 | 2016-08-09 | At&T Intellectual Property I, L.P. | System and method for data-driven socially customized models for language generation |
US9728185B2 (en) | 2014-05-22 | 2017-08-08 | Google Inc. | Recognizing speech using neural networks |
US20150364129A1 (en) | 2014-06-17 | 2015-12-17 | Google Inc. | Language Identification |
CN104008751A (zh) * | 2014-06-18 | 2014-08-27 | 周婷婷 | 一种基于bp神经网络的说话人识别方法 |
CN104168270B (zh) * | 2014-07-31 | 2016-01-13 | 腾讯科技(深圳)有限公司 | 身份验证方法、服务器、客户端及系统 |
US9378731B2 (en) | 2014-09-25 | 2016-06-28 | Google Inc. | Acoustic model training corpus selection |
US9299347B1 (en) | 2014-10-22 | 2016-03-29 | Google Inc. | Speech recognition using associative mapping |
US9418656B2 (en) * | 2014-10-29 | 2016-08-16 | Google Inc. | Multi-stage hotword detection |
CN104732978B (zh) * | 2015-03-12 | 2018-05-08 | 上海交通大学 | 基于联合深度学习的文本相关的说话人识别方法 |
EP3067884B1 (en) * | 2015-03-13 | 2019-05-08 | Samsung Electronics Co., Ltd. | Speech recognition system and speech recognition method thereof |
US9978374B2 (en) * | 2015-09-04 | 2018-05-22 | Google Llc | Neural networks for speaker verification |
US20180018973A1 (en) | 2016-07-15 | 2018-01-18 | Google Inc. | Speaker verification |
-
2016
- 2016-07-15 US US15/211,317 patent/US20180018973A1/en not_active Abandoned
-
2017
- 2017-07-06 WO PCT/US2017/040906 patent/WO2018013401A1/en active Application Filing
- 2017-07-06 CN CN201780003481.3A patent/CN108140386B/zh active Active
- 2017-07-06 JP JP2019500442A patent/JP6561219B1/ja active Active
- 2017-07-06 EP EP18165912.9A patent/EP3373294B1/en active Active
- 2017-07-06 RU RU2018112272A patent/RU2697736C1/ru active
- 2017-07-06 KR KR1020187009479A patent/KR102109874B1/ko active IP Right Grant
- 2017-07-06 EP EP17740860.6A patent/EP3345181B1/en active Active
-
2018
- 2018-06-01 US US15/995,480 patent/US10403291B2/en active Active
-
2019
- 2019-08-30 US US16/557,390 patent/US11017784B2/en active Active
-
2021
- 2021-05-04 US US17/307,704 patent/US11594230B2/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150127336A1 (en) * | 2013-11-04 | 2015-05-07 | Google Inc. | Speaker verification using neural networks |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2021527840A (ja) * | 2018-10-10 | 2021-10-14 | テンセント・テクノロジー・(シェンジェン)・カンパニー・リミテッド | 声紋識別方法、モデルトレーニング方法、サーバ、及びコンピュータプログラム |
US11508381B2 (en) | 2018-10-10 | 2022-11-22 | Tencent Technology (Shenzhen) Company Limited | Voiceprint recognition method, model training method, and server |
JP2022539674A (ja) * | 2019-12-04 | 2022-09-13 | グーグル エルエルシー | 特定話者スピーチモデルを使用した話者認識 |
JP7371135B2 (ja) | 2019-12-04 | 2023-10-30 | グーグル エルエルシー | 特定話者スピーチモデルを使用した話者認識 |
US11854533B2 (en) | 2019-12-04 | 2023-12-26 | Google Llc | Speaker awareness using speaker dependent speech model(s) |
JP2021135313A (ja) * | 2020-02-21 | 2021-09-13 | 日本電信電話株式会社 | 照合装置、照合方法、および、照合プログラム |
JP7388239B2 (ja) | 2020-02-21 | 2023-11-29 | 日本電信電話株式会社 | 照合装置、照合方法、および、照合プログラム |
Also Published As
Publication number | Publication date |
---|---|
RU2697736C1 (ru) | 2019-08-19 |
US20190385619A1 (en) | 2019-12-19 |
CN108140386A (zh) | 2018-06-08 |
KR20180050365A (ko) | 2018-05-14 |
JP6561219B1 (ja) | 2019-08-14 |
WO2018013401A1 (en) | 2018-01-18 |
CN108140386B (zh) | 2021-11-23 |
KR102109874B1 (ko) | 2020-05-12 |
EP3373294A1 (en) | 2018-09-12 |
US20210256981A1 (en) | 2021-08-19 |
US20180018973A1 (en) | 2018-01-18 |
EP3345181A1 (en) | 2018-07-11 |
US20180277124A1 (en) | 2018-09-27 |
EP3345181B1 (en) | 2019-01-09 |
US11017784B2 (en) | 2021-05-25 |
US11594230B2 (en) | 2023-02-28 |
EP3373294B1 (en) | 2019-12-18 |
US10403291B2 (en) | 2019-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6561219B1 (ja) | 話者照合 | |
US10255922B1 (en) | Speaker identification using a text-independent model and a text-dependent model | |
US10476872B2 (en) | Joint speaker authentication and key phrase identification | |
US10567515B1 (en) | Speech processing performed with respect to first and second user profiles in a dialog session | |
US10446141B2 (en) | Automatic speech recognition based on user feedback | |
US9147400B2 (en) | Method and apparatus for generating speaker-specific spoken passwords | |
US10715604B1 (en) | Remote system processing based on a previously identified user | |
US20170236520A1 (en) | Generating Models for Text-Dependent Speaker Verification | |
KR20160011709A (ko) | 지불 확인을 위한 방법, 장치 및 시스템 | |
CN112309406A (zh) | 声纹注册方法、装置和计算机可读存储介质 | |
US11416593B2 (en) | Electronic device, control method for electronic device, and control program for electronic device | |
JP7339116B2 (ja) | 音声認証装置、音声認証システム、および音声認証方法 | |
JP2005512246A (ja) | 動作モデルを使用して非煩雑的に話者を検証するための方法及びシステム | |
KR20230156145A (ko) | 하이브리드 다국어 텍스트 의존형 및 텍스트 독립형 화자 검증 | |
JP4245948B2 (ja) | 音声認証装置、音声認証方法及び音声認証プログラム | |
AU2017101551B4 (en) | Improving automatic speech recognition based on user feedback |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20190208 |
|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20190208 |
|
A871 | Explanation of circumstances concerning accelerated examination |
Free format text: JAPANESE INTERMEDIATE CODE: A871 Effective date: 20190208 |
|
A975 | Report on accelerated examination |
Free format text: JAPANESE INTERMEDIATE CODE: A971005 Effective date: 20190527 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20190624 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20190722 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 6561219 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |