JP2020504413A - 人工知能自動話者識別方法を用いる個人カスタマイズ型音声認識サービスの提供方法及びこれに使用されるサービス提供サーバ - Google Patents
人工知能自動話者識別方法を用いる個人カスタマイズ型音声認識サービスの提供方法及びこれに使用されるサービス提供サーバ Download PDFInfo
- Publication number
- JP2020504413A JP2020504413A JP2019558316A JP2019558316A JP2020504413A JP 2020504413 A JP2020504413 A JP 2020504413A JP 2019558316 A JP2019558316 A JP 2019558316A JP 2019558316 A JP2019558316 A JP 2019558316A JP 2020504413 A JP2020504413 A JP 2020504413A
- Authority
- JP
- Japan
- Prior art keywords
- service
- speaker
- service providing
- voice
- providing server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000013473 artificial intelligence Methods 0.000 title abstract description 4
- 230000008569 process Effects 0.000 abstract description 4
- 238000004891 communication Methods 0.000 description 9
- 239000000284 extract Substances 0.000 description 6
- 230000001419 dependent effect Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000010438 heat treatment Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 241000238558 Eucarida Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
- G10L17/24—Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/18—Artificial neural networks; Connectionist approaches
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/225—Feedback of the input speech
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
Claims (2)
- (a)サービス提供サーバが、ユーザ端末から話者の音声が含まれたサービス提供要求メッセージを受信するステップ;
(b)前記サービス提供サーバが、前記サービス提供要求メッセージに含まれた前記音声を分析して前記音声の話者を識別するステップ;
(c)前記サービス提供サーバが、話者識別情報に基づいて前記話者のためのカスタマイズ型サービスの提供に必要な制御コマンドを生成するステップ;及び
(d)前記サービス提供サーバが、生成された前記制御コマンドを外部電子機器に送信するステップを含む個人カスタマイズ型音声認識サービスを提供する方法。 - ユーザ端末から話者の音声が含まれたサービス提供要求メッセージを受信する受信部;
前記サービス提供要求メッセージに含まれた前記音声を分析して前記音声の話者を識別する話者識別部;
前記話者識別部が生成した話者識別情報に基づいて前記話者のためのカスタマイズ型サービスの提供に必要な制御コマンドを生成する判断部;及び
前記制御コマンドを外部電子機器に送信する送信部を含むサービス提供サーバ。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020170004094A KR101883301B1 (ko) | 2017-01-11 | 2017-01-11 | 인공 지능 자동 화자 식별 방법을 이용하는 개인 맞춤형 음성 인식 서비스 제공 방법 및 이에 사용되는 서비스 제공 서버 |
KR10-2017-0004094 | 2017-01-11 | ||
PCT/KR2017/003807 WO2018131752A1 (ko) | 2017-01-11 | 2017-04-07 | 인공 지능 자동 화자 식별 방법을 이용하는 개인 맞춤형 음성 인식 서비스 제공 방법 및 이에 사용되는 서비스 제공 서버 |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2020504413A true JP2020504413A (ja) | 2020-02-06 |
JP6909311B2 JP6909311B2 (ja) | 2021-07-28 |
Family
ID=62839511
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2019558316A Active JP6909311B2 (ja) | 2017-01-11 | 2017-04-07 | 人工知能自動話者識別方法を用いる個人カスタマイズ型音声認識サービスの提供方法及びこれに使用されるサービス提供サーバ |
Country Status (4)
Country | Link |
---|---|
US (1) | US11087768B2 (ja) |
JP (1) | JP6909311B2 (ja) |
KR (1) | KR101883301B1 (ja) |
WO (1) | WO2018131752A1 (ja) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101883301B1 (ko) * | 2017-01-11 | 2018-07-30 | (주)파워보이스 | 인공 지능 자동 화자 식별 방법을 이용하는 개인 맞춤형 음성 인식 서비스 제공 방법 및 이에 사용되는 서비스 제공 서버 |
US10258295B2 (en) | 2017-05-09 | 2019-04-16 | LifePod Solutions, Inc. | Voice controlled assistance for monitoring adverse events of a user and/or coordinating emergency actions such as caregiver communication |
KR102574903B1 (ko) * | 2018-08-08 | 2023-09-05 | 삼성전자주식회사 | 개인화된 장치 연결을 지원하는 전자 장치 및 그 방법 |
CN109102803A (zh) * | 2018-08-09 | 2018-12-28 | 珠海格力电器股份有限公司 | 家电设备的控制方法、装置、存储介质及电子装置 |
CN109117235B (zh) | 2018-08-24 | 2019-11-05 | 腾讯科技(深圳)有限公司 | 一种业务数据处理方法、装置以及相关设备 |
KR102275873B1 (ko) | 2018-12-18 | 2021-07-12 | 한국전자기술연구원 | 화자인식 장치 및 방법 |
KR20200098025A (ko) | 2019-02-11 | 2020-08-20 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
US11468886B2 (en) | 2019-03-12 | 2022-10-11 | Lg Electronics Inc. | Artificial intelligence apparatus for performing voice control using voice extraction filter and method for the same |
US11404062B1 (en) | 2021-07-26 | 2022-08-02 | LifePod Solutions, Inc. | Systems and methods for managing voice environments and voice routines |
US11410655B1 (en) | 2021-07-26 | 2022-08-09 | LifePod Solutions, Inc. | Systems and methods for managing voice environments and voice routines |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5774858A (en) * | 1995-10-23 | 1998-06-30 | Taubkin; Vladimir L. | Speech analysis method of protecting a vehicle from unauthorized accessing and controlling |
JP2004032685A (ja) * | 2002-03-07 | 2004-01-29 | Matsushita Electric Ind Co Ltd | コンピュータテレホニーによる保護資源アクセスシステムおよび保護資源アクセス方法 |
JP2005086768A (ja) * | 2003-09-11 | 2005-03-31 | Toshiba Corp | 制御装置、制御方法およびプログラム |
US20100131273A1 (en) * | 2008-11-26 | 2010-05-27 | Almog Aley-Raz | Device,system, and method of liveness detection utilizing voice biometrics |
US20130325473A1 (en) * | 2012-05-31 | 2013-12-05 | Agency For Science, Technology And Research | Method and system for dual scoring for text-dependent speaker verification |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20050023941A (ko) * | 2003-09-03 | 2005-03-10 | 삼성전자주식회사 | 음성 인식 및 화자 인식을 통한 개별화된 서비스를제공하는 a/v 장치 및 그 방법 |
KR20080023033A (ko) | 2006-09-08 | 2008-03-12 | 한국전자통신연구원 | 지능형 로봇 서비스 시스템에서 무선 마이크로폰을 이용한화자 인식 방법 및 장치 |
KR101330328B1 (ko) | 2010-12-14 | 2013-11-15 | 한국전자통신연구원 | 음성 인식 방법 및 이를 위한 시스템 |
KR20140119968A (ko) * | 2013-04-01 | 2014-10-13 | 삼성전자주식회사 | 콘텐츠 서비스 방법 및 시스템 |
TWI508057B (zh) * | 2013-07-15 | 2015-11-11 | Chunghwa Picture Tubes Ltd | 語音辨識系統以及方法 |
US20150025888A1 (en) * | 2013-07-22 | 2015-01-22 | Nuance Communications, Inc. | Speaker recognition and voice tagging for improved service |
JP6054283B2 (ja) * | 2013-11-27 | 2016-12-27 | シャープ株式会社 | 音声認識端末、サーバ、サーバの制御方法、音声認識システム、音声認識端末の制御プログラム、サーバの制御プログラムおよび音声認識端末の制御方法 |
WO2016018111A1 (en) * | 2014-07-31 | 2016-02-04 | Samsung Electronics Co., Ltd. | Message service providing device and method of providing content via the same |
KR102249392B1 (ko) * | 2014-09-02 | 2021-05-07 | 현대모비스 주식회사 | 사용자 맞춤형 서비스를 위한 차량 기기 제어 장치 및 방법 |
KR102383791B1 (ko) * | 2014-12-11 | 2022-04-08 | 삼성전자주식회사 | 전자 장치에서의 개인 비서 서비스 제공 |
JP6084654B2 (ja) * | 2015-06-04 | 2017-02-22 | シャープ株式会社 | 音声認識装置、音声認識システム、当該音声認識システムで使用される端末、および、話者識別モデルを生成するための方法 |
KR101883301B1 (ko) * | 2017-01-11 | 2018-07-30 | (주)파워보이스 | 인공 지능 자동 화자 식별 방법을 이용하는 개인 맞춤형 음성 인식 서비스 제공 방법 및 이에 사용되는 서비스 제공 서버 |
-
2017
- 2017-01-11 KR KR1020170004094A patent/KR101883301B1/ko active IP Right Grant
- 2017-04-07 JP JP2019558316A patent/JP6909311B2/ja active Active
- 2017-04-07 US US16/477,330 patent/US11087768B2/en active Active
- 2017-04-07 WO PCT/KR2017/003807 patent/WO2018131752A1/ko active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5774858A (en) * | 1995-10-23 | 1998-06-30 | Taubkin; Vladimir L. | Speech analysis method of protecting a vehicle from unauthorized accessing and controlling |
JP2004032685A (ja) * | 2002-03-07 | 2004-01-29 | Matsushita Electric Ind Co Ltd | コンピュータテレホニーによる保護資源アクセスシステムおよび保護資源アクセス方法 |
JP2005086768A (ja) * | 2003-09-11 | 2005-03-31 | Toshiba Corp | 制御装置、制御方法およびプログラム |
US20100131273A1 (en) * | 2008-11-26 | 2010-05-27 | Almog Aley-Raz | Device,system, and method of liveness detection utilizing voice biometrics |
US20130325473A1 (en) * | 2012-05-31 | 2013-12-05 | Agency For Science, Technology And Research | Method and system for dual scoring for text-dependent speaker verification |
Also Published As
Publication number | Publication date |
---|---|
US11087768B2 (en) | 2021-08-10 |
WO2018131752A1 (ko) | 2018-07-19 |
US20190378518A1 (en) | 2019-12-12 |
KR20180082783A (ko) | 2018-07-19 |
KR101883301B1 (ko) | 2018-07-30 |
JP6909311B2 (ja) | 2021-07-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6909311B2 (ja) | 人工知能自動話者識別方法を用いる個人カスタマイズ型音声認識サービスの提供方法及びこれに使用されるサービス提供サーバ | |
JP6906048B2 (ja) | 音声インターフェイスデバイスにおけるマルチユーザパーソナライゼーション | |
US11670297B2 (en) | Device leadership negotiation among voice interface devices | |
KR102299239B1 (ko) | 공동 디바이스 상의 가상 어시스턴트 시스템에 대한 개인 도메인 | |
CN110800044B (zh) | 用于语音助手系统的话语权限管理 | |
KR102213637B1 (ko) | 디바이스들 간의 상태 상호작용의 캡슐화 및 동기화 | |
EP3520100B1 (en) | Noise mitigation for a voice interface device | |
CN112136102B (zh) | 信息处理装置、信息处理方法以及信息处理系统 | |
US11582110B2 (en) | Techniques for sharing device capabilities over a network of user devices | |
US20220335938A1 (en) | Techniques for communication between hub device and multiple endpoints | |
CN117136352A (zh) | 用于集线器设备与多个端点之间的通信的技术 | |
JP2020173388A (ja) | 音声入力装置、音声操作システム、音声操作方法及びプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20190716 |
|
A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20200911 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20200923 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20201222 |
|
RD02 | Notification of acceptance of power of attorney |
Free format text: JAPANESE INTERMEDIATE CODE: A7422 Effective date: 20201222 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A821 Effective date: 20201222 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20210602 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20210702 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 6909311 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
S111 | Request for change of ownership or part of ownership |
Free format text: JAPANESE INTERMEDIATE CODE: R313113 |
|
R350 | Written notification of registration of transfer |
Free format text: JAPANESE INTERMEDIATE CODE: R350 |