KR20030007793A - 음성 처리 장치 - Google Patents
음성 처리 장치 Download PDFInfo
- Publication number
- KR20030007793A KR20030007793A KR1020027016297A KR20027016297A KR20030007793A KR 20030007793 A KR20030007793 A KR 20030007793A KR 1020027016297 A KR1020027016297 A KR 1020027016297A KR 20027016297 A KR20027016297 A KR 20027016297A KR 20030007793 A KR20030007793 A KR 20030007793A
- Authority
- KR
- South Korea
- Prior art keywords
- cluster
- voice
- word
- speech
- dictionary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000012545 processing Methods 0.000 title claims abstract description 101
- 238000001514 detection method Methods 0.000 claims abstract description 72
- 239000013598 vector Substances 0.000 claims description 139
- 238000000034 method Methods 0.000 claims description 97
- 230000008569 process Effects 0.000 claims description 40
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 15
- 239000000284 extract Substances 0.000 claims description 5
- 238000003672 processing method Methods 0.000 claims description 2
- 238000012423 maintenance Methods 0.000 abstract description 24
- 230000009471 action Effects 0.000 description 37
- 230000007246 mechanism Effects 0.000 description 37
- 230000006399 behavior Effects 0.000 description 30
- 230000008451 emotion Effects 0.000 description 19
- 230000007704 transition Effects 0.000 description 17
- 210000003128 head Anatomy 0.000 description 15
- 238000010586 diagram Methods 0.000 description 8
- 230000008859 change Effects 0.000 description 5
- 238000012217 deletion Methods 0.000 description 5
- 230000037430 deletion Effects 0.000 description 5
- 101100139878 Schizosaccharomyces pombe (strain 972 / ATCC 24843) ran1 gene Proteins 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000002996 emotional effect Effects 0.000 description 3
- 208000000044 Amnesia Diseases 0.000 description 2
- 208000026139 Memory disease Diseases 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000006984 memory degeneration Effects 0.000 description 2
- 208000023060 memory loss Diseases 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 210000002784 stomach Anatomy 0.000 description 2
- 206010011469 Crying Diseases 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000036528 appetite Effects 0.000 description 1
- 235000019789 appetite Nutrition 0.000 description 1
- 238000005452 bending Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000002542 deteriorative effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000037081 physical activity Effects 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 230000015541 sensory perception of touch Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0631—Creating reference templates; Clustering
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Machine Translation (AREA)
- Manipulator (AREA)
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JPJP-P-2001-00097843 | 2001-03-30 | ||
| JP2001097843 | 2001-03-30 | ||
| JPJP-P-2002-00069603 | 2002-03-14 | ||
| JP2002069603A JP2002358095A (ja) | 2001-03-30 | 2002-03-14 | 音声処理装置および音声処理方法、並びにプログラムおよび記録媒体 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| KR20030007793A true KR20030007793A (ko) | 2003-01-23 |
Family
ID=26612647
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| KR1020027016297A Withdrawn KR20030007793A (ko) | 2001-03-30 | 2002-04-01 | 음성 처리 장치 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US7228276B2 (enExample) |
| EP (1) | EP1376536A1 (enExample) |
| JP (1) | JP2002358095A (enExample) |
| KR (1) | KR20030007793A (enExample) |
| CN (1) | CN1462428A (enExample) |
| WO (1) | WO2002080141A1 (enExample) |
Families Citing this family (58)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070265834A1 (en) * | 2001-09-06 | 2007-11-15 | Einat Melnick | In-context analysis |
| US7398209B2 (en) | 2002-06-03 | 2008-07-08 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
| US7693720B2 (en) | 2002-07-15 | 2010-04-06 | Voicebox Technologies, Inc. | Mobile systems and methods for responding to natural language speech utterance |
| JP4392581B2 (ja) * | 2003-02-20 | 2010-01-06 | ソニー株式会社 | 言語処理装置および言語処理方法、並びにプログラムおよび記録媒体 |
| US7813928B2 (en) | 2004-06-10 | 2010-10-12 | Panasonic Corporation | Speech recognition device, speech recognition method, and program |
| US7110949B2 (en) * | 2004-09-13 | 2006-09-19 | At&T Knowledge Ventures, L.P. | System and method for analysis and adjustment of speech-enabled systems |
| US7634406B2 (en) * | 2004-12-10 | 2009-12-15 | Microsoft Corporation | System and method for identifying semantic intent from acoustic information |
| US7729478B1 (en) * | 2005-04-12 | 2010-06-01 | Avaya Inc. | Change speed of voicemail playback depending on context |
| CN101185115B (zh) * | 2005-05-27 | 2011-07-20 | 松下电器产业株式会社 | 语音编辑装置及方法和语音识别装置及方法 |
| US7640160B2 (en) | 2005-08-05 | 2009-12-29 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
| US7620549B2 (en) | 2005-08-10 | 2009-11-17 | Voicebox Technologies, Inc. | System and method of supporting adaptive misrecognition in conversational speech |
| US7949529B2 (en) | 2005-08-29 | 2011-05-24 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
| WO2007027989A2 (en) | 2005-08-31 | 2007-03-08 | Voicebox Technologies, Inc. | Dynamic speech sharpening |
| KR100717385B1 (ko) * | 2006-02-09 | 2007-05-11 | 삼성전자주식회사 | 인식 후보의 사전적 거리를 이용한 인식 신뢰도 측정 방법및 인식 신뢰도 측정 시스템 |
| JP2007286356A (ja) * | 2006-04-17 | 2007-11-01 | Funai Electric Co Ltd | 電子機器 |
| JPWO2007138875A1 (ja) * | 2006-05-31 | 2009-10-01 | 日本電気株式会社 | 音声認識用単語辞書・言語モデル作成システム、方法、プログラムおよび音声認識システム |
| JP4181590B2 (ja) * | 2006-08-30 | 2008-11-19 | 株式会社東芝 | インタフェース装置及びインタフェース処理方法 |
| US8073681B2 (en) | 2006-10-16 | 2011-12-06 | Voicebox Technologies, Inc. | System and method for a cooperative conversational voice user interface |
| US7818176B2 (en) | 2007-02-06 | 2010-10-19 | Voicebox Technologies, Inc. | System and method for selecting and presenting advertisements based on natural language processing of voice-based input |
| DE102007033472A1 (de) * | 2007-07-18 | 2009-01-29 | Siemens Ag | Verfahren zur Spracherkennung |
| JP5386692B2 (ja) * | 2007-08-31 | 2014-01-15 | 独立行政法人情報通信研究機構 | 対話型学習装置 |
| US8140335B2 (en) | 2007-12-11 | 2012-03-20 | Voicebox Technologies, Inc. | System and method for providing a natural language voice user interface in an integrated voice navigation services environment |
| JP2009157119A (ja) * | 2007-12-27 | 2009-07-16 | Univ Of Ryukyus | 音声単語自動獲得方法 |
| JP5454469B2 (ja) | 2008-05-09 | 2014-03-26 | 富士通株式会社 | 音声認識辞書作成支援装置,処理プログラム,および処理方法 |
| US8589161B2 (en) | 2008-05-27 | 2013-11-19 | Voicebox Technologies, Inc. | System and method for an integrated, multi-modal, multi-device natural language voice services environment |
| US9305548B2 (en) | 2008-05-27 | 2016-04-05 | Voicebox Technologies Corporation | System and method for an integrated, multi-modal, multi-device natural language voice services environment |
| US8326637B2 (en) | 2009-02-20 | 2012-12-04 | Voicebox Technologies, Inc. | System and method for processing multi-modal device interactions in a natural language voice services environment |
| US8064290B2 (en) * | 2009-04-28 | 2011-11-22 | Luidia, Inc. | Digital transcription system utilizing small aperture acoustical sensors |
| US9171541B2 (en) | 2009-11-10 | 2015-10-27 | Voicebox Technologies Corporation | System and method for hybrid processing in a natural language voice services environment |
| US9502025B2 (en) | 2009-11-10 | 2016-11-22 | Voicebox Technologies Corporation | System and method for providing a natural language content dedication service |
| US8645136B2 (en) * | 2010-07-20 | 2014-02-04 | Intellisist, Inc. | System and method for efficiently reducing transcription error using hybrid voice transcription |
| WO2012075640A1 (en) * | 2010-12-10 | 2012-06-14 | Panasonic Corporation | Modeling device and method for speaker recognition, and speaker recognition system |
| US9117444B2 (en) | 2012-05-29 | 2015-08-25 | Nuance Communications, Inc. | Methods and apparatus for performing transformation techniques for data clustering and/or classification |
| CN103219007A (zh) * | 2013-03-27 | 2013-07-24 | 谢东来 | 语音识别方法及装置 |
| US9697828B1 (en) * | 2014-06-20 | 2017-07-04 | Amazon Technologies, Inc. | Keyword detection modeling using contextual and environmental information |
| KR102246900B1 (ko) * | 2014-07-29 | 2021-04-30 | 삼성전자주식회사 | 전자 장치 및 이의 음성 인식 방법 |
| CN107003996A (zh) | 2014-09-16 | 2017-08-01 | 声钰科技 | 语音商务 |
| US9898459B2 (en) | 2014-09-16 | 2018-02-20 | Voicebox Technologies Corporation | Integration of domain information into state transitions of a finite state transducer for natural language processing |
| US9747896B2 (en) | 2014-10-15 | 2017-08-29 | Voicebox Technologies Corporation | System and method for providing follow-up responses to prior natural language inputs of a user |
| US10614799B2 (en) | 2014-11-26 | 2020-04-07 | Voicebox Technologies Corporation | System and method of providing intent predictions for an utterance prior to a system detection of an end of the utterance |
| US10431214B2 (en) | 2014-11-26 | 2019-10-01 | Voicebox Technologies Corporation | System and method of determining a domain and/or an action related to a natural language input |
| CN107112007B (zh) * | 2014-12-24 | 2020-08-07 | 三菱电机株式会社 | 语音识别装置及语音识别方法 |
| US10515150B2 (en) * | 2015-07-14 | 2019-12-24 | Genesys Telecommunications Laboratories, Inc. | Data driven speech enabled self-help systems and methods of operating thereof |
| US10455088B2 (en) | 2015-10-21 | 2019-10-22 | Genesys Telecommunications Laboratories, Inc. | Dialogue flow optimization and personalization |
| US10382623B2 (en) | 2015-10-21 | 2019-08-13 | Genesys Telecommunications Laboratories, Inc. | Data-driven dialogue enabled self-help systems |
| CN106935239A (zh) * | 2015-12-29 | 2017-07-07 | 阿里巴巴集团控股有限公司 | 一种发音词典的构建方法及装置 |
| US10331784B2 (en) | 2016-07-29 | 2019-06-25 | Voicebox Technologies Corporation | System and method of disambiguating natural language processing requests |
| US20180268844A1 (en) * | 2017-03-14 | 2018-09-20 | Otosense Inc. | Syntactic system for sound recognition |
| US20180254054A1 (en) * | 2017-03-02 | 2018-09-06 | Otosense Inc. | Sound-recognition system based on a sound language and associated annotations |
| JP6711343B2 (ja) * | 2017-12-05 | 2020-06-17 | カシオ計算機株式会社 | 音声処理装置、音声処理方法及びプログラム |
| JP7000268B2 (ja) * | 2018-07-18 | 2022-01-19 | 株式会社東芝 | 情報処理装置、情報処理方法、およびプログラム |
| US10854109B2 (en) | 2018-10-31 | 2020-12-01 | Sony Interactive Entertainment Inc. | Color accommodation for on-demand accessibility |
| US11375293B2 (en) | 2018-10-31 | 2022-06-28 | Sony Interactive Entertainment Inc. | Textual annotation of acoustic effects |
| US10977872B2 (en) | 2018-10-31 | 2021-04-13 | Sony Interactive Entertainment Inc. | Graphical style modification for video games using machine learning |
| US11636673B2 (en) | 2018-10-31 | 2023-04-25 | Sony Interactive Entertainment Inc. | Scene annotation using machine learning |
| KR20220094400A (ko) * | 2020-12-29 | 2022-07-06 | 현대자동차주식회사 | 대화 시스템, 그를 가지는 차량 및 대화 시스템의 제어 방법 |
| CN115171702B (zh) * | 2022-05-30 | 2024-09-24 | 青岛海尔科技有限公司 | 数字孪生声纹特征处理方法、存储介质及电子装置 |
| CN119495304B (zh) * | 2024-11-06 | 2025-12-12 | 深圳前海微众银行股份有限公司 | 语音识别模型微调方法、电子设备、存储介质及程序产品 |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPS5745680A (en) | 1980-08-30 | 1982-03-15 | Fujitsu Ltd | Pattern recognition device |
| JPS6125199A (ja) | 1984-07-14 | 1986-02-04 | 日本電気株式会社 | 音声認識方式 |
| US6243680B1 (en) * | 1998-06-15 | 2001-06-05 | Nortel Networks Limited | Method and apparatus for obtaining a transcription of phrases through text and spoken utterances |
| KR100277694B1 (ko) * | 1998-11-11 | 2001-01-15 | 정선종 | 음성인식시스템에서의 발음사전 자동생성 방법 |
| JP2002160185A (ja) | 2000-03-31 | 2002-06-04 | Sony Corp | ロボット装置、ロボット装置の行動制御方法、外力検出装置及び外力検出方法 |
-
2002
- 2002-03-14 JP JP2002069603A patent/JP2002358095A/ja not_active Abandoned
- 2002-04-01 EP EP02708744A patent/EP1376536A1/en not_active Withdrawn
- 2002-04-01 WO PCT/JP2002/003248 patent/WO2002080141A1/ja not_active Ceased
- 2002-04-01 US US10/296,797 patent/US7228276B2/en not_active Expired - Fee Related
- 2002-04-01 KR KR1020027016297A patent/KR20030007793A/ko not_active Withdrawn
- 2002-04-01 CN CN02801646A patent/CN1462428A/zh active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| CN1462428A (zh) | 2003-12-17 |
| US20040030552A1 (en) | 2004-02-12 |
| WO2002080141A1 (en) | 2002-10-10 |
| EP1376536A1 (en) | 2004-01-02 |
| JP2002358095A (ja) | 2002-12-13 |
| US7228276B2 (en) | 2007-06-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR20030007793A (ko) | 음성 처리 장치 | |
| CN1855224B (zh) | 信息处理装置、信息处理方法 | |
| EP1107227B1 (en) | Voice processing | |
| Glass | A probabilistic framework for segment-based speech recognition | |
| JP4296714B2 (ja) | ロボット制御装置およびロボット制御方法、記録媒体、並びにプログラム | |
| Livescu et al. | Subword modeling for automatic speech recognition: Past, present, and emerging approaches | |
| US20030163320A1 (en) | Voice synthesis device | |
| US20230186905A1 (en) | System and method for tone recognition in spoken languages | |
| KR20010062767A (ko) | 정보 처리 장치, 정보 처리 방법 및 저장 매체 | |
| US20010032075A1 (en) | Speech recognition method, apparatus and storage medium | |
| Lee et al. | Audio-to-visual conversion using hidden markov models | |
| KR101153078B1 (ko) | 음성 분류 및 음성 인식을 위한 은닉 조건부 랜덤 필드모델 | |
| JP2001188779A (ja) | 情報処理装置および方法、並びに記録媒体 | |
| JP4600736B2 (ja) | ロボット制御装置および方法、記録媒体、並びにプログラム | |
| JP4587009B2 (ja) | ロボット制御装置およびロボット制御方法、並びに記録媒体 | |
| JP2001154693A (ja) | ロボット制御装置およびロボット制御方法、並びに記録媒体 | |
| JP4706893B2 (ja) | 音声認識装置および方法、並びに、プログラムおよび記録媒体 | |
| JP2002268663A (ja) | 音声合成装置および音声合成方法、並びにプログラムおよび記録媒体 | |
| JP2004309523A (ja) | ロボット装置の動作パターン共有システム、ロボット装置の動作パターン共有方法、及びロボット装置 | |
| JP2004170756A (ja) | ロボット制御装置および方法、記録媒体、並びにプログラム | |
| JP2002258886A (ja) | 音声合成装置および音声合成方法、並びにプログラムおよび記録媒体 | |
| JP4639533B2 (ja) | 音声認識装置および音声認識方法、並びにプログラムおよび記録媒体 | |
| JP2003271181A (ja) | 情報処理装置および情報処理方法、並びに記録媒体およびプログラム | |
| JP4742415B2 (ja) | ロボット制御装置およびロボット制御方法、並びに記録媒体 | |
| Aarnio | Speech recognition with hidden markov models in visual communication |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PA0105 | International application |
Patent event date: 20021129 Patent event code: PA01051R01D Comment text: International Patent Application |
|
| PG1501 | Laying open of application | ||
| PC1203 | Withdrawal of no request for examination | ||
| WITN | Application deemed withdrawn, e.g. because no request for examination was filed or no examination fee was paid |