JP4391701B2 - 音声信号の区分化及び認識のシステム及び方法 - Google Patents
音声信号の区分化及び認識のシステム及び方法 Download PDFInfo
- Publication number
- JP4391701B2 JP4391701B2 JP2000592818A JP2000592818A JP4391701B2 JP 4391701 B2 JP4391701 B2 JP 4391701B2 JP 2000592818 A JP2000592818 A JP 2000592818A JP 2000592818 A JP2000592818 A JP 2000592818A JP 4391701 B2 JP4391701 B2 JP 4391701B2
- Authority
- JP
- Japan
- Prior art keywords
- cluster
- speech
- signal
- merged
- clusters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000011218 segmentation Effects 0.000 title description 3
- 230000003595 spectral effect Effects 0.000 claims abstract description 36
- 230000005236 sound signal Effects 0.000 claims description 27
- 238000001228 spectrum Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 2
- 230000002123 temporal effect Effects 0.000 description 4
- 230000009466 transformation Effects 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- IERHLVCPSMICTF-XVFCMESISA-N CMP group Chemical group P(=O)(O)(O)OC[C@@H]1[C@H]([C@H]([C@@H](O1)N1C(=O)N=C(N)C=C1)O)O IERHLVCPSMICTF-XVFCMESISA-N 0.000 description 1
- 241000970807 Thermoanaerobacterales Species 0.000 description 1
- 239000013317 conjugated microporous polymer Substances 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000003643 myeloid progenitor cell Anatomy 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
- Machine Translation (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
- Mobile Radio Communication Systems (AREA)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/225,891 | 1999-01-04 | ||
US09/225,891 US6278972B1 (en) | 1999-01-04 | 1999-01-04 | System and method for segmentation and recognition of speech signals |
PCT/US1999/031308 WO2000041164A1 (en) | 1999-01-04 | 1999-12-29 | System and method for segmentation and recognition of speech signals |
Publications (3)
Publication Number | Publication Date |
---|---|
JP2002534718A JP2002534718A (ja) | 2002-10-15 |
JP2002534718A5 JP2002534718A5 (US07585860-20090908-C00083.png) | 2007-08-02 |
JP4391701B2 true JP4391701B2 (ja) | 2009-12-24 |
Family
ID=22846699
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2000592818A Expired - Fee Related JP4391701B2 (ja) | 1999-01-04 | 1999-12-29 | 音声信号の区分化及び認識のシステム及び方法 |
Country Status (10)
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6735563B1 (en) * | 2000-07-13 | 2004-05-11 | Qualcomm, Inc. | Method and apparatus for constructing voice templates for a speaker-independent voice recognition system |
US20030154181A1 (en) * | 2002-01-25 | 2003-08-14 | Nec Usa, Inc. | Document clustering with cluster refinement and model selection capabilities |
US7299173B2 (en) * | 2002-01-30 | 2007-11-20 | Motorola Inc. | Method and apparatus for speech detection using time-frequency variance |
KR100880480B1 (ko) * | 2002-02-21 | 2009-01-28 | 엘지전자 주식회사 | 디지털 오디오 신호의 실시간 음악/음성 식별 방법 및시스템 |
KR100435440B1 (ko) * | 2002-03-18 | 2004-06-10 | 정희석 | 화자간 변별력 향상을 위한 가변 길이 코드북 생성 장치및 그 방법, 그를 이용한 코드북 조합 방식의 화자 인식장치 및 그 방법 |
US7050973B2 (en) * | 2002-04-22 | 2006-05-23 | Intel Corporation | Speaker recognition using dynamic time warp template spotting |
DE10220524B4 (de) * | 2002-05-08 | 2006-08-10 | Sap Ag | Verfahren und System zur Verarbeitung von Sprachdaten und zur Erkennung einer Sprache |
DE10220521B4 (de) * | 2002-05-08 | 2005-11-24 | Sap Ag | Verfahren und System zur Verarbeitung von Sprachdaten und Klassifizierung von Gesprächen |
DE10220520A1 (de) * | 2002-05-08 | 2003-11-20 | Sap Ag | Verfahren zur Erkennung von Sprachinformation |
DE10220522B4 (de) * | 2002-05-08 | 2005-11-17 | Sap Ag | Verfahren und System zur Verarbeitung von Sprachdaten mittels Spracherkennung und Frequenzanalyse |
EP1361740A1 (de) * | 2002-05-08 | 2003-11-12 | Sap Ag | Verfahren und System zur Verarbeitung von Sprachinformationen eines Dialogs |
EP1363271A1 (de) * | 2002-05-08 | 2003-11-19 | Sap Ag | Verfahren und System zur Verarbeitung und Speicherung von Sprachinformationen eines Dialogs |
US7509257B2 (en) * | 2002-12-24 | 2009-03-24 | Marvell International Ltd. | Method and apparatus for adapting reference templates |
US8219391B2 (en) * | 2005-02-15 | 2012-07-10 | Raytheon Bbn Technologies Corp. | Speech analyzing system with speech codebook |
BRPI0707135A2 (pt) * | 2006-01-18 | 2011-04-19 | Lg Electronics Inc. | aparelho e método para codificação e decodificação de sinal |
US20080189109A1 (en) * | 2007-02-05 | 2008-08-07 | Microsoft Corporation | Segmentation posterior based boundary point determination |
CN101998289B (zh) * | 2009-08-19 | 2015-01-28 | 中兴通讯股份有限公司 | 一种集群终端呼叫过程中控制声音播放设备的方法及装置 |
US20130151248A1 (en) * | 2011-12-08 | 2013-06-13 | Forrest Baker, IV | Apparatus, System, and Method For Distinguishing Voice in a Communication Stream |
CA2898677C (en) * | 2013-01-29 | 2017-12-05 | Stefan Dohla | Low-frequency emphasis for lpc-based coding in frequency domain |
CN105989849B (zh) * | 2015-06-03 | 2019-12-03 | 乐融致新电子科技(天津)有限公司 | 一种语音增强方法、语音识别方法、聚类方法及装置 |
CN105161094A (zh) * | 2015-06-26 | 2015-12-16 | 徐信 | 一种语音音频切分手动调整切分点的系统及方法 |
CN111785296B (zh) * | 2020-05-26 | 2022-06-10 | 浙江大学 | 基于重复旋律的音乐分段边界识别方法 |
CN115580682B (zh) * | 2022-12-07 | 2023-04-28 | 北京云迹科技股份有限公司 | 机器人拨打电话的接通挂断时刻的确定的方法及装置 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
NL8503304A (nl) * | 1985-11-29 | 1987-06-16 | Philips Nv | Werkwijze en inrichting voor het segmenteren van een uit een akoestisch signaal, bij voorbeeld een spraaksignaal, afgeleid elektrisch signaal. |
CN1013525B (zh) | 1988-11-16 | 1991-08-14 | 中国科学院声学研究所 | 认人与不认人实时语音识别的方法和装置 |
EP0706172A1 (en) * | 1994-10-04 | 1996-04-10 | Hughes Aircraft Company | Low bit rate speech encoder and decoder |
US6314392B1 (en) | 1996-09-20 | 2001-11-06 | Digital Equipment Corporation | Method and apparatus for clustering-based signal segmentation |
-
1999
- 1999-01-04 US US09/225,891 patent/US6278972B1/en not_active Expired - Lifetime
- 1999-12-29 CN CNB998153230A patent/CN1173333C/zh not_active Expired - Fee Related
- 1999-12-29 WO PCT/US1999/031308 patent/WO2000041164A1/en active IP Right Grant
- 1999-12-29 JP JP2000592818A patent/JP4391701B2/ja not_active Expired - Fee Related
- 1999-12-29 EP EP99967799A patent/EP1141939B1/en not_active Expired - Lifetime
- 1999-12-29 AT AT99967799T patent/ATE323932T1/de not_active IP Right Cessation
- 1999-12-29 DE DE69930961T patent/DE69930961T2/de not_active Expired - Lifetime
- 1999-12-29 AU AU24015/00A patent/AU2401500A/en not_active Abandoned
- 1999-12-29 KR KR1020017008529A patent/KR100699622B1/ko not_active IP Right Cessation
-
2002
- 2002-07-31 HK HK02105630.3A patent/HK1044063B/zh not_active IP Right Cessation
Also Published As
Publication number | Publication date |
---|---|
US6278972B1 (en) | 2001-08-21 |
KR20010089769A (ko) | 2001-10-08 |
JP2002534718A (ja) | 2002-10-15 |
HK1044063B (zh) | 2005-05-20 |
DE69930961D1 (de) | 2006-05-24 |
CN1348580A (zh) | 2002-05-08 |
AU2401500A (en) | 2000-07-24 |
HK1044063A1 (en) | 2002-10-04 |
EP1141939A1 (en) | 2001-10-10 |
CN1173333C (zh) | 2004-10-27 |
ATE323932T1 (de) | 2006-05-15 |
DE69930961T2 (de) | 2007-01-04 |
KR100699622B1 (ko) | 2007-03-23 |
EP1141939B1 (en) | 2006-04-19 |
WO2000041164A1 (en) | 2000-07-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4391701B2 (ja) | 音声信号の区分化及び認識のシステム及び方法 | |
US5327521A (en) | Speech transformation system | |
JP4218982B2 (ja) | 音声処理 | |
US8280724B2 (en) | Speech synthesis using complex spectral modeling | |
Thakur et al. | Speech recognition using euclidean distance | |
JPH07334184A (ja) | 音響カテゴリ平均値計算装置及び適応化装置 | |
CN100365704C (zh) | 声音合成方法以及声音合成装置 | |
US20230317056A1 (en) | Audio generator and methods for generating an audio signal and training an audio generator | |
JPH0612089A (ja) | 音声認識方法 | |
CN110663080A (zh) | 通过频谱包络共振峰的频移动态修改语音音色的方法和装置 | |
US20020065649A1 (en) | Mel-frequency linear prediction speech recognition apparatus and method | |
Prasad et al. | Speech features extraction techniques for robust emotional speech analysis/recognition | |
JP3266157B2 (ja) | 音声強調装置 | |
Mazumder et al. | Feature extraction techniques for speech processing: A review | |
JP2003524795A (ja) | スピーチエネーブル装置のユーザインターフェースの完全性をテストする方法および装置 | |
US11270721B2 (en) | Systems and methods of pre-processing of speech signals for improved speech recognition | |
JPH1097274A (ja) | 話者認識方法及び装置 | |
JPH07121197A (ja) | 学習式音声認識方法 | |
JP4603727B2 (ja) | 音響信号分析方法及び装置 | |
Marković et al. | Recognition of normal and whispered speech based on RASTA filtering and DTW algorithm | |
CN112951256B (zh) | 语音处理方法及装置 | |
Marković et al. | Recognition of Whispered Speech Based on PLP Features and DTW Algorithm | |
Kalamani et al. | Comparison Of Cepstral And Mel Frequency Cepstral Coefficients For Various Clean And Noisy Speech Signals | |
Marinozzi et al. | Digital speech algorithms for speaker de-identification | |
CN114038474A (zh) | 音频合成方法、终端设备及计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20061226 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20070426 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20090908 |
|
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20091008 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20121016 Year of fee payment: 3 |
|
R150 | Certificate of patent or registration of utility model |
Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
LAPS | Cancellation because of no payment of annual fees |