CN1173333C - 分段和识别语音信号的系统和方法 - Google Patents
分段和识别语音信号的系统和方法 Download PDFInfo
- Publication number
- CN1173333C CN1173333C CNB998153230A CN99815323A CN1173333C CN 1173333 C CN1173333 C CN 1173333C CN B998153230 A CNB998153230 A CN B998153230A CN 99815323 A CN99815323 A CN 99815323A CN 1173333 C CN1173333 C CN 1173333C
- Authority
- CN
- China
- Prior art keywords
- bunch
- signal
- speech
- merging
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000011218 segmentation Effects 0.000 title claims description 36
- 230000003595 spectral effect Effects 0.000 claims abstract description 8
- 238000001228 spectrum Methods 0.000 claims description 45
- 230000014509 gene expression Effects 0.000 claims description 8
- 230000015572 biosynthetic process Effects 0.000 claims description 7
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 230000000875 corresponding effect Effects 0.000 claims 3
- 230000002596 correlated effect Effects 0.000 claims 2
- 230000009466 transformation Effects 0.000 claims 2
- 238000010586 diagram Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000012549 training Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000004321 preservation Methods 0.000 description 2
- IERHLVCPSMICTF-XVFCMESISA-N CMP group Chemical group P(=O)(O)(O)OC[C@@H]1[C@H]([C@H]([C@@H](O1)N1C(=O)N=C(N)C=C1)O)O IERHLVCPSMICTF-XVFCMESISA-N 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000005039 memory span Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
- Machine Translation (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
- Mobile Radio Communication Systems (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/225,891 | 1999-01-04 | ||
US09/225,891 US6278972B1 (en) | 1999-01-04 | 1999-01-04 | System and method for segmentation and recognition of speech signals |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1348580A CN1348580A (zh) | 2002-05-08 |
CN1173333C true CN1173333C (zh) | 2004-10-27 |
Family
ID=22846699
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB998153230A Expired - Fee Related CN1173333C (zh) | 1999-01-04 | 1999-12-29 | 分段和识别语音信号的系统和方法 |
Country Status (10)
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6735563B1 (en) * | 2000-07-13 | 2004-05-11 | Qualcomm, Inc. | Method and apparatus for constructing voice templates for a speaker-independent voice recognition system |
US20030154181A1 (en) * | 2002-01-25 | 2003-08-14 | Nec Usa, Inc. | Document clustering with cluster refinement and model selection capabilities |
US7299173B2 (en) * | 2002-01-30 | 2007-11-20 | Motorola Inc. | Method and apparatus for speech detection using time-frequency variance |
KR100880480B1 (ko) * | 2002-02-21 | 2009-01-28 | 엘지전자 주식회사 | 디지털 오디오 신호의 실시간 음악/음성 식별 방법 및시스템 |
KR100435440B1 (ko) * | 2002-03-18 | 2004-06-10 | 정희석 | 화자간 변별력 향상을 위한 가변 길이 코드북 생성 장치및 그 방법, 그를 이용한 코드북 조합 방식의 화자 인식장치 및 그 방법 |
US7050973B2 (en) * | 2002-04-22 | 2006-05-23 | Intel Corporation | Speaker recognition using dynamic time warp template spotting |
DE10220524B4 (de) * | 2002-05-08 | 2006-08-10 | Sap Ag | Verfahren und System zur Verarbeitung von Sprachdaten und zur Erkennung einer Sprache |
DE10220521B4 (de) * | 2002-05-08 | 2005-11-24 | Sap Ag | Verfahren und System zur Verarbeitung von Sprachdaten und Klassifizierung von Gesprächen |
DE10220520A1 (de) * | 2002-05-08 | 2003-11-20 | Sap Ag | Verfahren zur Erkennung von Sprachinformation |
DE10220522B4 (de) * | 2002-05-08 | 2005-11-17 | Sap Ag | Verfahren und System zur Verarbeitung von Sprachdaten mittels Spracherkennung und Frequenzanalyse |
EP1361740A1 (de) * | 2002-05-08 | 2003-11-12 | Sap Ag | Verfahren und System zur Verarbeitung von Sprachinformationen eines Dialogs |
EP1363271A1 (de) * | 2002-05-08 | 2003-11-19 | Sap Ag | Verfahren und System zur Verarbeitung und Speicherung von Sprachinformationen eines Dialogs |
US7509257B2 (en) * | 2002-12-24 | 2009-03-24 | Marvell International Ltd. | Method and apparatus for adapting reference templates |
US8219391B2 (en) * | 2005-02-15 | 2012-07-10 | Raytheon Bbn Technologies Corp. | Speech analyzing system with speech codebook |
BRPI0707135A2 (pt) * | 2006-01-18 | 2011-04-19 | Lg Electronics Inc. | aparelho e método para codificação e decodificação de sinal |
US20080189109A1 (en) * | 2007-02-05 | 2008-08-07 | Microsoft Corporation | Segmentation posterior based boundary point determination |
CN101998289B (zh) * | 2009-08-19 | 2015-01-28 | 中兴通讯股份有限公司 | 一种集群终端呼叫过程中控制声音播放设备的方法及装置 |
US20130151248A1 (en) * | 2011-12-08 | 2013-06-13 | Forrest Baker, IV | Apparatus, System, and Method For Distinguishing Voice in a Communication Stream |
CA2898677C (en) * | 2013-01-29 | 2017-12-05 | Stefan Dohla | Low-frequency emphasis for lpc-based coding in frequency domain |
CN105989849B (zh) * | 2015-06-03 | 2019-12-03 | 乐融致新电子科技(天津)有限公司 | 一种语音增强方法、语音识别方法、聚类方法及装置 |
CN105161094A (zh) * | 2015-06-26 | 2015-12-16 | 徐信 | 一种语音音频切分手动调整切分点的系统及方法 |
CN111785296B (zh) * | 2020-05-26 | 2022-06-10 | 浙江大学 | 基于重复旋律的音乐分段边界识别方法 |
CN115580682B (zh) * | 2022-12-07 | 2023-04-28 | 北京云迹科技股份有限公司 | 机器人拨打电话的接通挂断时刻的确定的方法及装置 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
NL8503304A (nl) * | 1985-11-29 | 1987-06-16 | Philips Nv | Werkwijze en inrichting voor het segmenteren van een uit een akoestisch signaal, bij voorbeeld een spraaksignaal, afgeleid elektrisch signaal. |
CN1013525B (zh) | 1988-11-16 | 1991-08-14 | 中国科学院声学研究所 | 认人与不认人实时语音识别的方法和装置 |
EP0706172A1 (en) * | 1994-10-04 | 1996-04-10 | Hughes Aircraft Company | Low bit rate speech encoder and decoder |
US6314392B1 (en) | 1996-09-20 | 2001-11-06 | Digital Equipment Corporation | Method and apparatus for clustering-based signal segmentation |
-
1999
- 1999-01-04 US US09/225,891 patent/US6278972B1/en not_active Expired - Lifetime
- 1999-12-29 CN CNB998153230A patent/CN1173333C/zh not_active Expired - Fee Related
- 1999-12-29 WO PCT/US1999/031308 patent/WO2000041164A1/en active IP Right Grant
- 1999-12-29 JP JP2000592818A patent/JP4391701B2/ja not_active Expired - Fee Related
- 1999-12-29 EP EP99967799A patent/EP1141939B1/en not_active Expired - Lifetime
- 1999-12-29 AT AT99967799T patent/ATE323932T1/de not_active IP Right Cessation
- 1999-12-29 DE DE69930961T patent/DE69930961T2/de not_active Expired - Lifetime
- 1999-12-29 AU AU24015/00A patent/AU2401500A/en not_active Abandoned
- 1999-12-29 KR KR1020017008529A patent/KR100699622B1/ko not_active IP Right Cessation
-
2002
- 2002-07-31 HK HK02105630.3A patent/HK1044063B/zh not_active IP Right Cessation
Also Published As
Publication number | Publication date |
---|---|
JP4391701B2 (ja) | 2009-12-24 |
US6278972B1 (en) | 2001-08-21 |
KR20010089769A (ko) | 2001-10-08 |
JP2002534718A (ja) | 2002-10-15 |
HK1044063B (zh) | 2005-05-20 |
DE69930961D1 (de) | 2006-05-24 |
CN1348580A (zh) | 2002-05-08 |
AU2401500A (en) | 2000-07-24 |
HK1044063A1 (en) | 2002-10-04 |
EP1141939A1 (en) | 2001-10-10 |
ATE323932T1 (de) | 2006-05-15 |
DE69930961T2 (de) | 2007-01-04 |
KR100699622B1 (ko) | 2007-03-23 |
EP1141939B1 (en) | 2006-04-19 |
WO2000041164A1 (en) | 2000-07-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1173333C (zh) | 分段和识别语音信号的系统和方法 | |
US5749068A (en) | Speech recognition apparatus and method in noisy circumstances | |
CN107481728B (zh) | 背景声消除方法、装置及终端设备 | |
AU712412B2 (en) | Speech processing | |
US5327521A (en) | Speech transformation system | |
JP2692581B2 (ja) | 音響カテゴリ平均値計算装置及び適応化装置 | |
US5749073A (en) | System for automatically morphing audio information | |
US4570232A (en) | Speech recognition apparatus | |
US5459815A (en) | Speech recognition method using time-frequency masking mechanism | |
WO1999040571A1 (en) | System and method for noise-compensated speech recognition | |
JPH08234788A (ja) | 音声認識のバイアス等化方法および装置 | |
JP3219093B2 (ja) | 外部のボイシングまたはピッチ情報を使用することなく音声を合成する方法および装置 | |
CN112466297B (zh) | 一种基于时域卷积编解码网络的语音识别方法 | |
JPH0638199B2 (ja) | 音声認識装置 | |
EP1693826A1 (en) | Vocal tract resonance tracking using a nonlinear predictor and a target-guided temporal constraint | |
JPH10105187A (ja) | クラスタ構成をベースとする信号セグメント化方法 | |
JP2759267B2 (ja) | 音声認識テンプレートから音声を合成する方法および装置 | |
JP3266157B2 (ja) | 音声強調装置 | |
JPH10149191A (ja) | モデル適応方法、装置およびその記憶媒体 | |
JPH07199997A (ja) | 音声信号の処理システムにおける音声信号の処理方法およびその処理における処理時間の短縮方法 | |
JPH10254473A (ja) | 音声変換方法及び音声変換装置 | |
JP4603727B2 (ja) | 音響信号分析方法及び装置 | |
JPH07121197A (ja) | 学習式音声認識方法 | |
RU2271578C2 (ru) | Способ распознавания речевых команд управления | |
JPH06214592A (ja) | 耐雑音音韻モデルの作成方式 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C06 | Publication | ||
PB01 | Publication | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20041027 Termination date: 20111229 |