CA3004700C - Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system - Google Patents
Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system Download PDFInfo
- Publication number
- CA3004700C CA3004700C CA3004700A CA3004700A CA3004700C CA 3004700 C CA3004700 C CA 3004700C CA 3004700 A CA3004700 A CA 3004700A CA 3004700 A CA3004700 A CA 3004700A CA 3004700 C CA3004700 C CA 3004700C
- Authority
- CA
- Canada
- Prior art keywords
- speech
- band
- sub
- glottal
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 75
- 230000005284 excitation Effects 0.000 title claims abstract description 50
- 230000015572 biosynthetic process Effects 0.000 title claims abstract description 33
- 238000003786 synthesis reaction Methods 0.000 title claims abstract description 30
- 239000013598 vector Substances 0.000 claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 36
- 230000003595 spectral effect Effects 0.000 claims abstract description 29
- 230000001419 dependent effect Effects 0.000 claims description 13
- 238000001228 spectrum Methods 0.000 claims description 12
- 238000001914 filtration Methods 0.000 claims description 6
- 238000003066 decision tree Methods 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 description 44
- 238000000605 extraction Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 7
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 238000000513 principal component analysis Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000010561 standard procedure Methods 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 238000012952 Resampling Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000004513 sizing Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/75—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 for modelling vocal tract parameters
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2015/054122 WO2017061985A1 (en) | 2015-10-06 | 2015-10-06 | Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system |
Publications (2)
Publication Number | Publication Date |
---|---|
CA3004700A1 CA3004700A1 (en) | 2017-04-13 |
CA3004700C true CA3004700C (en) | 2021-03-23 |
Family
ID=58488102
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA3004700A Active CA3004700C (en) | 2015-10-06 | 2015-10-06 | Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP3363015A4 (de) |
KR (1) | KR20180078252A (de) |
CN (1) | CN108369803B (de) |
AU (1) | AU2015411306A1 (de) |
CA (1) | CA3004700C (de) |
WO (1) | WO2017061985A1 (de) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3857541B1 (de) * | 2018-09-30 | 2023-07-19 | Microsoft Technology Licensing, LLC | Erzeugung von sprachwellenformen |
CN109767755A (zh) * | 2019-03-01 | 2019-05-17 | 广州多益网络股份有限公司 | 一种语音合成方法和系统 |
CN111862931B (zh) * | 2020-05-08 | 2024-09-24 | 北京嘀嘀无限科技发展有限公司 | 一种语音生成方法及装置 |
CN112365875B (zh) * | 2020-11-18 | 2021-09-10 | 北京百度网讯科技有限公司 | 语音合成方法、装置、声码器和电子设备 |
CN113571079A (zh) * | 2021-02-08 | 2021-10-29 | 腾讯科技(深圳)有限公司 | 语音增强方法、装置、设备及存储介质 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6070140A (en) * | 1995-06-05 | 2000-05-30 | Tran; Bao Q. | Speech recognizer |
US5937384A (en) * | 1996-05-01 | 1999-08-10 | Microsoft Corporation | Method and system for speech recognition using continuous density hidden Markov models |
US20020116196A1 (en) * | 1998-11-12 | 2002-08-22 | Tran Bao Q. | Speech recognizer |
US6970820B2 (en) * | 2001-02-26 | 2005-11-29 | Matsushita Electric Industrial Co., Ltd. | Voice personalization of speech synthesizer |
WO2003019527A1 (fr) * | 2001-08-31 | 2003-03-06 | Kabushiki Kaisha Kenwood | Procede et appareil de generation d'un signal affecte d'un pas et procede et appareil de compression/decompression et de synthese d'un signal vocal l'utilisant |
ATE456130T1 (de) * | 2007-10-29 | 2010-02-15 | Harman Becker Automotive Sys | Partielle sprachrekonstruktion |
CA2724753A1 (en) * | 2008-05-30 | 2009-12-03 | Nokia Corporation | Method, apparatus and computer program product for providing improved speech synthesis |
PL2242045T3 (pl) * | 2009-04-16 | 2013-02-28 | Univ Mons | Sposób kodowania i syntezy mowy |
CN102231275B (zh) * | 2011-06-01 | 2013-10-16 | 北京宇音天下科技有限公司 | 一种基于加权混合激励的嵌入式语音合成方法 |
CN102270449A (zh) * | 2011-08-10 | 2011-12-07 | 歌尔声学股份有限公司 | 参数语音合成方法和系统 |
US20130080172A1 (en) * | 2011-09-22 | 2013-03-28 | General Motors Llc | Objective evaluation of synthesized speech attributes |
US10453479B2 (en) * | 2011-09-23 | 2019-10-22 | Lessac Technologies, Inc. | Methods for aligning expressive speech utterances with text and systems therefor |
GB2508417B (en) * | 2012-11-30 | 2017-02-08 | Toshiba Res Europe Ltd | A speech processing system |
TWI573129B (zh) * | 2013-02-05 | 2017-03-01 | 國立交通大學 | 編碼串流產生裝置、韻律訊息編碼裝置、韻律結構分析裝置與語音合成之裝置及方法 |
-
2015
- 2015-10-06 EP EP15905930.2A patent/EP3363015A4/de not_active Ceased
- 2015-10-06 CN CN201580085103.5A patent/CN108369803B/zh active Active
- 2015-10-06 AU AU2015411306A patent/AU2015411306A1/en not_active Abandoned
- 2015-10-06 KR KR1020187012944A patent/KR20180078252A/ko not_active Application Discontinuation
- 2015-10-06 CA CA3004700A patent/CA3004700C/en active Active
- 2015-10-06 WO PCT/US2015/054122 patent/WO2017061985A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2017061985A1 (en) | 2017-04-13 |
KR20180078252A (ko) | 2018-07-09 |
AU2015411306A1 (en) | 2018-05-24 |
CN108369803A (zh) | 2018-08-03 |
CA3004700A1 (en) | 2017-04-13 |
CN108369803B (zh) | 2023-04-04 |
EP3363015A4 (de) | 2019-06-12 |
EP3363015A1 (de) | 2018-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10621969B2 (en) | Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system | |
DK2579249T3 (en) | PARAMETER SPEECH SYNTHESIS PROCEDURE AND SYSTEM | |
CA3004700C (en) | Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system | |
AU2020227065B2 (en) | Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system | |
US10014007B2 (en) | Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system | |
CN110459202A (zh) | 一种韵律标注方法、装置、设备、介质 | |
Silva et al. | Spoken digit recognition in portuguese using line spectral frequencies | |
CN116994553A (zh) | 语音合成模型的训练方法、语音合成方法、装置及设备 | |
Kadyan et al. | Prosody features based low resource Punjabi children ASR and T-NT classifier using data augmentation | |
CN117935789A (zh) | 语音识别方法及系统、设备、存储介质 | |
US10446133B2 (en) | Multi-stream spectral representation for statistical parametric speech synthesis | |
Bansal et al. | A novel AFM signal model for parametric representation of speech phonemes | |
EP3113180B1 (de) | Verfahren zur durchführung einer audio-einblendung in ein sprachsignal und vorrichtung zur durchführung einer audio-einblendung in ein sprachsignal | |
CN102231275B (zh) | 一种基于加权混合激励的嵌入式语音合成方法 | |
CN111862931B (zh) | 一种语音生成方法及装置 | |
Hua | Do WaveNets Dream of Acoustic Waves? | |
CN115631744A (zh) | 一种两阶段的多说话人基频轨迹提取方法 | |
Thomas et al. | Hilbert Envelope Based Specto-Temporal Features for Phoneme Recognition in Telephone Speech | |
Lamkadam et al. | Comparative study and improvement of acoustic vectors extractors: Multiple streams applied to the recognition of Arabic numerals | |
Ye | Efficient Approaches for Voice Change and Voice Conversion Systems | |
CN113096625A (zh) | 多人佛乐生成方法、装置、设备及存储介质 | |
CN113345415A (zh) | 语音合成方法、装置、设备及存储介质 | |
Jinachitra | Robust structured voice extraction for flexible expressive resynthesis | |
HERMUS | Exponentieel Sinusoıdale Spraakcompressie voor Corpus-gebaseerde Tekst-naar-Spraak Synthese |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request |
Effective date: 20180508 |