CN105654939B - 一种基于音向量文本特征的语音合成方法 - Google Patents
一种基于音向量文本特征的语音合成方法 Download PDFInfo
- Publication number
- CN105654939B CN105654939B CN201610000677.4A CN201610000677A CN105654939B CN 105654939 B CN105654939 B CN 105654939B CN 201610000677 A CN201610000677 A CN 201610000677A CN 105654939 B CN105654939 B CN 105654939B
- Authority
- CN
- China
- Prior art keywords
- text
- module
- sound
- vector
- sound vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 230000002194 synthesizing effect Effects 0.000 title claims abstract description 13
- 238000012549 training Methods 0.000 claims abstract description 45
- 238000013507 mapping Methods 0.000 claims abstract description 16
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 9
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims abstract description 3
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 11
- 230000033764 rhythmic process Effects 0.000 claims description 5
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 230000000306 recurrent effect Effects 0.000 claims description 4
- 241000208340 Araliaceae Species 0.000 claims description 2
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 claims description 2
- 235000003140 Panax quinquefolius Nutrition 0.000 claims description 2
- 235000008434 ginseng Nutrition 0.000 claims description 2
- 238000005094 computer simulation Methods 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 4
- 230000001427 coherent effect Effects 0.000 description 2
- 238000013179 statistical model Methods 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000010189 synthetic method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
Description
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610000677.4A CN105654939B (zh) | 2016-01-04 | 2016-01-04 | 一种基于音向量文本特征的语音合成方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610000677.4A CN105654939B (zh) | 2016-01-04 | 2016-01-04 | 一种基于音向量文本特征的语音合成方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105654939A CN105654939A (zh) | 2016-06-08 |
CN105654939B true CN105654939B (zh) | 2019-09-13 |
Family
ID=56490413
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610000677.4A Active CN105654939B (zh) | 2016-01-04 | 2016-01-04 | 一种基于音向量文本特征的语音合成方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105654939B (zh) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107515850A (zh) * | 2016-06-15 | 2017-12-26 | 阿里巴巴集团控股有限公司 | 确定多音字发音的方法、装置和系统 |
CN106227721B (zh) * | 2016-08-08 | 2019-02-01 | 中国科学院自动化研究所 | 汉语韵律层级结构预测系统 |
CN106328139A (zh) * | 2016-09-14 | 2017-01-11 | 努比亚技术有限公司 | 一种语音交互的方法和系统 |
CN106776501A (zh) * | 2016-12-13 | 2017-05-31 | 深圳爱拼信息科技有限公司 | 一种文本错别字自动更正方法和服务器 |
CN106971709B (zh) | 2017-04-19 | 2021-10-15 | 腾讯科技(上海)有限公司 | 统计参数模型建立方法和装置、语音合成方法和装置 |
CN107729313B (zh) * | 2017-09-25 | 2021-09-17 | 百度在线网络技术(北京)有限公司 | 基于深度神经网络的多音字读音的判别方法和装置 |
CN108665901B (zh) * | 2018-05-04 | 2020-06-30 | 广州国音科技有限公司 | 一种音素/音节提取方法及装置 |
CN109036371B (zh) * | 2018-07-19 | 2020-12-18 | 北京光年无限科技有限公司 | 用于语音合成的音频数据生成方法及系统 |
CN109119067B (zh) * | 2018-11-19 | 2020-11-27 | 苏州思必驰信息科技有限公司 | 语音合成方法及装置 |
CN109754778B (zh) * | 2019-01-17 | 2023-05-30 | 平安科技(深圳)有限公司 | 文本的语音合成方法、装置和计算机设备 |
CN110189744A (zh) * | 2019-04-09 | 2019-08-30 | 阿里巴巴集团控股有限公司 | 文本处理的方法、装置和电子设备 |
CN110136692B (zh) * | 2019-04-30 | 2021-12-14 | 北京小米移动软件有限公司 | 语音合成方法、装置、设备及存储介质 |
CN112750419B (zh) * | 2020-12-31 | 2024-02-13 | 科大讯飞股份有限公司 | 一种语音合成方法、装置、电子设备和存储介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1731509A (zh) * | 2005-09-02 | 2006-02-08 | 清华大学 | 移动语音合成方法 |
CN101178896A (zh) * | 2007-12-06 | 2008-05-14 | 安徽科大讯飞信息科技股份有限公司 | 基于声学统计模型的单元挑选语音合成方法 |
CN102270449A (zh) * | 2011-08-10 | 2011-12-07 | 歌尔声学股份有限公司 | 参数语音合成方法和系统 |
CN102496363A (zh) * | 2011-11-11 | 2012-06-13 | 北京宇音天下科技有限公司 | 一种用于汉语语音合成的音调修正方法 |
CN104217713A (zh) * | 2014-07-15 | 2014-12-17 | 西北师范大学 | 汉藏双语语音合成方法及装置 |
JP2015036788A (ja) * | 2013-08-14 | 2015-02-23 | 直也 内野 | 外国語の発音学習装置 |
-
2016
- 2016-01-04 CN CN201610000677.4A patent/CN105654939B/zh active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1731509A (zh) * | 2005-09-02 | 2006-02-08 | 清华大学 | 移动语音合成方法 |
CN101178896A (zh) * | 2007-12-06 | 2008-05-14 | 安徽科大讯飞信息科技股份有限公司 | 基于声学统计模型的单元挑选语音合成方法 |
CN102270449A (zh) * | 2011-08-10 | 2011-12-07 | 歌尔声学股份有限公司 | 参数语音合成方法和系统 |
CN102496363A (zh) * | 2011-11-11 | 2012-06-13 | 北京宇音天下科技有限公司 | 一种用于汉语语音合成的音调修正方法 |
JP2015036788A (ja) * | 2013-08-14 | 2015-02-23 | 直也 内野 | 外国語の発音学習装置 |
CN104217713A (zh) * | 2014-07-15 | 2014-12-17 | 西北师范大学 | 汉藏双语语音合成方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN105654939A (zh) | 2016-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105654939B (zh) | 一种基于音向量文本特征的语音合成方法 | |
Zhang et al. | Transfer learning from speech synthesis to voice conversion with non-parallel training data | |
CN103065620B (zh) | 在手机上或网页上接收用户输入的文字并实时合成为个性化声音的方法 | |
CN101064104B (zh) | 基于语音转换的情感语音生成方法 | |
CN110136691B (zh) | 一种语音合成模型训练方法、装置、电子设备及存储介质 | |
CN112863483A (zh) | 支持多说话人风格、语言切换且韵律可控的语音合成装置 | |
CN106971709A (zh) | 统计参数模型建立方法和装置、语音合成方法和装置 | |
CN1835075B (zh) | 一种结合自然样本挑选与声学参数建模的语音合成方法 | |
CN102201234B (zh) | 一种基于音调自动标注及预测的语音合成方法 | |
CN106128450A (zh) | 一种汉藏双语跨语言语音转换的方法及其系统 | |
CN110060701A (zh) | 基于vawgan-ac的多对多语音转换方法 | |
CN102938252B (zh) | 结合韵律和发音学特征的汉语声调识别系统及方法 | |
CN111210803B (zh) | 一种基于Bottle neck特征训练克隆音色及韵律的系统及方法 | |
CN102426834B (zh) | 测试英文口语韵律水平的方法 | |
CN106057192A (zh) | 一种实时语音转换方法和装置 | |
CN105390133A (zh) | 藏语ttvs系统的实现方法 | |
CN112037758A (zh) | 一种语音合成方法及装置 | |
Indumathi et al. | Survey on speech synthesis | |
CN109036376A (zh) | 一种闽南语语音合成方法 | |
TWI503813B (zh) | 可控制語速的韻律訊息產生裝置及語速相依之階層式韻律模組 | |
CN113257221B (zh) | 一种基于前端设计的语音模型训练方法及语音合成方法 | |
Choi et al. | A melody-unsupervision model for singing voice synthesis | |
CN110556092A (zh) | 语音的合成方法及装置、存储介质、电子装置 | |
CN112242134A (zh) | 语音合成方法及装置 | |
CN116913244A (zh) | 一种语音合成方法、设备及介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 310000 Room 1105, 11/F, Building 4, No. 9, Jiuhuan Road, Jianggan District, Hangzhou City, Zhejiang Province Applicant after: Limit element (Hangzhou) intelligent Polytron Technologies Inc. Address before: 100089 Floor 1-312-316, No. 1 Building, 35 Shangdi East Road, Haidian District, Beijing Applicant before: Limit element (Beijing) smart Polytron Technologies Inc. Address after: 100089 Floor 1-312-316, No. 1 Building, 35 Shangdi East Road, Haidian District, Beijing Applicant after: Limit element (Beijing) smart Polytron Technologies Inc. Address before: 100089 Floor 1-312-316, No. 1 Building, 35 Shangdi East Road, Haidian District, Beijing Applicant before: Limit Yuan (Beijing) Intelligent Technology Co.,Ltd. Address after: 100089 Floor 1-312-316, No. 1 Building, 35 Shangdi East Road, Haidian District, Beijing Applicant after: Limit Yuan (Beijing) Intelligent Technology Co.,Ltd. Address before: 100085 Block 318, Yiquanhui Office Building, 35 Shangdi East Road, Haidian District, Beijing Applicant before: BEIJING TIMES RUILANG TECHNOLOGY Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: 310000 Room 1105, 11/F, Building 4, No. 9, Jiuhuan Road, Jianggan District, Hangzhou City, Zhejiang Province Patentee after: Zhongke extreme element (Hangzhou) Intelligent Technology Co.,Ltd. Address before: 310000 Room 1105, 11/F, Building 4, No. 9, Jiuhuan Road, Jianggan District, Hangzhou City, Zhejiang Province Patentee before: Limit element (Hangzhou) intelligent Polytron Technologies Inc. |