CN101399044B - 语音转换方法和系统 - Google Patents
语音转换方法和系统 Download PDFInfo
- Publication number
- CN101399044B CN101399044B CN200710163066.2A CN200710163066A CN101399044B CN 101399044 B CN101399044 B CN 101399044B CN 200710163066 A CN200710163066 A CN 200710163066A CN 101399044 B CN101399044 B CN 101399044B
- Authority
- CN
- China
- Prior art keywords
- spectrum
- speech
- voice
- conversion
- frequency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000006243 chemical reaction Methods 0.000 title claims abstract description 123
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000001228 spectrum Methods 0.000 claims abstract description 139
- 238000004458 analytical method Methods 0.000 claims abstract description 21
- 238000005452 bending Methods 0.000 claims abstract description 18
- 230000003595 spectral effect Effects 0.000 claims description 52
- 230000033764 rhythmic process Effects 0.000 claims description 28
- 238000012546 transfer Methods 0.000 claims description 16
- 238000009499 grossing Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 10
- 238000005516 engineering process Methods 0.000 abstract description 11
- 238000004590 computer program Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 34
- 238000012549 training Methods 0.000 description 30
- 230000014509 gene expression Effects 0.000 description 8
- 230000007704 transition Effects 0.000 description 7
- 230000008901 benefit Effects 0.000 description 5
- 238000013507 mapping Methods 0.000 description 5
- 239000000463 material Substances 0.000 description 5
- 230000001755 vocal effect Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000001831 conversion spectrum Methods 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- WBMKMLWMIQUJDP-STHHAXOLSA-N (4R,4aS,7aR,12bS)-4a,9-dihydroxy-3-prop-2-ynyl-2,4,5,6,7a,13-hexahydro-1H-4,12-methanobenzofuro[3,2-e]isoquinolin-7-one hydrochloride Chemical compound Cl.Oc1ccc2C[C@H]3N(CC#C)CC[C@@]45[C@@H](Oc1c24)C(=O)CC[C@@]35O WBMKMLWMIQUJDP-STHHAXOLSA-N 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 238000000050 ionisation spectroscopy Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G10L2021/0135—Voice conversion or morphing
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Electrically Operated Instructional Devices (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (12)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200710163066.2A CN101399044B (zh) | 2007-09-29 | 2007-09-29 | 语音转换方法和系统 |
US12/240,148 US8234110B2 (en) | 2007-09-29 | 2008-09-29 | Voice conversion method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200710163066.2A CN101399044B (zh) | 2007-09-29 | 2007-09-29 | 语音转换方法和系统 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101399044A CN101399044A (zh) | 2009-04-01 |
CN101399044B true CN101399044B (zh) | 2013-09-04 |
Family
ID=40509376
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200710163066.2A Expired - Fee Related CN101399044B (zh) | 2007-09-29 | 2007-09-29 | 语音转换方法和系统 |
Country Status (2)
Country | Link |
---|---|
US (1) | US8234110B2 (zh) |
CN (1) | CN101399044B (zh) |
Families Citing this family (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1569200A1 (en) * | 2004-02-26 | 2005-08-31 | Sony International (Europe) GmbH | Identification of the presence of speech in digital audio data |
CN101727904B (zh) * | 2008-10-31 | 2013-04-24 | 国际商业机器公司 | 语音翻译方法和装置 |
US8645140B2 (en) * | 2009-02-25 | 2014-02-04 | Blackberry Limited | Electronic device and method of associating a voice font with a contact for text-to-speech conversion at the electronic device |
CN101751922B (zh) * | 2009-07-22 | 2011-12-07 | 中国科学院自动化研究所 | 基于隐马尔可夫模型状态映射的文本无关语音转换系统 |
CN102063899B (zh) * | 2010-10-27 | 2012-05-23 | 南京邮电大学 | 一种非平行文本条件下的语音转换方法 |
GB2489473B (en) * | 2011-03-29 | 2013-09-18 | Toshiba Res Europ Ltd | A voice conversion method and system |
US8260615B1 (en) * | 2011-04-25 | 2012-09-04 | Google Inc. | Cross-lingual initialization of language models |
US9984700B2 (en) * | 2011-11-09 | 2018-05-29 | Speech Morphing Systems, Inc. | Method for exemplary voice morphing |
JP5846043B2 (ja) * | 2012-05-18 | 2016-01-20 | ヤマハ株式会社 | 音声処理装置 |
CN102723077B (zh) * | 2012-06-18 | 2014-07-09 | 北京语言大学 | 汉语教学语音合成方法及装置 |
CN102982809B (zh) * | 2012-12-11 | 2014-12-10 | 中国科学技术大学 | 一种说话人声音转换方法 |
US20150179167A1 (en) * | 2013-12-19 | 2015-06-25 | Kirill Chekhter | Phoneme signature candidates for speech recognition |
CN103730121B (zh) * | 2013-12-24 | 2016-08-24 | 中山大学 | 一种伪装声音的识别方法及装置 |
US9438195B2 (en) | 2014-05-23 | 2016-09-06 | Apple Inc. | Variable equalization |
US9613620B2 (en) | 2014-07-03 | 2017-04-04 | Google Inc. | Methods and systems for voice conversion |
CN104464725B (zh) * | 2014-12-30 | 2017-09-05 | 福建凯米网络科技有限公司 | 一种唱歌模仿的方法与装置 |
US9620140B1 (en) | 2016-01-12 | 2017-04-11 | Raytheon Company | Voice pitch modification to increase command and control operator situational awareness |
JP6646001B2 (ja) * | 2017-03-22 | 2020-02-14 | 株式会社東芝 | 音声処理装置、音声処理方法およびプログラム |
US10622002B2 (en) * | 2017-05-24 | 2020-04-14 | Modulate, Inc. | System and method for creating timbres |
CN107705802B (zh) * | 2017-09-11 | 2021-01-29 | 厦门美图之家科技有限公司 | 语音转换方法、装置、电子设备及可读存储介质 |
CN107507619B (zh) * | 2017-09-11 | 2021-08-20 | 厦门美图之家科技有限公司 | 语音转换方法、装置、电子设备及可读存储介质 |
CN107731241B (zh) * | 2017-09-29 | 2021-05-07 | 广州酷狗计算机科技有限公司 | 处理音频信号的方法、装置和存储介质 |
CN107958672A (zh) * | 2017-12-12 | 2018-04-24 | 广州酷狗计算机科技有限公司 | 获取基音波形数据的方法和装置 |
JP7040258B2 (ja) * | 2018-04-25 | 2022-03-23 | 日本電信電話株式会社 | 発音変換装置、その方法、およびプログラム |
IT201800005283A1 (it) * | 2018-05-11 | 2019-11-11 | Rimodulatore del timbro vocale | |
CN108847249B (zh) * | 2018-05-30 | 2020-06-05 | 苏州思必驰信息科技有限公司 | 声音转换优化方法和系统 |
CN109616131B (zh) * | 2018-11-12 | 2023-07-07 | 南京南大电子智慧型服务机器人研究院有限公司 | 一种数字实时语音变音方法 |
TWI754804B (zh) * | 2019-03-28 | 2022-02-11 | 國立中正大學 | 改善構音異常語音理解度之系統與方法 |
US11538485B2 (en) | 2019-08-14 | 2022-12-27 | Modulate, Inc. | Generation and detection of watermark for real-time voice conversion |
CN111402856B (zh) * | 2020-03-23 | 2023-04-14 | 北京字节跳动网络技术有限公司 | 语音处理方法、装置、可读介质及电子设备 |
CN111462769B (zh) * | 2020-03-30 | 2023-10-27 | 深圳市达旦数生科技有限公司 | 一种端到端的口音转换方法 |
CN111916093A (zh) * | 2020-07-31 | 2020-11-10 | 腾讯音乐娱乐科技(深圳)有限公司 | 音频处理方法及装置 |
US11996117B2 (en) | 2020-10-08 | 2024-05-28 | Modulate, Inc. | Multi-stage adaptive system for content moderation |
CN113421576B (zh) * | 2021-06-29 | 2024-05-24 | 平安科技(深圳)有限公司 | 语音转换方法、装置、设备以及存储介质 |
US20230298607A1 (en) * | 2022-03-15 | 2023-09-21 | Soundhound, Inc. | System and method for voice unidentifiable morphing |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1993018505A1 (en) * | 1992-03-02 | 1993-09-16 | The Walt Disney Company | Voice transformation system |
US6240384B1 (en) * | 1995-12-04 | 2001-05-29 | Kabushiki Kaisha Toshiba | Speech synthesis method |
WO1998035340A2 (en) * | 1997-01-27 | 1998-08-13 | Entropic Research Laboratory, Inc. | Voice conversion system and methodology |
US6336092B1 (en) * | 1997-04-28 | 2002-01-01 | Ivl Technologies Ltd | Targeted vocal transformation |
JP3631657B2 (ja) * | 2000-04-03 | 2005-03-23 | シャープ株式会社 | 声質変換装置および声質変換方法、並びに、プログラム記録媒体 |
US7277554B2 (en) * | 2001-08-08 | 2007-10-02 | Gn Resound North America Corporation | Dynamic range compression using digital frequency warping |
DE602005026778D1 (de) | 2004-01-16 | 2011-04-21 | Scansoft Inc | Corpus-gestützte sprachsynthese auf der basis von segmentrekombination |
-
2007
- 2007-09-29 CN CN200710163066.2A patent/CN101399044B/zh not_active Expired - Fee Related
-
2008
- 2008-09-29 US US12/240,148 patent/US8234110B2/en active Active
Also Published As
Publication number | Publication date |
---|---|
US20090089063A1 (en) | 2009-04-02 |
CN101399044A (zh) | 2009-04-01 |
US8234110B2 (en) | 2012-07-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101399044B (zh) | 语音转换方法和系统 | |
Arslan | Speaker transformation algorithm using segmental codebooks (STASC) | |
US7996222B2 (en) | Prosody conversion | |
Erro et al. | Voice conversion based on weighted frequency warping | |
Ye et al. | Quality-enhanced voice morphing using maximum likelihood transformations | |
EP2881947B1 (en) | Spectral envelope and group delay inference system and voice signal synthesis system for voice analysis/synthesis | |
US10692484B1 (en) | Text-to-speech (TTS) processing | |
JP2007249212A (ja) | テキスト音声合成のための方法、コンピュータプログラム及びプロセッサ | |
Choi et al. | Korean singing voice synthesis based on auto-regressive boundary equilibrium gan | |
Nguyen et al. | High quality voice conversion using prosodic and high-resolution spectral features | |
Lee | Statistical approach for voice personality transformation | |
Ben Othmane et al. | Enhancement of esophageal speech obtained by a voice conversion technique using time dilated fourier cepstra | |
Panda et al. | A waveform concatenation technique for text-to-speech synthesis | |
JP2018084604A (ja) | クロスリンガル音声合成用モデル学習装置、クロスリンガル音声合成装置、クロスリンガル音声合成用モデル学習方法、プログラム | |
Nose et al. | Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency | |
Lee et al. | A segmental speech coder based on a concatenative TTS | |
JP3281266B2 (ja) | 音声合成方法及び装置 | |
Al-Radhi et al. | Continuous vocoder applied in deep neural network based voice conversion | |
Shuang et al. | Voice conversion by combining frequency warping with unit selection | |
Tamura et al. | One sentence voice adaptation using GMM-based frequency-warping and shift with a sub-band basis spectrum model | |
Shuang et al. | A novel voice conversion system based on codebook mapping with phoneme-tied weighting | |
Wen et al. | Pitch-scaled spectrum based excitation model for HMM-based speech synthesis | |
Lachhab et al. | A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion | |
Levy-Leshem et al. | Taco-VC: A single speaker tacotron based voice conversion with limited data | |
Chunwijitra et al. | A tone-modeling technique using a quantized F0 context to improve tone correctness in average-voice-based speech synthesis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
ASS | Succession or assignment of patent right |
Owner name: NIUAOSI COMMUNICATIONS LIMITED Free format text: FORMER OWNER: INTERNATIONAL BUSINESS MACHINE CORP. Effective date: 20090925 |
|
C41 | Transfer of patent application or patent right or utility model | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20090925 Address after: Massachusetts, USA Applicant after: IBM Address before: New York grams of Armand Applicant before: International Business Machines Corp. |
|
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20130904 Termination date: 20200929 |
|
CF01 | Termination of patent right due to non-payment of annual fee |