CN101901598A - Humming synthesis method and system - Google Patents

Humming synthesis method and system Download PDF

Info

Publication number
CN101901598A
CN101901598A CN2010102234975A CN201010223497A CN101901598A CN 101901598 A CN101901598 A CN 101901598A CN 2010102234975 A CN2010102234975 A CN 2010102234975A CN 201010223497 A CN201010223497 A CN 201010223497A CN 101901598 A CN101901598 A CN 101901598A
Authority
CN
China
Prior art keywords
syllable
parameters
base frequency
text
song
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010102234975A
Other languages
Chinese (zh)
Inventor
李健
张连毅
武卫东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JIETONG HUASHENG SPEECH TECHNOLOGY Co Ltd
Beijing Sinovoice Technology Co Ltd
Original Assignee
JIETONG HUASHENG SPEECH TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JIETONG HUASHENG SPEECH TECHNOLOGY Co Ltd filed Critical JIETONG HUASHENG SPEECH TECHNOLOGY Co Ltd
Priority to CN2010102234975A priority Critical patent/CN101901598A/en
Publication of CN101901598A publication Critical patent/CN101901598A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Auxiliary Devices For Music (AREA)

Abstract

The invention provides a humming synthesis method and a humming synthesis system. The method comprises the following steps of: receiving a text input by a user; performing text analysis to acquire a syllable sequence corresponding to the text and syllable names of syllables in the syllable sequence; acquiring corresponding time length parameters, base frequency parameters and spectrum parameters by combing a statistical parameter model and planning according to the syllable names of the syllables and the context aiming at the syllables in the syllable sequence; regulating the time length parameters and the base frequency parameters acquired by planing according to a song template selected by the user and the number of the syllables of the syllable sequence, wherein the song template stores the time length parameters and the base frequency parameters of the syllables; performing interpolation regulation on the spectrum parameters of the corresponding syllables according to the regulated time length parameters; and acquiring voice data by using a synthesizer according to the time length parameters, the base frequency parameters and the spectrum parameters of the syllables in the syllable sequence. The method and the system can output the voice data with song rhythm and melody.

Description

A kind of humming synthesis method and system
Technical field
The present invention relates to the speech synthesis technique field, particularly relate to a kind of humming synthesis method and system.
Background technology
Speech synthesis technique claims literary composition language conversion (TTS, Text to Speech) technology again, and its massage voice reading that any Word message can be converted into the standard smoothness comes out.
Present phoneme synthesizing method is to prerecord a sound bank, finishes a speech synthesis system then on this sound bank basis.The intonation rhythm of the method synthetic video depends on sound bank, and the sound that promptly synthesizes similarly is that the recording people is speaking.
And in some entertainment applications, the user wishes to regulate the intonation rhythm of synthetic speech, such as, note " is sung " with the intonation of song.
In a word, need the urgent technical matters that solves of those skilled in the art to be exactly: how can synthesize voice with song intonation rhythm.
Summary of the invention
Technical matters to be solved by this invention provides a kind of humming synthesis method and system, is used to export the speech data that has song rhythm and melody.
In order to address the above problem, the invention discloses a kind of humming synthesis method, comprising:
Receive the text of user's input;
Carry out text analyzing, obtain the syllable sequence corresponding with described text, and, the syllable title of each syllable in this syllable sequence;
At each syllable in the described syllable sequence, according to its syllable title and context environmental, in conjunction with the statistical parameter model, planning obtains corresponding time length parameter, base frequency parameters and spectrum parameter;
According to the song template of user's selection and the syllable number of described syllable sequence, duration parameters, base frequency parameters that described planning obtains are adjusted, wherein, articulatory duration parameters of storage and base frequency parameters in the described song template;
According to adjusted duration parameters, the spectrum parameter of corresponding syllables is carried out the interpolation adjustment;
According to duration parameters, base frequency parameters and the spectrum parameter of each syllable in the described syllable sequence, utilize compositor to obtain the speech data corresponding with described syllable sequence.
Preferably, the described step that duration parameters, base frequency parameters are adjusted comprises:
Obtain the syllable number of described syllable sequence;
From described song template, extract and described syllable number corresponding time length parameter and base frequency parameters, and cover duration parameters, the base frequency parameters that described planning obtains.
Preferably, described text analyzing step comprises:
Described text is carried out the participle operation;
Numeric character in the described text is converted to literal;
According to word segmentation result, the text after the numeric character conversion is carried out rhythm prediction;
Predicting the outcome according to the rhythm, is syllable sequence with text-converted, and, based on the syllable mapping table, obtain the syllable title of each syllable in this syllable sequence.
Preferably, described song template is the template that generates as follows:
At song sample, extract the wherein duration parameters and the base frequency parameters of each syllable;
With described duration parameters and base frequency parameters, be saved to the song template.
Preferably, described song sample comprises the song sample of singing opera arias.
On the other hand, the invention also discloses a kind of humming synthesis system, comprising:
Interface module is used to receive the text that the user imports;
Text analysis model is used to carry out text analyzing, obtains the syllable sequence corresponding with described text, and, the syllable title of each syllable in this syllable sequence;
The parametric programming module is used at each syllable of described syllable sequence, and according to its syllable title and context environmental, in conjunction with the statistical parameter model, planning obtains corresponding time length parameter, base frequency parameters and spectrum parameter;
First parameter adjustment module, be used for according to the song template of user's selection and the syllable number of described syllable sequence, duration parameters, base frequency parameters that described planning obtains are adjusted, wherein, articulatory duration parameters of storage and base frequency parameters in the described song template;
Second parameter adjustment module is used for according to adjusted duration parameters the spectrum parameter of corresponding syllables being carried out the interpolation adjustment;
Synthesis module is used for duration parameters, base frequency parameters and spectrum parameter according to described each syllable of syllable sequence, utilizes compositor to obtain the speech data corresponding with described syllable sequence.
Preferably, described first parameter adjustment module comprises:
Acquiring unit is used to obtain the syllable number of described syllable sequence;
Adjustment unit is used for extracting and described syllable number corresponding parameters information from the song template, covers duration parameters, base frequency parameters that described planning obtains, and the spectrum parameter is carried out interpolation according to the planning duration.
Preferably, described text analysis model comprises:
The participle unit is used for described text is carried out the participle operation;
The numeric character converting unit is used for the numeric character of described text is converted to literal;
Rhythm predicting unit is used for according to word segmentation result, and the text after the numeric character conversion is carried out rhythm prediction;
The syllable converting unit is used for predicting the outcome according to the rhythm, is syllable sequence with text-converted, and, based on the syllable mapping table, obtain the syllable title of each syllable in this syllable sequence.
Preferably, described system also comprises song template generation module, and this song template generation module comprises:
Extraction unit is used at song sample, extracts the wherein duration parameters and the base frequency parameters of each syllable;
Preserve the unit, be used for described duration parameters and base frequency parameters are saved to the song template.
Preferably, described song sample comprises the song sample of singing opera arias.
Compared with prior art, the present invention has the following advantages:
It is unit storage duration parameters, base frequency parameters with the syllable that the present invention adopts the song template, and can name described song template according to the rule of sign rhythm such as song title, melody; Like this, the user can select suitable song template according to actual demands such as personal habits, application scenarioss, adjusts with duration and base frequency parameters that planning is obtained, obtains the speech data of user input text at last based on the parameter synthetic technology.Because in speech parameter, duration and base frequency parameters determine the information of rhythm, melody aspect jointly, spectrum parameter decision tone color information, i.e. the characteristic voice information of speaker; Thereby the present invention can combine duration, the base frequency parameters of song template with the spectrum parameter of sound storehouse speaker, and can access tone color is that sound storehouse speaker, tone rhythm are song and the humming voice flow that has certain melody.
Description of drawings
Fig. 1 is the process flow diagram of a kind of humming synthesis method embodiment of the present invention;
Fig. 2 is a kind of structural drawing of humming synthesis system embodiment of the present invention.
Embodiment
For above-mentioned purpose of the present invention, feature and advantage can be become apparent more, the present invention is further detailed explanation below in conjunction with the drawings and specific embodiments.
One of core idea of the embodiment of the invention is, generate the song template based on duration parameters and base frequency parameters, and, when user input text, can adjust duration and base frequency parameters that planning obtains according to described song template, utilize compositor to obtain the speech data of described text then.Because in speech parameter, duration and base frequency parameters determine the information of rhythm, melody aspect jointly, spectrum parameter decision tone color information, i.e. the characteristic voice information of speaker; Thereby above-mentioned duration with the song template, base frequency parameters combine with the spectrum parameter of sound storehouse speaker, and can access tone color is that sound storehouse speaker, tone rhythm are song and the humming voice flow that has certain melody.
With reference to Fig. 1, show the process flow diagram of a kind of humming synthesis method embodiment of the present invention, specifically can comprise:
The text of step 101, reception user input;
The text of described user's input can comprise literal and numeric character, wherein, described literal can be Chinese character, Japanese, Korean, English etc., perhaps, in the above-mentioned kinds of words one or several, as Chinese-English combination or the like, the present invention is not limited concrete text, below mainly is example with the Chinese character.
Step 102, carry out text analyzing, obtain the syllable sequence corresponding with described text, and, the syllable title of each syllable in this syllable sequence;
Below concrete text " grand Opening Ceremony of the Games has been held at 2008-8-8 in Beijing " be example, described text analyzing step is described, specifically can comprise:
Substep A1, described text is carried out participle operation;
Word segmentation result: Beijing/hold at/2008-8-8/// grand// Olympic Games/opening ceremony
Substep A2, the numeric character in the described text is converted to literal;
Corresponding this example, described numeric character conversion also promptly is converted to " 2008-8-8 " " 2008 on August 8, ", and the text after the numeric character conversion is " grand Opening Ceremony of the Games has been held in Beijing 2008 on August 8, ".
Substep A3, according to word segmentation result, the text after the numeric character conversion is carried out rhythm prediction;
The rhythm predicts the outcome: Beijing is at grand Opening Ceremony of the Games in 2008 on August 8 ,/held
Substep A4, predicting the outcome according to the rhythm, is syllable sequence with text-converted, and, based on the syllable mapping table, obtain the syllable title of each syllable in this syllable sequence.
Syllable sequence: bei3 jing1 zai4 er4 ling2 ling2 ba1 nian2 ba1 yue4 ba1 ri4Ju3 xing2 le5 sheng4 da4 de5 ao4 yun4 hui4 kai1 mu4 shi4
Wherein, numeral 12345 is represented tone, is respectively, two, three, the four tones of standard Chinese pronunciation, softly.In practice, the syllable title of Chinese character syllable can obtain by inquiry of Chinese character syllable mapping table, and " bei3 " that for example go up in the example promptly is the syllable title.
Step 103, at each syllable in the described syllable sequence, according to its syllable title and context environmental, in conjunction with the statistical parameter model, planning obtains corresponding time length parameter, base frequency parameters and spectrum parameter;
Described context environmental mainly is meant the positional information of syllable, can comprise in beginning of the sentence, the sentence and end of the sentence; Example on the correspondence, the context environmental of " shi4 " is an end of the sentence, the context environmental of " er4 " then is in the sentence.
In practice, described statistical parameter model can obtain by off-line training, and it stores each syllable pairing parameter under different context environmentals.
For example, during off-line, train first statistical model, train second statistical model at base frequency parameters at duration parameters, and, at spectrum parameter training the 3rd statistical model; So, during online planning, can directly obtain and syllable corresponding time length parameter, base frequency parameters and spectrum parameter from described three statistical models.
Step 104, the song template of selecting according to the user and the syllable number of described syllable sequence are adjusted duration parameters, base frequency parameters that described planning obtains, wherein, and articulatory duration parameters of storage and base frequency parameters in the described song template;
In practice, can set up the song template by following off-line step:
Substep A1, at song sample, extract the wherein duration parameters and the base frequency parameters of each syllable;
Substep A2, with described duration parameters and base frequency parameters, be saved to the song template.
Because common song is made up of voice and music two parts, and the sounding characteristics and the mankind of musical instrument differ greatly, and can produce a lot of deviations during extraction, therefore, the present invention preferentially selects the song sample of singing opera arias for use.
In speech parameter, duration parameters also is the tone period length of each syllable, can determine according to wave file; Base frequency parameters is the vibration frequency of sound wave, can at first detect the cycle of waveform during extraction, gets inverse then and can obtain base frequency parameters.
In specific implementation, can adopt ripe instrument, from song sample, extract described duration parameters and base frequency parameters automatically, the present invention is not limited concrete extracting mode.
In addition, the present invention generally generates a song template at a song sample, and wherein, described song sample can be complete song, also can be snatch of song; And, select for the convenience of the user, can be described song template name, for example, described naming rule can be a song title: " approximately in the winter time ", " moon is represented my heart ", " story Of The Spring " etc.
When user input text, the present invention can represent the option that described off-line is set up several song templates, select for the user, and the user can select suitable song template according to actual needs such as personal habits, application scenarioss.
Particularly, described step 104 can realize by following substep:
Substep B1, obtain the syllable number of described syllable sequence;
Substep B2, extraction and described syllable number corresponding time length parameter and base frequency parameters from described song template, and cover duration parameters, the base frequency parameters that described planning obtains.
The syllable number of supposing the described syllable sequence that acquires is N, and the syllable number in the described song template is M, wherein, M, N is natural number, and set-up procedure of the present invention mainly contains two kinds of situations:
Situation 1, M 〉=N;
At this moment, can directly from the song template, intercept the duration parameters and the base frequency parameters of top n syllable.
Situation 2, M<N;
At this situation, can recycling described song template in the duration parameters and the base frequency parameters of M syllable, suppose that the syllable sequence number is 1,2 in the song template, ..., M, and hypothesis N>2M, so, the syllable sequence number in the pairing song template of duration parameters of finally obtaining and base frequency parameters can for: 1,2, ..., M, 1,2 ..., M, 1,2 ... N.
Here, the duration parameters that described coverage planning obtains, base frequency parameters also, are replaced original duration parameters and base frequency parameters with duration parameters in the song template and base frequency parameters.
In practice, can after the duration parameters and base frequency parameters of extracting a syllable, and then carry out described overlapping operation, carry out at other syllable then and extract and overlapping operation; Perhaps, after the duration parameters and base frequency parameters of extracting N syllable, carry out overlapping operation again, the present invention is not limited concrete sequence of operation.
Step 105, the adjusted duration parameters of foundation are carried out the interpolation adjustment to the spectrum parameter of corresponding syllables;
The precondition of utilizing compositor to carry out phonetic synthesis is that base frequency parameters and spectrum parameter should be one to one, also, must corresponding one an of base frequency parameters compose parameter; So this step by adjusting the spectrum parameter, makes it corresponding with the base frequency parameters that step 104 planning obtains, to carry out next step phonetic synthesis.
Below by concrete example described adjustment process is described:
Suppose that step 103 is 400ms at the duration parameters that described syllable sequence planning obtains, the number of sampling each second is 1000, and also, sample frequency is 1000HZ (hertz), and by calculating, the number that can obtain base frequency parameters and spectrum parameter is 400;
Suppose step 104 according to the song template of user's selection and the syllable number of described syllable sequence, adjusting the duration parameters that obtains is 500ms, and the number that also is base frequency parameters is 500;
So, this step then is at 400 in the step 103 spectrum parameters, and interpolation obtains 500 spectrum parameters.
Interpolation method has a lot, for example, and linear interpolation, non-linear interpolation, perhaps, two point interpolations, multiple spot interpolation etc., those skilled in the art can adopt any as required, and the present invention is not limited this.
For example, when adopting 2 linear interpolation, interpolation formula can be Qs=(aQ1+bQ2+u1)/(a+b), wherein, Q1, Q2 are respectively the spectrum parameter of known spectrum parameter point 1,2 (can be original spectrum parameter point in the step 103, also can be the acquired new spectrum parameter point of this step), a, b is a natural number, can represent known spectrum parameter point 1,2 to treat the weight that interpolation point S produces, 0<u1<a+b respectively.
In summary, this step promptly is to be N with M spectrum parameter interpolation, and to satisfy the requirement of a corresponding fundamental frequency of spectrum, wherein, M value can be obtained by step 103, and the N value can be by step 104 acquisition, and M, N are natural number.
Step 106, the duration parameters according to each syllable in the described syllable sequence, base frequency parameters and spectrum parameter utilize compositor to obtain the speech data corresponding with described syllable sequence.
Because have advantages such as regulating power is big, voice plasticity is strong, the parameter synthetic technology has obtained using widely in phonetic synthesis; In practice, wave filter is as compositor can to adopt LPC (linear predictive coding, linearpredictive coding), and the present invention is not limited concrete compositor.
Owing to added duration parameters and base frequency parameters in the song template, thereby the described synthetic speech data that obtains has melody identical with song and rhythm.
With reference to Fig. 2, show a kind of structural drawing of humming synthesis system embodiment of the present invention, specifically can comprise:
Interface module 201 is used to receive the text that the user imports;
Text analysis model 202 is used to carry out text analyzing, obtains the syllable sequence corresponding with described text, and, the syllable title of each syllable in this syllable sequence;
Parametric programming module 203 is used at each syllable of described syllable sequence, and according to its syllable title and context environmental, in conjunction with the statistical parameter model, planning obtains corresponding time length parameter, base frequency parameters and spectrum parameter;
First parameter adjustment module 204, be used for according to the song template of user's selection and the syllable number of described syllable sequence, duration parameters, base frequency parameters that described planning obtains are adjusted, wherein, articulatory duration parameters of storage and base frequency parameters in the described song template;
Second parameter adjustment module 205 is used for according to adjusted duration parameters the spectrum parameter of corresponding syllables being carried out the interpolation adjustment;
Synthesis module 206 is used for duration parameters, base frequency parameters and spectrum parameter according to described each syllable of syllable sequence, utilizes compositor to obtain the speech data corresponding with described syllable sequence.
In practice, described text analysis model 202 may further include:
Participle unit C1 is used for described text is carried out the participle operation;
Numeric character processing unit C2 is used for the numeric character of described text is converted to literal;
Rhythm predicting unit C3 is used for according to word segmentation result, and the text after the numeric character conversion is carried out rhythm prediction;
Syllable converting unit C4 is used for predicting the outcome according to the rhythm, is syllable sequence with text-converted, and, based on the syllable mapping table, obtain the syllable title of each syllable in this syllable sequence.
The present invention can adopt the song template generation module of following off-line to set up described song template, and this song template generation module specifically can comprise:
Extraction unit D1 is used at song sample, extracts the wherein duration parameters and the base frequency parameters of each syllable;
Preserve cells D 2, be used for described duration parameters and base frequency parameters and corresponding sample frequency are saved to the song template.
Because common song is made up of voice and music two parts, and the sounding characteristics and the mankind of musical instrument differ greatly, and can produce a lot of deviations during extraction, therefore, the present invention preferentially selects the song sample of singing opera arias for use.
When user input text, the present invention can represent the option that described off-line is set up several song templates, select for the user, and the user can select suitable song template according to actual needs such as personal habits, application scenarioss.
Particularly, described first parameter adjustment module 204 can comprise following cellular construction:
Acquiring unit E1 is used to obtain the syllable number of described syllable sequence;
Adjustment unit E2 is used for extracting and described syllable number corresponding parameters information from the song template, and covers duration parameters, the base frequency parameters that described planning obtains.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed all is and the difference of other embodiment that identical similar part is mutually referring to getting final product between each embodiment.For system embodiment, because it is similar substantially to method embodiment, so description is fairly simple, relevant part gets final product referring to the part explanation of method embodiment.
The present invention can be applied to various computer terminals and digital mobile equipment, and arbitrary text that be used for system is received or input converts the voice flow that has song rhythm and melody to.
More than to a kind of humming synthesis method provided by the present invention and system, be described in detail, used specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.

Claims (10)

1. a humming synthesis method is characterized in that, comprising:
Receive the text of user's input;
Carry out text analyzing, obtain the syllable sequence corresponding with described text, and, the syllable title of each syllable in this syllable sequence;
At each syllable in the described syllable sequence, according to its syllable title and context environmental, in conjunction with the statistical parameter model, planning obtains corresponding time length parameter, base frequency parameters and spectrum parameter;
According to the song template of user's selection and the syllable number of described syllable sequence, duration parameters, base frequency parameters that described planning obtains are adjusted, wherein, articulatory duration parameters of storage and base frequency parameters in the described song template;
According to adjusted duration parameters, the spectrum parameter of corresponding syllables is carried out the interpolation adjustment;
According to duration parameters, base frequency parameters and the spectrum parameter of each syllable in the described syllable sequence, utilize compositor to obtain the speech data corresponding with described syllable sequence.
2. the method for claim 1 is characterized in that, the described step that duration parameters, base frequency parameters are adjusted comprises:
Obtain the syllable number of described syllable sequence;
From described song template, extract and described syllable number corresponding time length parameter and base frequency parameters, and cover duration parameters, the base frequency parameters that described planning obtains.
3. the method for claim 1 is characterized in that, described text analyzing step comprises:
Described text is carried out the participle operation;
Numeric character in the described text is converted to literal;
According to word segmentation result, the text after the numeric character conversion is carried out rhythm prediction;
Predicting the outcome according to the rhythm, is syllable sequence with text-converted, and, based on the syllable mapping table, obtain the syllable title of each syllable in this syllable sequence.
4. the method for claim 1 is characterized in that, described song template is the template that generates as follows:
At song sample, extract the wherein duration parameters and the base frequency parameters of each syllable;
With described duration parameters and base frequency parameters, be saved to the song template.
5. method as claimed in claim 4 is characterized in that described song sample comprises the song sample of singing opera arias.
6. a humming synthesis system is characterized in that, comprising:
Interface module is used to receive the text that the user imports;
Text analysis model is used to carry out text analyzing, obtains the syllable sequence corresponding with described text, and, the syllable title of each syllable in this syllable sequence;
The parametric programming module is used at each syllable of described syllable sequence, and according to its syllable title and context environmental, in conjunction with the statistical parameter model, planning obtains corresponding time length parameter, base frequency parameters and spectrum parameter;
First parameter adjustment module, be used for according to the song template of user's selection and the syllable number of described syllable sequence, duration parameters, base frequency parameters that described planning obtains are adjusted, wherein, articulatory duration parameters of storage and base frequency parameters in the described song template;
Second parameter adjustment module is used for according to adjusted duration parameters the spectrum parameter of corresponding syllables being carried out the interpolation adjustment;
Synthesis module is used for duration parameters, base frequency parameters and spectrum parameter according to described each syllable of syllable sequence, utilizes compositor to obtain the speech data corresponding with described syllable sequence.
7. system as claimed in claim 6 is characterized in that, described first parameter adjustment module comprises:
Acquiring unit is used to obtain the syllable number of described syllable sequence;
Adjustment unit is used for extracting and described syllable number corresponding parameters information from the song template, covers duration parameters, base frequency parameters that described planning obtains, and the spectrum parameter is carried out interpolation according to the planning duration.
8. system as claimed in claim 6 is characterized in that, described text analysis model comprises:
The participle unit is used for described text is carried out the participle operation;
The numeric character converting unit is used for the numeric character of described text is converted to literal;
Rhythm predicting unit is used for according to word segmentation result, and the text after the numeric character conversion is carried out rhythm prediction;
The syllable converting unit is used for predicting the outcome according to the rhythm, is syllable sequence with text-converted, and, based on the syllable mapping table, obtain the syllable title of each syllable in this syllable sequence.
9. system as claimed in claim 6 is characterized in that, also comprises song template generation module, and this song template generation module comprises:
Extraction unit is used at song sample, extracts the wherein duration parameters and the base frequency parameters of each syllable;
Preserve the unit, be used for described duration parameters and base frequency parameters are saved to the song template.
10. system as claimed in claim 9 is characterized in that described song sample comprises the song sample of singing opera arias.
CN2010102234975A 2010-06-30 2010-06-30 Humming synthesis method and system Pending CN101901598A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010102234975A CN101901598A (en) 2010-06-30 2010-06-30 Humming synthesis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010102234975A CN101901598A (en) 2010-06-30 2010-06-30 Humming synthesis method and system

Publications (1)

Publication Number Publication Date
CN101901598A true CN101901598A (en) 2010-12-01

Family

ID=43227091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010102234975A Pending CN101901598A (en) 2010-06-30 2010-06-30 Humming synthesis method and system

Country Status (1)

Country Link
CN (1) CN101901598A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103456295A (en) * 2013-08-05 2013-12-18 安徽科大讯飞信息科技股份有限公司 Method and system for generating fundamental frequency parameters in singing synthesis
CN103915093A (en) * 2012-12-31 2014-07-09 安徽科大讯飞信息科技股份有限公司 Method and device for realizing voice singing
CN104391980A (en) * 2014-12-08 2015-03-04 百度在线网络技术(北京)有限公司 Song generating method and device
CN105740394A (en) * 2016-01-27 2016-07-06 广州酷狗计算机科技有限公司 Music generation method, terminal, and server
CN105788589A (en) * 2016-05-04 2016-07-20 腾讯科技(深圳)有限公司 Audio data processing method and device
CN107705782A (en) * 2017-09-29 2018-02-16 百度在线网络技术(北京)有限公司 Method and apparatus for determining phoneme pronunciation duration
CN107749301A (en) * 2017-09-18 2018-03-02 得理电子(上海)有限公司 A kind of tone color sample reconstructing method and system, storage medium and terminal device
CN108053814A (en) * 2017-11-06 2018-05-18 芋头科技(杭州)有限公司 A kind of speech synthesis system and method for analog subscriber song
CN108257609A (en) * 2017-12-05 2018-07-06 北京小唱科技有限公司 The modified method of audio content and its intelligent apparatus
CN108877753A (en) * 2018-06-15 2018-11-23 百度在线网络技术(北京)有限公司 Music synthesis method and system, terminal and computer readable storage medium
CN109801618A (en) * 2017-11-16 2019-05-24 深圳市腾讯计算机系统有限公司 A kind of generation method and device of audio-frequency information
CN110047462A (en) * 2019-01-31 2019-07-23 北京捷通华声科技股份有限公司 A kind of phoneme synthesizing method, device and electronic equipment
CN112270917A (en) * 2020-10-20 2021-01-26 网易(杭州)网络有限公司 Voice synthesis method and device, electronic equipment and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11184490A (en) * 1997-12-25 1999-07-09 Nippon Telegr & Teleph Corp <Ntt> Singing synthesizing method by rule voice synthesis
CN1661674A (en) * 2004-01-23 2005-08-31 雅马哈株式会社 Singing generator and portable communication terminal having singing generation function
EP1256932B1 (en) * 2001-05-11 2006-05-10 Sony France S.A. Method and apparatus for synthesising an emotion conveyed on a sound
CN101308652A (en) * 2008-07-17 2008-11-19 安徽科大讯飞信息科技股份有限公司 Synthesizing method of personalized singing voice
WO2010031437A1 (en) * 2008-09-19 2010-03-25 Asociacion Centro De Tecnologias De Interaccion Visual Y Comunicaciones Vicomtech Method and system of voice conversion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11184490A (en) * 1997-12-25 1999-07-09 Nippon Telegr & Teleph Corp <Ntt> Singing synthesizing method by rule voice synthesis
EP1256932B1 (en) * 2001-05-11 2006-05-10 Sony France S.A. Method and apparatus for synthesising an emotion conveyed on a sound
CN1661674A (en) * 2004-01-23 2005-08-31 雅马哈株式会社 Singing generator and portable communication terminal having singing generation function
CN101308652A (en) * 2008-07-17 2008-11-19 安徽科大讯飞信息科技股份有限公司 Synthesizing method of personalized singing voice
WO2010031437A1 (en) * 2008-09-19 2010-03-25 Asociacion Centro De Tecnologias De Interaccion Visual Y Comunicaciones Vicomtech Method and system of voice conversion

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103915093A (en) * 2012-12-31 2014-07-09 安徽科大讯飞信息科技股份有限公司 Method and device for realizing voice singing
CN103915093B (en) * 2012-12-31 2019-07-30 科大讯飞股份有限公司 A kind of method and apparatus for realizing singing of voice
CN103456295B (en) * 2013-08-05 2016-05-18 科大讯飞股份有限公司 Sing synthetic middle base frequency parameters and generate method and system
CN103456295A (en) * 2013-08-05 2013-12-18 安徽科大讯飞信息科技股份有限公司 Method and system for generating fundamental frequency parameters in singing synthesis
CN104391980B (en) * 2014-12-08 2019-03-08 百度在线网络技术(北京)有限公司 The method and apparatus for generating song
CN104391980A (en) * 2014-12-08 2015-03-04 百度在线网络技术(北京)有限公司 Song generating method and device
CN105740394A (en) * 2016-01-27 2016-07-06 广州酷狗计算机科技有限公司 Music generation method, terminal, and server
WO2017190674A1 (en) * 2016-05-04 2017-11-09 腾讯科技(深圳)有限公司 Method and device for processing audio data, and computer storage medium
CN105788589B (en) * 2016-05-04 2021-07-06 腾讯科技(深圳)有限公司 Audio data processing method and device
CN105788589A (en) * 2016-05-04 2016-07-20 腾讯科技(深圳)有限公司 Audio data processing method and device
US10789290B2 (en) 2016-05-04 2020-09-29 Tencent Technology (Shenzhen) Company Limited Audio data processing method and apparatus, and computer storage medium
CN107749301B (en) * 2017-09-18 2021-03-09 得理电子(上海)有限公司 Tone sample reconstruction method and system, storage medium and terminal device
CN107749301A (en) * 2017-09-18 2018-03-02 得理电子(上海)有限公司 A kind of tone color sample reconstructing method and system, storage medium and terminal device
CN107705782A (en) * 2017-09-29 2018-02-16 百度在线网络技术(北京)有限公司 Method and apparatus for determining phoneme pronunciation duration
CN108053814A (en) * 2017-11-06 2018-05-18 芋头科技(杭州)有限公司 A kind of speech synthesis system and method for analog subscriber song
CN108053814B (en) * 2017-11-06 2023-10-13 芋头科技(杭州)有限公司 Speech synthesis system and method for simulating singing voice of user
CN109801618A (en) * 2017-11-16 2019-05-24 深圳市腾讯计算机系统有限公司 A kind of generation method and device of audio-frequency information
CN108257609A (en) * 2017-12-05 2018-07-06 北京小唱科技有限公司 The modified method of audio content and its intelligent apparatus
CN108877753B (en) * 2018-06-15 2020-01-21 百度在线网络技术(北京)有限公司 Music synthesis method and system, terminal and computer readable storage medium
US10971125B2 (en) 2018-06-15 2021-04-06 Baidu Online Network Technology (Beijing) Co., Ltd. Music synthesis method, system, terminal and computer-readable storage medium
CN108877753A (en) * 2018-06-15 2018-11-23 百度在线网络技术(北京)有限公司 Music synthesis method and system, terminal and computer readable storage medium
CN110047462A (en) * 2019-01-31 2019-07-23 北京捷通华声科技股份有限公司 A kind of phoneme synthesizing method, device and electronic equipment
CN110047462B (en) * 2019-01-31 2021-08-13 北京捷通华声科技股份有限公司 Voice synthesis method and device and electronic equipment
CN112270917A (en) * 2020-10-20 2021-01-26 网易(杭州)网络有限公司 Voice synthesis method and device, electronic equipment and readable storage medium
CN112270917B (en) * 2020-10-20 2024-06-04 网易(杭州)网络有限公司 Speech synthesis method, device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN101901598A (en) Humming synthesis method and system
US20220013106A1 (en) Multi-speaker neural text-to-speech synthesis
KR101120710B1 (en) Front-end architecture for a multilingual text-to-speech system
KR20220004737A (en) Multilingual speech synthesis and cross-language speech replication
EP2704092A2 (en) System for creating musical content using a client terminal
Latorre et al. New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer
CN101578659A (en) Voice tone converting device and voice tone converting method
Santra et al. Development of GUI for text-to-speech recognition using natural language processing
JP4829477B2 (en) Voice quality conversion device, voice quality conversion method, and voice quality conversion program
JP2006517037A (en) Prosodic simulated word synthesis method and apparatus
CN102543081A (en) Controllable rhythm re-estimation system and method and computer program product
CN112102811B (en) Optimization method and device for synthesized voice and electronic equipment
CN112037755B (en) Voice synthesis method and device based on timbre clone and electronic equipment
Macon et al. Concatenation-based midi-to-singing voice synthesis
KR20200069264A (en) System for outputing User-Customizable voice and Driving Method thereof
JP2014062970A (en) Voice synthesis, device, and program
JP2001034280A (en) Electronic mail receiving device and electronic mail system
CN112242134A (en) Speech synthesis method and device
CN115762471A (en) Voice synthesis method, device, equipment and storage medium
CN113539236B (en) Speech synthesis method and device
CN113724684A (en) Voice synthesis method and system for air traffic control instruction
Li et al. A lyrics to singing voice synthesis system with variable timbre
CN1979636B (en) Method for converting phonetic symbol to speech
JP2021148942A (en) Voice quality conversion system and voice quality conversion method
Kiruthiga et al. Annotating Speech Corpus for Prosody Modeling in Indian Language Text to Speech Systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20101201