JP2006119655A5

JP2006119655A5 -

Info

Publication number: JP2006119655A5
Application number: JP2005336272A
Authority: JP
Filing date: 2005-11-21
Publication date: 2007-04-12
Anticipated expiration: 2021-03-09

Claims

Storage means for storing a feature amount of speech at a specific time as an index of phoneme and pitch;
Template storage means for storing a template representing a temporal change in the feature amount of the pitch and the voice as an index of the phoneme and the pitch;
Input means for inputting speech information for speech synthesis including at least pitch and phoneme;
Read means for reading out the feature amount and template of the voice from the storage means and the template storage means respectively by the inputted voice information;
Voice synthesis means for applying the read template to the pitch included in the read voice feature quantity and the input voice information, and synthesizing voice based on the voice feature quantity and pitch after the application In a speech synthesizer having
When the pitch included in the input voice information exceeds the value of the highest index in the storage means, the reading means is the highest stored in the storage means from the pitch included in the input voice information A speech synthesizer characterized in that a pitch difference obtained by subtracting a pitch of an index is obtained, and a feature amount obtained by adding the pitch difference to a feature amount read from the storage means by the highest pitch index is output to the speech synthesizer. .

Storage means for storing a feature amount of speech at a specific time as an index of phoneme and pitch;
Template storage means for storing a template representing a temporal change in the feature amount of the pitch and the voice as an index of the phoneme and the pitch;
Input means for inputting speech information for speech synthesis including at least pitch and phoneme;
Read means for reading out the feature amount and template of the voice from the storage means and the template storage means respectively by the inputted voice information;
Voice synthesis means for applying the read template to the pitch included in the read voice feature quantity and the input voice information, and synthesizing voice based on the voice feature quantity and pitch after the application In a speech synthesizer having
When the pitch included in the input voice information is lower than the lowest index value in the storage means, the reading means is the lowest stored in the storage means from the pitch included in the input voice information. A pitch difference obtained by subtracting the pitch of the index is obtained, and a feature amount obtained by adding a specified ratio of the pitch difference to the feature amount read from the storage unit with the lowest pitch index is output to the speech synthesis unit. Speech synthesizer.

The audio feature quantity stored in the storage means includes excitation resonance,
3. The speech synthesizer according to claim 2, wherein the reading unit corrects and outputs the excitation resonance so that the bandwidth of the excitation resonance becomes narrower as the pitch included in the inputted speech information is lower.

The audio feature quantity stored in the storage means includes a formant,
3. The speech synthesizer according to claim 2, wherein the reading means corrects and outputs the formant so that the amplitude of the formant increases as the pitch included in the input speech information decreases.