JP2015087740A5

JP2015087740A5 - Speech synthesis apparatus and speech synthesis method

Info

Publication number: JP2015087740A5
Application number: JP2014048636A
Authority: JP
Filing date: 2014-03-12
Publication date: 2016-04-14
Anticipated expiration: 2034-03-12

Description

ところで、上述した音声合成技術と音声対話システムとを組み合わせて、利用者の音声による問いに対し、データを検索して音声合成により出力する対話システムを想定する。この場合、音声合成によって出力される音声が利用者に不自然な感じ、具体的には、いかにも機械が喋っている感じを与えるときがある、という問題が指摘されている。
本発明は、このような事情に鑑みてなされたものであり、その目的の一つは、利用者に自然な感じを与えるような、具体的には、利用者に対して好印象や悪印象などを与えることが可能な音声合成装置および音声合成方法を提供することにある。 By the way, it is assumed that a dialogue system that combines the above-described voice synthesis technology and a voice dialogue system and retrieves data and outputs it by voice synthesis in response to a user's voice question. In this case, a problem has been pointed out that the voice output by the voice synthesis feels unnatural to the user, specifically, the machine sometimes feels roaring.
The present invention has been made in view of such circumstances, and one of its purposes is to give the user a natural feeling, specifically, a positive impression or a bad impression on the user. Is to provide a speech synthesis apparatus and a speech synthesis method .

Claims

A voice input unit for inputting questions by voice signals;
Among the questions, a pitch analysis unit that analyzes the pitch of a specific first section;
An acquisition unit for acquiring an answer to the question;
Among the obtained answers, a speech synthesizer that changes and outputs the pitch of a specific second section so that the pitch has a predetermined relationship with the pitch of the first section;
Equipped with,
The first interval is the ending of the question;
The second interval is the beginning or end of the answer,
A speech synthesizer characterized by the above.

The predetermined relationship is a relationship of Kyowa intervals excluding perfect 1 degree.
The speech synthesizer according to claim 1 .

The speech synthesizer
The pitch of the second section is changed so as to have a pitch relationship within a range of one octave above and below, except for the same pitch as the pitch of the first section, and output. The speech synthesizer according to 1 .

The speech synthesizer
The pitch of the second section, wherein the first section of the pitch, claim 2 and outputs modified as a pitch in relation Kyowa interval under 5 degrees or The speech synthesizer according to 3 .

The speech synthesizer
When changing the pitch of the second section to be a pitch having a predetermined relationship with the pitch of the first section,
If the pitch to be changed is lower than a predetermined threshold pitch, the pitch to be changed is further shifted to a pitch one octave higher, or
If the pitch to be changed is higher than a predetermined threshold pitch, the pitch to be changed is further shifted to a pitch one octave below.
The speech synthesizer according to claim 2 or 3 .

The speech synthesizer
If the pitch of the second section is to be changed to be a pitch having a predetermined relationship with the pitch of the first section, if a predetermined attribute is defined, the predetermined relationship 4. The speech synthesizer according to claim 2 or 3 , wherein the pitch at is further shifted to a pitch one octave above or below.

There are a first mode and a second mode as operation modes,
The speech synthesizer
If the operation mode is the first mode, the pitch of the second section is set to a pitch that is in a relationship of Kyowa intervals except for the perfect pitch with respect to the pitch of the first section. Change and output,
If the operation mode is the second mode, the pitch of the second section is changed and output so that the pitch is in a dissonant pitch relative to the pitch of the first section.
The speech synthesizer according to claim 1 .

Computer,
Get answers to questions from input audio signals ,
Among the questions, analyze the pitch of a specific first section ,
Of the acquired answer, the pitch of the specific second period, and outputs the change to the pitch of the first section such that the pitch in a predetermined relationship,
The first interval is the ending of the question;
The second interval is the beginning or end of the answer,
A speech synthesis method characterized by the above.