JPS63210898A

JPS63210898A - Voice synthesizer

Info

Publication number: JPS63210898A
Application number: JP62043121A
Authority: JP
Inventors: 利光蓑輪
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1987-02-27
Filing date: 1987-02-27
Publication date: 1988-09-01

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、本発明の音声合成装置は、ワードプロセッサ
等に入力した文字列の読合せ、盲人の読書用等に利用さ
れる。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The speech synthesis device of the present invention is used for reading aloud character strings input into a word processor or the like, for reading by blind people, etc.

（従来の技術）第４図は従来の音声合成装置の構成を示すもので、１は
、声道の特性を共振・反共振であられした声道パラメー
タがファイルされた声道パラメータファイルで、声道パ
ラメータは、約１Ｏｎ＋ｓ毎に音声を分析して得たとこ
ろのホルマント周波数及びバンド幅の情報や、スペクト
ルを線スペクトル化したＬＳＰパラメータ等で構成され
ている。２は、文字列が入力すると、それ等の文字列に
含まれる音節の声道パラメータを声道パラメータファイ
ル１から選択して、時間的に配列した上、補間計算によ
って算出した声道パラメータを各音節間の補間区間に挿
入して、それ等の音節を結合する声道パラメータ結合部
、３は、アンプ制御情報が入力すると、パルス列及び白
色雑音の振幅を決定するアンプ計算部、４は、抑揚制御
情報が入力すると、パルス列のパルス間隔（以下ピッチ
という）を決定する抑揚計算部、５はアンプ計算部３で
決定された振幅及び抑揚計算部で決定されたパルス間隔
に基づいてパルス列を出力するパルス列発生部、６はア
ンプ計算部３で決定された振幅に基づいて白色雑音を出
力する白色雑音発生部、７はパルス列及び白色雑音が声
道に入力したときの声道中の透過波及び反射波を計算す
ることにより、口唇からの透過波として所望の音声信号
を得る音響計算部で、この音響計算部７はデジタル計算
機で構成されている。８は音響計算部７から出力された
デジタル音声信号をアナログ音声信号に変換するＤ／Ａ
コンバータ、９はアナログ音声信号によって駆動される
スピーカである。(Prior art) Fig. 4 shows the configuration of a conventional speech synthesizer. 1 is a vocal tract parameter file containing vocal tract parameters obtained by comparing vocal tract characteristics with resonance and anti-resonance; The road parameters are composed of information on formant frequency and bandwidth obtained by analyzing speech every 1 On+s, LSP parameters obtained by converting a spectrum into a line spectrum, and the like. 2, when character strings are input, the vocal tract parameters of the syllables included in those character strings are selected from the vocal tract parameter file 1, arranged temporally, and the vocal tract parameters calculated by interpolation calculation are 3 is an amplifier calculation unit that determines the amplitude of the pulse train and white noise when the amplifier control information is input; 4 is the intonation; When control information is input, an intonation calculation section 5 determines the pulse interval (hereinafter referred to as pitch) of the pulse train, and 5 outputs a pulse train based on the amplitude determined by the amplifier calculation section 3 and the pulse interval determined by the intonation calculation section. A pulse train generator, 6 a white noise generator that outputs white noise based on the amplitude determined by the amplifier calculator 3, and 7 a transmitted wave and reflection in the vocal tract when the pulse train and white noise are input to the vocal tract. This acoustic calculation section 7 obtains a desired audio signal as a transmitted wave from the lips by calculating waves, and this acoustic calculation section 7 is composed of a digital computer. 8 is a D/A that converts the digital audio signal output from the acoustic calculation unit 7 into an analog audio signal.
Converter 9 is a speaker driven by an analog audio signal.

このように構成された従来例において１文字列。In the conventional example configured in this way, one character string.

アンプ制御情報及び抑揚制御情報が入力すると、抑揚計
算部４は１話調成分〔第４図（ａ）参照〕とアクセント
成分〔第４図（ｂ）参照〕とから計算されたピッチ値〔
以下制御点ピッチという。第４図（ｃ）参照〕を各音節
の母音のほぼ中央に与えた上、これ等のスペクトル定常
点間を直線補間していた。When the amplifier control information and intonation control information are input, the intonation calculation unit 4 calculates a pitch value [calculated from the first tone component [see FIG. 4(a)] and the accent component [see FIG. 4(b)].
Hereinafter referred to as control point pitch. (see Figure 4(c))] was given to approximately the center of the vowel of each syllable, and linear interpolation was performed between these spectral stationary points.

（発明が解決しようとする問題点）しかしながら、従来のピッチ生成法では、スペクトル定
常点が各音節毎に１点しか与えられないので、゛′マ″
やパバ″のような音節の有声子音部におけるピッチ降下
や、例えば第４図（ａ）、　（ｂ）及び（ｃ）に示すよ
うに、「彼がいます」の文で、「彼が」と「います」の
°′ガ″の音節と′°イ″の音節との間のピッチ降下が
表現できないという問題があった。(Problem to be Solved by the Invention) However, in the conventional pitch generation method, only one spectral stationary point is given for each syllable.
The pitch drop in the voiced consonant part of syllables such as ``and paba'', and the pitch drop in the voiced consonant part of syllables such as ``he is'', as shown in Figures 4 (a), (b) and (c), for example in the sentence ``he is'' There was a problem that the pitch drop between the syllables ``ga'' and ``i'' in ``Imasu'' could not be expressed.

本発明は、このような問題に鑑みてなされたもので、自
然音声のように複雑なピッチ変化が表現できる音声合成
装置を提供することを目的としている。The present invention has been made in view of these problems, and an object of the present invention is to provide a speech synthesis device that can express complex pitch changes like natural speech.

（問題点を解決するための手段）本発明は、音節内に形成される複数のスペクトル定常点
の間の制御点ピッチを抑揚制御情報の話調成分とアクセ
ント成分とに基づいて計算した上、それ等のスペクトル
定常点の間をスペクトル定常点の位置によって決まる非
線形な補間関数で補間するものである。(Means for Solving the Problems) The present invention calculates the control point pitch between a plurality of spectral stationary points formed within a syllable based on tone components and accent components of intonation control information, and then Interpolation is performed between these spectral stationary points using a nonlinear interpolation function determined by the position of the spectral stationary points.

（作　用）本発明によれば、各音節毎に複数のスペクトル定常点が
設けられているため、各音節毎に複雑なピッチ変化が表
現できる上、隣接するスペクトル定常点の間が非線形な
補間関数で補間されるので。(Function) According to the present invention, since a plurality of spectral stationary points are provided for each syllable, complex pitch changes can be expressed for each syllable, and nonlinear interpolation between adjacent spectral stationary points is possible. Because it is interpolated with a function.

音節間の急激なピッチの変化も表現できる。It can also express sudden changes in pitch between syllables.

（実施例）以下、図面を参照しながら、本発明の実施例を詳細に説
明する。(Example) Hereinafter, an example of the present invention will be described in detail with reference to the drawings.

第１図は本発明の一実施例の構成を示すもので、第３図
の符号と同一符号のものは同一部分を示しており、又、
ｌＯは話調・アクセント成分ファイル１１、音節内ピッ
チファイル１２．制御点ピッチ計算部１３、補間関数フ
ァイル１４及び補間計算部１５からなる抑揚計算部であ
る。FIG. 1 shows the configuration of an embodiment of the present invention, and the same reference numerals as those in FIG. 3 indicate the same parts, and
lO is a tone/accent component file 11, an intra-syllable pitch file 12. This is an intonation calculation section consisting of a control point pitch calculation section 13, an interpolation function file 14, and an interpolation calculation section 15.

このように構成された本実施例では、入力文字列及び抑
場制御情報が入力すると、先ず、制御点ピッチ計算部１
３は、抑揚制御情報に基づいて話調アクセント成分ファ
イル１１から話調の開始点周波数Ｆｊｊ話調の傾きｄＦ
及びアクセント成分値／（ｎ、Ｔ）〔母音の中心部のピ
ッチ〕を読み出すと共に、音節毎に音節内ピッチファイ
ル１２から音節内の制御点ピッチ修正係数９（ｋ、ｃ）
を読み出して、制御点ピッチｆｔ、ｈを次式により計算
する。In this embodiment configured in this way, when an input character string and suppression field control information are input, first, the control point pitch calculation unit 1
3 is the starting point frequency Fjj of the tone dF from the tone accent component file 11 based on the intonation control information.
and the accent component value/(n, T) [pitch of the center of the vowel], and the control point pitch correction coefficient 9 (k, c) within the syllable from the intra-syllable pitch file 12 for each syllable.
is read out, and the control point pitch ft, h is calculated using the following equation.

但し、　／ｚ、ｂ：入力した第ｉ文字目の中のに番目の
制御点ピッチＮ：入力した文字列のモーラ数／（ｉ、Ｔ）　：　Ｔ形アクセントのｉモーラ目のアク
セント成分値９（ｋ、ｃ）　：入力文字Ｃの音節内で第に番目の制御
点ピッチ修正係数で、母音中心で９（ｋ、ｃ）　＝　１．０となる。However, /z, b: Pitch of the second control point in the input i-th character N: Number of moras of the input character string/(i, T): Accent component value of the i-th mora of the T-shaped accent 9 (k, c): The pitch correction coefficient of the second control point within the syllable of input character C, with 9(k, c) = 1.0 at the vowel center.

尚、音節内のスペクトル定常点の数と位置は。Furthermore, the number and position of spectral stationary points within a syllable are:

声道パラメータファイル１の作成時に決める。即ち、各
音節の分析時にスペクトル変化を計算して。Determine when creating vocal tract parameter file 1. i.e. by calculating the spectral changes during the analysis of each syllable.

スペクトル変化の閾値が１ｄＢ以下、継続時間の閾値が
２０ｍ５以上であるときに、スペクトル変化が小さくて
、継続時間の長い区間の中心がスペクトル定常点の位置
とすると、有声子音を伴うもので２〜３箇所の定常点位
置が得られる。この際、母音の中心部は必ずスペクトル
定常点の位置とする。If the spectral change threshold is 1 dB or less and the duration threshold is 20 m5 or more, and the center of the section with a small spectral change and long duration is the spectral stationary point, then 2~ Three stationary point positions are obtained. At this time, the center of the vowel is always the position of the spectral stationary point.

次に、補間計算部１５は、これ等のスペクトル定常点の
間を、スペクトル定常点の位置情報によって補間関数フ
ァイル１４から読み出した補間関数を計算して補間する
。Next, the interpolation calculation unit 15 interpolates between these spectral stationary points by calculating an interpolation function read from the interpolation function file 14 based on the position information of the spectral stationary points.

尚、位置情報としては、文頭、音節内、文節内の音節間
、文節間、交尾の５種類があり、自然音声内で見られる
平均的なピッチ変化パターンがテーブル化されたものを
それぞれ補間関数としている。There are five types of position information: beginning of a sentence, within a syllable, between syllables within a bunsetsu, between bunsetsu, and during a bunsetsu, and each table of average pitch change patterns found in natural speech is created using an interpolation function. It is said that

（発明の効果）以上説明したように、本発明によれば、音節内のピッチ
制御を複数の位置で行ない、それ等の補間に、自然音声
中のピッチ変化を模擬する非線形の補間関数を用いるた
め、音節内のピッチ変化や、文節間でのピッチの急激な
変化が自然音声のように表現できるという効果がある。(Effects of the Invention) As explained above, according to the present invention, pitch control within a syllable is performed at multiple positions, and a nonlinear interpolation function that simulates pitch changes in natural speech is used for interpolation. This has the effect of allowing pitch changes within a syllable or sudden changes in pitch between phrases to be expressed like natural speech.

[Brief explanation of the drawing]

第１図は本発明の一実施例の構成図、第２図は本発明の
一実施例における抑揚計算部の構成図、第３図は従来の
音声合成装置の構成図、第４図は従来のピッチ計算方式
によるピッチの時間変化の説明図である。１　・・・声道ハラメータファイル、　２　・・・声道
パラメータ結合部、　３　・・・アンプ計算部、　５　
・・・パルス列発生部、　６　・・・白色雑音発生部、
　７　・・・音響計算部、８　・・・　Ｄ／Ａコンバー
タ、　９　・・・スピーカ、１０・・・抑揚計算部、１
１・・話調・アクセント成分ファイル、１２・・・音節
内ピッチファイル、１３・・・制御点ピッチ計算部。１４・・・補間関数ファイル、１５・・　補間計算部。特許出願人　松下電器産業株式会社 −一、′１第１図第２図第３図第４図Fig. 1 is a block diagram of an embodiment of the present invention, Fig. 2 is a block diagram of an intonation calculation section in an embodiment of the present invention, Fig. 3 is a block diagram of a conventional speech synthesizer, and Fig. 4 is a block diagram of a conventional speech synthesizer. FIG. 2 is an explanatory diagram of pitch change over time according to the pitch calculation method of FIG. 1...Vocal tract Harameter file, 2...Vocal tract parameter combination unit, 3...Amplifier calculation unit, 5
...Pulse train generation section, 6...White noise generation section,
7...Acoustic calculation unit, 8...D/A converter, 9...Speaker, 10...Intonation calculation unit, 1
1...Tone/accent component file, 12...Intra-syllable pitch file, 13...Control point pitch calculation unit. 14... Interpolation function file, 15... Interpolation calculation section. Patent applicant Matsushita Electric Industrial Co., Ltd.-1,'1 Figure 1 Figure 2 Figure 3 Figure 4

Claims

[Claims]

(1) In a speech synthesis device that vocalizes an input character string, the control point pitch between multiple spectral stationary points formed within each syllable is calculated based on the tone component and accent component of intonation control information. A speech synthesis device characterized in that:

(2) In a speech synthesis device that vocalizes an input character string, it is assumed that the spaces between multiple spectral stationary points formed within each syllable are interpolated using a nonlinear interpolation function determined by the position of the spectral stationary points. Characteristic speech synthesizer.