JPS61166600A

JPS61166600A - Voice snthesizer

Info

Publication number: JPS61166600A
Application number: JP60007744A
Authority: JP
Inventors: 大橋　秀紀
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1985-01-19
Filing date: 1985-01-19
Publication date: 1986-07-28
Anticipated expiration: 2012-08-20
Also published as: JP2642617B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（イ）産業上の利用分野本発明は音声を合成する音声合成装置に関する。[Detailed description of the invention] (b) Industrial application fields The present invention relates to a speech synthesis device that synthesizes speech.

（ロ）従来の技術従来の音声合成装置としては特願昭５９−２２９１号に
て提案した如く１日本語の場合子背部並びに該子音部か
ら母音部に継がろ過渡部及び定常的な母音部が結合して
いる約１００背節に対応しているＬＳＰ係数、ＰＡＲＯ
ＯＲ係数等の音節パラメータをＲＯＭ（リードオンリー
メモリ）に貯えておき、このＲＯＭから読み出した該音
節パラメータを接続することに依って単語及び文章単位
の音声を再合成するものがあった。この合成方式は一般
に規則合成方式として知られている。(B) Prior art As proposed in Japanese Patent Application No. 59-2291, a conventional speech synthesis device has a filtering section and a regular vowel section in the case of the Japanese language, where the consonant section is passed from the consonant section to the vowel section. LSP coefficients corresponding to approximately 100 connected dorsal segments, PARO
There is a system that stores syllable parameters such as OR coefficients in a ROM (read only memory) and resynthesizes speech in units of words and sentences by connecting the syllable parameters read from the ROM. This synthesis method is generally known as a rule synthesis method.

斯様な音声合成装置は約１００會節単位の音節パラメー
タをメモリに格納しておけばあらゆる日本語の音声が再
合成可能であるが、出力される音声は、ＲＯＭ内に貯え
られた音節データであるので、その特徴を持った音声の
再合成しか行なえないという欠点があった。従って様々
な人物の特徴を持った合成音声を出力することはできな
いものであった。Such a speech synthesizer can resynthesize any Japanese speech by storing syllable parameters for approximately 100 syllables in its memory, but the output speech is based on syllable data stored in the ROM. Therefore, it has the disadvantage that it can only perform resynthesis of voices that have those characteristics. Therefore, it has been impossible to output synthesized speech that has the characteristics of various people.

（ハ）発明が解決しようとする問題点本発明は上述の点に鑑みてなされ、規則合成方式をもた
せる音声合成装置を供給するものである。(c) Problems to be Solved by the Invention The present invention has been made in view of the above-mentioned points, and provides a speech synthesis device having a rule synthesis method.

に）問題点を解決するための手段本発明の音声合成装置は、外部から音声を入力゛　して
個人性を示す特徴パラメータを抽出する特徴と、ｔ−設
けたものである。B) Means for Solving the Problems The speech synthesis device of the present invention has a feature of inputting speech from the outside and extracting characteristic parameters indicating individuality.

（ホ）作　用本発明の音声合成装置は子音部と該子音部から子音部に
継がろ過渡部と定常的な母音部とが結合した音節パラメ
ータ’ｉＲＯＭ（リードオンリーメモリ）に貯える。ま
た定常的な個人性を示す母音のパラメータを背戸入力部
、音声分析部より入力し、バッファーメモリに貯え、そ
の定常母音パラメータと上記音節パラメータより切り出
された子音部及び該子音部より母音部に継がろ過渡部と
を１１・王接続して１個人情報が付加された新たな音節単位への音節パラメータとする音節パラメータ生成部及びそれ
？貯える音節パラメータバッファーメモリ金持ち、上記
の音節パラメータ列及びそれに適合するピッチ・パラメ
ータにより音声を合成するものである。　゛（へ）実施例第１図に本発明の音声合成装置の一実施例を示す。同図
に於て、　（１（ｌは背戸入力用マイクであり。(e) Operation The speech synthesis device of the present invention stores syllable parameters in an iROM (read-only memory) in which a consonant part, a filter transition part from the consonant part to the consonant part, and a constant vowel part are combined. In addition, the vowel parameters indicating stationary individuality are input from the back door input unit and the speech analysis unit, and stored in the buffer memory, and the consonant part extracted from the stationary vowel parameter and the syllable parameter above, and the vowel part from the consonant part are input. A syllable parameter generation unit that connects Tsuguri and filtered Watabe to 11 and 1 and uses it as a syllable parameter to a new syllable unit with 1 personal information added, and is it? The syllable parameter buffer memory is used to synthesize speech using the above syllable parameter string and pitch parameters that match the syllable parameter string. (f) Embodiment FIG. 1 shows an embodiment of the speech synthesis apparatus of the present invention. In the same figure, (1 (l is the back door input microphone.

（１１は該音声入力用マイクα〔より入力された定常母
音を分析しＬＳＦパラメータ、ＰＡＲＯＯＲパラメータ
等の音声の特徴パラメータを抽出するパラメータ抽出回
路である。（１２１は該パラメータ抽出回路ａ１１によ
り、抽出された定常母音の特徴パラメータを一時的に貯
える定常母音バッファ−メモリ（ＲＡＭ）である。(11 is a parameter extraction circuit that analyzes the steady vowel input from the voice input microphone α and extracts voice characteristic parameters such as LSF parameters and PAROOR parameters. (121 is a parameter extraction circuit that This is a stationary vowel buffer memory (RAM) that temporarily stores characteristic parameters of stationary vowels.

また（４）は子音部と該子音部から母音部に継がる過渡
部及び定常的な母音部が結合している各種の音節パラメ
ータがアドレス付けされて貯７ｔられている標準音節パ
ラメータメモリである。（６）は該標準音節パラメータ
メモリ（４）中の音節パラメータよ　　　　　！。Further, (4) is a standard syllable parameter memory in which various syllable parameters in which a consonant part, a transition part that continues from the consonant part to a vowel part, and a constant vowel part are combined are stored with addresses. . (6) is the syllable parameter in the standard syllable parameter memory (4)! .

シ子背部及び該子音部力・ら母音に継がろ過渡部のみを
抽出し、該抽出データと上記定常母音バッファーメモリ
［１７Ｊ中の定常母音とを結合させ新たに（子音部）＋
（過渡部）＋（入力された定常母音）という定常母音入
力者の個人性情報を有する新たな音節パラメータを作り
だす音節パラメータ生成部である。（５）は音節パラメ
ータ生成部により作り出された個人性を有する音節パラ
メータを貯えておくユーザー汁節パラメータメモリ（Ｒ
ＡＭ）である。Extract only the shiji dorsal part and the consonant part power/ra vowel, and combine the extracted data with the stationary vowel in the above stationary vowel buffer memory [17J and create a new (consonant part) +
This is a syllable parameter generation unit that generates a new syllable parameter having the personality information of the person who inputs the steady vowel: (transient part) + (input steady vowel). (5) is a user syllable parameter memory (R
AM).

一方（１）は文字キーが配列されたキーボード、（２）
は該キーボード（１）からのキー操作信号を受けてその
キーに対応する音節単位の文字信号に変換するデコーダ
である。（３）は該デコーダ（２）よりの文字信号と上
記標準音節パラメータメモリ（４）およびユーザー音節
パラメータメモ１月５）の各音節アドレスとを結びつけ
る音節アドレステーブルである。また（１６１は上記デ
コーダ（２）よりの音節単位の文字信号とその音節の発
生時間長とを対応づけた音節長テーブルである。On the other hand, (1) is a keyboard with character keys arranged, (2)
is a decoder that receives key operation signals from the keyboard (1) and converts them into character signals in syllable units corresponding to the keys. (3) is a syllable address table that links the character signal from the decoder (2) with each syllable address in the standard syllable parameter memory (4) and user syllable parameter memo (January 5). Further, (161) is a syllable length table in which the character signal in syllable units from the decoder (2) is associated with the generation time length of that syllable.

また（７）は上記標準音節パラメータメモ１月４）もし
くはユーザー汁節パラメータメモリ（５）のいづれの音
節パラメータにより音声合成を行なうかを選択する合成
廿声選択部である。また（８）は音節長テーブル（１６
１にて指定された音節の発生時間長に合致する如く音節
データ長を伸長又は圧縮する音節データ長制御部である
。（９）はパラメータ領域（９−ａ）とピッチ領域（９
−１１）とから成る音声データバッファメモリであり、
パラメータ領域（９−ａ）には上記標準音節パラメータ
メモリ（４）もしくはユーザー音節パラメータメモ１月
５）が音節データ長制御部（８）により間部された状態
での音節パラメータとして格納され、これに続く上記キ
ーボード（１）よりのキー人力に応じて新たな音節パラ
メータが順次格納される。Further, (7) is a synthesized voice selection unit which selects which syllable parameter from the standard syllable parameter memo (January 4) or the user's syllable parameter memory (5) is used for speech synthesis. Also, (8) is the syllable length table (16
This is a syllable data length control unit that expands or compresses the syllable data length so as to match the syllable generation time length specified in 1. (9) is the parameter area (9-a) and the pitch area (9
-11) is an audio data buffer memory consisting of
In the parameter area (9-a), the standard syllable parameter memory (4) or the user syllable parameter memo (January 5) is stored as a syllable parameter in a state where it is interleaved by the syllable data length control unit (8). New syllable parameters are sequentially stored in accordance with the subsequent key presses from the keyboard (1).

α３は合成音声のアクセント型を指定する為のアクセン
ト指定部である。α４はアクセント指定部（１３により
指定されたアクセント及びキーボード（１）入力よシ得
られる合成音声の音節数で表わされるモーラ数との組合
せ信号からなるピッチパターン指定信号を生成するピッ
チパターン指定回路である。α3 is an accent specification section for specifying the accent type of synthesized speech. α4 is a pitch pattern designation circuit that generates a pitch pattern designation signal consisting of a combination signal of the accent designated by the accent designation unit (13) and the number of moras represented by the number of syllables of the synthesized speech obtained by inputting the keyboard (1). be.

（Ｉｓは上記ピッチパターン指定回路［１４１よりのピ
ッチパターン指定信号より合成音声のイントネーション
及びアゲセン）を決定する標準的なピッチバク　、メー
タが納められているピッチテーブルであり。(Is is a pitch table containing a standard pitch back and meter that determines the intonation and transition of the synthesized voice from the pitch pattern designation signal from the pitch pattern designation circuit [141]).

モーフ数とそのアクセント型の組合せ毎にピッチパラメ
ータがパターン化されて格納されている。Pitch parameters are stored in patterns for each combination of morph number and accent type.

すなわち、アクセント位置のピッチ周波数が相対的に高
くなるように設定される。住ηは上記音節長テーブルｔ
ｔ６１からの各音節の時間長に基づいて、上記ピッチテ
ーブル住９よシ得られた合成音声の標準ピッチパターン
ＩＨ節毎に線形圧縮又は線形伸長するピッチパターンマ
ツチング回路であり、該回路にてマツチングされたマツ
チングピッチパターンが上記音声データバッファメモ１
月９）のピッチ領域（９−１１）に格納され、このマツ
チング・ピッチパターンと上記パラメータ領域（９−Ｉ
Ｌ）の音節パラメータ列とが対応付けられる。That is, the pitch frequency of the accent position is set to be relatively high. Sum η is the syllable length table t above.
This is a pitch pattern matching circuit that linearly compresses or linearly expands the standard pitch pattern IH of the synthesized speech obtained from the pitch table 9 based on the time length of each syllable from t61. The matched pitch pattern is the audio data buffer memo 1
This matching pitch pattern and the parameter area (9-I) are stored in the pitch area (9-11) of
L) is associated with the syllable parameter string.

賭は上記音声データバッファメモリ（９）に格納された
音節パラメータ列及びそれに対応したマツチングピッチ
パターンを入力することにより、上記キーボード（１）
入力に対応した音声信号を合成出力する音声合成部であ
る。（１１は上記音声合成部叫よりの合成音声出力を増
幅するアンプであり、スピーカー圓より最終的な合成音
声が発生される。The bet is made by inputting the syllable parameter string stored in the audio data buffer memory (9) and the matching pitch pattern corresponding thereto, using the keyboard (1).
This is a voice synthesis unit that synthesizes and outputs voice signals corresponding to input. (11 is an amplifier that amplifies the synthesized voice output from the voice synthesizer, and the final synthesized voice is generated from the speaker ring.

次に音節パラメータ生成部（６）における処理手順を＃
ＩＩｚ図のフローチャートに基−づいて、雲セ説明する
。Next, the processing procedure in the syllable parameter generation unit (6) is #
A detailed explanation will be given based on the flowchart shown in Figure IIz.

まず、標準音節パラメータメモリ（４）よりノ（ラメー
タを抽出し、そのパラメータが子音部並びに該子音部か
ら子音部に継がろ過渡部であるか子音部であるかを判断
し、子音部もしくは子音部への過渡部であればユーザー
音節パラメータメモリ内に書き込んでゆく、これを定常
子音部への継続部のデータの最後まで行なう。また標準
音節パラメータメモリ（４）よりのパラメータが母音で
あれば、該パラメータに代わり定常母音バッファーメモ
リ＋１３よシ対応する定常母音パラメータを取り込んで
。First, the parameter is extracted from the standard syllable parameter memory (4), and it is determined whether the parameter is a consonant part and whether the transition from the consonant part to the consonant part is a filtered part or a consonant part. If the transition part is a transition part, write it into the user syllable parameter memory, and do this until the end of the data of the continuation part to a stationary consonant part.Furthermore, if the parameter from the standard syllable parameter memory (4) is a vowel, Instead of this parameter, the stationary vowel buffer memory +13 imports the corresponding stationary vowel parameter.

ＳＸＳ図に示す如くユーザー音節パラメータメモリ内０
対２す７′（子音部→゛（定常子音部″′継続　　　　
　　　１部）のデータに続けて書き込んでゆく。この作
、　　　　　　　　＋を標準音節パフメータメモ１月４
）内のすべての音節バフメータについて行なう。それに
よりニーデー皆節パラメータメモ１月５）内には標準音
節パラメータメモリ（４）に対応する新しい個人性情報
を持った音節パラメータが生成される。0 in the user syllable parameter memory as shown in the SXS diagram
pair 2s 7' (consonant part → ゛(stationary consonant part'') continued
Continuing to write the data in part 1). This work, + standard syllable puff meter memo January 4
) for all syllable buff meters. As a result, a syllable parameter having new individuality information corresponding to the standard syllable parameter memory (4) is generated in the needle all syllable parameter memo (January 5).

（ホ）発明の効果本発明の音声合成装置は１以上の説明から明らかな如く
、外部から音声を入力して個人性を示す特徴パラメータ
を抽出する特徴パラメータ抽出部と、該抽出部にて得ら
れる特徴パラメータを用いて個人情報が付加された音節
パラメータを生成する音節パラメータ生成部とを設けた
ものであるので、ユーザーの個人性情報を含んだ音節パ
ラメータを基本単位として規則合成が可能となり、ユー
ザー自身の音声に近い合成音声を出力する事ができる。(E) Effects of the Invention As is clear from the above description, the speech synthesis device of the present invention includes a feature parameter extraction section that inputs voice from the outside and extracts feature parameters indicating individuality, and a feature parameter extraction section that extracts feature parameters indicating personality. Since the system is equipped with a syllable parameter generation unit that generates syllable parameters to which personal information is added using the characteristic parameters of the user, it is possible to perform rule synthesis using syllable parameters that include the user's personal information as a basic unit. It is possible to output a synthesized voice that is close to the user's own voice.

[Brief explanation of drawings]

＠１図は本発明の音声合成装置の一実施例の構成を示す
ブロック図であり、第２図及び′＠３図は本発明装置に
係る音節パラメータ生成部の処理手順を示すフローチャ
ート、及びそのメモリ図である。（１）・・・キーボード、（２）・・・デコーダ、（３
）・・・音節アドレステーブル、（４）・・・標準音節
パラメータメモリ。（５）・・・ユーザー音節パラメータメモリ、（６）・
・・音節パラメータ生成部、（７）・・・合成音声選択
部、（８）・・・音節データ長制御部、（９）・・・音
声データバックアメモリ。（９−＆　）・・・パラメータ領域、（９−７））・・
・ピッチ領域、餞・・・音声入力用マイク、αト・パラ
メータ抽出回路、αト・定常母音バッファーメモリ、Ｇ
ト・アクセント指定部、α４・・・ピッチパターン指定
回路。１５・・・ピッチテーブル、ａト＝節長テーブル、（１
η・・・ピッチパターン・マツチング回路、Ｑ８１”−
ｆ声合成部、（１！ｌ・・・アンプ、■・・・スピーカ
。Figure @1 is a block diagram showing the configuration of an embodiment of the speech synthesis device of the present invention, and Figures 2 and '@3 are flowcharts showing the processing procedure of the syllable parameter generation section according to the device of the present invention, and their flowcharts. It is a memory diagram. (1)... Keyboard, (2)... Decoder, (3
)...Syllable address table, (4)...Standard syllable parameter memory. (5)...User syllable parameter memory, (6)...
...Syllable parameter generation unit, (7)...Synthesized speech selection unit, (8)...Syllable data length control unit, (9)...Speech data backup memory. (9-&)...Parameter area, (9-7))...
・Pitch area, voice input microphone, α parameter extraction circuit, α constant vowel buffer memory, G
Accent specification section, α4...Pitch pattern specification circuit. 15... Pitch table, a = node length table, (1
η...Pitch pattern matching circuit, Q81"-
fVoice synthesis section, (1!l...Amplifier, ■...Speaker.

Claims

[Claims]

In a speech synthesis device that synthesizes arbitrary word or sentence speech based on each syllable parameter formed by connecting syllable parameters, which are feature parameters of speech in syllable units, feature parameters indicating individuality are obtained by inputting speech from an external source. and a syllable parameter generation unit that uses the feature parameters obtained by the extraction unit to generate syllable parameters to which new personal information is added. Synthesizer.