JPS6286396A

JPS6286396A - Rule synthesization voice communication system

Info

Publication number: JPS6286396A
Application number: JP22597985A
Authority: JP
Inventors: 光宏由比藤; 勝憲下原; 徳永　幸生
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1985-10-12
Filing date: 1985-10-12
Publication date: 1987-04-20

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔産業上の利用分野〕この発明は、送信装置側より文を転送し受信装置側で規
則合成により音声の合成を行う規則合成音声通信方式に
関するものである。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a rule-synthesized speech communication system in which a sentence is transferred from a transmitting device and speech is synthesized by rule-based synthesis on a receiving device.

[Conventional technology]

従来の一般的な規則合成音声通信方式として、送信装ｆ
ｆ１ｔ側でアクセント付カナ文を送信し、受信装置側で
アクセント付カナ文の規則合成を行う場合を蕗６図（ａ
）に示す。As a conventional general rule-synthesized voice communication system, the transmitter f
Figure 6 (a) shows the case where the f1t side transmits accented kana sentences and the receiver side performs rule synthesis of the accented kana sentences.
).

第６図（ａ）において、１はアクセント付カナ文人力、
２はアクセント付カナ文送信装置、３は回線、４は交換
機、５は回線、６は規則合成音声の声色を変えるための
声の高さおよび発声スピードの設定が可能な規則合成器
を内蔵したアクセント付カナ文受信装置、７はスピーカ
、８は規則合成音声の音色を変えるための声の高さおよ
び発声スピードの制御用入力である。In Figure 6 (a), 1 is the accented kana literary ability;
2 is an accented kana sentence transmitter, 3 is a line, 4 is an exchange, 5 is a line, and 6 is a built-in rule synthesizer that can set the voice pitch and speaking speed to change the tone of the rule-synthesized voice. In the accented kana sentence receiving device, 7 is a speaker, and 8 is an input for controlling the pitch and speed of voice for changing the timbre of the rule-synthesized speech.

第６図（ａ）の例で、送信装置側では、キーボード等で
作成したアクセント付カナ文人力１がアクセント付カナ
文送信装置２へ入力される。アクセント付カナ文送信装
置２は、第６図（ｂ）のよ５なフレームフ奪−マットで
アクセント付カナ文人力１を回線３へ送信する。In the example shown in FIG. 6(a), on the transmitting device side, the accented kana sentence human power 1 created using a keyboard or the like is input to the accented kana sentence transmitting device 2. The accented kana sentence transmission device 2 transmits the accented kana sentence human power 1 to the line 3 using five frame frames as shown in FIG. 6(b).

第６図（ｂ）において、Ａはアクセント付カナ文識別子
、Ｂはアクセント付カナ文である。第６図（ｂ）では、
規則合成に関係する部分のみ７示しており、フン−六同
期やデータ伝送のために必要なヘッダ類は省略してあり
、以後に示すフＶ−ムフ才一マットについても同様とす
る。一方、受信装置側では、あらかじめ、制御用人力８
によりアクセント付カナ文受信装ｆｆ１Ｓ内の規則合成
器の声色を変えるための声の高さおよび発声スピードな
どを設定しておき、交換機４および回＊５を通して受信
したアクセント付カナ文人力１は、アクセント付カナ文
受信装ｆ１６へ送られ音声に合成される。In FIG. 6(b), A is an accented kana sentence identifier, and B is an accented kana sentence. In Figure 6(b),
Only the parts related to rule synthesis are shown, and headers necessary for synchronization and data transmission are omitted, and the same applies to the FuV-Mufu Saiichi mat shown below. On the other hand, on the receiving device side, the control manual 8
The voice pitch and speaking speed for changing the voice tone of the rule synthesizer in the accented kana sentence receiving device ff1S are set using It is sent to the accented kana sentence receiver f16 and synthesized into speech.

また上記の規則合成音声通信方式以外に複数の声色を合
成できる単独の規則合成器を用いる規則合成音声通信方
式があり、この方式では、例えば男性１名と女性１名の
声について抽出した音素のホルマント情報を用いて声の
高さなどを変えることにより、男性３人（普通の声、低
い声、老人の声）、女性２人（普通の声、低い声）およ
び男の子の声を合成し、声色の指定は規則合成器への入
力文章の中で行う。したがって、送信装置側で声色の指
定のついた文章を送信し、受信装置ｔ側でその文章の声
色の指定に基づいて規則合成音声の出力を行うことによ
り、送信装置側の指定した声色で音声を合成することが
できる。In addition to the above-mentioned rule-based synthesis voice communication system, there is a rule-based synthesis voice communication system that uses a single rule synthesizer that can synthesize multiple voice tones. By changing the pitch of the voice using formant information, the voices of three men (normal voice, low voice, old man's voice), two women (normal voice, low voice), and a boy are synthesized. The tone of voice is specified in the input text to the rule synthesizer. Therefore, by transmitting a sentence with a specified tone of voice on the transmitting device side and outputting a rule-synthesized speech based on the specified tone of voice of the sentence on the receiving device t side, the voice can be heard in the tone specified by the transmitting device. can be synthesized.

[Problem that the invention seeks to solve]

上記のような従来の一般的な規則合成音声通信方式では
、受信装置側における規則合成音声の声色、音質および
音量などを送信装置側から直接指定する手段がないため
、受信装置側で設定された声色、音質および音量などで
規則合成が行われ、送信者の指定した特性で受信装置側
に出力させることができなかった。また単独の規則合成
器を用いる規則合成音声通信方式では、送信装置側の指
定した声色で音声を合成することができるものの、あら
かじめ登録されている声色しか合成できないうえ、文章
中で声色を指定するために文章の入力時に声色を指定す
る必要があり、さらに、後で声色を変更したい場合には
文章内の声色指定部ケ修正する必要があるという問題点
があった。In the conventional general rule-synthesized voice communication system as described above, there is no way to directly specify the tone, tone quality, volume, etc. of the rule-synthesized voice on the receiving device side, so the settings on the receiving device side Rules were synthesized based on voice tone, tone quality, volume, etc., and it was not possible to output to the receiving device with the characteristics specified by the sender. Furthermore, in the rule-synthesized voice communication system that uses a single rule-synthesizer, it is possible to synthesize voice with the voice tone specified by the transmitting device, but it is only possible to synthesize voice tones that have been registered in advance, and the voice tone cannot be specified in the text. Therefore, it is necessary to specify the tone of voice when inputting a sentence, and furthermore, if the user wants to change the tone of voice later, there is a problem in that it is necessary to modify the voice tone specified part in the sentence.

声色、音質、音量および発声スピードを送信装置側から
指定でき、かつ、その指定情報を被送信文と分離して送
信できる規則合成音声通信方式を提供することを目的と
する。It is an object of the present invention to provide a rule-synthesized voice communication system in which voice tone, tone quality, volume, and voice speed can be specified from a transmitting device side, and the specified information can be transmitted separately from the transmitted sentence.

[Means for solving problems]

この発明に係る規則合成音声通信方式の第１の発明は、
送信装置側から受信装置側における規則合成音声の声色
、音質、音１およびｘＡ声ススピードうち少なくとも１
つを指定する指定情報ン文章を伝送するフレームに付加
して伝送し、受信装置側では受信文の規則合成に際し、
指定情報で指定された声色、音質、音量および発声スピ
ードで規則合成音声の出力を行うものである。The first invention of the rule-synthesized voice communication system according to the present invention is as follows:
At least one of the voice color, tone quality, sound 1, and xA voice speed of the rule-synthesized voice from the transmitting device side to the receiving device side
A specification information sentence that specifies a message is added to the frame to be transmitted, and when the receiving device synthesizes the received sentences according to the rules,
It outputs rule-synthesized speech with the tone of voice, tone quality, volume, and speaking speed specified by the specified information.

この発明に係る規則合成音声通信方式の第２の発明は、
送信装置側からの文章の伝送に先立って送信装置側から
の音声単位の音声パラメータを伝送して受信装置内の規
則合成手段の蓄積部に蓄積したのち、受信装置側では受
信文の規則合成に際し、蓄積部に蓄積された音声単位の
音声パラメータを用いて規則合成音声の出力を行うもの
である。The second invention of the rule-synthesized voice communication system according to this invention is as follows:
Prior to the transmission of a sentence from the transmitting device, the audio parameters of each voice from the transmitting device are transmitted and stored in the storage section of the rule synthesis means in the receiving device, and then the receiving device performs rule synthesis of the received sentence. , outputs rule-synthesized speech using the speech parameters of each speech unit stored in the storage section.

この発明に係る規則合成音声通信方式の第３の発明は、
送信装置ｔ側］からの文章の伝送に先立って少なくとも
一種類以上の送信装置側からの音声単位の音声パラメー
タを伝送して受信装置内の規則合成手段の蓄積部に蓄積
したのち、送信装置側から受信装置側における規則合成
音声の声色、音質。The third invention of the rule-synthesized voice communication system according to this invention is as follows:
Prior to the transmission of the text from the transmitting device t side, at least one type of voice parameter for each voice unit from the transmitting device side is transmitted and stored in the storage unit of the rule synthesis means in the receiving device, and then the transmitting device side The voice tone and quality of the rule-synthesized speech on the receiving device side.

音量および発声スピードのうち少なくとも１つを指定す
る指定情報を文章を伝送するフレームに付加して伝送し
、受信装置側では受信文の規則合成に際し、蓄積部に蓄
積された音声単位の音声パラメータＺ用い、指定情報で
指定された声色、音質。Specifying information specifying at least one of the volume and the speaking speed is added to the frame for transmitting the sentence and transmitted, and the receiving device side uses the voice parameter Z of the voice unit stored in the storage unit when performing rule synthesis of the received sentence. The voice tone and tone quality specified by the specified information.

音量および発声スピードで規則合成音声の出力を行うも
のである。It outputs synthetically synthesized speech according to the volume and speaking speed.

[Effect]

この発明にか（る第１の発明は、送信装置側から送らハ
る指定情報に基づいて受信装置側で規則合成音声の出力
を行う。In a first aspect of the present invention, a receiving device outputs a regular synthesized speech based on designation information sent from a transmitting device.

またこの発明にか〜る第２の発明は、送信装置側から送
られる音声単位の音声パラメータを受信装置側に蓄積し
、この音声パラメータに基づいて受信装置側で規則合成
音声の出力を行う。Further, in a second aspect of the present invention, audio parameters of each audio unit sent from the transmitting device are stored on the receiving device side, and based on the audio parameters, the receiving device outputs regular synthesized speech.

さらに、このツム明にかくる第３の発明は、送信装置側
から送られる音声単位の音声パラメータを蓄積し、次い
で送られてくる指定情報と、前記音声パラメータに基づ
いて受信装置側で規則合成音声の出力を行う。Furthermore, the third invention disclosed in this Tsumaki accumulates the audio parameters of each audio unit sent from the transmitting device, and then performs rule synthesis on the receiving device side based on the specified information sent and the audio parameters. Outputs audio.

〔Example〕

第１図（ａ）、（ｂ）はこの発明の第１の発明の構成お
よび送信装置側より伝送される文章、指定情報を含むフ
Ｖ−ムフ亨−マットを示す図で、第６図（ａ）、（ｂ）
と同一符号は同一部分を示し、１１は送信装置、１２は
交換機、１３は受信装置、１３ａはその規則合成音声の
声色が少なくとも一種類以上設定可能で、音質、音量お
よび発声スピードなどが任意に設定可能な規則合成手段
、２０．２１は回線である。またＣは声色指定コマンド
、Ｄは声色値、Ｅは音質指定コマンド、Ｆは音質値、Ｇ
は音量指定コマンド、Ｈは音量値、工は発声スピード指
定コマンド、Ｊは発声スピード値である。FIGS. 1(a) and 1(b) are diagrams showing the configuration of the first invention of the present invention and a fume mat containing text and designation information transmitted from the transmitting device side, and FIG. a), (b)
The same reference numerals indicate the same parts, 11 is a transmitting device, 12 is a switching device, 13 is a receiving device, and 13a is a device for which at least one type of tone of voice can be set for the rule-synthesized voice, and the tone quality, volume, voice speed, etc. can be set arbitrarily. The configurable rule synthesis means 20.21 is a line. Also, C is a voice color specification command, D is a voice color value, E is a tone quality specification command, F is a tone quality value, and G
is a volume designation command, H is a volume value, WORK is a voice speed designation command, and J is a voice speed value.

声色値、音質値、音量値および発生スピード値は、各々
その値に対して声色、音質、音量および３む声スピード
を定めておく。For the voice color value, tone quality value, volume value, and generation speed value, the voice color, tone quality, volume, and voice speed are determined for each value.

次に動作について説明する。Next, the operation will be explained.

まず、送信装置側では、規則合成音声の声色。First, on the transmitting device side, the voice tone of the rule-synthesized voice.

音質、音量および発声スピードを指定する指定情報およ
び文章が送信装置１１に入力され、送信装置１１は指定
情報を文章を伝送するフレームに付加して第１図（ｂ）
に示す構成として伝送し、交換楡１２を一介６して受信
装置１３に入力させる。受信装置側では受信装置１３内
において、規則合成手段１３ａが制御情報で指定された
声色、音質、音量および発声スピードで同時に伝送され
た文章の規則合成を行い、スピーカ７より入力された文
章の規則合成音声が出力される。Specification information specifying sound quality, volume, and speaking speed and sentences are input to the transmitting device 11, and the transmitting device 11 adds the specified information to the frame for transmitting the sentences, as shown in FIG. 1(b).
The data is transmitted using the configuration shown in FIG. On the receiving device side, within the receiving device 13, the rule synthesizing means 13a performs rule synthesizing of the sentences transmitted simultaneously with the voice tone, tone quality, volume, and speaking speed specified by the control information, and synthesizes the rules of the sentence input from the speaker 7. Synthesized speech is output.

ＮｒＩ２図（ａ）　、　　（ｂ）　、　　（ｃ）はこの
発明の第２の発明の構成および文章の伝送に先立って送
信装置側より伝送される音声単位の音声パラメータのフ
レームフォーマットおよび伝送される文章のフレームフ
ォーマットを示す図で、第１図（ａ）、（ｂ）と同一符
号は同一部分を示し、１１ａは送信装置１１内に設けた
音声パラメータ蓄積部、１４は受信装置、１４ａは規則
合成手段で、その規則合成音声能で、かつ抽出した音声
単位の音声パラメータを蓄積する蓄積部１４ｂを備えて
いる。Ｋは音声パラメータ識別子、Ｌは音声パラメータ
である。NrI2 Figures (a), (b), and (c) show the structure of the second invention of the present invention, the frame format of the audio parameters of the audio unit transmitted from the transmitter side prior to the transmission of the text, and the text to be transmitted. In this figure, the same reference numerals as in FIGS. 1(a) and 1(b) indicate the same parts, 11a is the audio parameter storage section provided in the transmitting device 11, 14 is the receiving device, and 14a is the rule synthesis section. The apparatus is equipped with a storage section 14b that stores voice parameters of the extracted voice unit using the rule-synthesized voice ability. K is a voice parameter identifier, and L is a voice parameter.

次に動作について説明する。Next, the operation will be explained.

まず、送イｇ装置側では、文章の伝送に先立って送信装
置１１に入力されたあるいは音声パラメータ蓄積部１１
ａに蓄積されていた第２図（ｂ）に示す構成の音声単位
の音声パラメータを伝送して交換機１２を介して受信送
ｆ１ｔ１４に入力させ、規則合成手段１４ａ内の蓄積部
１４ｂに蓄積させる。First, on the transmitting device side, prior to transmitting the text, the data input to the transmitting device 11 or the voice parameter storage section 11 is
The voice parameters of the voice unit having the configuration shown in FIG. 2(b) stored in the voice unit a are transmitted and inputted to the reception/transmission f1t14 via the exchange 12, and are stored in the storage section 14b in the rule synthesis means 14a.

次いで送信装置１１は入力された第２図（ｅ）に示す構
成の文章を伝送して交換機１２を介して受信装置１１４
に入力させる。受信装置側では、受信装置１４内の規則
合成手段１４ａが蓄積部１４ｂに蓄積された音声単位の
音声単位パラメータを用いて規則合成を行い、スピーカ
７より入力された文章の規則合成音声が得られる。Next, the transmitting device 11 transmits the input text having the structure shown in FIG.
input. On the receiving device side, the rule synthesizing means 14a in the receiving device 14 performs rule synthesis using the voice unit parameters of the voice units stored in the storage unit 14b, and the rule synthesized speech of the sentence input from the speaker 7 is obtained. .

第３図（ａ）〜（りはこの発明の第３の発明の構成およ
び文章の伝送に先立って送信装置側より伝送されル音声
ハラメータのフレームフォーマｙ）および文章、制御情
報を含むフ／−ムフォーマノトビ示す図で、第１図（ａ
）、　（ｂ）および第２図（ａ）。FIGS. 3(a) to 3(a) show the configuration of the third aspect of the present invention and the frame former of the voice harameter transmitted from the transmitter side prior to the transmission of the text, and the frame former containing the text and control information. Figure 1 (a)
), (b) and Figure 2 (a).

（ｂ）と同一符号は同一部分７示し、１５は受イ♂装置
、１５ａは規則合成手段で、その規則合成音声の声色が
少なくとも一柚類以上設定可能で、音質。The same reference numerals as in (b) indicate the same parts 7, 15 is a receiving male device, 15a is a rule synthesis means, and the voice tone of the rule synthesis voice can be set to at least one level or higher, and the sound quality.

音量および発声スピードが任意に設定可能で、かつ抽出
した音声単位の音声パラメータを蓄積する蓄積部１５ｂ
’Ｙ備えている。１５ｃは音声単位の音声パラメータ、
指定情報およびアクセント付カナ文の識別部である。A storage unit 15b whose volume and speaking speed can be arbitrarily set and which stores audio parameters of extracted audio units.
'Y are prepared. 15c is the audio parameter of the audio unit,
This is the specification information and accented kana sentence identification part.

次に動作について説明する。なお、第３図（ａ）〜（Ｃ
）に示す実施例は、第１図（ａ）、（ｂ）に示す実施例
を含んでいるので、詐しく説明する。Next, the operation will be explained. In addition, Fig. 3(a) to (C
The embodiment shown in ) includes the embodiments shown in FIGS. 1(a) and 1(b), so it will be explained incorrectly.

伝送する文としてアクセント付カナ文を用い、送信装置
側で文章の伝送に先立って２人の声について抽出した音
声単位の音声パラメータを伝送し。An accented kana sentence is used as the sentence to be transmitted, and the transmitting device transmits the voice parameters of the voice unit extracted for the voices of the two people prior to transmitting the sentence.

文章の伝送に際しては受信装置側における規則合成音声
の声色（受伯装置１１では予め３人の声について抽出し
た音声単位の音声パラメータを蓄積しているものとし、
文章の伝送に先立って伝送Ｅ−だ２人の声について抽出
した音声単位の音声パラメータと合わせて計５人の声に
ついて抽出した音声単位の音声パラメータより６極の音
色！合成する）、音質、音量および発声スピードなどを
指定する指定情報ｔ、文章を伝送する７Ｖ−ムに付加し
。When transmitting the text, the tone of the rule-synthesized voice on the receiving device side (assuming that the receiver device 11 has stored in advance the voice parameters of the voice units extracted for the voices of three people,
Prior to the transmission of the text, the transmission E- is combined with the audio parameters of each audio unit extracted for the voices of two people, and the audio parameters of each audio unit extracted for a total of five voices! Six poles of tones! (to be synthesized), specification information t for specifying sound quality, volume, speaking speed, etc., and added to the 7V-me for transmitting text.

て伝送し、受信装置側では、受信した２人の声について
抽出した音声単位の音声パラメータ娑蓄積した後、受信
文の規則合成に際し、この新規に蓄積した音声単位の音
声パラメータと、予め蓄積しである３人の声について抽
出した音声単位の音声パラメータ！用い、指定情報で指
定された声色。On the receiving device side, after accumulating the voice parameters of the voice unit extracted from the two received voices, when performing rule synthesis of the received sentence, the newly accumulated voice parameters of the voice unit are combined with the voice parameters of the voice unit previously accumulated. Audio parameters extracted for each voice of the three voices! and the tone of voice specified by the specified information.

音質、音量および発声スピードなどで規則合成音声の出
力を行う場合を説明する。A case will be explained in which a rule-synthesized speech is output based on sound quality, volume, speaking speed, etc.

はじめに、送信装置側の動作について述べる。First, the operation on the transmitter side will be described.

まず、３人の声について抽出した音声単位の音声パラメ
ータを音声パラメータ蓄積５１１ａから第３図（ｂ）に
示す構成で伝送して、規則合成手段１５ａ内の蓄積部１
５ｂに蓄積しておく。そして文章の伝送に先立って必要
となる音声パラメータ蓄積部１１ａに記憶されている２
人分の音声単位の音声パラメータまたは外部から送信装
置１１に入力された２人分の音声単位の音声パラメータ
を蓄積部１５ｂに蓄積する。次いで受信装置側の規則合
成音声の声色、音質、音量および３む声スピードなどを
指定するための指定情報およびキーボード等で作成した
アクセント付カナ文を送信装置１１に入力する。First, the audio parameters of each audio unit extracted for the voices of three people are transmitted from the audio parameter storage 511a in the configuration shown in FIG.
Store it in 5b. Then, the 2 parameters stored in the audio parameter storage section 11a, which are necessary prior to the transmission of the text, are
The audio parameters in units of audio for one person or the audio parameters in units of audio for two people input into the transmitter 11 from the outside are stored in the storage section 15b. Next, specification information for specifying the tone, tone quality, volume, voice speed, etc. of the rule-synthesized speech on the receiving device side and accented kana sentences created using a keyboard or the like are input to the transmitting device 11.

送信装置１１はアクセント付カナ文と指定情報を第３図
（ｅ）に示すような同一のフレームあるいは第４図（ａ
）　、　　（ｂ）に示すような別々のフＶ−ムで転送す
る。そして途中で声色、音質、音量および発声スピード
など娑変えるには、前に転送したフＶ−ムに続けて第３
図（ｃ）のフレームまたは第４図（ａ）、　　（ｂ）の
フレームを転送する。The transmitting device 11 sends the accented kana sentence and the specification information in the same frame as shown in FIG. 3(e) or in the same frame as shown in FIG. 4(a).
) and (b) are transferred in separate frames as shown in (b). If you want to change the tone of voice, tone quality, volume, speaking speed, etc. midway through, please select the third V-frame following the previously transferred frame.
The frame shown in FIG. 4(c) or the frames shown in FIGS. 4(a) and 4(b) are transferred.

次に、受信装置９１１の動作について述べる。Next, the operation of the receiving device 911 will be described.

交換機１２Ｙ介して受信した第３図（ｂ）　、　　（Ｃ
）または第４図（ａ）、（ｂ）に示すようなフレームは
、受信装置ｔ１５内の識別部１５ｃに入力される。ａＱ
別郡部１５ｅは、受信データの解析７行い、音声単位の
音声パラメータ、指定情報およびアクセント付カナ文の
識別７行う。まず、文の伝送に先立って送られてくる音
声単位の音声パラメータを規則合成手段１５ａ内の蓄積
部１５ｂへ蓄積する。Figures 3(b) and (C) received via exchange 12Y.
) or frames as shown in FIGS. 4(a) and 4(b) are input to the identification unit 15c in the receiving device t15. aQ
The separate section 15e analyzes 7 the received data, and performs 7 identification of voice parameters, specification information, and accented kana sentences for each voice. First, the speech parameters of the speech units sent prior to the transmission of a sentence are stored in the storage section 15b within the rule synthesis means 15a.

この蓄積により予め蓄積しである３人の声について抽出
した音声単位の音声バラメークと合わせて計５人の声に
ついて抽出した音声単位の音声パラメータが蓄積部１５
ｂへ蓄積される。Through this accumulation, the audio parameters of each audio unit extracted for a total of five voices, including the audio variations of each audio unit extracted for three voices that have been stored in advance, are stored in the storage unit 15.
It is stored in b.

次に、指定情報出力が規則合成手段１５ａへ入力され、
送信側の指定した声色、音質、音量および発声スピード
などが設定され、次に、識別部１５Ｃは、識別したアク
セント付カナ文を規則合成手段１５ａへ送り音声に合成
する。Next, the specified information output is input to the rule synthesis means 15a,
The tone of voice, tone quality, volume, speaking speed, etc. specified by the transmitting side are set, and then the identifying section 15C sends the identified accented kana sentence to the rule synthesizing means 15a to synthesize it into speech.

なお、この実施例では、■規則合成方式について示した
が、音声単位として単音節を用い、これを離散的に接続
する単音節編集方式にも同様に適用できる。０文がアク
セント付カナ文の場合について示したが、その他の文の
場合も同様に適用できる。■音声単位としては音素、単
音節、ＶＣＶ。In this embodiment, a rule synthesis method has been described; however, it can be similarly applied to a monosyllable editing method in which monosyllables are used as speech units and are connected discretely. Although the case where the 0th sentence is an accented kana sentence is shown, the same applies to other sentences as well. ■Sound units include phonemes, monosyllables, and VCV.

ＣＶＣなどが考えられる。Possible options include CVC.

また規則合成による音声は他のメディアと複合化され、
音声ガイダンスとして用いられることが多い。このよう
に規則合成される文が他のメチイアと複合化されている
場合でも、規則合成に関係する部分をこの実施例のよう
に構成することにより、規則合成音声の出力状態を制御
することができる。In addition, the audio generated by rule synthesis is combined with other media,
Often used as voice guidance. Even when a sentence to be synthesized by rules is compounded with other speech, the output state of the synthesized speech by rules can be controlled by configuring the parts related to rule synthesis as in this example. can.

さらに、上記の実施例では、エンド−エンド形の通信形
態について示し、だが、一方の端末をセンタとし、この
センタに一般の電話機ヲ接続したエンド−センターエン
ド形の通信形態も考えられる。Further, in the above embodiment, an end-to-end communication mode is shown, but an end-to-center-end communication mode is also conceivable, in which one terminal is used as a center and a general telephone is connected to the center.

第５図に、エンド−センターエンド形の構成例を示す。FIG. 5 shows an example of an end-center-end configuration.

第５図において、２０〜２３は回線、２４〜２６は電話
機、２７はセンタ、２８は送信装置側の端末である。端
末２８は情報提供者等の端末であり、音素ファイル情報
、指定情報および規則合成用の文を第３図（ｂ）、（ｃ
）、第４図（ａ）、（ｂ）のフレームフォーマットで回
ｌｌＭ２Ｏへ転送する。センタ２７は回線２０を進して
受信した音素ファイル情報、指定情報および規則合成用
の文をもとに音声を合成する。センタ２Ｔは合成した音
声７回線２１〜２３を通して電話機２４〜２６へ転送す
る。In FIG. 5, 20 to 23 are lines, 24 to 26 are telephones, 27 is a center, and 28 is a terminal on the transmitter side. The terminal 28 is a terminal of an information provider, etc., and transmits phoneme file information, specification information, and sentences for rule synthesis in FIGS. 3(b) and 3(c).
), the frame format shown in FIGS. 4(a) and 4(b) is transferred to 11M2O. The center 27 synthesizes speech based on the phoneme file information, specification information, and sentences for rule synthesis received through the line 20. The center 2T transfers the synthesized voice to telephones 24-26 through seven lines 21-23.

〔Effect of the invention〕

この発明にか−る第１の発明は、送信装置側から受信装
置側における規則合成音声の声色、音質。A first aspect of the present invention relates to the tone and quality of regularly synthesized speech from the transmitting device side to the receiving device side.

音量および発声スピードなどｔ指定する指定情報を、文
章を伝送するフレームに付加して伝送し、受信装置側で
は受信文の規則合成に際し、受信文と共に伝送された指
定情報で指定された声色、音質、音量および発声スピー
ドなどで規則合成音声の出力を行うようにしたので、送
信者の意図した声色、音質、音量および発声スピードな
どで受信装置＠の規則合成音声の出力を行わせることが
可能であり、また文と声色、音質、音量および発声スピ
ードなどの指定情報を分離できる利点がある。Specified information such as volume and speaking speed is added to the frame for transmitting the text and transmitted, and when the receiving device synthesizes the received sentence according to the rules, the receiving device uses the tone and quality of voice specified by the specified information transmitted together with the received sentence. , it is possible to output the rule-synthesized voice according to the volume, voice speed, etc., so it is possible to have the receiver @ output the rule-synthesized voice according to the voice tone, tone quality, volume, voice speed, etc. intended by the sender. It also has the advantage of being able to separate sentences from specific information such as tone of voice, tone quality, volume, and speaking speed.

またこの発明にかｋる第２の発明は、送信装置側では文
章の送信に先立って１人の声について抽出した音声単位
の音声パラメータを伝送しておき、受信装置側では受信
した新たな音声単位の音声パラメータを蓄積し、この新
規に蓄積した音声単位の音声パラメータ！用いて受信し
た文章の規則台る利点がある。Further, in the second invention according to the present invention, the transmitter side transmits the audio parameters of each voice extracted for one person's voice prior to transmitting the text, and the receiver side transmits the voice parameters of the received new voice. Accumulate the audio parameters of the unit, and the audio parameters of this newly accumulated audio unit! It has the advantage of using the rules of the received text.

さらに、この発明にか〜る第３の発明は、文章の送信に
先立って１人或いは複数の人の声について抽出した音声
単位の音声パラメータを伝送しておき、文章の送信に際
しては受信装置側における規則合成音声の声色、音質、
音量および発声スピードなどを指定する指定情報を、文
章を伝送するフレームに付加して伝送し、受信装ｆｔ１
１１１では受信した新たな音声単位の音声パラメータを
蓄積し、受信文の規則合成に際しては、この新規に蓄積
した音声単位の音声パラメータと予め登録しである音声
単位の音声パラメータを用い、指定情報で指定された声
色、音質、音量および発−スピードなどで規則合成音声
の出力を行うようにしたので、送信者の意図した声色、
音質、音量および発声スピードなどで受信装置側の規則
合成音声の出力が可能であるはかりでなく、文と声色、
音質、音量および発声スピードなどの指定情報を分離で
きる利点がある。Furthermore, the third invention according to the present invention is such that, prior to sending a text, audio parameters for each voice unit extracted from one or more people's voices are transmitted, and when sending the text, the receiving device side The tone and quality of the rule-synthesized speech,
Specification information specifying the volume, speaking speed, etc. is added to the frame for transmitting the text and transmitted, and the receiving device ft1
In step 111, the voice parameters of the newly received voice unit are accumulated, and when the received sentence is synthesized according to the rules, the voice parameters of the newly accumulated voice unit and the voice parameters of the voice unit that have been registered in advance are used, and the voice parameters of the voice unit that have been registered in advance are used. Since the rule-synthesized voice is output with the specified voice tone, tone quality, volume, and speaking speed, it is possible to output the voice tone that the sender intended,
It is not a scale that allows the receiving device to output a synthesized voice according to the sound quality, volume, and speaking speed, but also the sentence and tone of voice,
It has the advantage of being able to separate specified information such as sound quality, volume, and speaking speed.

[Brief explanation of drawings]

第１図（ａ）、（ｂ）はこの発明にかｋる第１の発明の
実施例を示す構成ならびに指定情報を含むフレームフォ
ーマットを示す図、第２図（ａ）、（ｂ）。（ｃ）はこの発明にか（る第２の発明の構成ならびの構
成ならびに指定情報を含む）／−ムフォーマットを示す
図、第４図（ａ）、（ｂ）は他のフＶ−ムフォーマッ）
Ｙ示す図、第５図はこの発明！エンドーセンターエンド
形の通信形態に用いた場合の構成を示す図、第６図（ａ
）、（ｂ）は従来の規則合成音声通信方式を示す構成図
とフレームフォーマットを示す図である。図中、１１は送信装置、１１ａは音声パラメータｉ′積
部、１２は交換機、１３は受信装置、１３０は規則合成
手段、１４は受信装置、１４＆は規則合成手段、１４ｂ
は蓄積部、１５は受信装置、１５ａは規則合成手段、１
５ｂは蓄積部、１５ｃは識別部、２０．２１は回線であ
る。第５図第６図手続補正書（方式）％式％１、事件の表示　特願昭６０−２２５９７９号２、発明
の名称　規則合成音声通信方式３、補正をする者事件との関係　特許出願人住所　東京都千代田区内幸町１丁目１番６号名称　（４
２２）　　日本電信電話株式会社小林特許事務所電話０
３　（４９６）　１２５６番５、補正命令の日付け　昭
和６１年　１月２８日６、補正の対象図面７、補正の内容第１図〜第３図を別紙のとおり補正する。以上手続補正書（自発）昭和６１年４月３０日FIGS. 1(a) and 1(b) are diagrams showing the structure of a first embodiment of the invention and a frame format including designation information, and FIGS. 2(a) and 2(b) are diagrams showing a frame format including designation information. 4(c) is a diagram showing a frame format according to the present invention (including the configuration and configuration of the second invention and designation information), and FIGS. 4(a) and 4(b) are diagrams showing other frame formats. )
The figure shown in Y, Figure 5, shows this invention! Figure 6 (a) is a diagram showing the configuration when used in an end-center-end type of communication form.
) and (b) are diagrams showing a configuration diagram and a frame format of a conventional rule-based synthetic voice communication system. In the figure, 11 is a transmitter, 11a is a voice parameter i′ product unit, 12 is an exchange, 13 is a receiver, 130 is a rule synthesis means, 14 is a reception device, 14 & is a rule synthesis means, 14b
1 is a storage unit, 15 is a receiving device, 15a is a rule synthesis means, 1
5b is a storage section, 15c is an identification section, and 20.21 is a line. Figure 5 Figure 6 Procedural amendment (method) % formula % 1. Indication of the case Japanese Patent Application No. 60-225979 2. Title of the invention Rule-synthesized voice communication system 3. Person making the amendment Relationship with the case Patent applicant Address 1-1-6 Uchisaiwai-cho, Chiyoda-ku, Tokyo Name (4
22) Nippon Telegraph and Telephone Corporation Kobayashi Patent Office Telephone 0
3 (496) No. 1256 5, Date of amendment order January 28, 1985 6, Drawing subject to amendment 7, Contents of amendment Figures 1 to 3 will be amended as shown in the attached sheet. Written amendment to the above procedures (voluntary) April 30, 1985

Claims

[Claims]

(1) Speech is synthesized from accented kana sentences input from the transmitter side, sentences with kanji and kana mixed in the orthography normally used, or foreign language sentences, and at least one type of tone of voice is set for the synthesized voice according to rules. Possible, sound quality,
In a receiving device having a rule-synthesizing means in which a volume and a voice speed can be arbitrarily set, a specification for specifying at least one of the tone, tone quality, volume, and voice speed of a rule-synthesized voice from the transmitting device side to the receiving device side. Information is added to a frame for transmitting a sentence and transmitted, and the receiving device side outputs a rule-synthesized voice with a tone of voice, tone quality, volume, and speaking speed specified by the specified information when performing rule synthesis of the received sentence. A rule-synthesized voice communication system featuring the following.

(2) Synthesize speech from accented kana sentences input from the transmitter side, sentences with kanji and kana mixed in the commonly used orthography, or foreign language sentences, and check the tone, quality, volume, and utterance of the synthesized voice according to rules. At least one of the speeds can be arbitrarily set, and the transmitting device uses a receiving device that has a rule synthesis means that is equipped with a storage unit that stores audio parameters for each audio unit, and the transmitting device After transmitting the voice parameters of the voice units from the side and storing them in the storage unit of the rule synthesis means in the receiving device, the receiving device side uses the voice parameters accumulated in the storage unit when performing rule synthesis of the received sentence. A rule-synthesized speech communication method characterized by outputting rule-synthesized speech using speech parameters.

(3) Synthesize speech from accented kana sentences inputted from the transmitting device side, sentences with kanji and kana mixed in the commonly used orthography, or foreign language sentences, and set at least one type of tone of voice for the synthesized voice according to rules. Possible, sound quality,
Using a receiving device that has a rule synthesis means in which the volume and speaking speed can be arbitrarily set and is equipped with a storage unit that stores audio parameters for each audio unit, at least one type of text is synthesized prior to transmission of the text from the transmitting device side. After transmitting the voice parameters for each voice from the transmitting device side and storing them in the storage unit of the rule synthesis means in the receiving device, the voice tone and quality of the rule-synthesized voice from the transmitting device side to the receiving device side, At least one of volume and speaking speed
The receiving device side uses the audio parameters of each audio unit stored in the storage unit to synthesize the received sentences by adding designation information to the frame for transmitting the text. A rule-synthesized voice communication method characterized by outputting rule-synthesized voice with specified voice tone, tone quality, volume, and speaking speed.