JPS58154900A

JPS58154900A - Sentence voice converter

Info

Publication number: JPS58154900A
Application number: JP57037368A
Authority: JP
Inventors: 公一江尻
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1982-03-10
Filing date: 1982-03-10
Publication date: 1983-09-14
Also published as: JPH054676B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】本発明は、文字情報の形で与えられる文章を音声に変換
して発声する文章音声変換装置に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a text-to-speech conversion device that converts a text given in the form of text information into speech and utters it.

入力文章を音読みで発声する文章音声変換装置が開発さ
れている。ところで、従来の斯る文章音声変換装置は、
一般的な単語も、固有名詞、専門語、新語などの特殊な
単語も区別することなく、同じような発音特性で発声す
るようになっている。A text-to-speech conversion device that reads input text aloud has been developed. By the way, such a conventional text-to-speech conversion device is
Both common words and special words such as proper nouns, technical words, and new words are uttered with similar pronunciation characteristics without distinction.

しかし、一般的な単語や熟語は一般人にも容易に聴取で
きるが、上記のような特殊な単語は、それに慣れていな
い人は聞き落しやすい。これはラジオ放送なとを想像す
れば明らかである。ラジオ放送のアナウンサーは、一般
的でない固有名詞、新語、゛専門語、さらには数詞や年
月日などは、他の一般的な単語や熟語よりも速度を落し
て読んだり、繰り返したり、または読み換える等の方法
で、聴堆者の理解を助ける努力をしている。However, while common words and phrases can be easily heard by the general public, people who are not familiar with the above-mentioned special words are likely to miss them. This becomes clear if you imagine a radio broadcast. Radio broadcast announcers read uncommon proper nouns, new words, technical terms, even numbers, dates, etc. at a slower speed, repeat them, or read them more slowly than other common words and phrases. Efforts are being made to help listeners understand the content by changing the language and other methods.

したがって本発明の目的は、固有名詞、新語、専門語な
どの聴取を容易化した文章音声変換装置を提供すること
にある。Therefore, an object of the present invention is to provide a text-to-speech conversion device that facilitates listening to proper nouns, new words, technical terms, and the like.

この目的を達成するために、本発明による文章音声変換
装置は、入力文章中の特定の種類の単語を識別する手段
を有し、該手段（二よる識別結果にしたがって、該特定
種類の単語の発音特性を他の種類の単語と異ならせるよ
うに構成したことを特徴とするものである。In order to achieve this object, the text-to-speech conversion device according to the present invention has means for identifying a specific type of word in an input text, and according to the identification result by the means (2), the text-to-speech conversion device according to the present invention It is characterized by being constructed so that its pronunciation characteristics are different from those of other types of words.

以下、図面を参照しながら、一実施例について本発明を
説明する。Hereinafter, the present invention will be described with reference to one embodiment with reference to the drawings.

第１図は、本発明にか＼る文章音声変換装置のブロック
図である。FIG. 1 is a block diagram of a text-to-speech conversion device according to the present invention.

同図において、１は文章ファイルであり、こ＼ではカナ
漢字混りの入力文章が文字コード（例えはＪＩＳコード
）の形で蓄積されている。この文章ファイル１から読み
出される入力文章が音声に変換され５発声部１２より発
声されるわけである。In the figure, 1 is a text file in which input texts containing kana and kanji are stored in the form of character codes (for example, JIS codes). The input text read from the text file 1 is converted into speech and is uttered by the 5-voice unit 12.

なお、この文章ファイル１は具体的には磁気テープ装置
、磁気ディスク装置などの記憶装置である。Note that this text file 1 is specifically a storage device such as a magnetic tape device or a magnetic disk device.

３は単語辞書ファイルであり、こ＼（二は第２図）＝示
すような形式で、漢字やカナの各種単語の情報がファイ
ル化されている。こ＼で、Ｗは単語コード、Ｃ１はその
単語の品詞を示す品詞コード、Ｃ２はその単語の読みを
示す読みコードである。なお、読み方が２通り以上あ°
る単語については、読みコードＣ２が２つ以上ある。こ
＼までは従来の文章音声変換装置に用いられている単語
辞書ファイルと同一形式であるが、本実施例ではさらに
分類コードＣ３が′追加されている。この分類コードＣ
３は、該当の単語が他の一般的な単語と発音特性を異な
らせるべき種類の単語（特殊単語と称す）か否かを表示
する。この特殊単語としては、一般的でない人名や地名
などの固有名詞、専門語、新語、また数詞などが必要に
応じて選定される。3 is a word dictionary file, in which information on various words in kanji and kana are stored in the format shown in Figure 2. Here, W is a word code, C1 is a part-of-speech code indicating the part of speech of the word, and C2 is a reading code indicating the reading of the word. In addition, there are more than two ways to read it.
There are two or more reading codes C2 for a word. Up to this point, the format is the same as the word dictionary file used in conventional text-to-speech conversion devices, but in this embodiment, a classification code C3 has been added. This classification code C
3 indicates whether the corresponding word is a type of word (referred to as a special word) whose pronunciation characteristics should be different from other general words. As the special words, proper nouns such as unusual names of people and places, specialized words, new words, numerals, etc. are selected as necessary.

第１図に戻って、２は検索部である。この検索部２は、
公知の２文節最長−散性などの方法により、入力文章中
の格助詞と語句の区切りを検出し、それを参考にして単
語辞書ファイル３より入力文章中の各単語を検索する。Returning to FIG. 1, 2 is a search section. This search section 2 is
A known method such as the longest two-clause-dispersion method is used to detect the break between case particles and words in the input sentence, and with reference to this, each word in the input sentence is searched from the word dictionary file 3.

検索された単語の品詞コード−Ｃ１はアクセント決定部
５に送られ、読みコードＣ２は単音節分解処理部６に送
られ、また分類コードＣ３は発音制御部９負送られる。The part-of-speech code -C1 of the searched word is sent to the accent determination section 5, the reading code C2 is sent to the monosyllabic decomposition processing section 6, and the classification code C3 is sent to the pronunciation control section 9.

このような検索部２の構成は、従来の文章音声変換装置
の検索部と同様でよい。ただし、本実施例の検索部２は
、特殊単語の識別手段としても働く。つまり、単語の検
索時（″ニー特殊単語か否かを示す分類コードＣ３も同
時に辞書ファイル３から読み出すからである。換言すれ
ば、単語辞書ファイル３のコード形式を第２図のように
一部変更することにより、検索部２の構成を実質的に変
更することなく特殊単語の識別な可能としているのであ
る。The configuration of the search section 2 may be similar to the search section of a conventional text-to-speech conversion device. However, the search unit 2 of this embodiment also works as a means for identifying special words. In other words, when searching for a word, the classification code C3 indicating whether or not it is a special word is also read out from the dictionary file 3 at the same time. By making this change, it is possible to identify special words without substantially changing the configuration of the search unit 2.

単音節分解処理部６は検索部２から入力される各単語の
読みコードＣ２から、音韻規則にしたがってその単語の
読みを単音節に分解し、各単音節に対するパラメータを
単音節パラメータファイル７から検索し、それを結合処
理部８へ送る。また単音節分解処理部６は、分解した個
々の単音節間のつながりないし区切りの様子を単音節パ
ラメータと同期して結合処理部８へ通知する。結合処理
部８は、一つながりの音声として発音されるべき単音節
間の結合を自然にするための結合処理（調音処理）を単
音節パラメータに施し、音源パラメータ発生部１０へ送
る。The monosyllabic decomposition processing unit 6 decomposes the pronunciation of each word into monosyllables according to the phonological rules from the pronunciation code C2 of each word inputted from the search unit 2, and retrieves parameters for each monosyllable from the monosyllabic parameter file 7. and sends it to the combination processing section 8. Furthermore, the monosyllable decomposition processing unit 6 notifies the combination processing unit 8 of the connection or separation between the decomposed individual monosyllables in synchronization with the monosyllable parameters. The combination processing unit 8 performs combination processing (articulation processing) on the single syllable parameters to make the combination between single syllables that are to be pronounced as one continuous voice natural, and sends them to the sound source parameter generation unit 10 .

なお、上記の単音節分解処理部６、単音節パラメータフ
ィルタ７、および結合処理部８は、いずれも従来装置の
ものと同様でよい。Note that the monosyllable decomposition processing section 6, monosyllable parameter filter 7, and combination processing section 8 described above may all be the same as those of the conventional device.

４はイントネーション決定部である。このイントネーシ
ョン決定部４は、従来と同様（二、例えば入力文章中の
個々の文の末尾の語などから、平叙文か疑問文かなどを
判断し、文の全体的なイントネーション決定部る。イン
トネーション（二よって、文中の語句（特に末尾語）の
発音時のアクセントやピッチを変える必要があるので、
イントネーション決定部４からはイントネーション情報
がアクセント決定部５および発音制御部９に送られる。4 is an intonation determining section. This intonation determining unit 4 determines the overall intonation of the sentence by determining whether it is a declarative sentence or an interrogative sentence based on the final word of each sentence in the input sentence, for example. (2) Therefore, it is necessary to change the accent and pitch when pronouncing the words in the sentence (especially the final word).
The intonation information is sent from the intonation determining section 4 to the accent determining section 5 and the pronunciation control section 9.

アクセント決定部５は、検出部２より与えられ？品詞コ
ードＣ１、およびイントネーション決定部４からのイン
トネーション情報にしたがって、発声しようとする単語
のアクセントを決定し、アクセント情報を発音制御部９
へ送る。発音制御部９は、アクセント情報およびイント
ネーション情報にしたがって発音特性を決める要素であ
る継続時間、ピッチ、および振幅を決定し、発音特性情
報を出力する。The accent determining section 5 receives the accent from the detecting section 2? The accent of the word to be uttered is determined according to the part of speech code C1 and the intonation information from the intonation determining section 4, and the accent information is transmitted to the pronunciation control section 9.
send to The pronunciation control section 9 determines the duration, pitch, and amplitude, which are the elements that determine the pronunciation characteristics, according to the accent information and intonation information, and outputs the pronunciation characteristics information.

音源パラメータ発生部１０は、結合処理装置８から与え
られる単音節パラメータ、およびその修飾情報である発
音特性情報にしたがって音源パラメータを発生する。こ
の音源パラメータにしたがって、音声合成部１１は音声
信号を合成し、それを発声部１２に送って発声させる。The sound source parameter generation unit 10 generates sound source parameters according to the monosyllabic parameters provided from the combination processing device 8 and pronunciation characteristic information that is modification information thereof. According to the sound source parameters, the voice synthesis section 11 synthesizes a voice signal, and sends it to the voice generation section 12 to generate a voice.

音源パラメータは発音特性情報で修飾されているので、
発声部１２で発声される音声の発音特性、つまり継続時
間、ピッチ、振幅（音量）は発音特性情報にしたがって
制御される。Since the sound source parameters are modified with pronunciation characteristic information,
The pronunciation characteristics, that is, the duration, pitch, and amplitude (volume) of the voice uttered by the voice generating section 12 are controlled according to the pronunciation characteristic information.

このように、特殊単語以外については符号９〜１２の各
部の動作および構成は従来装置のものと同様である。た
だし、特殊単語の発声時、つまり発音制御部９に入力さ
れる分類コードＣ３が特殊単語を指定した場合、発声制
御部９はアクセント情報およびイントネーション情報に
よって決まる発音特性を故意に変化させ、その特殊単語
を他の一般語句と明瞭に区別して聴取できるような制御
を行なう。本実施例の発音制御部９は、特殊単語に対し
ては発音特性のうちピッチを一律に高くする。As described above, except for the special words, the operations and configurations of the parts 9 to 12 are the same as those of the conventional device. However, when a special word is uttered, that is, when the classification code C3 input to the pronunciation control unit 9 specifies a special word, the pronunciation control unit 9 intentionally changes the pronunciation characteristics determined by the accent information and intonation information, and Control is performed so that words can be clearly distinguished from other common words and phrases. The pronunciation control unit 9 of this embodiment uniformly increases the pitch of the pronunciation characteristics for special words.

なお、ピッチと同時に振幅なども変化させるようにして
もよく、要は特殊単語であることを聴者（二認識させ、
かつ明瞭に聴取できるように発音特性を変化させるとい
うことである。Note that the amplitude may also be changed at the same time as the pitch.
This means changing the pronunciation characteristics so that the sound can be heard clearly.

特殊単語に対するこのような発音特性の制御を行なうた
めに、発音制御部９は従来装置のものと構成を変更する
必要がある。しかし、このような構成の変更は極めて軽
微でよく、その実現は容易であるので、発音制御部９の
具体例は特（＝示さない。In order to control the pronunciation characteristics of special words in this manner, it is necessary to change the configuration of the pronunciation control section 9 from that of the conventional device. However, such a change in the configuration may be extremely minor and its implementation is easy, so a specific example of the sound generation control section 9 will not be specifically shown.

本発明は以上に説明したように、一般的でない固有名詞
、新語、専門用語、さらには聴取しく二くい数詞など（
特殊単語）ζ二ついてはピッチ等の発音特性を故意に変
化させて発声させ、聴者に注意を喚起する構成である。As explained above, the present invention can be applied to unusual proper nouns, new words, technical terms, and even numerals that are difficult to hear (
Special words) ζSecondly, the pronunciation characteristics such as pitch are intentionally changed to make the words uttered in order to call the listener's attention.

したがって本発明によれば、従来の文章音声変換装置の
欠点を大幅に改善した優れた文章音声変換装置を提供す
ることができる効果か得られる。Therefore, according to the present invention, it is possible to provide an excellent text-to-speech conversion device that greatly improves the drawbacks of conventional text-to-speech conversion devices.

[Brief explanation of drawings]

第１図は本発明の一実施例を示すブロック図、第２図は
嚇語辞書ファイル内のコード形式を示す図である。１・・・文章ファイル、２・・・検索部、３・・・単語
辞書ファイル、４・・・イントネーション決定部、５・
・・アクセント決定部、６・・・単音節分解処理部、７
・・・単音節パラメータファイル、８・・・結合処理部
、９・・・発音制御部、ＩＯ・・・音源パラメータ発生
部、１１・・・音声合成部、１２・・・発声部。FIG. 1 is a block diagram showing an embodiment of the present invention, and FIG. 2 is a diagram showing a code format in a threatening word dictionary file. 1... Sentence file, 2... Search section, 3... Word dictionary file, 4... Intonation determining section, 5.
... Accent determination unit, 6... Monosyllabic decomposition processing unit, 7
. . . Monosyllabic parameter file, 8 .

Claims

[Claims]

(1) A text-to-speech conversion device that converts a text input in the form of character information into speech and utters it, which has means for identifying a specific type of word in the input text, and according to the identification result by the means. , a text-to-speech conversion device characterized by making the pronunciation characteristics of the specific type of words different from those of other types of words.