JP3279261B2

JP3279261B2 - Apparatus, method, and recording medium for creating a fixed phrase corpus

Info

Publication number: JP3279261B2
Application number: JP22875098A
Authority: JP
Inventors: 敬子稲垣; 幸夫三留
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1998-08-13
Filing date: 1998-08-13
Publication date: 2002-04-30
Anticipated expiration: 2018-08-13
Also published as: JP2000056787A

Abstract

PROBLEM TO BE SOLVED: To efficiently create a fixed form sentence corpus to be accumulated in a speech data base used for generating a synthetic speech. SOLUTION: Fixed form words and a position of an arbitrary word to be inserted between the fixed form words are inputted from the fixed form word input circuit 1. The attribute of the arbitrary word inserted and the insertion position are inputted from an arbitrary word input circuit 2. An arbitrary word selection part 3 retrieves a word dictionary 4, and selects and extracts all the words having the same attribute as the inputted arbitrary word from the registered words. A fixed form sentence creation part 5 inserts each of the words extracted by the arbitrary word selection part 3 between the fixed form words respectively and generates a fixed form sentence. The created fixed form sentences are each outputted from a fixed form sentence output part 6, and accumulated in a data base 10 provided in the speech synthesis device connected through a LAN, etc.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、定型文コーパス作
成装置、方法及びこの方法を実行するプログラムを記録
した記録媒体に関し、特にテキストに基づいて合成音声
を生成する音声合成装置で使用される定型文コーパスを
収集するために好適なものに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an apparatus and method for creating a fixed-text corpus, and a recording medium on which a program for executing the method is recorded. It is suitable for collecting a sentence corpus.

【０００２】[0002]

【従来の技術】任意のテキストを合成音声に出力する音
声合成システムが、特開平８−８７２９７号公報に開示
されている（以下、この音声合成システムを従来例１と
する）。図６は、従来例１の音声合成システムの主要構
成を示すブロック図であり、音声情報検索部２１と、音
声情報データベース２２と、合成音声生成部２３と、合
成音声生成規則２４とを備えている。2. Description of the Related Art A speech synthesis system for outputting an arbitrary text as a synthesized speech is disclosed in Japanese Patent Laid-Open Publication No. 8-87297 (hereinafter, this speech synthesis system is referred to as Conventional Example 1). FIG. 6 is a block diagram showing a main configuration of a speech synthesis system according to Conventional Example 1, which includes a speech information search unit 21, a speech information database 22, a synthesized speech generation unit 23, and a synthesized speech generation rule 24. I have.

【０００３】従来例１の音声合成システムにおいては、
テキストまたは発音記号列が入力されると、自然音声を
分析し、音声情報検索部２１は、抽出した音声特徴量及
びこれに対応する発声内容を格納した音声情報データベ
ース２２中に、入力テキストまたは発音記号列に一致す
る発声内容が存在するか否かを検索する。一致する発声
内容が検索された場合は、音声情報検索部２１は、これ
を合成音声生成部２３へ渡す。そして、音声合成生成部
２３は、音声情報に応じた処理を施して合成音声を生成
する。一方、一致する発声内容が検索されない場合は、
音声情報検索部２１は、入力されたテキストまたは表音
記号列をそのまま合成音声生成部２３へ渡し、合成音声
生成部２３は、合成音声生成規則２４に基づいて合成音
声を生成する。[0003] In the conventional speech synthesis system,
When a text or a phonetic symbol string is input, the natural voice is analyzed, and the voice information searching unit 21 stores the input text or pronunciation in a voice information database 22 storing the extracted voice feature amounts and the corresponding utterance contents. A search is made to see if there is any utterance content that matches the symbol string. If a matching utterance content is found, the speech information search unit 21 passes this to the synthesized speech generation unit 23. Then, the speech synthesis generation unit 23 performs a process according to the speech information to generate a synthesized speech. On the other hand, if no match is found,
The voice information search unit 21 passes the input text or phonetic symbol string as it is to the synthetic voice generating unit 23, and the synthetic voice generating unit 23 generates a synthetic voice based on the synthetic voice generation rule 24.

【０００４】また、特開平９−２４４６８０には、例え
ば、電話案内システムの応答文のように定型文のある特
定の単語だけが変化する文を合成するための韻律制御装
置が開示されている（以下、この韻律制御装置を従来例
２とする）。従来、音声合成装置では、定型文部分の合
成には自然音声をそのまま用い、任意語のみを規則に従
って合成し、それらを単純につなぎ合わせていた。これ
に対して、従来例２の韻律制御装置では、任意の単語の
韻律パタンを定型文の韻律パタンに合わせるように制御
している。Japanese Patent Application Laid-Open No. 9-244680 discloses a prosody control device for synthesizing a sentence in which only a specific word of a fixed sentence changes, such as a response sentence of a telephone guidance system (for example). Hereinafter, this prosody control device is referred to as Conventional Example 2.) 2. Description of the Related Art Conventionally, a speech synthesizer has used natural speech as it is for synthesizing a fixed sentence portion, synthesizing only arbitrary words according to rules, and simply connecting them. On the other hand, the prosody control device of Conventional Example 2 controls the prosody pattern of an arbitrary word to match the prosody pattern of a fixed sentence.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記従
来例には、次のような問題点があった。まず、従来例１
の音声合成装置は、音声情報データベースに適切な音声
波形が存在しない場合には、合成音声生成部２３または
波形生成部（図示せず）が生成規則に基づいて合成音声
を生成するため、音質が劣化してしまう。そのため、で
きるだけ多くのテキストに適合するようなデータバラン
スの取れた音声情報データベースを持っていなければ、
実用上、自然に聞こえる合成音声を生成できないという
問題点があった。However, the above conventional example has the following problems. First, Conventional Example 1
When an appropriate speech waveform does not exist in the speech information database, the speech synthesis device generates a synthesized speech based on the generation rule by the synthesized speech generation unit 23 or the waveform generation unit (not shown). Will deteriorate. Therefore, unless you have a data-balanced audio information database that fits as much text as possible,
Practically, there is a problem that a synthesized sound that sounds natural can not be generated.

【０００６】また、従来例２の韻律制御装置では、定型
文の韻律パタンに合うように、任意語の韻律パタンの制
御を行うが、許容範囲を越えた制御を行うと任意語その
ものの音質が劣化してしまう。このため、任意語の制御
が許容範囲に収まるのに十分な韻律パタンを用意してお
かければ、実用上、自然に聞こえる合成音声を生成でき
ないという問題点があった。In the prosody control device of Conventional Example 2, the prosody pattern of an arbitrary word is controlled to match the prosody pattern of a fixed sentence. However, if the control exceeds an allowable range, the sound quality of the arbitrary word itself will be reduced. Will deteriorate. For this reason, there is a problem that a synthetic voice that sounds natural can not be generated practically if a prosody pattern sufficient to control an arbitrary word falls within an allowable range is prepared.

【０００７】本発明は、上記従来例の問題点を解消する
ためになされたものであり、合成音声を生成するために
用いられる音声データベースに蓄積する定型文コーパス
を効率よく作成することができる定型文コーパス作成装
置、方法及びこの方法を実行するためのプログラムを記
録したコンピュータ読み取り可能な記録媒体を提供する
ことを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above-described problems of the conventional example, and has a fixed form corpus which can be efficiently created in a fixed sentence corpus stored in a sound database used for generating a synthesized sound. An object of the present invention is to provide a sentence corpus creation apparatus and method, and a computer-readable recording medium on which a program for executing the method is recorded.

【０００８】上記目的を達成するため、この発明の第１
の観点にかかる定型文コーパス作成装置は、複数の単語
と、各単語の属性とをそれぞれ対応付けて記憶する辞書
手段と、作成すべき定型文に含まれ、所望の単語によっ
て表された定型語と、前記定型文において定型語間に挿
入する任意語の位置とを入力する定型語入力手段と、前
記定型文において挿入する任意語の位置と挿入すべき任
意語の属性とを対応付けて入力する任意語入力手段と、
前記任意語入力手段から入力された任意語の属性に対応
する単語を前記辞書手段から選択して抽出する任意語選
択手段と、前記定型語入力手段から入力された任意語の
位置に、前記任意語選択手段によって抽出された単語を
それぞれ挿入して定型文を生成する定型文生成手段と、
を備え、前記辞書手段は、複数の単語のそれぞれに対応
付けてさらに各単語のモーラの個数または音節数を記憶
しており、前記任意語選択手段は、選択した単語のそれ
ぞれに対応するモーラの個数または音節数を前記辞書手
段から併せて抽出し、前記任意語選択手段によって抽出
された単語を、該単語に併せて抽出されたモーラの個数
または音節数に従って分類する単語分類手段をさらに備
え、前記定型文生成手段は、前記単語分類手段による分
類に関する情報を付加して、定型文を生成することを特
徴とする。[0008] To achieve the above object, the first aspect of the present invention.
The fixed form corpus creation device according to the aspect of the present invention includes a dictionary means for storing a plurality of words and attributes of each word in association with each other, a fixed form word included in the fixed form sentence to be created, and represented by a desired word. A fixed word input means for inputting the position of an arbitrary word to be inserted between fixed words in the fixed sentence, and inputting the position of the arbitrary word to be inserted in the fixed sentence and the attribute of the arbitrary word to be inserted in association with each other An arbitrary word input means,
An optional word selecting means for selecting and extracting from the dictionary means a word corresponding to the attribute of the optional word input from the optional word input means; and an optional word at the position of the optional word input from the fixed form input means. A fixed sentence generating means for generating a fixed sentence by inserting each of the words extracted by the word selecting means ,
And the dictionary means corresponds to each of a plurality of words.
And remember the number of mora or syllables for each word
And the arbitrary word selecting means selects the word of the selected word.
The number of mora or syllables corresponding to each
Extracted together from the column and extracted by the optional word selection means
The number of mora extracted along with the extracted word
Or a word classification means for classifying according to the number of syllables
For example, the fixed sentence generating means may include
It is characterized in that a fixed phrase is generated by adding information about a class .

【０００９】上記目的を達成するため、この発明の第２
の観点にかかる定型文コーパス作成装置は、複数の単語
と、各単語の属性とをそれぞれ対応付けて記憶する辞書
手段と、作成すべき定型文に含まれ、所望の単語によっ
て表された定型語と、前記定型文において定型語間に挿
入する任意語の位置とを入力する定型語入力手段と、前
記定型文において挿入する任意語の位置と挿入すべき任
意語の属性とを対応付けて入力する任意語入力手段と、
前記任意語入力手段から入力された任意語の属性に対応
する単語を前記辞書手段から選択して抽出する任意語選
択手段と、前記定型語入力手段から入力された任意語の
位置に、前記任意語選択手段によって抽出された単語を
それぞれ挿入して定型文を生成する定型文生成手段と、
を備え、前記辞書手段は、複数の単語のそれぞれに対応
付けてさらに各単語のアクセントの位置を記憶してお
り、前記任意語選択手段は、選択した単語のそれぞれに
対応するアクセントの位置を前記辞書手段から併せて抽
出し、前記任意語選択手段によって抽出された単語を、
該単語に併せて抽出されたアクセントに位置に従って分
類する単語分類手段をさらに備え、前記定型文生成手段
は、前記単語分類手段による分類に関する情報を付加し
て、定型文を生成することを特徴とする。 In order to achieve the above object, a second aspect of the present invention is provided.
The fixed phrase corpus creation device according to the aspect of
And a dictionary for storing attributes of each word in association with each other
Means and the words included in the fixed phrase to be created.
Between the fixed form word and the fixed form word in the fixed form sentence.
Fixed word input means for inputting the position of an arbitrary word to be input;
The position of the optional word to be inserted in the fixed phrase and the task to be inserted
An arbitrary word input means for inputting in association with a meaning attribute,
Corresponds to the attribute of an arbitrary word input from the arbitrary word input means
Arbitrary word selection for selecting and extracting words to be extracted from the dictionary means
Selection means, and an arbitrary word input from the fixed form input means.
In the position, the word extracted by the arbitrary word selecting means
A fixed phrase generation means for generating a fixed phrase by inserting each;
And the dictionary means corresponds to each of a plurality of words.
And memorize the accent position of each word.
And the optional word selecting means selects each of the selected words.
The position of the corresponding accent is also extracted from the dictionary means.
Out, the word extracted by the arbitrary word selecting means,
According to the position of the accent extracted along with the word
The standard sentence generating means,
Adds information on classification by the word classification means
And generating a fixed phrase.

【００１０】上記目的を達成するため、本発明の第３の
観点にかかる定型文コーパス作成装置は、複数の単語
と、各単語の属性とをそれぞれ対応付けて記憶する辞書
手段と、作成すべき定型文に含まれ、所望の単語によっ
て表された定型語と、前記定型文において定型語間に挿
入する任意語の位置とを入力する定型語入力手段と、前
記定型文において挿入する任意語の位置と挿入すべき任
意語の属性とを対応付けて入力する任意語入力手段と、
前記任意語入力手段から入力された任意語の属性に対応
する単語を前記辞書手段から選択して抽出する任意語選
択手段と、前記定型語入力手段から入力された任意語の
位置に、前記任意語選択手段によって抽出された単語を
それぞれ挿入して定型文を生成する定型文生成手段と、
を備え、前記辞書手段は、複数の単語のそれぞれに対応
付けてさらに各単語の音韻環境を示す情報を記憶してお
り、前記任意語選択手段は、選択した単語のそれぞれに
対応する音韻環境を示す情報を前記辞書手段から併せて
抽出し、前記任意語選択手段によって抽出された単語
を、該単語に併せて抽出された音韻環境を示す情報に従
って分類する単語分類手段ををさらに備え、前記定型文
生成手段は、前記単語分類手段による分類に関する情報
を付加して、定型文を生成することを特徴とする。前記
単語分類手段は、さらに前記定型語入力手段から入力さ
れた定型語と抽出された単語に対応する任意語との位置
関係に従って、前記任意語抽出手段によって抽出された
単語を分類するようにしてもよい。 In order to achieve the above object, a third aspect of the present invention is provided.
The fixed phrase corpus creator according to the viewpoint includes a plurality of words.
And a dictionary for storing attributes of each word in association with each other
Means and the words included in the fixed phrase to be created.
Between the fixed form word and the fixed form word in the fixed form sentence.
Fixed word input means for inputting the position of an arbitrary word to be input;
The position of the optional word to be inserted in the fixed phrase and the task to be inserted
An arbitrary word input means for inputting in association with a meaning attribute,
Corresponds to the attribute of an arbitrary word input from the arbitrary word input means
Arbitrary word selection for selecting and extracting words to be extracted from the dictionary means
Selection means, and an arbitrary word input from the fixed form input means.
In the position, the word extracted by the arbitrary word selecting means
A fixed phrase generation means for generating a fixed phrase by inserting each;
And the dictionary means corresponds to each of a plurality of words.
And store information indicating the phonological environment of each word.
And the optional word selecting means selects each of the selected words.
The information indicating the corresponding phonological environment is added from the dictionary means.
Words extracted and extracted by the arbitrary word selecting means
In accordance with the information indicating the phoneme environment extracted along with the word.
Further comprising word classification means for classifying
Generating means for generating information on classification by the word classification means;
Is added to generate a fixed sentence. Said
The word classification means further receives the input from the fixed form input means.
Between the fixed phrase and the arbitrary word corresponding to the extracted word
Extracted by the optional word extracting means according to the relation
Words may be classified.

【００１１】上記目的を達成するため、本発明の第４の
観点にかかる定型文コーパス作成装置は、複数の単語
と、各単語の属性とをそれぞれ対応付けて記憶する辞書
手段と、作成すべき定型文に含まれ、所望の単語によっ
て表された定型語と、前記定型文において定型語間に挿
入する任意語の位置とを入力する定型語入力手段と、前
記定型文において挿入する任意語の位置と挿入すべき任
意語の属性とを対応付けて入力する任意語入力手段と、
前記任意語入力手段から入力された任意語の属性に対応
する単語を前記辞書手段から選択して抽出する任意語選
択手段と、前記定型語入力手段から入力された任意語の
位置に、前記任意語選択手段によって抽出された単語を
それぞれ挿入して定型文を生成する定型文生成手段と、
を備え、前記辞書手段は、複数の単語のそれぞれに対応
付けてさらに各単語の音に関する情報を記憶しており、
前記任意語選択手段は、選択した単語のそれぞれに対応
する音に関する情報を前記辞書手段から併せて抽出し、
前記任意語選択手段によって抽出された単語を、該単語
に併せて抽出された音に関する情報に従って分類する単
語分類手段をさらに備え、前記定型文生成手段は、前記
単語分類手段による分類に関する情報を付加して、定型
文を生成することを特徴とする。 In order to achieve the above object, a fourth aspect of the present invention is provided.
The fixed phrase corpus creator according to the viewpoint includes a plurality of words.
And a dictionary for storing attributes of each word in association with each other
Means and the words included in the fixed phrase to be created.
Between the fixed form word and the fixed form word in the fixed form sentence.
Fixed word input means for inputting the position of an arbitrary word to be input;
The position of the optional word to be inserted in the fixed phrase and the task to be inserted
An arbitrary word input means for inputting in association with a meaning attribute,
Corresponds to the attribute of an arbitrary word input from the arbitrary word input means
Arbitrary word selection for selecting and extracting words to be extracted from the dictionary means
Selection means, and an arbitrary word input from the fixed form input means.
In the position, the word extracted by the arbitrary word selecting means
A fixed phrase generation means for generating a fixed phrase by inserting each;
And the dictionary means corresponds to each of a plurality of words.
In addition, information about the sound of each word is also stored,
The arbitrary word selecting means corresponds to each of the selected words.
Information relating to the sound to be extracted is also extracted from the dictionary means,
The word extracted by the arbitrary word selecting means is
Simply classify according to the information about the sound extracted along with
Further comprising word classification means, wherein the fixed phrase generation means comprises:
Adds information about the classification by the word classification means,
The method is characterized by generating a sentence.

【００１２】上記目的を達成するため、本発明の第５の
観点にかかる定型文コーパス作成装置は、生成すべき定
型文に含まれ、所望の単語によって表された定型語と、
前記定型文において定型語間に挿入する任意語の位置と
を入力する定型語入力ステップと、前記定型文において
挿入する任意語の位置と挿入すべき任意語の属性とを対
応付けて入力する任意語入力ステップと、前記任意語入
力ステップで入力された任意語の属性に対応する単語
を、複数の単語とそれぞれの属性とを対応付けて記憶す
る辞書から選択して抽出する任意語抽出ステップと、前
記定型語入力ステップで入力された任意語の位置に、前
記任意語抽出ステップで抽出された単語をそれぞれ挿入
して定型文を生成する定型文生成ステップと、を含み、
前記任意語抽出ステップは、前記辞書に前記複数の単語
にそれぞれ対応付けて記憶されている音に関する情報の
うち、前記選択した単語に対応する音に関する情報を併
せて抽出し、前記任意語抽出ステップで抽出された単語
を、該単語に併せて抽出された音に関する情報に従って
分類する単語分類ステップをさらに含み、前記定型文生
成ステップは、前記単語分類ステップにおける分類に関
する情報を付加して、定型文を生成することを特徴とす
る。In order to achieve the above object, a fifth aspect of the present invention is provided.
The fixed phrase corpus creation device according to the viewpoint
A fixed phrase included in the type sentence and represented by a desired word;
In the fixed phrase, the position of an arbitrary word to be inserted between fixed words and
Inputting a fixed form word, and in the fixed form sentence,
The position of the arbitrary word to be inserted and the attribute of the arbitrary word to be inserted are paired.
Inputting an arbitrary word in response to the inputting;
Word corresponding to the attribute of the arbitrary word entered in the force step
Is stored in association with a plurality of words and respective attributes.
An optional word extraction step of selecting and extracting from a dictionary
At the position of any word entered in the fixed word input step,
Insert each word extracted in the optional word extraction step
Anda fixed sentence generating step of generating a template sentence and,
The optional word extracting step includes: storing the plurality of words in the dictionary.
Of information about sound stored in association with
The information about the sound corresponding to the selected word
And the word extracted in the optional word extracting step
According to the information about the sound extracted along with the word.
A word classification step for classifying the fixed form sentence;
The forming step is related to the classification in the word classifying step.
In addition, a fixed phrase is generated by adding information to be executed .

【００１３】上記目的を達成するため、本発明の第６の
観点にかかる定型文コーパス作成方法は、生成すべき定
型文に含まれ、所望の単語によって表された定型語と、
前記定型文において定型語間に挿入する任意語の位置と
を入力する定型語入力ステップと、前記定型文において
挿入する任意語の位置と挿入すべき任意語の属性とを対
応付けて入力する任意語入力ステップと、前記任意語入
力ステップで入力された任意語の属性に対応する単語
を、複数の単語とそれぞれの属性とを対応付けて記憶す
る辞書から選択して抽出する任意語抽出ステップと、前
記定型語入力ステップで入力された任意語の位置に、前
記任意語抽出ステップで抽出された単語をそれぞれ挿入
して定型文を生成する定型文生成ステップとを備え、前
記辞書は、複数の単語のそれぞれに対応付けてさらに各
単語のアクセントの位置を記憶しており、前記任意語抽
出ステップは、選択した単語のそれぞれに対応するアク
セントの位置を前記辞書から併せて抽出し、前記任意語
抽出ステップによって抽出された単語を、該単語に併せ
て抽出されたアクセントに位置に従って分類する単語分
類ステップをさらに備え、前記定型文生成ステップは、
前記単語分類ステップによる分類に関する情報を付加し
て、定型文を生成することを特徴とする。[0013] To achieve the above object, a sixth aspect of the present invention.
Fixed text corpus creation method according to the aspect, the constant to be generated
A fixed phrase included in the type sentence and represented by a desired word;
In the fixed phrase, the position of an arbitrary word to be inserted between fixed words and
Inputting a fixed form word, and in the fixed form sentence,
The position of the arbitrary word to be inserted and the attribute of the arbitrary word to be inserted are paired.
Inputting an arbitrary word in response to the inputting;
Word corresponding to the attribute of the arbitrary word entered in the force step
Is stored in association with a plurality of words and respective attributes.
An optional word extraction step of selecting and extracting from a dictionary
At the position of any word entered in the fixed word input step,
Insert each word extracted in the optional word extraction step
A fixed phrase generation step of generating a fixed phrase by performing
The notation dictionary is further associated with each of a plurality of words.
The accent position of the word is stored, and the arbitrary word extraction is performed.
The exit step is an action corresponding to each of the selected words.
The cent position is also extracted from the dictionary, and the arbitrary word is extracted.
The word extracted in the extraction step is added to the word.
Words to be classified according to the position of the extracted accent
Type step, wherein the fixed phrase generation step includes:
Adding information on classification by the word classification step
And generating a fixed phrase .

【００１４】上記目的を達成するため、本発明の第７の
観点にかかる定型文コーパス作成方法は、作成すべき定
型文に含まれ、所望の単語によって表された定型語と、
前記定型文において定型語間に挿入する任意語の位置と
を入力する定型語入力ステップと、前記定型文において
挿入する任意語の位置と挿入すべき任意語の属性とを対
応付けて入力する任意語入力ステップと、前記任意語入
力ステップで入力された任意語の属性に対応する単語
を、複数の単語とそれぞれの属性とを対応付けて記憶す
る辞書から選択して抽出する任意語抽出ステップと、前
記定型語入力ステップで入力された任意語の位置に、前
記任意語抽出ステップで抽出された単語をそれぞれ挿入
して定型文を生成する定型文生成ステップとを備え、前
記辞書は、複数の単語のそれぞれに対応付けてさらに各
単語の音韻環境を示す情報を記憶しており、前記任意語
抽出ステップは、選択した単語のそれぞれに対応する音
韻環境を示す情報を前記辞書から併せて抽出し、前記任
意語抽出ステップで抽出された単語を、該単語に併せて
抽出された音韻環境を示す情報に従って分類する単語分
類ステップをさらに備え、前記定型文生成ステップは、
前記単語分類ステップによる分類に関する情報を付加し
て、定型文を生成することを特徴とする。前記単語分類
ステップは、さらに前記定型語入力ステップで入力され
た定型語と抽出された単語に対応する任意語との位置関
係に従って、前記任意語抽出ステップで抽出された単語
を分類するものでもよい。 In order to achieve the above object, a seventh aspect of the present invention is provided.
The fixed phrase corpus creation method according to the viewpoint
A fixed phrase included in the type sentence and represented by a desired word;
In the fixed phrase, the position of an arbitrary word to be inserted between fixed words and
Inputting a fixed form word, and in the fixed form sentence,
The position of the arbitrary word to be inserted and the attribute of the arbitrary word to be inserted are paired.
Inputting an arbitrary word in response to the inputting;
Word corresponding to the attribute of the arbitrary word entered in the force step
Is stored in association with a plurality of words and respective attributes.
An optional word extraction step of selecting and extracting from a dictionary
At the position of any word entered in the fixed word input step,
Insert each word extracted in the optional word extraction step
And a fixed sentence generating step of generating a fixed sentence, before
The notation dictionary is further associated with each of a plurality of words.
Information indicating a phonological environment of the word is stored, and
The extracting step includes the sound corresponding to each of the selected words.
The information indicating the rhyme environment is also extracted from the dictionary, and
The word extracted in the meaning extraction step is added to the word
Words classified according to the information indicating the extracted phoneme environment
Type step, wherein the fixed phrase generation step includes:
Adding information on classification by the word classification step
And generating a fixed phrase . The word classification
The step is further input in the fixed form input step.
Between the fixed form word and the arbitrary word corresponding to the extracted word
Word extracted in the optional word extraction step according to
May be classified.

【００１５】上記目的を達成するため、本発明の第８の
観点にかかる定型文コーパス作成方法は、作成すべき定
型文に含まれ、所望の単語によって表された定型語と、
前記定型文において定型語間に挿入する任意語の位置と
を入力する定型語入力ステップと、前記定型文において
挿入する任意語の位置と挿入すべき任意語の属性とを対
応付けて入力する任意語入力ステップと、前記任意語入
力ステップで入力された任意語の属性に対応する単語
を、複数の単語とそれぞれの属性とを対応付けて記憶す
る辞書から選択して抽出する任意語抽出ステップと、前
記定型語入力ステップで入力された任意語の位置に、前
記任意語抽出ステップで抽出された単語をそれぞれ挿入
して定型文を生成する定型文生成ステップとを備え、前
記辞書は、複数の単語のそれぞれに対応付けてさらに各
単語の音に関する情報を記憶しており、前記任意語抽出
ステップは、選択した単語のそれぞれに対応する音に関
する情報を前記辞書から併せて抽出し、前記任意語抽出
ステップによって抽出された単語を、該単語に併せて抽
出された音に関する情報に従って分類する単語分類ステ
ップをさらに備え、前記定型文生成ステップは、前記単
語分類ステップによる分類に関する情報を付加して、定
型文を生成することを特徴とする。In order to achieve the above object, an eighth aspect of the present invention is provided.
The fixed phrase corpus creation method according to the viewpoint
A fixed phrase included in the type sentence and represented by a desired word;
In the fixed phrase, the position of an arbitrary word to be inserted between fixed words and
Inputting a fixed form word, and in the fixed form sentence,
The position of the arbitrary word to be inserted and the attribute of the arbitrary word to be inserted are paired.
Inputting an arbitrary word in response to the inputting;
Word corresponding to the attribute of the arbitrary word entered in the force step
Is stored in association with a plurality of words and respective attributes.
An optional word extraction step of selecting and extracting from a dictionary
At the position of any word entered in the fixed word input step,
Insert each word extracted in the optional word extraction step
And a fixed sentence generating step of generating a fixed sentence, before
The notation dictionary is further associated with each of a plurality of words.
Stores information about the sound of words and extracts the arbitrary words
The steps relate to the sound corresponding to each of the selected words.
Information to be extracted from the dictionary, and the arbitrary word extraction
The words extracted by the step are extracted together with the words.
A word classification step that classifies according to the information about the emitted sounds.
The fixed form sentence generation step,
Add information about the classification by the word classification step
It is characterized by generating a type sentence .

【００１６】上記目的を達成するため、本発明の第９の
観点にかかるコンピュータ読み取り可能な記録媒体は、
生成すべき定型文に含まれ、所望の単語によって表され
た定型語と、前記定型文において定型語間に挿入する任
意語の位置とを入力する定型語入力ステップと、前記定
型文において挿入する任意語の位置と挿入すべき任意語
の属性とを対応付けて入力する任意語入力ステップと、
前記任意語入力ステップで入力された任意語の属性に対
応する単語を、複数の単語とそれぞれの属性とを対応付
けて記憶する辞書から選択して抽出する任意語抽出ステ
ップと、前記定型語入力ステップで入力された任意語の
位置に、前記任意語抽出ステップで抽出された単語をそ
れぞれ挿入して定型文を生成する定型文生成ステップ
と、を備え、前記任意語抽出ステップは、前記辞書に前
記複数の単語にそれぞれ対応付けて記憶されている音に
関する情報のうち、前記選択した単語に対応する音に関
する情報を併せて抽出し、前記任意語抽出ステップで抽
出された単語を、該単語に併せて抽出された音に関する
情報に従って分類する単語分類ステップを実行するプロ
グラムをさらに記録し、前記定型文生成ステップは、前
記単語分類ステップにおける分類に関する情報を付加し
て、定型文を生成する処理を実行するプログラムを記録
することを特徴とする。 In order to achieve the above object, a ninth aspect of the present invention is provided.
The computer-readable recording medium according to the aspect is
Included in the fixed phrase to be generated, represented by the desired word
The fixed phrase and the task to insert between the fixed words in the fixed phrase.
A fixed word input step of inputting a position of a meaning word;
Arbitrary words to be inserted in the type sentence and arbitrary words to be inserted
An optional word input step of inputting in association with the attribute of
The attribute of the arbitrary word input in the optional word input step is
Corresponding words, multiple words and their attributes are associated
Word extraction step to select and extract from the dictionary
In the fixed word input step
At the position, the word extracted in the optional word extraction step is
A fixed phrase generation step of generating a fixed phrase by inserting each
And the optional word extracting step includes:
The sound stored in association with each word
Information related to the sound corresponding to the selected word.
Information to be extracted at the same time, and
The output word is related to the sound extracted along with the word.
Professional to perform word classification step to classify according to information
Further recording the gram, and the fixed form generation step
Adds information about classification in the vocabulary classification step
And record the program that executes the process to generate the fixed phrase
It is characterized by doing.

【００１７】上記目的を達成するため、本発明の第１０
の観点にかかるコンピュータ読み取り可能な記録媒体
は、生成すべき定型文に含まれ、所望の単語によって表
された定型語と、前記定型文において定型語間に挿入す
る任意語の位置とを入力する定型語入力ステップと、前
記定型文において挿入する任意語の位置と挿入すべき任
意語の属性とを対応付けて入力する任意語入力ステップ
と、前記任意語入力ステップで入力された任意語の属性
に対応する単語を、複数の単語とそれぞれの属性とを対
応付けて記憶する辞書から選択して抽出する任意語抽出
ステップと、前記定型語入力ステップで入力された任意
語の位置に、前記任意語抽出ステップで抽出された単語
をそれぞれ挿入して定型文を生成する定型文生成ステッ
プと、を備え、前記辞書は、複数の単語のそれぞれに対
応付けてさらに各単語のアクセントの位置を記憶してお
り、前記任意語抽出ステップは、選択した単語のそれぞ
れに対応するアクセントの位置を前記辞書から併せて抽
出し、前記任意語抽出ステップによって抽出された単語
を、該単語に併せて抽出されたアクセントに位置に従っ
て分類する単語分類ステップをさらに備え、前記定型文
生成ステップは、前記単語分類ステップによる分類に関
する情報を付加して、定型文を生成する処理を実行する
プログラムを記録することを特徴とする。 To achieve the above object, the tenth aspect of the present invention
Readable recording medium according to the aspect of the present invention
Is included in the fixed phrase to be generated, and is represented by the desired word.
Inserted between the fixed phrase and the fixed phrase in the fixed phrase.
Inputting the position of an arbitrary word to be input
The position of the optional word to be inserted in the fixed phrase and the task to be inserted
Arbitrary word input step of inputting in association with the meaning attribute
And the attribute of the arbitrary word input in the optional word input step
The word corresponding to is paired with multiple words and their attributes.
Arbitrary word extraction that selects and extracts from dictionaries stored and associated
Step and any of the input in the fixed form input step
In the position of the word, the word extracted in the optional word extraction step
To generate a fixed phrase by inserting
And the dictionary includes a pair for each of a plurality of words.
And memorize the accent position of each word.
The optional word extracting step includes selecting each of the selected words.
The corresponding accent position is extracted from the dictionary.
And the words extracted by the optional word extraction step
According to the position of the accent extracted along with the word.
Further comprising a word classification step for classifying
The generation step is related to the classification by the word classification step.
Perform processing to generate a fixed phrase by adding information
It is characterized by recording a program.

【００１８】上記目的を達成するため、本発明の第１１
の観点にかかるコンピュータ読み取り可能な記録媒体
は、生成すべき定型文に含まれ、所望の単語によって表
された定型語と、前記定型文において定型語間に挿入す
る任意語の位置とを入力する定型語入力ステップと、前
記定型文において挿入する任意語の位置と挿入すべき任
意語の属性とを対応付けて入力する任意語入力ステップ
と、前記任意語入力ステップで入力された任意語の属性
に対応する単語を、複数の単語とそれぞれの属性とを対
応付けて記憶する辞書から選択して抽出する任意語抽出
ステップと、前記定型語入力ステップで入力された任意
語の位置に、前記任意語抽出ステップで抽出された単語
をそれぞれ挿入して定型文を生成する定型文生成ステッ
プと、を備え、前記辞書は、複数の単語のそれぞれに対
応付けてさらに各単語の音韻環境を示す情報を記憶して
おり、前記任意語抽出ステップは、選択した単語のそれ
ぞれに対応する音韻環境を示す情報を前記辞書から併せ
て抽出し、前記任意語抽出ステップで抽出された単語
を、該単語に併せて抽出された音韻環境を示す情報に従
って分類する単語分類ステップをさらに備え、前記定型
文生成ステップは、前記単語分類ステップによる分類に
関する情報を付加して、定型文を生成する処理を実行す
るプログラムを記録することを特徴とする。前記単語分
類ステップは、前記定型語入力ステップで入力された定
型語と抽出された単語に対応する任意語との位置関係に
従って、前記任意語抽出ステップで抽出された単語を分
類するようにしてもよい。 To achieve the above object, an eleventh aspect of the present invention is provided.
Readable recording medium according to the aspect of the present invention
Is included in the fixed phrase to be generated, and is represented by the desired word.
Inserted between the fixed phrase and the fixed phrase in the fixed phrase.
Inputting the position of an arbitrary word to be input
The position of the optional word to be inserted in the fixed phrase and the task to be inserted
Arbitrary word input step of inputting in association with the meaning attribute
And the attribute of the arbitrary word input in the optional word input step
The word corresponding to is paired with multiple words and their attributes.
Arbitrary word extraction that selects and extracts from dictionaries stored and associated
Step and any of the input in the fixed form input step
In the position of the word, the word extracted in the optional word extraction step
To generate a fixed phrase by inserting
Comprising a flop, wherein the dictionary pairs in each of a plurality of words
And memorize the information indicating the phonological environment of each word
And the optional word extraction step includes the step of extracting the selected word.
Information indicating the phonological environment corresponding to each is combined from the dictionary.
Extracted in the optional word extraction step
In accordance with the information indicating the phoneme environment extracted along with the word.
Further comprising a word classification step of classifying
In the sentence generation step, the classification by the word classification step is performed.
Process to generate a fixed phrase by adding information about
The program is characterized by recording a program . For the word
The type step includes the fixed form input in the fixed form input step.
The positional relationship between the type word and the arbitrary word corresponding to the extracted word
Therefore, the words extracted in the optional word extraction step are separated.
You may make it similar.

【００１９】上記目的を達成するため、本発明の第１２
の観点にかかるコンピュータ読み取り可能な記録媒体
は、生成すべき定型文に含まれ、所望の単語によって表
された定型語と、前記定型文において定型語間に挿入す
る任意語の位置とを入力する定型語入力ステップと、前
記定型文において挿入する任意語の位置と挿入すべき任
意語の属性とを対応付けて入力する任意語入力ステップ
と、前記任意語入力ステップで入力された任意語の属性
に対応する単語を、複数の単語とそれぞれの属性とを対
応付けて記憶する辞書から選択して抽出する任意語抽出
ステップと、前記定型語入力ステップで入力された任意
語の位置に、前記任意語抽出ステップで抽出された単語
をそれぞれ挿入して定型文を生成する定型文生成ステッ
プと、を備え、前記辞書は、複数の単語のそれぞれに対
応付けてさらに各単語の音に関する情報を記憶してお
り、前記任意語抽出ステップは、選択した単語のそれぞ
れに対応する音に関する情報を前記辞書から併せて抽出
し、前記任意語抽出ステップによって抽出された単語
を、該単語に併せて抽出された音に関する情報に従って
分類する単語分類ステップをさらに備え、前記定型文生
成ステップは、前記単語分類ステップによる分類に関す
る情報を付加して、定型文を生成する処理を実行するプ
ログラムを記録することを特徴とする。 To achieve the above object, the present invention provides a twelfth aspect.
Readable recording medium according to the aspect of the present invention
Is included in the fixed phrase to be generated, and is represented by the desired word.
Inserted between the fixed phrase and the fixed phrase in the fixed phrase.
Inputting the position of an arbitrary word to be input
The position of the optional word to be inserted in the fixed phrase and the task to be inserted
Arbitrary word input step of inputting in association with the meaning attribute
And the attribute of the arbitrary word input in the optional word input step
The word corresponding to is paired with multiple words and their attributes.
Arbitrary word extraction that selects and extracts from dictionaries stored and associated
Step and any of the input in the fixed form input step
In the position of the word, the word extracted in the optional word extraction step
To generate a fixed phrase by inserting
Comprising a flop, wherein the dictionary pairs in each of a plurality of words
And also memorize information about the sound of each word.
The optional word extracting step includes selecting each of the selected words.
Extracts information about the sound corresponding to the sound from the dictionary
And the words extracted by the optional word extraction step
According to the information about the sound extracted along with the word.
The method further comprises a word classification step of classifying the fixed form sentence.
The forming step relates to the classification by the word classifying step.
By adding that information, up to execute a process of generating template sentence
Recording the program.

【００２０】上記記録媒体は、前記任意語抽出ステップ
で使用される複数の単語と、各単語の属性とをそれぞれ
対応付けて記憶する辞書をさらに記録してもよい。 In the above-mentioned recording medium, the arbitrary word extracting step
The multiple words used in the and the attributes of each word
A dictionary to be stored in association with the dictionary may be further recorded.

【００２１】[0021]

【発明の実施の形態】以下、添付図面を参照して、本発
明の実施の形態について説明する。Embodiments of the present invention will be described below with reference to the accompanying drawings.

【００２２】［第１の実施の形態］図１は、この実施の
形態にかかる定型文コーパス作成装置の構成を示すブロ
ック図である。図示するように、この定型文コーパス作
成装置は、定型語入力部１と、任意語入力部２と、任意
語選択部３と、単語辞書４と、定型文生成部５と、定型
文出力部６とを備える。また、この定型文コーパス作成
装置は、ＬＡＮ（Local Area Network）などを介してデ
ータベース１０を有する音声合成システムの音声データ
ベースに接続されている。[First Embodiment] FIG. 1 is a block diagram showing a configuration of a fixed-form sentence corpus creating apparatus according to this embodiment. As shown in the figure, this fixed phrase corpus creation device includes a fixed phrase input unit 1, an optional word input unit 2, an optional word selection unit 3, a word dictionary 4, a fixed phrase generation unit 5, a fixed phrase output unit. 6 is provided. Further, the fixed form corpus creation device is connected to a speech database of a speech synthesis system having a database 10 via a LAN (Local Area Network) or the like.

【００２３】この定型文コーパス作成装置を使ってあら
かじめ合成音声の韻律パタンを生成、あるいはモデル化
するのに必要なコーパスを設計しておくことにより、例
えば電話案内システムの応答文のように定型文のある特
定の単語だけが変化する文を生成するのに必要な韻律パ
タンを音声データベースに蓄積することができる。ま
た、この定型文コーパス作成装置により、任意語の韻律
パタンの制御が許容範囲に収まるのに十分な韻律パタン
を音声データベースに用意することが可能となる。By designing a corpus necessary for generating or modeling a prosody pattern of a synthesized speech in advance using this fixed-sentence corpus creating apparatus, a fixed-sentence sentence such as a response sentence of a telephone guidance system can be obtained. The prosody pattern necessary to generate a sentence in which only certain words change in a certain word can be stored in the speech database. Further, with this fixed-sentence corpus creation device, it is possible to prepare a sufficient prosodic pattern in the speech database so that the control of the prosodic pattern of an arbitrary word falls within an allowable range.

【００２４】定型文入力部１は、キーボードなどの入力
装置を含み、定型語と定型語間に挿入する任意語の位置
とを入力する。定型語と定型語間に挿入する任意語の位
置とは、例えば、“ただ今［Ａ］は、［Ｂ］中です。”
といった具合に入力する。この場合、“ただ今”、
“は”、“中です”が定型語であり、［Ａ］と［Ｂ］と
が任意語の挿入位置となる。The fixed phrase input unit 1 includes an input device such as a keyboard, and inputs a fixed phrase and a position of an arbitrary word to be inserted between the fixed phrases. For example, the position of an arbitrary word inserted between fixed words and fixed words is, for example, “[A] is now in [B].”
And so on. In this case, “just now”,
“Ha” and “middle” are fixed words, and [A] and [B] are insertion positions of arbitrary words.

【００２５】任意語入力部２は、キーボードなどの入力
装置を含み、挿入する任意語の属性とその挿入位置とを
入力する。挿入する任意語の属性とその挿入位置とは、
上記の例では、例えば、“人名［Ａ］、動作［Ｂ］”と
いった具合に入力する。The arbitrary word input unit 2 includes an input device such as a keyboard, and inputs an attribute of an arbitrary word to be inserted and its insertion position. The attribute of the arbitrary word to be inserted and its insertion position are
In the above example, the input is, for example, “person name [A], operation [B]”.

【００２６】任意語選択部３は、単語辞書４を検索し、
任意語入力部２から入力された任意語の属性と一致する
属性を有する単語をすべて選択して抽出する。定型語間
に挿入すべき任意語が複数ある場合には、任意語選択部
３は、それぞれに対して対応するすべての単語を選択し
て抽出する。任意語選択部３は、各単語に対応付けられ
て記憶されている他の情報も、併せて抽出する。The optional word selection unit 3 searches the word dictionary 4 and
All words having attributes that match the attributes of the arbitrary word input from the arbitrary word input unit 2 are selected and extracted. When there are a plurality of arbitrary words to be inserted between the fixed words, the arbitrary word selection unit 3 selects and extracts all the words corresponding to each. The arbitrary word selection unit 3 also extracts other information stored in association with each word.

【００２７】単語辞書４は、任意語選択部３による単語
の抽出に用いられる辞書であり、例えば、各単語の見出
し、読み、モーラ数、アクセント位置、品詞、単語の属
性、頻度情報などをそれぞれ対応付けて記憶している。
単語辞書４の構成例については、詳しく後述する。The word dictionary 4 is a dictionary used for extracting words by the arbitrary word selection unit 3, and includes, for example, headings, readings, mora numbers, accent positions, parts of speech, word attributes, frequency information, etc. of each word. They are stored in association with each other.
A configuration example of the word dictionary 4 will be described later in detail.

【００２８】定型文生成部５は、任意語選択部３によっ
て抽出された各単語を、定型語入力部１から入力された
定型語間に挿入して定型文を生成する。定型文生成部５
は、挿入すべき任意語が複数ある場合には、それぞれに
対応して抽出された単語をそのすべての組み合わせで定
型語間に挿入した定型文を生成する。The fixed phrase generator 5 inserts each word extracted by the optional word selector 3 between fixed words input from the fixed word input unit 1 to generate a fixed phrase. Fixed phrase generator 5
Generates a fixed phrase in which, when there are a plurality of arbitrary words to be inserted, words extracted corresponding to each of the words are inserted between the fixed words in all combinations thereof.

【００２９】定型文出力部６は、定型文生成部５で生成
した定型文をＬＡＮ上に出力し、このＬＡＮに接続され
た音声合成システムが有するデータベースに定型文コー
パスとして蓄積させる。The fixed phrase output unit 6 outputs the fixed phrase generated by the fixed phrase generation unit 5 onto a LAN, and stores the fixed phrase as a fixed phrase corpus in a database of a speech synthesis system connected to the LAN.

【００３０】図２は、図１の単語辞書４の具体例を示す
図である。この例において単語辞書４には、見出し（単
語）、読み、モーラ、アクセント、品詞及び属性が対応
付けられて登録されており、見出しとして「明美」、
「良子」、「愛知」、「タクシー」、「バス」、「山
形」、「電車」、「地下鉄」、「横浜」、「唯」、
「愛」、「飛行機」が登録されている。FIG. 2 is a diagram showing a specific example of the word dictionary 4 of FIG. In this example, in the word dictionary 4, headings (words), readings, mora, accents, parts of speech, and attributes are registered in association with each other.
"Ryoko", "Aichi", "Taxi", "Bus", "Yamagata", "Train", "Subway", "Yokohama", "Yui",
"Love" and "Airplane" are registered.

【００３１】例えば、属性が名前である単語には、「明
美」、「良子」、「唯」、「愛」があり、このうち「明
美」と「良子」とはモーラ数が“３”、「唯」と「愛」
とはモーラ数が“４”となっている。また、属性が乗り
物である単語には、「タクシー」、「バス」、「電
車」、「地下鉄」、「飛行機」があり、このうち「タク
シー」と「バス」とはアクセントが語頭（番号“１”で
示す）にあり、「電車」と「地下鉄」とはアクセントが
なく（番号“０”で示す）、「飛行機」はアクセントが
語中（番号“２”で示す）にある。For example, words whose attributes are names include "Akemi", "Ryoko", "Yui", and "love". Among them, "Akemi" and "Ryoko" have a mora number of "3", "Yui" and "love"
Means that the mora number is "4". In addition, words whose attributes are vehicles include “taxi”, “bus”, “train”, “subway”, and “airplane”. Of these, “taxi” and “bus” have an accent prefix (number “ 1), "train" and "subway" have no accent (indicated by the number "0"), and "airplane" has accents in the word (indicated by the number "2").

【００３２】以下、この実施の形態にかかる定型文コー
パス作成装置における動作について、具体例に基づきな
がら説明する。以下の説明において、単語辞書４は、図
２に示す例に従うものとする。Hereinafter, the operation of the fixed phrase corpus creating apparatus according to this embodiment will be described with reference to specific examples. In the following description, the word dictionary 4 is based on the example shown in FIG.

【００３３】オペレータは、まず、定型語入力部１から
定型語及び定型語間に挿入する任意語として、“［Ａ］
ちゃん、お元気ですか。”と入力し、任意語入力部２か
ら任意語の属性として、“人名［Ａ］”と入力する。こ
れにより、［Ａ］に対応する属性として、人名が任意語
選択部３に供給される。First, the operator inputs "[A]" as a fixed word from the fixed word input unit 1 and an arbitrary word to be inserted between the fixed words.
How are you? And input "person name [A]" as an attribute of an arbitrary word from the arbitrary word input unit 2. Thereby, a personal name is supplied to the arbitrary word selection unit 3 as an attribute corresponding to [A]. .

【００３４】任意語選択部３は、単語辞書４から検索す
ると、属性が人名である単語として「明美」、「良
子」、「唯」、「愛」の４つが検索されるので、これら
すべての選択して抽出する。そして、任意語選択部３
は、これら４つの単語を定型文生成部５に供給する。When the arbitrary word selection unit 3 searches the word dictionary 4, the words "Akemi", "Ryoko", "Yui", and "Love" are searched as words whose attributes are personal names. Select and extract. And an arbitrary word selection unit 3
Supplies these four words to the fixed phrase generator 5.

【００３５】定型文生成部５には、定型語入力部１から
入力された“［Ａ］ちゃん、お元気ですか。”が供給さ
れており、この文中の［Ａ］に任意語選択部３から供給
された４つの単語をそれぞれ挿入して、“明美ちゃん、
お元気ですか。”“良子ちゃん、お元気ですか。”“唯
ちゃん、お元気ですか。”“愛ちゃん、お元気です
か。”という４つの定型文を生成する。"[A] -chan, how are you?" Input from the fixed-form word input unit 1 is supplied to the fixed-form sentence generation unit 5, and the optional word selection unit 3 is added to [A] in this sentence. Insert each of the four words provided by "Akemi-chan,
How are you. "" How are you, Ryoko-chan? "" How are you, Yui-chan? "Ai-chan, how are you doing? Is generated.

【００３６】こうして生成された４つの定型文は、定型
文出力部６からＬＡＮ上に出力され、当該ＬＡＮに接続
された音声合成システムが備えるデータベース１０に定
型文コーパスとして蓄積される。The four fixed phrases generated in this way are output from the fixed phrase output unit 6 to the LAN, and are stored as a fixed phrase corpus in the database 10 provided in the speech synthesis system connected to the LAN.

【００３７】以上説明したように、この実施の形態にか
かる定型文コーパス作成装置によれば、定型語と定型語
間に挿入する任意語の位置、及び任意語の属性及び位置
を入力するだけで様々な定型文を作成することができ、
音声合成システムにおいて音声データベースとして用い
るために必要となる文パタンを効率よく作成することが
できる。As described above, according to the apparatus for creating a fixed phrase corpus according to the present embodiment, the position of an arbitrary word to be inserted between fixed words and the attribute and position of the arbitrary word can be simply input. You can create various fixed phrases,
A sentence pattern required for use as a speech database in a speech synthesis system can be efficiently created.

【００３８】［第２の実施の形態］図３は、この実施の
形態にかかる定型文コーパス作成装置の構成を示すブロ
ック図である。この定型文コーパス作成装置は、モーラ
比較部７をさらに備える点において、第１の実施の形態
のもの（図１）と異なる。[Second Embodiment] FIG. 3 is a block diagram showing the configuration of a typical sentence corpus creating apparatus according to this embodiment. This fixed phrase corpus creation device differs from that of the first embodiment (FIG. 1) in further including a mora comparison unit 7.

【００３９】モーラ比較部７は、任意語選択部３によっ
て単語辞書４から抽出された単語を、それぞれのモーラ
の個数に応じたモーラグループに分類し、モーラグルー
プ毎に順番で定型文生成部５に供給する。また、定型文
生成部５は、分類されたモーラグループに関する情報が
付加された定型文を生成する。定型文出力部６は、モー
ラグループに関する情報を対応付けて定型文を出力し、
データベース１０に蓄積させる。The mora comparison unit 7 classifies the words extracted from the word dictionary 4 by the arbitrary word selection unit 3 into mora groups according to the number of mora, and forms the fixed phrase generation unit 5 in order for each mora group. To supply. Further, the fixed phrase generation unit 5 generates a fixed phrase to which information on the classified mora group is added. The fixed phrase output unit 6 outputs a fixed phrase in association with information on the mora group,
It is stored in the database 10.

【００４０】以下、この実施の形態にかかる定型文コー
パス作成装置における動作について、具体例に基づきな
がら説明する。以下の説明において、単語辞書４は、第
１の実施の形態と同様、図２に示す例に従うものとす
る。The operation of the fixed-text corpus creation apparatus according to this embodiment will be described below based on a specific example. In the following description, it is assumed that the word dictionary 4 follows the example shown in FIG. 2, as in the first embodiment.

【００４１】この実施の形態において、任意語選択部３
による単語の抽出までは、第１の実施の形態と同一であ
るとする。但し、任意語選択部３は、抽出した４つの単
語と、これに併せて抽出した各単語のモーラの個数と
を、定型文生成部５ではなく、モーラ比較部７に供給す
る。In this embodiment, the arbitrary word selecting section 3
It is assumed that the process up to the extraction of a word by is the same as that of the first embodiment. However, the arbitrary word selection unit 3 supplies the extracted four words and the number of mora extracted for each word together with the four words to the mora comparison unit 7 instead of the fixed phrase generation unit 5.

【００４２】次に、モーラ比較部７は、抽出した４つの
単語をモーラの個数に応じたモーラグループに分類する
が、ここでは、モーラの個数が３である「明美」と「良
子」とがモーラグループ“Ｍ３”に、モーラの個数が２
である「唯」と「愛」とがモーラグループ“Ｍ２”に分
類される。そして、モーラグループの分類毎に順番に、
各単語が定型文生成部５に供給される。Next, the mora comparing section 7 classifies the extracted four words into mora groups according to the number of mora. Here, "Akemi" and "Ryoko" having the number of mora of 3 are included. In the mora group “M3”, the number of mora is 2
Are classified into the mora group “M2”. Then, in order for each Mora group,
Each word is supplied to the fixed phrase generation unit 5.

【００４３】定型文生成部５は、定型語入力部１から入
力された“［Ａ］ちゃん、お元気ですか。”が供給され
ており、まず、この文中の［Ａ］に任意語選択部３から
供給されたモーラグループ“Ｍ３”に分類された「明
美」及び「良子」をそれぞれ挿入して、“明美ちゃん、
お元気ですか。”“良子ちゃん、お元気ですか。”とい
う定型文を生成する。そして、定型文生成部５は、この
２つの定型文に“Ｍ３”という情報を付加する。The standard sentence generating unit 5 is supplied with “[A], how are you?” Input from the standard word input unit 1. First, an arbitrary word selecting unit is added to [A] in this sentence. Insert “Akemi” and “Ryoko” classified into the mora group “M3” supplied from 3, respectively, and add “Akemi-chan,
How are you. "" How are you, Ryoko-chan? Then, the fixed phrase generator 5 adds information “M3” to the two fixed phrases.

【００４４】次に、定型文生成部５は、供給された
“［Ａ］ちゃん、お元気ですか。”という文中の［Ａ］
に、任意語選択部３から供給されたモーラグループ“Ｍ
３”に分類された「唯」及び「愛」をそれぞれ挿入し
て、“唯ちゃん、お元気ですか。”“愛ちゃん、お元気
ですか。”という４つの定型文を生成する。そして、定
型文生成部５は、この２つの定型文に“Ｍ２”という情
報を付加する。Next, the fixed-form sentence generating unit 5 sends [A] in the sentence "[A], how are you?"
In addition, the mora group “M” supplied from the arbitrary word selection unit 3
3 ”,“ Ui ”and“ Ai ”are inserted, respectively, to generate four fixed phrases“ Ui-chan, how are you? ”And“ Ai-chan, how are you? ”. Then, the fixed phrase generation unit 5 adds information “M2” to the two fixed phrases.

【００４５】こうして生成された４つの定型文は、それ
ぞれに付加されている“Ｍ３”または“Ｍ２というモー
ラグループの分類を示す情報と共に、定型文出力部６か
らＬＡＮ上に出力され、当該ＬＡＮに接続された音声合
成システムが備えるデータベース１０に定型文コーパス
として蓄積される。The four fixed sentences generated in this way are output from the fixed sentence output unit 6 to the LAN together with information indicating the classification of the mora group “M3” or “M2” added to each of the fixed sentences. The data is stored as a fixed sentence corpus in the database 10 provided in the connected speech synthesis system.

【００４６】以上説明したように、この実施の形態にか
かる定型文コーパス作成装置では、挿入される任意語と
して単語辞書４から抽出された単語が、モーラの個数に
従って分類され、さらに生成される定型文もモーラの個
数に従って分類される。このため、音声合成システムに
おいて音声データベースとして用いるために必要となる
所望のモーラの個数を有する単語を含む文パタンを効率
よく作成することができる。As described above, in the fixed sentence corpus creating apparatus according to this embodiment, words extracted from the word dictionary 4 as arbitrary words to be inserted are classified according to the number of mora, and the fixed form is further generated. Sentences are also classified according to the number of mora. Therefore, a sentence pattern including words having a desired number of mora required for use as a speech database in the speech synthesis system can be efficiently created.

【００４７】ところで、音声情報データベースに蓄積さ
れている定型文に基づいて、その定型文に含まれる単語
とモーラの個数が異なる単語を挿入した文を音声合成す
る場合には、韻律パターンの制御も必要となる。この実
施の形態にかかる定型文コーパス作成装置は、モーラの
個数で分類された複数の定型文を容易に作成できるた
め、韻律パターンの制御が不要となるように音声情報デ
ータベースに多くの定型文を蓄積させておくという意味
でも、有効になる。By the way, when a sentence in which a word having a different number of mora from a word included in the fixed sentence is inserted based on the fixed sentence stored in the voice information database, the prosody pattern is also controlled. Required. The fixed-sentence corpus creation device according to the present embodiment can easily create a plurality of fixed-sentences classified by the number of mora, so that many fixed-sentences are stored in the audio information database so that control of the prosodic pattern becomes unnecessary. It is also effective in the sense that it is stored.

【００４８】［第３の実施の形態］図４は、この実施の
形態にかかる定型文コーパス作成装置の構成を示すブロ
ック図である。この定型文コーパス作成装置は、アクセ
ント比較部８をさらに備える点において、第１の実施の
形態のもの（図１）と異なる。[Third Embodiment] FIG. 4 is a block diagram showing the configuration of a fixed phrase corpus creating apparatus according to the third embodiment. This fixed-form sentence corpus creation device differs from that of the first embodiment (FIG. 1) in further including an accent comparison unit 8.

【００４９】アクセント比較部８は、任意語選択部３に
よって単語辞書４から抽出された単語を、それぞれのア
クセントの位置に応じたアクセントグループに分類し、
アクセントグループ毎に順番で定型文生成部５に供給す
る。また、定型文生成部５は、分類されたアクセントグ
ループに関する情報を保持したまま定型文を生成する。
定型文出力部６は、アクセントグループに関する情報を
対応付けて定型文を出力し、データベース１０に蓄積さ
せる。The accent comparing section 8 classifies the words extracted from the word dictionary 4 by the optional word selecting section 3 into accent groups corresponding to the respective accent positions.
It is supplied to the fixed phrase generation unit 5 in order for each accent group. Further, the fixed phrase generation unit 5 generates a fixed phrase while retaining information on the classified accent groups.
The fixed phrase output unit 6 outputs a fixed phrase in association with information on the accent group, and stores the fixed phrase in the database 10.

【００５０】以下、この実施の形態にかかる定型文コー
パス作成装置における動作について、具体例に基づきな
がら説明する。以下の説明において、単語辞書４は、第
１の実施の形態と同様、図２に示す例に従うものとす
る。Hereinafter, the operation of the fixed phrase corpus creating apparatus according to this embodiment will be described based on a specific example. In the following description, it is assumed that the word dictionary 4 follows the example shown in FIG. 2, as in the first embodiment.

【００５１】オペレータは、まず、定型語入力部１から
定型語及び定型語間に挿入する任意語として、“目的地
まで［Ａ］で３０分かかります。”と入力し、任意語入
力部２から任意語の属性として、“乗り物［Ａ］”と入
力する。これにより、［Ａ］に対応する属性として、乗
り物が任意語選択部３に供給される。First, the operator inputs, as an arbitrary word to be inserted between the standard word and the standard word from the standard word input unit 1, "It takes 30 minutes to the destination [A]." , "Vehicle [A]" is input as an attribute of an arbitrary word. Thereby, the vehicle is supplied to the arbitrary word selection unit 3 as an attribute corresponding to [A].

【００５２】任意語選択部３は、単語辞書４から検索す
ると、属性が人名である単語として「タクシー」、「バ
ス」、「電車」、「地下鉄」、「飛行機」の５つが検索
されるので、これらすべての選択して抽出する。そし
て、任意語選択部３は、抽出した５つの単語と、これに
併せて抽出した各単語のアクセントの位置とを、アクセ
ント比較部８に供給する。When the optional word selecting section 3 searches the word dictionary 4, five words having the attribute "personal name" such as "taxi", "bus", "train", "subway" and "airplane" are searched. Select and extract all these. Then, the arbitrary word selection unit 3 supplies the extracted five words and the positions of the accents of the respective words extracted together with the five words to the accent comparison unit 8.

【００５３】次に、アクセント比較部８は、抽出した４
つの単語をアクセントの位置に応じたアクセントグルー
プに分類するが、ここでは、アクセントの位置が語頭に
ある「タクシー」と「バス」とがアクセントグループ
“Ａ１”に、アクセントがない「電車」と「地下鉄」と
がアクセントグループ“Ａ０”に、アクセントの位置が
語中にある「飛行機」がアクセントグループ“Ａ２”に
分類される。そして、アクセントグループの分類毎に順
番に、各単語が定型文生成部５に供給される。Next, the accent comparison unit 8 extracts the extracted 4
The two words are classified into an accent group according to the position of the accent. In this case, “taxi” and “bus” having the accent position at the beginning of the word belong to the accent group “A1”, and “train” and “ “Subway” is classified into the accent group “A0”, and “airplane” with the accent position in the word is classified into the accent group “A2”. Then, the words are supplied to the fixed phrase generation unit 5 in order for each accent group classification.

【００５４】定型文生成部５は、定型語入力部１から入
力された“目的地まで［Ａ］で３０分かかります。”が
供給されており、まず、この文中の［Ａ］に任意語選択
部３から供給されたアクセントグループ“Ａ１”に分類
された「タクシー」及び「バス」をそれぞれ挿入して、
“目的地までタクシーで３０分かかります。”“目的地
までバスで３０分かかります。”という定型文を生成す
る。そして、定型文生成部５は、この２つの定型文に
“Ａ１”という情報を付加する。The fixed phrase generator 5 is supplied with “It takes 30 minutes to [A] to the destination” input from the fixed phrase input unit 1. First, an arbitrary word is added to [A] in the sentence. Insert "taxi" and "bus" classified into the accent group "A1" supplied from the selection unit 3, respectively,
It generates a fixed phrase "It takes 30 minutes by taxi to the destination.""It takes 30 minutes by bus to the destination." Then, the fixed phrase generation unit 5 adds information “A1” to the two fixed phrases.

【００５５】次に、定型文生成部５は、供給された“目
的地まで［Ａ］で３０分かかります。”という文中の
［Ａ］に任意語選択部３から供給されたアクセントグル
ープ“Ａ０”に分類された「電車」及び「地下鉄」をそ
れぞれ挿入して、“目的地まで電車で３０分かかりま
す。”“目的地まで地下鉄で３０分かかります。”とい
う定型文を生成する。そして、定型文生成部５は、この
２つの定型文に“Ａ０”という情報を付加する。Next, the fixed phrase generator 5 adds the accent group “A0” supplied from the optional word selector 3 to [A] in the supplied “It takes 30 minutes to [A] to the destination.” Insert the "train" and "subway" classified as "", respectively, and generate a fixed phrase "It takes 30 minutes by train to the destination.""It takes 30 minutes by subway to the destination." Then, the fixed phrase generation unit 5 adds information “A0” to the two fixed phrases.

【００５６】さらに、定型文生成部５は、供給された
“目的地まで［Ａ］で３０分かかります。”という文中
の［Ａ］に任意語選択部３から供給されたアクセントグ
ループ“Ａ２”に分類された「飛行機」を挿入して、
“目的地まで飛行機で３０分かかります。”という定型
文を生成する。そして、定型文生成部５は、この定型文
に“Ａ２”という情報を付加する。Furthermore, the fixed phrase generation unit 5 supplies the accent group “A2” supplied from the optional word selection unit 3 to [A] in the supplied “It takes 30 minutes to [D] to the destination.” Insert "Airplane" classified as
Generate a fixed phrase saying "It takes 30 minutes by air to reach the destination." Then, the fixed phrase generation unit 5 adds information “A2” to the fixed phrase.

【００５７】こうして生成された５つの定型文は、それ
ぞれに付加されている“Ａ１”、“Ａ０”または“Ａ２
というアクセントグループの分類を示す情報と共に、定
型文出力部６からＬＡＮ上に出力され、当該ＬＡＮに接
続された音声合成システムが備えるデータベース１０に
定型文コーパスとして蓄積される。The five fixed phrases generated in this way are respectively assigned “A1”, “A0” or “A2”.
Along with the information indicating the classification of the accent group, the standard sentence output unit 6 outputs the information to the LAN, and is stored as a standard sentence corpus in the database 10 provided in the speech synthesis system connected to the LAN.

【００５８】以上説明したように、この実施の形態にか
かる定型文コーパス作成装置では、挿入される任意語と
して単語辞書４から抽出された単語が、アクセントの位
置に従って分類され、さらに生成される定型文もアクセ
ントの位置に従って分類される。このため、音声合成シ
ステムにおいて音声データベースとして用いるために必
要となる所望のアクセント位置を有する単語を含む文パ
タンを効率よく作成することができる。As described above, in the fixed sentence corpus creating apparatus according to this embodiment, the words extracted from the word dictionary 4 as arbitrary words to be inserted are classified according to the position of the accent, and the fixed form is further generated. Sentences are also classified according to the position of the accent. Therefore, it is possible to efficiently create a sentence pattern including a word having a desired accent position required for use as a speech database in the speech synthesis system.

【００５９】ところで、アクセントの位置が異なる単語
間では、そのピッチパタンがかなり異なることとなるの
で、その代用が難しい。この実施の形態にかかる定型文
コーパス作成装置は、アクセントの位置で分類された複
数の定型文を容易に作成できるため、代用が困難な単語
を含む定型文に基づいて音声合成するという事態を避け
る意味でも、有効になる。By the way, the pitch pattern is considerably different between words having different accent positions, so that substitution is difficult. The fixed-sentence corpus creation device according to this embodiment can easily create a plurality of fixed-sentences classified according to accent positions, thereby avoiding a situation in which speech synthesis is performed based on fixed-sentences including words that are difficult to substitute. In a sense, it is effective.

【００６０】［第４の実施の形態］図５は、この実施の
形態にかかる定型文コーパス作成装置の構成を示すブロ
ック図である。この定型文コーパス作成装置は、音韻環
境比較部９をさらに備える点において、第１の実施の形
態のもの（図１）と異なる。[Fourth Embodiment] FIG. 5 is a block diagram showing the configuration of a fixed-form sentence corpus creation apparatus according to this embodiment. This fixed phrase corpus creation device differs from that of the first embodiment (FIG. 1) in further including a phoneme environment comparison unit 9.

【００６１】音韻環境比較部９は、任意語選択部３によ
って単語辞書４から抽出された単語を、それぞれの読み
を基に語頭及び／または語尾の音韻の組み合わせに応じ
た音韻グループに分類し、音韻グループ毎に順番で定型
文生成部５に供給する。定型文生成部５は、分類された
音韻グループに関する情報を保持したまま定型文を生成
する。定型文出力部６は、音韻グループに関する情報を
対応付けて定型文を出力し、データベース１０に蓄積さ
せる。The phoneme environment comparison unit 9 classifies the words extracted from the word dictionary 4 by the arbitrary word selection unit 3 into phoneme groups corresponding to the combination of the beginning and / or the end phoneme based on each reading. It is supplied to the fixed phrase generation unit 5 in order for each phoneme group. The fixed sentence generation unit 5 generates a fixed sentence while retaining information on the classified phoneme groups. The fixed phrase output unit 6 outputs a fixed phrase in association with information about the phoneme group, and stores the fixed phrase in the database 10.

【００６２】なお、音韻環境比較部９によるグループ分
けの方法は、任意語の挿入位置が文頭、文中、文末のい
ずれであるかによって異なる。すなわち、任意語の挿入
位置が文頭である場合には語頭の読みに従って、文中で
ある場合には読みの語頭と語尾の両方の読みに従って、
文末である場合には語尾の読みに従って、それぞれ音韻
グループへのグループ分けが行われる。The method of grouping performed by the phoneme environment comparison unit 9 differs depending on whether the insertion position of the arbitrary word is at the beginning of a sentence, in a sentence, or at the end of a sentence. In other words, if the insertion position of the arbitrary word is the beginning of the sentence, it follows the reading of the beginning of the sentence, and if it is in the sentence, it follows the reading of both the beginning and the end of the reading,
If it is at the end of the sentence, the speech is grouped into phoneme groups according to the ending reading.

【００６３】以下、この実施の形態にかかる定型文コー
パス作成装置における動作について、具体例の基づきな
がら説明する。以下の説明において、単語辞書４には、
見出し（単語）として「山岡」、「増岡」、「山田」、
「三田」、「山本」が登録されているものとする。これ
らの見出しに対応する読みは、それぞれ「やまおか」、
「ますおか」、「やまだ」、「みた」、「やまもと」で
ある。これらの品詞、属性は、すべて固有名詞、名字で
ある。Hereinafter, the operation of the fixed phrase corpus creation apparatus according to this embodiment will be described with reference to specific examples. In the following description, the word dictionary 4 contains
Headlines (words) include "Yamaoka", "Masuoka", "Yamada"
It is assumed that "Mita" and "Yamamoto" are registered. The readings corresponding to these headings are "Yamaoka",
"Masuoka", "Yamada", "Mita", "Yamamoto". These parts of speech and attributes are all proper nouns and surnames.

【００６４】ここで、オペレータが、定型語入力部１か
ら定型語及び定型語間に挿入する任意語として、（１）“［Ａ］さん、お元気ですか。” （２）“新任の［Ａ］先生を紹介します。” （３）“私の旧姓は、［Ａ］” とそれぞれ入力した場合を考える。また、オペレータ
は、いずれの場合にも、任意語入力部２から任意語の属
性として“名字［Ａ］”と入力する。Here, as the arbitrary words inserted by the operator from the fixed word input unit 1 and between the fixed words, (1) “Mr. [A], how are you?” (2) “[ A] Introduce the teacher. ”(3) Consider the case where“ My maiden name is [A] ”. In any case, the operator inputs “surname [A]” as an attribute of an arbitrary word from the arbitrary word input unit 2.

【００６５】（１）〜（３）のいずれの場合にも、任
意語選択部３は、単語辞書４から検索すると、属性が人
名である単語として「山岡」、「増岡」、「山田」、
「三田」、「山本」の５つが検索されるので、これらす
べての選択して抽出する。そして、任意語選択部３は、
これら５つの単語を定型文生成部５に供給する。任意語
選択部３は、抽出した５つの単語と、これに併せて抽出
した各単語の読みとを、音韻環境比較部９に供給する。In any of the cases (1) to (3), the arbitrary word selection unit 3 searches the word dictionary 4 and finds that the attribute is a person name as “Yamaoka”, “Masuoka”, “Yamada”,
Since “Mita” and “Yamamoto” are searched, all of them are selected and extracted. Then, the optional word selection unit 3
These five words are supplied to the fixed phrase generator 5. The arbitrary word selection unit 3 supplies the extracted five words and the reading of each word extracted together with the five words to the phoneme environment comparison unit 9.

【００６６】次に、音韻環境比較部９は、抽出した５つ
の単語をその読みによる音韻環境に応じたグループに分
類するが、上記の（１）〜（３）の場合とで、グループ
分けの仕方が異なるものとなる。Next, the phoneme environment comparing section 9 classifies the extracted five words into groups according to the phoneme environment based on the reading, and in the above cases (1) to (3), The way is different.

【００６７】上記の（１）の場合には、任意語が文の先
頭にあり、定型語との接続点は語尾だけになる。これに
より、音韻環境比較部９は、抽出した５つの単語のうち
で語尾の読みが同じである「山岡」と「増岡」とを同一
のグループに分類する。残りの３つの単語では、語尾の
読みが同じものがないので、音韻環境比較部９は、当該
３つの単語を、単語毎に１つのグループとして分類す
る。In the case of the above (1), the arbitrary word is at the head of the sentence, and the connection point with the fixed form word is only the ending. As a result, the phoneme environment comparison unit 9 classifies “Yamaoka” and “Masuoka” having the same ending reading among the extracted five words into the same group. Since the remaining three words do not have the same ending reading, the phoneme environment comparison unit 9 classifies the three words into one group for each word.

【００６８】上記の（２）の場合には、任意語が文の途
中にあり、定型語との接続点は語頭と語尾との両方とな
る。音韻環境比較部５は、抽出した５つの単語のうちに
語頭と語尾との両方の読みが同じであるものがないの
で、当該５つの単語を、単語毎に１つのグループとして
分類する。In the case of the above (2), the arbitrary word is in the middle of the sentence, and the connection point with the fixed form word is both the beginning and the end of the sentence. The phonological environment comparison unit 5 classifies the five words as one group for each word, because none of the five extracted words has the same reading at both the beginning and the end.

【００６９】上記の（３）の場合には、任意語が文末に
あり、定型語との接続点は語頭だけになる。これによ
り、音韻環境比較部９は、抽出した５つの単語のうちで
語頭の読みが同じである「山岡」、「山田」、「山本」
を同一のグループに分類する。残りの３つの単語では、
語尾の読みが同じものがないので、音韻環境比較部９
は、当該２つの単語を、単語毎に１つのグループとして
分類する。In the case of the above (3), the arbitrary word is at the end of the sentence, and the connection point with the fixed form word is only the beginning of the word. As a result, the phoneme environment comparison unit 9 determines that “Yamaoka”, “Yamada”, and “Yamamoto”, which have the same initial reading among the five extracted words,
Are classified into the same group. In the remaining three words,
Since no ending reading is the same, the phoneme environment comparison unit 9
Classifies the two words as one group for each word.

【００７０】なお、以降の処理は、第２、第３の実施の
形態の場合と実質的に同一であり、上記の（１）〜
（３）のいずれの場合にも、生成された定型文は、グル
ープの分類を示す情報と共に、定型文コーパスとしてデ
ータベース１０に蓄積される。The subsequent processes are substantially the same as those in the second and third embodiments, and
In either case of (3), the generated fixed phrase is stored in the database 10 as a fixed phrase corpus together with information indicating the group classification.

【００７１】以上説明したように、この実施の形態にか
かる定型文コーパス作成装置では、挿入される任意語と
して単語辞書４から抽出された単語が、語尾、語頭の音
韻環境に従って分類され、さらに生成される定型文も語
尾、語頭の音韻環境に従って分類される。このため、音
声データベースにおいて強化すべき文パタンを効率よく
作成することができる。さらに、任意語の前後の定型語
の有無によって、グループ分けの仕方を変えているた
め、任意語の挿入位置に応じたきめ細かな分類が可能と
なる。As described above, in the fixed-sentence corpus creating apparatus according to this embodiment, words extracted from the word dictionary 4 as arbitrary words to be inserted are classified according to the phonological environment of the ending and the beginning of the word, and further generated. The fixed phrases to be sent are also classified according to the phonetic environment of the ending and the beginning. Therefore, a sentence pattern to be strengthened in the speech database can be efficiently created. Further, since the manner of grouping is changed depending on the presence or absence of fixed words before and after the arbitrary word, fine classification can be performed according to the insertion position of the arbitrary word.

【００７２】［実施の形態の変形］本発明は、上記の第
１〜第４の実施の形態に限られず、種々の変形、応用が
可能である。以下、本発明に適用可能な上記の実施の形
態の変形態様について、説明する。[Modifications of Embodiment] The present invention is not limited to the above-described first to fourth embodiments, and various modifications and applications are possible. Hereinafter, modifications of the above-described embodiment applicable to the present invention will be described.

【００７３】上記の第２の実施の形態では、モーラの個
数に従って単語辞書から抽出した単語をグループに分類
していたが、音節数に従ってグループの分類をするもの
としてもよい。また、上記の第２〜第４の実施の形態の
それぞれで説明した、モーラの個数（または音節数）、
アクセントの位置、音韻環境を組み合わせてグループの
分類を行ってもよい。さらには、これら以外のそれぞれ
の単語を発生したときの音に関する情報に従って、抽出
した単語をグループに分類してもよい。In the second embodiment, the words extracted from the word dictionary are classified into groups according to the number of mora. However, the groups may be classified according to the number of syllables. Further, the number of moras (or the number of syllables) described in each of the second to fourth embodiments,
The group may be classified by combining the position of the accent and the phonological environment. Further, the extracted words may be classified into groups according to the information about the sound when each of the other words is generated.

【００７４】上記の第１〜第４の実施の形態では、任意
語選択部３及び定型文生成部５、並びにモーラ比較部
７、アクセント比較部８または音韻環境比較部９がプロ
グラム制御により動作することで、定型文コーパスを作
成していた。これに対し、このような機能を実現するた
めのプログラムをＣＤ−ＲＯＭなどのコンピュータ読み
取り可能な記録媒体に格納して配布し、汎用コンピュー
タにインストールして実行させてもよい。この場合、記
録媒体には、単語辞書４も格納して提供してもよい。In the first to fourth embodiments, the arbitrary word selection unit 3, the fixed phrase generation unit 5, the mora comparison unit 7, the accent comparison unit 8, and the phonological environment comparison unit 9 operate under program control. By doing so, he created a fixed phrase corpus. On the other hand, a program for realizing such a function may be stored in a computer-readable recording medium such as a CD-ROM, distributed, installed in a general-purpose computer, and executed. In this case, the recording medium may also store and provide the word dictionary 4.

【００７５】[0075]

【発明の効果】以上説明したように、本発明によれば、
定型語と定型語間に挿入する任意語の位置、及び任意語
の属性及び位置を入力するだけで様々な定型文を作成す
ることができる。このため、例えば、音声合成システム
において音声データベースとして用いるために必要とな
る文パタンを効率よく作成することができる。As described above, according to the present invention,
Various fixed phrases can be created simply by inputting the position of an arbitrary word to be inserted between the fixed words and the attribute and position of the arbitrary word. For this reason, for example, a sentence pattern required for use as a speech database in a speech synthesis system can be efficiently created.

【００７６】さらに、モーラの個数または音節数、アク
セントの位置、若しくは音韻環境などの各単語の音に関
する情報に従って分類を行って定型文を作成することに
より、例えば、音声データベースにおいて強化すべき文
パタンを効率よく作成することができる。Further, classification is performed in accordance with information on the sound of each word, such as the number of mora or the number of syllables, the position of accents, or the phonological environment, thereby creating a fixed sentence. Can be created efficiently.

[Brief description of the drawings]

【図１】本発明の第１の実施の形態にかかる定型文コー
パス作成装置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a fixed-form sentence corpus creation apparatus according to a first embodiment of the present invention.

【図２】図１の単語辞書の例を示す図である。FIG. 2 is a diagram showing an example of the word dictionary of FIG.

【図３】本発明の第２の実施の形態にかかる定型文コー
パス作成装置の構成を示すブロック図である。FIG. 3 is a block diagram showing a configuration of a fixed phrase corpus creation device according to a second embodiment of the present invention.

【図４】本発明の第３の実施の形態にかかる定型文コー
パス作成装置の構成を示すブロック図である。FIG. 4 is a block diagram showing a configuration of a fixed phrase corpus creation device according to a third embodiment of the present invention.

【図５】本発明の第４の実施の形態にかかる定型文コー
パス作成装置の構成を示すブロック図である。FIG. 5 is a block diagram showing a configuration of a fixed phrase corpus creation device according to a fourth embodiment of the present invention.

【図６】従来例１の音声合成システムの構成を示すブロ
ック図である。FIG. 6 is a block diagram showing a configuration of a speech synthesis system of Conventional Example 1.

[Explanation of symbols]

１定型語入力部２任意語入力部３任意語選択部４単語辞書５定型文生成部６定型文出力部７モーラ比較部８アクセント比較部９音韻環境比較部１０データベース Reference Signs List 1 fixed phrase input unit 2 optional word input unit 3 optional word selection unit 4 word dictionary 5 fixed phrase generation unit 6 fixed phrase output unit 7 mora comparison unit 8 accent comparison unit 9 phonological environment comparison unit 10 database

Claims

(57) [Claims]

1. A dictionary means for storing a plurality of words and attributes of each word in association with each other; a fixed form word included in a fixed form sentence to be created and represented by a desired word; Fixed word input means for inputting the position of an arbitrary word to be inserted between fixed words, and arbitrary word input means for inputting the position of the arbitrary word to be inserted in the fixed phrase and the attribute of the arbitrary word to be inserted in association with each other An optional word selecting means for selecting and extracting a word corresponding to an attribute of an arbitrary word input from the optional word input means from the dictionary means; and, at a position of the arbitrary word input from the fixed form input means, and a fixed sentence generating means for generating a standard sentence by inserting each word extracted by any word selection means, said dictionary means, in association with each of a plurality of words
And memorize the number of mora or syllables for each word.
The optional word selecting means corresponds to each of the selected words.
The number of mora or syllables
And extracting the word extracted by the arbitrary word selecting means into the word
According to the number of mora or syllables extracted along with
The system further comprises word classifying means for classifying , wherein the fixed form sentence generating means
A fixed form corpus creating apparatus characterized in that a fixed form sentence is generated by adding information relating to the fixed form.

2. The method according to claim 1 , wherein the plurality of words and the attribute of each word are
Dictionary means for storing in association with each other, and a dictionary means which is included in a fixed phrase to be created and is represented by a desired word.
The fixed phrase and the task to insert between the fixed words in the fixed phrase.
A fixed word input means for inputting the position of a meaning word, and the position of an arbitrary word to be inserted in the fixed phrase and the position to be inserted
Arbitrary word input means for inputting in association with the attribute of an arbitrary word
And the attribute of the arbitrary word input from the arbitrary word input means.
Any word election to the word to be extracted by selecting from the dictionary hand stage
Selection means and an arbitrary word position input from the fixed form input means,
Insert words extracted by the arbitrary word selection means
And a fixed phrase generating means for generating a fixed phrase by inputting , wherein the dictionary means further stores the position of the accent of each word in association with each of a plurality of words; and A word classification for extracting the position of an accent corresponding to each of the selected words from the dictionary means, and classifying the words extracted by the arbitrary word selecting means according to the positions of the accents extracted in accordance with the words; further comprising means, the typical sentence generating means, and characterized in that by adding information on the classification by the word classification means, to generate a template sentence
To create a fixed phrase corpus.

3. The method according to claim 1 , wherein the plurality of words and the attribute of each word are respectively
Dictionary means for storing in association with each other, and a dictionary means which is included in a fixed phrase to be created and is represented by a desired word.
The fixed phrase and the task to insert between the fixed words in the fixed phrase.
A fixed word input means for inputting the position of a meaning word, and the position of an arbitrary word to be inserted in the fixed phrase and the position to be inserted
Arbitrary word input means for inputting in association with the attribute of an arbitrary word
And the attribute of the arbitrary word input from the arbitrary word input means.
Arbitrary word selection for selecting and extracting words to be extracted from the dictionary means
Selection means and an arbitrary word position input from the fixed form input means,
Insert words extracted by the arbitrary word selection means
And a fixed-form sentence generating means for generating a fixed-form sentence , wherein the dictionary means further stores information indicating a phonological environment of each word in association with each of a plurality of words; The means further extracts information indicating a phonological environment corresponding to each of the selected words from the dictionary means, and indicates the words extracted by the arbitrary word selecting means to the phonological environments extracted together with the words. further comprising a word classification means for classifying according to the information, the typical sentence generating means, and characterized in that by adding information on the classification by the word classification means, to generate a template sentence
To create a fixed phrase corpus.

4. The word categorizing means further comprises: extracting a word extracted by the arbitrary word extracting means in accordance with a positional relationship between a fixed word input from the fixed word input means and an arbitrary word corresponding to the extracted word. The fixed-form sentence corpus creation device according to claim 3 , wherein classification is performed.

5. The method according to claim 1 , wherein the plurality of words and the attribute of each word are
Dictionary means for storing in association with each other, and a dictionary means which is included in a fixed phrase to be created and is represented by a desired word.
The fixed phrase and the task to insert between the fixed words in the fixed phrase.
A fixed word input means for inputting the position of a meaning word, and the position of an arbitrary word to be inserted in the fixed phrase and the position to be inserted
Arbitrary word input means for inputting in association with the attribute of an arbitrary word
And the attribute of the arbitrary word input from the arbitrary word input means.
Arbitrary word selection for selecting and extracting words to be extracted from the dictionary means
Selection means and an arbitrary word position input from the fixed form input means,
Insert words extracted by the arbitrary word selection means
A fixed-form sentence generating means for generating a fixed-form sentence , wherein the dictionary means further stores information on the sound of each word in association with each of a plurality of words, and A word classification for extracting information on sounds corresponding to each of the selected words from the dictionary means together, and classifying the words extracted by the arbitrary word selection means according to the information on the sounds extracted along with the words; further comprising means, the typical sentence generating means, and characterized in that by adding information on the classification by the word classification means, to generate a template sentence
To create a fixed phrase corpus.

6. A fixed form word input step of inputting a fixed form word included in a fixed form sentence to be generated and represented by a desired word, and a position of an arbitrary word inserted between the fixed form words in the fixed form sentence, An arbitrary word input step of inputting the position of the arbitrary word to be inserted in the fixed phrase and the attribute of the arbitrary word to be inserted, and a plurality of words corresponding to the attribute of the arbitrary word input in the arbitrary word input step. An arbitrary word extraction step of selecting and extracting from the dictionary the words and their attributes are stored in association with each other, at the position of the arbitrary word input in the fixed form input step,
Anda fixed sentence generating step of generating a fixed sentence and the words extracted by the arbitrary word extraction step respectively inserted, said any word extraction step, the plurality of words in the dictionary
Of information about sound stored in association with
The information about the sound corresponding to the selected word
The word extracted in the arbitrary word extraction step as the word
Words to be classified according to the information about the extracted sounds
The method further includes a classification step, wherein the fixed phrase generation step includes the step of:
A method for creating a fixed sentence corpus, characterized in that a fixed sentence is generated by adding information on classifications to be made.

7. A desired word included in a fixed phrase to be generated,
Between the fixed phrase represented by the fixed phrase and the fixed phrase in the fixed phrase.
Input step to input the position of the arbitrary word to be inserted into the
And the position of an arbitrary word to be inserted in the fixed phrase
Arbitrary word input step for inputting in association with the attribute of the arbitrary word
And the attribute of the arbitrary word input in the optional word input step.
Corresponding words, multiple words and their attributes are associated
Word extraction step to select and extract from the dictionary
At the position of an arbitrary word input in the fixed word input step,
Insert the words extracted in the optional word extraction step
And generating a fixed phrase by inputting the fixed phrase, wherein the dictionary is further associated with each of a plurality of words.
The position of the accent of each word is stored, and the optional word extracting step includes:
Extract the corresponding accent position from the dictionary
And extracting the word extracted in the arbitrary word extracting step
Classify according to position based on accents extracted along with words
Further comprising a word classifying step, wherein the fixed form sentence generating step includes:
Creates a fixed phrase corpus by generating fixed phrases by adding information on classification
How .

8. A desired word included in a fixed phrase to be created,
Between the fixed phrase represented by the fixed phrase and the fixed phrase in the fixed phrase.
Input step to input the position of the arbitrary word to be inserted into the
And the position of an arbitrary word to be inserted in the fixed phrase
Arbitrary word input step for inputting in association with the attribute of the arbitrary word
And the attribute of the arbitrary word input in the optional word input step.
Corresponding words, multiple words and their attributes are associated
Word extraction step to select and extract from the dictionary
At the position of an arbitrary word input in the fixed word input step,
Insert the words extracted in the optional word extraction step
And generating a fixed phrase by inputting the fixed phrase , wherein the dictionary is further associated with each of a plurality of words.
Stores the indicate to information phoneme environment of each word, the arbitrary word extraction step, each of the selected word
Information indicating the corresponding phonological environment is also extracted from the dictionary
The word extracted in the arbitrary word extraction step is added to the word.
Classify according to the extracted information indicating the phonetic environment
The method further includes a word classification step, wherein the fixed sentence generation step includes the step of:
Generating fixed phrases by adding information about classifications
A method for creating a fixed-text corpus .

9. The word classification step further includes the step of extracting the word extracted in the arbitrary word extraction step according to a positional relationship between the fixed word input in the fixed word input step and an arbitrary word corresponding to the extracted word. 9. The method according to claim 8 , wherein classification is performed.

10. A desired word included in a fixed phrase to be created,
And the fixed form word in the fixed form sentence
A fixed word input step to input the position of an arbitrary word to be inserted between
And the position of an arbitrary word to be inserted in the fixed phrase
Arbitrary word input step for inputting in association with the attribute of the arbitrary word
And the attribute of the arbitrary word input in the optional word input step.
Corresponding words, multiple words and their attributes are associated
Word extraction step to select and extract from the dictionary
At the position of an arbitrary word input in the fixed word input step,
Insert the words extracted in the optional word extraction step
And generating a fixed phrase by inputting the fixed phrase , wherein the dictionary is further associated with each of a plurality of words.
The information on the sound of each word is stored, and the arbitrary word extracting step includes:
Information on the corresponding sound is also extracted from the dictionary , and the word extracted in the arbitrary word extraction step is
Classify according to information about sounds extracted along with words
Further comprising a word classification step , wherein the fixed sentence generation step is performed by the word classification step.
Generating fixed phrases by adding information about classifications
A method for creating a fixed-text corpus .

11. A computer-readable recording medium.
There are, included in the template text to be generated, the fixed word represented by the desired word, the fixed word input step of inputting the position of any words to be inserted between fixed word in the template text, the template text An arbitrary word input step of inputting the position of the arbitrary word to be inserted and the attribute of the arbitrary word to be inserted in association with each other, and converting the word corresponding to the attribute of the arbitrary word input in the arbitrary word input step into a plurality of words And an optional word extraction step of selecting and extracting from a dictionary storing the respective attributes in association with each other, at the position of the arbitrary word input in the fixed form input step,
And a fixed sentence generating step of generating a fixed sentence and the words extracted by the arbitrary word extraction step respectively inserted, said any word extraction step, the plurality of words in the dictionary
Of information about sound stored in association with
The information about the sound corresponding to the selected word
The word extracted in the arbitrary word extraction step as the word
Words to be classified according to the information about the extracted sounds
The program further includes a program for executing a classification step, wherein the fixed sentence generation step includes the step of:
Process for generating fixed phrases by adding information on
A computer-readable recording medium for recording a program for executing processing .

12. A computer-readable recording medium.
There are, included in the template text to be generated is represented by the desired word
The fixed phrase and the task to insert between the fixed words in the fixed phrase.
A fixed word input step of inputting the position of the meaning word, and the position of an arbitrary word to be inserted in the fixed sentence and the word to be inserted
Arbitrary word input step for inputting in association with the attribute of the arbitrary word
And the attribute of the arbitrary word input in the optional word input step.
Corresponding words, multiple words and their attributes are associated
Word extraction step to select and extract from the dictionary
At the position of an arbitrary word input in the fixed word input step,
Insert the words extracted in the optional word extraction step
And generating a fixed phrase by inputting the fixed phrase , wherein the dictionary is further associated with each of a plurality of words.
The position of the accent of each word is stored, and the optional word extracting step includes:
Extract the corresponding accent position from the dictionary
And extracting the word extracted in the arbitrary word extracting step
Classify according to position based on accents extracted along with words
Further comprising a word classifying step, wherein the fixed form sentence generating step includes:
Processing to generate fixed phrases by adding information about classification
A program for recording a program for executing
A computer-readable recording medium.

13. A computer-readable recording medium.
There are, included in the template text to be generated is represented by the desired word
The fixed phrase and the task to insert between the fixed words in the fixed phrase.
A fixed word input step of inputting the position of the meaning word, and the position of an arbitrary word to be inserted in the fixed sentence and the word to be inserted
Arbitrary word input step for inputting in association with the attribute of the arbitrary word
And the attribute of the arbitrary word input in the optional word input step.
Corresponding words, multiple words and their attributes are associated
Word extraction step to select and extract from the dictionary
At the position of an arbitrary word input in the fixed word input step,
Insert the words extracted in the optional word extraction step
And generating a fixed phrase by inputting the fixed phrase , wherein the dictionary is further associated with each of a plurality of words.
Information indicating a phonemic environment of each word is stored, and the optional word extracting step includes:
Information indicating the corresponding phonological environment is also extracted from the dictionary
The word extracted in the arbitrary word extraction step is added to the word.
Classify according to the extracted information indicating the phonetic environment
The method further includes a word classification step, wherein the fixed sentence generation step includes the step of:
Processing to generate fixed phrases by adding information about classification
A program for recording a program for executing
A computer-readable recording medium.

14. The word classification step, further comprising:
Fixed words input in the word input step and extracted words
The arbitrary word extraction according to the positional relationship with the arbitrary word corresponding to
Classifying the words extracted in the output step
14. The recording medium according to claim 13, wherein:

15. A computer-readable recording medium.
There are, included in the template text to be generated is represented by the desired word
The fixed phrase and the task to insert between the fixed words in the fixed phrase.
A fixed word input step of inputting the position of the meaning word, and the position of an arbitrary word to be inserted in the fixed sentence and the word to be inserted
Arbitrary word input step for inputting in association with the attribute of the arbitrary word
And the attribute of the arbitrary word input in the optional word input step.
Corresponding words, multiple words and their attributes are associated
Word extraction step to select and extract from the dictionary
At the position of an arbitrary word input in the fixed word input step,
Insert the words extracted in the optional word extraction step
And generating a fixed phrase by inputting the fixed phrase , wherein the dictionary is further associated with each of a plurality of words.
The information on the sound of each word is stored, and the arbitrary word extracting step includes:
Information on the corresponding sound is also extracted from the dictionary , and the word extracted in the arbitrary word extraction step is
Classify according to information about sounds extracted along with words
Further comprising a word classification step , wherein the fixed sentence generation step is performed by the word classification step.
Processing to generate fixed phrases by adding information about classification
A program for recording a program for executing
A computer-readable recording medium.

16. A compound used in the optional word extracting step.
Number of words and attributes of each word are stored in association with each other
12. The dictionary according to claim 11, further comprising:
Computer readable according to any one of the above items 15 to 15
Recording media.