JP2746880B2 - Compound word division method - Google Patents

Compound word division method

Info

Publication number
JP2746880B2
JP2746880B2 JP62178434A JP17843487A JP2746880B2 JP 2746880 B2 JP2746880 B2 JP 2746880B2 JP 62178434 A JP62178434 A JP 62178434A JP 17843487 A JP17843487 A JP 17843487A JP 2746880 B2 JP2746880 B2 JP 2746880B2
Authority
JP
Japan
Prior art keywords
compound word
suffix
prefix
accent
morphological analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP62178434A
Other languages
Japanese (ja)
Other versions
JPS6421496A (en
Inventor
哲也 酒寄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP62178434A priority Critical patent/JP2746880B2/en
Publication of JPS6421496A publication Critical patent/JPS6421496A/en
Application granted granted Critical
Publication of JP2746880B2 publication Critical patent/JP2746880B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Description

【発明の詳細な説明】 技術分野 本発明は、複合語の分割方法、より詳細には、テキス
ト音声合成のアクセント単位抽出方法に関する。 従来技術 音声合成において自然な韻律を付加するために、アク
セント核の位置やレベルを設定することが不可欠であ
り、このためにはまず、入力テキストを、アクセント単
位(アクセント核を1つだけ持つ形態素列)に分割する
必要がある。この際、アクセント単位は普通、1つ以上
の文節からなるが、長い複合語が1文節を作る場合に
は、この複合語(文節)を2つ以上のアクセント単位に
分割する必要がある。このための方法としては、複合語
内の係受け解析を行う方法や、複合語を構成単語数の1/
2の位置で分割する方法が提案されている。前者の方法
では、助詞や助動詞の情報なしに係受け解析を行う必要
があり、処理が複雑になる。後者の方法では、5単語以
上の非常に長い複合語や、2等分した単語列や奇数単語
からなるような場合に、不適切な分割となることが多い
などの問題があった。 目的 本発明は、上述のごとき実情に鑑みてなされたもの
で、特に、音声のテキスト合成において、1つのアクセ
ント単位として音声すると不自然になるような長い複合
語を、適切な位置で、複数のアクセント単位に分割する
ことを目的としてなされたものである。 構成 本発明は、上記目的を達成するために、入力テキスト
から形態素解析処理によって形態素解析情報を抽出し、
該形態素解析情報に基づき音韻規則によって音韻記号列
を、韻律記号導出規則によって韻律記号列を求め、予め
用意した音素片のパラメータ系列を、前記音韻記号列に
したがって読み出し、結合規則によって韻律を付加する
テキスト音声合成装置において、前記入力テキスト中に
接辞を挟み込んだ長い複合語が存在するときに、該複合
語中に接尾辞辞書中に存在する接尾辞があった場合、こ
の接尾辞の直後を分割位置とし、該複合語中に接頭辞辞
書中に存在する接頭辞のあった場合、この接頭辞の直前
を分割位置とし、前記複合語をアクセント単位に分割す
ることにより1つのアクセント単位で音声すると不自然
な長い複合語をより自然に発声するようにしたことを特
徴としたものである。以下、本発明の実施例に基づいて
説明する。 第1図は、本発明の一実施例を説明するための流れ図
であるが、本発明は、以下に説明するように、長い複合
語を複合語中の接辞に注目した方法によって、複数のア
クセント単位に分割するものである。第1図において、
まず、複合語内に接辞が存在した場合には、ここが意味
的に大きな区切りとなることから、複合語尾から、接尾
辞を、複合語頭から接頭辞を検索する。ここで接尾辞が
見つけられた場合には、その接尾辞の直後を、接頭辞が
見つけられた場合には、その接頭辞の直前を分割位置と
し、2つの複合語に分割する。第2図は接尾辞Bが見つ
けられた例であり、第3図は接頭辞Cが見つけられた例
である(ただし、Aはアクセント単位)。分割された複
合語がまだ長い場合には、更にこの処理を繰り返す。 複合語を構成する単語の平均モーラ長から考えて、1
つのアクセント単位は2つないし3つの単語(この場合
1つは接辞)で構成するのが妥当である。そこで複合語
中に接尾辞も接頭辞も存在しない場合には、第4図に示
すように複合語頭から2単語毎に分割位置を設定する。 効果 以上の説明から明らかなように、本発明によると、複
合語中に挟み込まれた接辞に注目した方法によって、長
い複合語から文節を複数のアクセント単位に分割するこ
とが可能となる。
Description: TECHNICAL FIELD The present invention relates to a compound word dividing method, and more particularly to an accent unit extracting method for text-to-speech synthesis. 2. Description of the Related Art In order to add natural prosody in speech synthesis, it is essential to set the position and level of an accent nucleus. In order to do so, first, an input text is converted into an accent unit (a morpheme having only one accent nucleus). Column). At this time, the accent unit usually consists of one or more clauses. When a long compound word forms one clause, it is necessary to divide this compound word (clause) into two or more accent units. As a method for this, there is a method of performing dependency analysis in a compound word, and a method of dividing a compound word into 1 /
A method of dividing at position 2 has been proposed. In the former method, it is necessary to perform dependency analysis without information on particles and auxiliary verbs, and the processing becomes complicated. The latter method has a problem that improper division is often caused when a very long compound word of 5 words or more, a word string divided into two or an odd word is used. Object The present invention has been made in view of the above-mentioned circumstances, and in particular, in a text synthesis of speech, a long compound word which becomes unnatural when spoken as one accent unit is formed in a plurality of appropriate positions. This is done for the purpose of dividing into accent units. Configuration In order to achieve the above object, the present invention extracts morphological analysis information from input text by morphological analysis processing,
Based on the morphological analysis information, a phoneme symbol string is obtained according to a phoneme rule and a prosody symbol string is obtained according to a prosody symbol derivation rule. A parameter sequence of a prepared phoneme segment is read out according to the phoneme symbol string, and a prosody is added according to a combination rule. In the text-to-speech synthesis apparatus, when there is a long compound word having a suffix inserted in the input text, and there is a suffix present in the suffix dictionary in the compound word, a part immediately after this suffix is divided. Position, and if there is a prefix present in the prefix dictionary in the compound word, the position immediately before this prefix is used as a division position, and the compound word is divided into accent units to produce speech in one accent unit. The feature is that unnatural long compound words are uttered more naturally. Hereinafter, a description will be given based on examples of the present invention. FIG. 1 is a flow chart for explaining one embodiment of the present invention. As will be described below, the present invention uses a method of focusing a long compound word on a plurality of accents by focusing on affixes in the compound word. It is divided into units. In FIG.
First, if an affix exists in a compound word, this is a large semantic break. Therefore, a suffix is searched from a compound ending and a prefix is searched from a compound head. If a suffix is found here, the part immediately after the suffix is found, and if a prefix is found, the part immediately before the prefix is used as the dividing position to divide the word into two compound words. FIG. 2 is an example in which the suffix B is found, and FIG. 3 is an example in which the prefix C is found (where A is an accent unit). If the divided compound words are still long, this process is further repeated. Considering the average mora length of the words that make up the compound word, 1
It is appropriate that one accent unit is composed of two or three words (one in this case, one affix). Therefore, if neither a suffix nor a prefix exists in the compound word, a division position is set every two words from the compound word head as shown in FIG. Effects As is clear from the above description, according to the present invention, a phrase can be divided into a plurality of accent units from a long compound word by a method focusing on an affix sandwiched in the compound word.

【図面の簡単な説明】 第1図は、本発明の一実施例を説明するための流れ図、
第2図乃至第4図は、複合語の分割例を示す図である。 A……アクセント単位,B……接尾辞,C……接頭辞。
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a flowchart for explaining one embodiment of the present invention;
2 to 4 are diagrams showing examples of compound word division. A: Accent unit, B: Suffix, C: Prefix.

Claims (1)

(57)【特許請求の範囲】 1.入力テキストから形態素解析処理によって形態素解
析情報を抽出し、該形態素解析情報に基づき音韻規則に
よって音韻記号列を、韻律記号導出規則によって韻律記
号列を求め、予め用意した音素片のパラメータ系列を、
前記音韻記号列にしたがって読み出し、結合規則によっ
て韻律を付加するテキスト音声合成装置において、前記
入力テキスト中に接辞を挟み込んだ長い複合語が存在す
るときに、該複合語中に接尾辞辞書中に存在する接尾辞
があった場合、この接尾辞の直後を分割位置とし、該複
合語中に接頭辞辞書中に存在する接頭辞のあった場合、
この接頭辞の直前を分割位置とし、前記複合語をアクセ
ント単位に分割することにより1つのアクセント単位で
音声すると不自然な長い複合語をより自然に発声するよ
うにしたことを特徴とする複合語分割方法。
(57) [Claims] Morphological analysis information is extracted from the input text by morphological analysis processing, a phonemic symbol sequence is obtained by a phonemic rule based on the morphological analysis information, a prosodic symbol sequence is obtained by a prosodic symbol derivation rule, and a parameter sequence of a phoneme segment prepared in advance is obtained.
In a text-to-speech synthesizer that reads in accordance with the phoneme symbol string and adds a prosody according to a combination rule, when a long compound word having an affix is inserted in the input text, the compound word is included in a suffix dictionary in the compound word. If there is a suffix, the division position is set immediately after this suffix, and if there is a prefix existing in the prefix dictionary in the compound word,
A compound word characterized in that immediately before the prefix is used as a division position, and the compound word is divided into accent units so that an unnatural long compound word can be uttered more naturally when uttered in one accent unit. Split method.
JP62178434A 1987-07-16 1987-07-16 Compound word division method Expired - Lifetime JP2746880B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP62178434A JP2746880B2 (en) 1987-07-16 1987-07-16 Compound word division method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP62178434A JP2746880B2 (en) 1987-07-16 1987-07-16 Compound word division method

Publications (2)

Publication Number Publication Date
JPS6421496A JPS6421496A (en) 1989-01-24
JP2746880B2 true JP2746880B2 (en) 1998-05-06

Family

ID=16048445

Family Applications (1)

Application Number Title Priority Date Filing Date
JP62178434A Expired - Lifetime JP2746880B2 (en) 1987-07-16 1987-07-16 Compound word division method

Country Status (1)

Country Link
JP (1) JP2746880B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6277727B2 (en) * 2014-01-14 2018-02-14 カシオ計算機株式会社 Speech synthesis apparatus, speech synthesis method, and program

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0650435B2 (en) * 1985-03-29 1994-06-29 株式会社東芝 Speech synthesizer

Also Published As

Publication number Publication date
JPS6421496A (en) 1989-01-24

Similar Documents

Publication Publication Date Title
JP3587048B2 (en) Prosody control method and speech synthesizer
JPH0833744B2 (en) Speech synthesizer
KR970037209A (en) Voice output device (SPEECH SYNTHSIZER)
JPH1039895A (en) Speech synthesising method and apparatus therefor
JP2746880B2 (en) Compound word division method
van Rijnsoever A multilingual text-to-speech system
JP3626398B2 (en) Text-to-speech synthesizer, text-to-speech synthesis method, and recording medium recording the method
JPS5972494A (en) Rule snthesization system
JPH01204100A (en) Text speech synthesis system
JP2996978B2 (en) Text-to-speech synthesizer
JP2002123281A (en) Speech synthesizer
JP3397406B2 (en) Voice synthesis device and voice synthesis method
El Kadhi et al. Building diphone database for Arabic text to speech synthesis system
Roux et al. Data-driven approach to rapid prototyping Xhosa speech synthesis
JPH0229797A (en) Text voice converting device
JP2578876B2 (en) Text-to-speech device
KR100269215B1 (en) Method for producing fundamental frequency contour of prosodic phrase for tts
Singh et al. Punjabi text-to-speech synthesis system
JPH0756599B2 (en) How to create audio files
JPH04367000A (en) Voice synthesizing device
JPH02234198A (en) Text voice synthesizing system
JPH06138894A (en) Device and method for voice synthesis
Marshall Speech synthesis in interactive spoken dialogue systems
JPH06168265A (en) Language processor and speech synthesizer
KR970060042A (en) Speech synthesis method

Legal Events

Date Code Title Description
EXPY Cancellation because of completion of term
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20080213

Year of fee payment: 10