JPH11231898A

JPH11231898A - Speech synthesizing device and its control method

Info

Publication number: JPH11231898A
Application number: JP10037638A
Authority: JP
Inventors: Yasuo Okuya; 泰夫奥谷
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1998-02-19
Filing date: 1998-02-19
Publication date: 1999-08-27
Anticipated expiration: 2018-02-19
Also published as: JP3962474B2

Abstract

PROBLEM TO BE SOLVED: To generate a more natural accent by using information on a word positioned before a word for combining whenever necessary, in determining an accent position at the time of combining the word. SOLUTION: An accent combining table 110 holds the position of an accent for combining a word, using a Chinese numeral retrieval key composed of at least one numeral and an auxiliary numeral retrieval key composed of a numeral. At the time of aurally synthesizing an expression composed of the combination of a numeral and an auxiliary numeral, a Chinese numeral retrieval key generation treating part 104 selects an expression maximaly agreeing to the numerical expression in the aurally synthesized expression, from the Chinese numeral retrieval keys registered with the accent combining table 110 for use as the Chinese numeral retrieval key. On the other hand, an auxiliary numeral retrieval key generation treating part 108 generates the auxiliary numeral retrieval key according to auxiliary numerals in the expression to be aurally synthesized. Also, an accent combining tale retrieval treating part 106 retrieves the accent combining table 110, using the generated Chinese numeral retrieval key and the auxiliary numeral retrieval key, thereby obtaining an accent position at the time of the combining.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声合成装置及び
その制御方法に関する。特に、２つの語を連結した際に
適切にアクセント結合を行える音声合成装置及びその制
御方法に関するものである。The present invention relates to a speech synthesizer and a control method thereof. In particular, the present invention relates to a speech synthesizer capable of appropriately performing accent connection when two words are connected, and a control method thereof.

【０００２】[0002]

【従来の技術】一般に、音声合成装置では、自然な合成
音声を生成するために単語間のアクセント結合を考慮す
る。単語間のアクセント結合とは、名詞と格助詞、動詞
と助動詞、数詞と助数詞などの間で起きるアクセント結
合のことである。特に、数詞と助数詞間のアクセント結
合を実現することは、自然な合成音声を生成する上で重
要である。2. Description of the Related Art In general, a speech synthesizer considers accent connections between words in order to generate a natural synthesized speech. An accent connection between words is an accent connection that occurs between a noun and a case particle, a verb and an auxiliary verb, a number and a classifier, and the like. In particular, realizing an accent connection between a number and a classifier is important in generating a natural synthesized speech.

【０００３】従来のこの種の装置による数詞と助数詞間
のアクセント結合は以下のような手順で行われる。（１）数詞を標準漢数字表記（例：１２３４５→一万二
千三百四十五）に変換する。（２）助数詞と助数詞の直前の漢数字（(1)の例では
「五」）とをキーにアクセント結合テーブルを参照し、
数詞と助数詞が連接した場合のアクセントの位置を決定
する。[0003] Accent connection between a numeral and a classifier by a conventional device of this kind is performed in the following procedure. (1) Convert a number into standard Chinese numeral notation (e.g., 12345 → 12,300,345). (2) The accent binding table is referred to using the classifier and the kanji numeral ("5" in the example of (1)) immediately before the classifier as a key,
Determine the position of the accent when a number and a classifier are concatenated.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、助数詞
と助数詞の直前の漢数字だけでアクセントの位置を決定
する従来の方法では、以下の例に示すように、アクセン
トを正確に表現できない場合が生じてしまう。However, in the conventional method of determining the position of the accent only by the classifier and the Chinese numeral just before the classifier, as shown in the following example, there are cases where the accent cannot be accurately expressed. I will.

【０００５】図５は助数詞「円（エン）」と数詞とのア
クセント結合テーブルを示す図である。図５のアクセン
ト結合テーブルでは、数詞と助数詞が連接した場合のア
クセントの位置が、助数詞の直前に位置しうる各漢数字
毎に保持されている。本アクセント結合テーブルでは、
数詞と助数詞の境界を基準位置（０）として、アクセン
ト結合が発生した場合に助数詞の何モーラ目にアクセン
ト核が生じるかという情報を数字で表わしている。な
お、数値がマイナスの場合は、基準位置から左方向にア
クセント核が移動すること意味するものとする。FIG. 5 is a diagram showing an accent combination table of a classifier "en" and a numeral. In the accent combination table of FIG. 5, the position of the accent when a number and a classifier are concatenated is stored for each Chinese numeral that can be located immediately before the classifier. In this accent binding table,
Using the boundary between the numeral and the classifier as a reference position (0), information indicating which mora of the classifier has an accent nucleus when an accent combination occurs is represented by a number. Note that a negative value means that the accent nucleus moves leftward from the reference position.

【０００６】ただし、アクセント位置の値ＰＬは、平板
型（アクセント核がない）になることを意味している。
また、読みはわかりやすくするために記述しているので
あって、アクセント結合テーブルの情報には含まれな
い。[0006] However, the value PL of the accent position means that it is flat (no accent nucleus).
In addition, the reading is described for simplicity, and is not included in the information of the accent combination table.

【０００７】このアクセント結合テーブルを用いると、
数詞と助数詞が連接した場合のアクセントの位置を決定
することができる。例えば、四円…ヨ↓エン（アクセント位置「０」）十五円…ジュウ（↑）ゴ↓エン（アクセント位置
「０」）百円…ヒャ（↑）クエン（アクセント位置「ＰＬ」）千円…セ↓ンエン（アクセント位置「−１」）となる。Using this accent binding table,
It is possible to determine the position of the accent when a number and a classifier are concatenated. For example, 4 yen… Yo エン en (accent position “0”) 15 yen… Ju (↑) go ↓ en (accent position “0”) 100 yen… Hya (↑) cuen (accent position “PL”) 1,000 yen ... Sen-en (accent position "-1").

【０００８】同様に、数詞が「二百」や「四千」などの
場合は、それぞれ連接する数詞が「百」、「千」である
ので、二百円…ニ（↑）ヒャクエン四千円…ヨ（↑）ンセ↓ンエンとなる。しかしながら、これらのアクセントは本来のア
クセント位置、ニ（↑）ヒャク↓エンヨ（↑）ンセンエンとは異なっており、正確なアクセント位置の生成が行な
えない場合が存在するという問題があった。Similarly, when the numeral is "200" or "4000", the connected numerals are "100" and "1000", respectively. … Yo (↑) However, these accents are different from the original accent position, ni (↑) hyak ↓ enyo (↑) nsenen, and there was a problem that accurate accent positions could not be generated in some cases.

【０００９】本発明は上述した問題点に鑑みてなされた
ものであり、語の連結時のアクセント位置を決定するに
おいて、必要に応じて結合位置の語よりも前にある語の
情報を利用し、より自然なアクセントを生成することが
可能な音声合成装置及び方法を提供することを目的とす
る。SUMMARY OF THE INVENTION The present invention has been made in view of the above-described problems. In determining an accent position at the time of connecting words, information of a word preceding a word at a combination position is used as necessary. It is an object of the present invention to provide a speech synthesis apparatus and method capable of generating a more natural accent.

【００１０】また、本発明の他の目的は、数詞と助数詞
の連結において、当該連結位置の直前の漢数字よりも前
に位置する漢数字の情報も利用することを可能とし、数
詞と助数詞が連結した場合に、より自然なアクセントを
生成することを可能とすることにある。Another object of the present invention is to make it possible to use information of a kanji numeral located before a kanji numeral immediately before the connection position in the connection between a numeral and a classifier. The object of the present invention is to make it possible to generate a more natural accent when connected.

【００１１】[0011]

【課題を解決するための手段】上記の目的を達成するた
めの本発明の一態様による音声合成装置は例えば以下の
構成を備えている。すなわち、第１の属性に属し、少な
くとも１つの語からなる表記と、第２の属性に属し、少
なくとも１つの語からなる表記とをキーとして、連結時
のアクセントの位置を保持するアクセント結合テーブル
と、第１の属性に属する第１表記部と第２の属性に属す
る第２表記部が連結されてなる表記を音声合成するにお
いて、前記該第１表記部と前記第２表記部のそれぞれに
関して、前記アクセント結合テーブルに登録された表記
から最長一致する表記を第１及び第２のキーとして抽出
する抽出手段と、前記抽出手段で抽出された第１及び第
２のキーを用いて前記アクセント結合テーブルを検索
し、連結時のアクセント位置を獲得する獲得手段と、前
記音声合成すべき表記と前記獲得手段で獲得したアクセ
ント位置とに基づいて当該表記の音声合成を行う合成手
段とを備える。Means for Solving the Problems A speech synthesizing apparatus according to an aspect of the present invention for achieving the above object has, for example, the following configuration. That is, using a notation belonging to the first attribute and composed of at least one word and a notation belonging to the second attribute and composed of at least one word as keys, In speech-synthesizing a notation obtained by connecting a first notation part belonging to a first attribute and a second notation part belonging to a second attribute, for each of the first notation part and the second notation part, Extracting means for extracting the longest matching notation from the notations registered in the accent connection table as first and second keys, and using the first and second keys extracted by the extraction means to extract the accent connection table Means for retrieving an accent position at the time of connection, and a speech synthesis of the notation based on the notation to be speech-synthesized and the accent position acquired by the acquiring means. And a combining means for performing.

【００１２】また、上記の目的を達成するための本発明
の他の一態様である音声合成装置の制御方法は、例えば
以下の工程を有する。すなわち、第１の属性に属し、少
なくとも１つの語からなる表記と、第２の属性に属し、
少なくとも１つの語からなる表記とをキーとして、連結
時のアクセントの位置を保持するアクセント結合テーブ
ルを備えた音声合成装置の制御方法であって、第１の属
性に属する第１表記部と第２の属性に属する第２表記部
が連結されてなる表記を音声合成するにおいて、前記該
第１表記部と前記第２表記部のそれぞれに関して、前記
アクセント結合テーブルに登録された表記から最長一致
する表記を第１及び第２のキーとして抽出する抽出工程
と、前記抽出工程で抽出された第１及び第２のキーを用
いて前記アクセント結合テーブルを検索し、連結時のア
クセント位置を獲得する獲得工程と、前記音声合成すべ
き表記と前記獲得工程で獲得したアクセント位置とに基
づいて当該表記の音声合成を行う合成工程とを備える。A method for controlling a speech synthesizer according to another embodiment of the present invention for achieving the above object has, for example, the following steps. That is, it belongs to the first attribute, is composed of at least one word, and belongs to the second attribute.
What is claimed is: 1. A method for controlling a speech synthesizer comprising an accent combination table for holding an accent position at the time of connection by using a notation composed of at least one word as a key, comprising: In the speech synthesis of the notation formed by connecting the second notation parts belonging to the attribute, the notation that is the longest match from the notation registered in the accent combination table for each of the first notation part and the second notation part Extracting the first and second keys as the first and second keys, and obtaining the accent connection table using the first and second keys extracted in the extracting step to obtain an accent position at the time of connection. And a synthesizing step of synthesizing the notation based on the notation to be synthesized and the accent position acquired in the acquiring step.

【００１３】[0013]

【発明の実施の形態】以下、添付の図面を参照して本発
明の好適な一実施形態を詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings.

【００１４】図１は本実施形態による音声合成装置の構
成を示すブロック図である。図１において、２１は制御
メモリであり図３のフローチャートに示すような制御手
順に従った制御プログラムを記憶する。２２は制御メモ
リ２１に保持されている制御手順に従って判断・演算な
どを行なう中央処理装置である。２３はメモリ（ＲＡ
Ｍ）であり、中央処理装置２２が各種制御を行う際の作
業領域を提供する。メモリ２３には、図２で説明する
「数詞表記データ保持部１０１」、「標準漢数字表記デ
ータ保持部１０３」、「漢数字検索キー保持部１０
５」、「助数詞検索キー保持部１０７」、「助数詞表記
データ保持部１０９」が割当てられる。FIG. 1 is a block diagram showing the configuration of the speech synthesizer according to the present embodiment. In FIG. 1, reference numeral 21 denotes a control memory which stores a control program according to a control procedure as shown in the flowchart of FIG. Reference numeral 22 denotes a central processing unit that performs determination, calculation, and the like in accordance with a control procedure stored in the control memory 21. 23 is a memory (RA
M), and provides a work area when the central processing unit 22 performs various controls. The memory 23 includes a “numerical notation data holding unit 101”, a “standard Kanji numeral notation data holding unit 103”, and a “kanji numeral search key holding unit 10” described in FIG.
5, "a classifier search key holding unit 107", and a "classifier notation data holding unit 109".

【００１５】２４はディスク装置であり、本実施形態で
はハードディスクを用いる。ディスク装置２４は、図２
で説明する「アクセント結合テーブル１１０」を格納し
ており、使用に際してメモリ２３に格納される。２５
は、上記の各構成を接続するバスである。Reference numeral 24 denotes a disk device, which uses a hard disk in this embodiment. The disk device 24 is shown in FIG.
Are stored in the memory 23 at the time of use. 25
Is a bus connecting the above components.

【００１６】図２は、本実施形態による音声合成装置の
機能構成を示すブロック図である。同図において１０１
は数詞の表記を保持する数詞表記データ保持部、１０２
は数詞の表記を標準漢数字表記に変換するための標準漢
数字表記変換処理部、１０３は標準漢数字変換処理部に
より変換された結果を保持する標準漢数字表記データ保
持部、１０４は標準漢数字表記データからアクセント結
合テーブルの検索キーを生成するための漢数字検索キー
生成部、１０５は漢数字検索キー生成部の処理結果を保
持するための漢数字検索キー保持部である。FIG. 2 is a block diagram showing a functional configuration of the speech synthesizer according to the present embodiment. In FIG.
Numeral notation data holding unit for holding the notation of a numeral, 102
Is a standard kanji notation conversion processing unit for converting the notation of the numeral into the standard kanji notation, 103 is a standard kanji notation data holding unit for holding the result of conversion by the standard kanji conversion processing unit, and 104 is a standard kanji notation data holding unit. A kanji numeral search key generation unit 105 for generating a search key for an accent combination table from numeral notation data, and a kanji numeral search key holding unit 105 for holding the processing result of the kanji numeral search key generation unit.

【００１７】１０６は漢数字検索キーと助数詞検索キー
を使ってアクセント結合テーブル１１０を検索するため
のアクセント結合テーブル検索処理部である。１０７は
助数詞表記データから生成した助数詞検索キーを保持す
るための助数詞検索キー保持部、１０８は助数詞表記デ
ータから助数詞検索キーを生成する助数詞検索キー生成
処理部、１０９は助数詞表記データを保持する助数詞表
記データ保持部、１１０は漢数字検索キーと助数詞検索
キーに対応するアクセント位置の値を保持するアクセン
ト結合テーブルである。Reference numeral 106 denotes an accent connection table search processing unit for searching the accent connection table 110 by using a Chinese numeral search key and a classifier search key. 107 is a classifier search key holding unit for holding a classifier search key generated from the classifier notation data, 108 is a classifier search key generation processing unit for generating a classifier search key from the classifier notation data, and 109 is a classifier holding the classifier notation data The notation data holding unit 110 is an accent binding table that holds accent position values corresponding to the Chinese numeric search key and the classifier search key.

【００１８】次に、以上の各機能構成による音声合成処
理の手順を、図３に示すフローチャートを参照して説明
する。まず、ステップＳ３０１では、標準漢数字表記変
換処理部１０２が数詞表記データ保持部１０１に保持さ
れた数詞の表記（例：１２３、１万４００、千五百）
を、表記のゆれをなくすために標準漢数字表記（例：百
二十三、一万四百、千五百）に変換し、この変換結果を
標準漢数字表記データ保持部１０３に保持させる。そし
てステップＳ３０２に移る。Next, the procedure of the speech synthesis processing by each of the above-described functional configurations will be described with reference to the flowchart shown in FIG. First, in step S301, the standard kanji notation conversion processing unit 102 uses the notation of the numeral stored in the numeral data storage unit 101 (eg, 123, 14,400, 1,500).
Is converted to standard Chinese numeral notation (e.g., 123, 14,400, 1,500) to eliminate the fluctuation of the notation, and the conversion result is stored in the standard Chinese numeral notation data storage unit 103. Then, control goes to a step S302.

【００１９】ステップＳ３０２では、漢数字検索キー生
成処理部１０４が、標準漢数字表記データ保持部１０３
に保持されている標準漢数字表記からアクセント結合テ
ーブルの検索に必要な漢数字検索キーを生成する。生成
された漢数字検索キーは漢数字検索キー保持部１０５に
保持される。In step S302, the kanji numeric search key generation processing unit 104 sets the standard kanji numeric notation data holding unit 103
Generates a kanji search key required for searching the accent binding table from the standard kanji notation stored in. The generated Chinese numeral search key is stored in the Chinese numeral search key storage unit 105.

【００２０】漢数字検索キーは、アクセント結合テーブ
ル１１０のデータ構造に依存するので、ここでアクセン
ト結合テーブル１１０について説明する。アクセント結
合テーブル１１０は、漢数字表記を行にもち、助数詞表
記を列にもつマトリクス構造を有し、行と列の交差する
場所にアクセント位置の値を保持する。本実施形態によ
るアクセント結合テーブルのデータ構成例を図４に示
す。図４中のアクセント位置を表わす数字は、数詞と助
数詞の境界を０として、右方向（助数詞方向）がプラ
ス、左方向（数詞方向）がマイナスで、アクセント核の
位置が連結位置から何番目のモーラにあるかを示す数値
である。また、ＰＬは平板型（アクセント核がない）に
なることを示す記号である。Since the Chinese numeral search key depends on the data structure of the accent combination table 110, the accent combination table 110 will be described here. The accent combination table 110 has a matrix structure in which Chinese numbers are written in rows and classifiers are written in columns, and values of accent positions are held at intersections of rows and columns. FIG. 4 shows a data configuration example of the accent combination table according to the present embodiment. The numbers representing the accent positions in FIG. 4 are defined such that the boundary between the numeral and the classifier is 0, the right direction (the direction of the classifier) is plus, the left direction (the direction of the numeral) is negative, and the position of the accent nucleus from the connection position It is a numerical value indicating whether it is in Mora. PL is a symbol indicating that the plate is of a flat plate type (without accent nuclei).

【００２１】漢数字検索キー生成処理部１０４は、標準
漢数字表記の一の位から、アクセント結合テーブル１１
０の漢数字表記に一致する部分文字列の最長のものを漢
数字検索キーとし、漢数字検索キー保持部１０５に保持
させる。こうして、漢数字検索キーを生成すると、ステ
ップＳ３０３に移る。The kanji numeric search key generation processing unit 104 starts with the first digit of the standard kanji notation, and
The longest partial character string that matches the Chinese numeral notation of 0 is used as a Chinese numeric search key, and is stored in the Chinese numeric search key storage unit 105. After the generation of the Chinese numeral search key, the process proceeds to step S303.

【００２２】次に、ステップＳ３０３では、助数詞検索
キー生成処理部１０８が助数詞表記データ保持部１０９
に保持されている助数詞表記から助数詞検索キーを生成
する。図４に示したようなアクセント結合テーブル１１
０の場合は、助数詞の表記がそのまま助数詞検索キーと
なる。以上のようにして助数詞検索キーが生成された
ら、その助数詞検索キーを助数詞検索キー保持部１０７
に保持させる。こうして、ステップＳ３０４に移る。Next, in step S303, the classifier search key generation processing unit 108 sets the classifier notation data holding unit 109.
A classifier search key is generated from the classifier notation held in. Accent binding table 11 as shown in FIG.
In the case of 0, the notation of the classifier becomes the classifier search key as it is. When the classifier search key is generated as described above, the classifier search key is used as the classifier search key holding unit 107.
To be held. Thus, the process proceeds to step S304.

【００２３】ステップＳ３０４では、アクセント結合テ
ーブル検索処理部１０６が、漢数字検索キー保持部１０
５に保持されている漢数字検索キーと、助数詞検索キー
保持部１０７に保持されている助数詞検索キーとをもと
に、アクセント結合テーブル１１０を検索し、該当する
値（アクセント位置を示す値）を出力して、すべての処
理を終了する。In step S304, the accent binding table search processing unit 106 sets the
Based on the Chinese numeric search key held in No. 5 and the classifier search key held in the classifier search key holding unit 107, the accent binding table 110 is searched, and the corresponding value (value indicating the accent position) is searched. Is output, and all processing ends.

【００２４】以上のような処理の結果、例えば、「二百
円」については、漢数字検索キーが「二百」、助数詞検
索キーが「円」となり、図４のアクセント結合テーブル
１１０からアクセント位置「０」が出力される。この結
果、音声合成時には、「ニ（↑）ヒャク↓エン」という
アクセント位置が得られる。すなわち、正しいアクセン
ト位置が得られる。As a result of the above processing, for example, for "200 yen", the Chinese numeral search key is "200", the classifier search key is "yen", and the accent position table 110 in FIG. "0" is output. As a result, at the time of speech synthesis, an accent position of "Ni (↑) Hyaku ↓ En" is obtained. That is, a correct accent position can be obtained.

【００２５】同様に、「四千円」については、漢数字検
索キーが「四千」、助数詞検索キーが「円」となり、図
４のアクセント結合テーブル１１０からアクセント位置
「ＰＬ」が出力される。この結果、音声合成時には、
「ヨ（↑）ンセンエン」という正しいアクセント位置が
得られる。Similarly, for "4,000 yen", the Chinese numeral search key is "4,000", the classifier search key is "yen", and the accent position "PL" is output from the accent combination table 110 of FIG. . As a result, during speech synthesis,
The correct accent position "Yo (↑) Neng Seng En" is obtained.

【００２６】なお、上記実施例では、アクセント結合テ
ーブル１１０の助数詞検索キーとして、助数詞の表記を
用いる場合について説明したが、これに限定されるもの
ではない。例えば、テーブルサイズの縮小を考慮して、
漢数字検索キーに対するアクセント核の移動モーラ数の
値がすべて同じとなる助数詞をひとつのカテゴリにまと
めておくようにしてもよい。この場合、助数詞のカテゴ
リ情報を保持する辞書または対応テーブルを用意する。
そして、助数詞検索キー生成処理部１０８では、カテゴ
リ情報辞書または対応テーブルから、助数詞カテゴリ情
報を取り出し、助数詞検索キーを生成する。In the above embodiment, the case where the notation of a classifier is used as a classifier search key of the accent combination table 110 has been described, but the present invention is not limited to this. For example, considering the reduction in table size,
The classifiers having the same value of the moving mora number of the accent nucleus with respect to the Chinese numeral search key may be collected in one category. In this case, a dictionary or a correspondence table holding category information of classifiers is prepared.
Then, the classifier search key generation processing unit 108 extracts classifier category information from the category information dictionary or the correspondence table, and generates a classifier search key.

【００２７】また、上記実施形態においては、アクセン
ト結合テーブル１１０をメモリ２３（ＲＡＭ）上に実現
する場合について説明したが、これに限定されるもので
はなく、任意の記憶媒体を用いて実現してもよい。Further, in the above embodiment, the case where the accent combination table 110 is realized on the memory 23 (RAM) has been described. However, the present invention is not limited to this, and is realized by using an arbitrary storage medium. Is also good.

【００２８】また、上記実施形態においては、各部を同
一の計算機上で構成する場合について説明したが、これ
に限定されるものではなく、ネットワーク上に分散した
計算機や処理装置などに分かれて各部を構成してもよ
い。Further, in the above embodiment, the case where each unit is configured on the same computer has been described. However, the present invention is not limited to this, and each unit is divided into computers and processing devices distributed on a network. You may comprise.

【００２９】以上のように、本実施形態によれば、数詞
と助数詞が連接する場合のアクセントの位置が次のよう
にして得られることになる。すなわち、数詞を一旦、標
準漢数字表記に変換した上で、アクセント結合テーブル
の検索に必要な部分文字列を標準漢数字表記の一の位か
ら順に切り出し、それを漢数字検索キーとする。また、
助数詞または助数詞カテゴリを助数詞検索キーとする。
そして、これら２つの検索キーを使ってアクセント結合
テーブルを検索し、数詞と助数詞が連接する場合のアク
セントの位置を決定する。このため、連結位置の直前の
数詞よりも前に位置する数詞をも考慮したアクセント位
置の決定が可能となり、より正確なアクセントの生成、
ひいては自然な合成音声の生成を行なうことができる。As described above, according to the present embodiment, the position of the accent when a number and a classifier are concatenated can be obtained as follows. That is, after converting a numeral into a standard Chinese numeral notation, a partial character string necessary for searching the accent combination table is sequentially cut out from the first place of the standard Chinese numeral notation, and is used as a Chinese numeral search key. Also,
A classifier or a classifier is used as a classifier search key.
Then, the accent combination table is searched using these two search keys, and the position of the accent when the number and the classifier are connected is determined. For this reason, it is possible to determine an accent position in consideration of a numeral word located before a numeral word immediately before a connection position, and to generate a more accurate accent,
As a result, a natural synthesized speech can be generated.

【００３０】なお、本発明は、複数の機器（例えばホス
トコンピュータ，インタフェイス機器，リーダ，プリン
タなど）から構成されるシステムに適用しても、一つの
機器からなる装置（例えば、複写機，ファクシミリ装置
など）に適用してもよい。The present invention can be applied to a system including a plurality of devices (for example, a host computer, an interface device, a reader, a printer, etc.), but it can be applied to a single device (for example, a copier, a facsimile). Device).

【００３１】また、本発明の目的は、前述した実施形態
の機能を実現するソフトウェアのプログラムコードを記
録した記憶媒体を、システムあるいは装置に供給し、そ
のシステムあるいは装置のコンピュータ（またはＣＰＵ
やＭＰＵ）が記憶媒体に格納されたプログラムコードを
読出し実行することによっても、達成されることは言う
までもない。Another object of the present invention is to provide a storage medium storing a program code of software for realizing the functions of the above-described embodiments to a system or apparatus, and to provide a computer (or CPU) of the system or apparatus.
And MPU) read and execute the program code stored in the storage medium.

【００３２】この場合、記憶媒体から読出されたプログ
ラムコード自体が前述した実施形態の機能を実現するこ
とになり、そのプログラムコードを記憶した記憶媒体は
本発明を構成することになる。In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiment, and the storage medium storing the program code constitutes the present invention.

【００３３】プログラムコードを供給するための記憶媒
体としては、例えば、フロッピディスク，ハードディス
ク，光ディスク，光磁気ディスク，ＣＤ−ＲＯＭ，ＣＤ
−Ｒ，磁気テープ，不揮発性のメモリカード，ＲＯＭな
どを用いることができる。As a storage medium for supplying the program code, for example, a floppy disk, hard disk, optical disk, magneto-optical disk, CD-ROM, CD
-R, a magnetic tape, a nonvolatile memory card, a ROM, or the like can be used.

【００３４】また、コンピュータが読出したプログラム
コードを実行することにより、前述した実施形態の機能
が実現されるだけでなく、そのプログラムコードの指示
に基づき、コンピュータ上で稼働しているＯＳ（オペレ
ーティングシステム）などが実際の処理の一部または全
部を行い、その処理によって前述した実施形態の機能が
実現される場合も含まれることは言うまでもない。When the computer executes the readout program code, not only the functions of the above-described embodiment are realized, but also the OS (Operating System) running on the computer based on the instruction of the program code. ) May perform some or all of the actual processing, and the processing may realize the functions of the above-described embodiments.

【００３５】さらに、記憶媒体から読出されたプログラ
ムコードが、コンピュータに挿入された機能拡張ボード
やコンピュータに接続された機能拡張ユニットに備わる
メモリに書込まれた後、そのプログラムコードの指示に
基づき、その機能拡張ボードや機能拡張ユニットに備わ
るＣＰＵなどが実際の処理の一部または全部を行い、そ
の処理によって前述した実施形態の機能が実現される場
合も含まれることは言うまでもない。Further, after the program code read from the storage medium is written into a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, based on the instructions of the program code, It goes without saying that the CPU included in the function expansion board or the function expansion unit performs part or all of the actual processing, and the processing realizes the functions of the above-described embodiments.

【００３６】[0036]

【発明の効果】以上説明したように本発明によれば、語
の連結時のアクセント位置を決定するに際して、必要に
応じて結合位置の語よりも前にある語の情報を利用する
ので、より自然なアクセントを生成することが可能とな
る。As described above, according to the present invention, when determining the accent position at the time of connecting words, the information of the word preceding the word at the connecting position is used as necessary. Natural accents can be generated.

【００３７】また、本発明によれば、特に、数詞と助数
詞の連結において、当該連結位置の直前の漢数字よりも
前に位置する漢数字の情報も利用することが可能とな
り、数詞と助数詞が連結した場合に、より自然なアクセ
ントを生成することが可能となる。According to the present invention, in particular, in the connection between a numeral and a classifier, it is also possible to use information on a Chinese numeral located before a Chinese numeral immediately before the connection position. When connected, a more natural accent can be generated.

【００３８】[0038]

[Brief description of the drawings]

【図１】本実施形態による音声合成装置の構成を示すブ
ロック図である。FIG. 1 is a block diagram illustrating a configuration of a speech synthesis device according to an embodiment.

【図２】本実施形態による音声合成装置の機能構成を示
すブロック図である。FIG. 2 is a block diagram illustrating a functional configuration of the speech synthesizer according to the embodiment;

【図３】本実施形態の音声合成装置におけるアクセント
位置決定処理の手順を示すフローチャートである。FIG. 3 is a flowchart illustrating a procedure of an accent position determination process in the speech synthesizer of the embodiment.

【図４】本実施形態によるアクセント結合テーブルのデ
ータ構成例を示す図である。FIG. 4 is a diagram illustrating an example of a data configuration of an accent combination table according to the embodiment;

【図５】助数詞「円（エン）」と数詞との一般的なアク
セント結合テーブルを示す図である。FIG. 5 is a diagram showing a general accent combination table of a classifier "en" and a numeral.

Claims

[Claims]

1. An accent that holds a position of an accent at the time of connection using a notation belonging to at least one word belonging to a first attribute and a notation belonging to at least one word belonging to a second attribute as keys. In speech-synthesizing a combined table and a notation formed by connecting a first notation part belonging to a first attribute and a second notation part belonging to a second attribute, the first notation part and the second notation part are combined. Extracting means for extracting, as first and second keys, the longest matching notation from the notations registered in the accent binding table for each of the notations, using the first and second keys extracted by the extracting means; Acquiring means for retrieving an accent connection table and acquiring an accent position at the time of connection; and a sound of the notation based on the notation to be synthesized and the accent position acquired by the acquiring means. A voice synthesizing device comprising voice synthesis means for performing voice synthesis.

2. The method according to claim 2, wherein the first attribute is a numeral, and the second attribute is
2. The speech synthesizer according to claim 1, wherein the attribute is a classifier.

3. The method according to claim 1, wherein the extracting means extracts the longest matching notation from the lower digit side of the numeral in the first notation portion from the accent binding table as a first key, and
The speech synthesizer according to claim 2, wherein a notation that matches a classifier in a notation portion is extracted from the accent connection table and used as a second key.

4. The speech synthesizing apparatus according to claim 2, further comprising a conversion unit that converts the notation of the first notation unit into a standard Kanji number notation and provides the converted notation to the extracting unit.

5. The accent binding table classifies classifiers having the same accent position in the combination of all registered numeral expressions and classifiers into the same category, and uses the category or classifier as a key, and The method according to claim 1, further comprising: storing an accent position based on a concatenation of a numeral and a classifier, wherein the extracting unit extracts, as a second key, a notation or a category corresponding to the classifier in the second notation unit from the accent combination table. 3. The speech synthesizer according to 3.

6. An accent that holds a position of an accent at the time of connection using a notation belonging to at least one word belonging to the first attribute and a notation belonging to at least one word belonging to the second attribute as keys. A method for controlling a speech synthesizer provided with a connection table, wherein speech synthesis is performed on a notation formed by connecting a first notation part belonging to a first attribute and a second notation part belonging to a second attribute. For each of the first notation part and the second notation part, an extraction step of extracting the longest matching notation from the notations registered in the accent binding table as first and second keys, and extracted in the extraction step. Using the first and second keys to search the accent binding table to obtain an accent position at the time of connection; A synthesizing step of synthesizing the notation based on the xent position.

7. The method according to claim 1, wherein the first attribute is a numeral, and the second attribute is
7. The method according to claim 6, wherein the attribute is a classifier.

8. The method according to claim 1, wherein the extracting step extracts the longest matching notation from the lower digit side of the numeral in the first notation part from the accent binding table as a first key.
8. The method according to claim 7, wherein a notation that matches a classifier in a notation unit is extracted from the accent binding table and used as a second key.

9. The control of the speech synthesizer according to claim 7, further comprising a conversion step of converting the notation of the first notation unit into a standard Chinese numeral notation and providing the converted notation to the extraction step. Method.

10. The accent binding table classifies classifiers having the same accent position in the combination of all registered notation of a number and a classifier into the same category, and uses the category or the classifier as a key, and The method according to claim 1, further comprising: storing an accent position based on a concatenation of a number and a classifier, wherein the extracting step extracts, as a second key, a notation or a category matching the classifier in the second notation unit from the accent combination table. 9. The method for controlling a speech synthesizer according to claim 8.

11. An accent that holds a position of an accent at the time of concatenation using a notation belonging to at least one word belonging to a first attribute and a notation belonging to at least one word belonging to a second attribute as keys. A computer readable memory storing a control program for a speech synthesizer using a connection table, wherein the control program is configured by connecting a first notation part belonging to a first attribute and a second notation part belonging to a second attribute. In the speech synthesis of the notation consisting of: extracting the longest matching notation from the notations registered in the accent combination table as the first and second keys for each of the first notation portion and the second notation portion The accent binding table is searched using the code of the process and the first and second keys extracted in the extraction process, and the accent position at the time of connection is obtained. A computer-readable memory, characterized in that it comprises a code acquisition step, and a code combining step of performing speech synthesis of the title based on the accent position denoted said to be speech synthesized acquired in the acquisition step of.

12. The computer readable memory according to claim 11, further comprising storing the accent binding table.