JP2002207494A

JP2002207494A - Speech synthesizer, method of synthesizing speech, and computer-readable storage medium with speech synthesizing program recorded thereon

Info

Publication number: JP2002207494A
Application number: JP2001003395A
Authority: JP
Inventors: Keiko Fukita; 慶子吹田; Hiroyuki Kanza; 浩幸勘座
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2001-01-11
Filing date: 2001-01-11
Publication date: 2002-07-26
Anticipated expiration: 2021-01-11
Also published as: JP3648456B2

Abstract

PROBLEM TO BE SOLVED: To solve the problem that proper reading according to contents of a text is impossible heretofore without detecting a keyword. SOLUTION: This speech synthesizer for outputting a synthesized tone corresponding to an inputted text is provided with a means to read in character information corresponding to the inputted text, a means to detect a characteristic of a format of the inputted text according to a character code or position information in the read in character information, and a means to generate the synthesized tone according to the characteristic of the detected format.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声合成装置及び
音声合成方法並びに音声合成プログラムを記録したコン
ピュータ読み取り可能な記録媒体に関する。特に、入力
文のフォーマットに関する特徴を検出し、その特徴に応
じた適切な読み上げを行う音声合成装置及び音声合成方
法並びに音声合成プログラムを記録したコンピュータ読
み取り可能な記録媒体に関する。The present invention relates to a speech synthesizer, a speech synthesis method, and a computer-readable recording medium on which a speech synthesis program is recorded. In particular, the present invention relates to a speech synthesizer, a speech synthesis method, and a computer-readable recording medium storing a speech synthesis program for detecting a feature related to a format of an input sentence and performing appropriate reading aloud according to the feature.

【０００２】[0002]

【従来の技術】従来の音声合成装置では、どのような内
容の文章を読み上げても、声質や韻律的特徴は画一的な
ものであった。特開平９−６２２８６号公報では、入力
文に応じて適切な韻律の合成音を提供可能な音声合成装
置が開示されている。この音声合成装置は、入力文から
キーワードを検出して入力文が属する分野を決定し、決
定された分野に基づいて韻律情報を生成することによ
り、分野毎に韻律が異なる合成音の生成を行っている。2. Description of the Related Art In a conventional speech synthesizer, voice quality and prosodic features are uniform no matter what text is read. Japanese Patent Application Laid-Open No. 9-62286 discloses a speech synthesizer capable of providing a synthesized sound having an appropriate prosody according to an input sentence. The speech synthesizer detects a keyword from an input sentence, determines a field to which the input sentence belongs, and generates prosody information based on the determined field, thereby generating a synthesized sound having a different prosody for each field. ing.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、上記音
声合成装置では、予め分野毎のキーワードを用意する必
要があり、また、キーワードと一致する単語を含まない
文章については適用することができない。However, in the above-mentioned speech synthesizer, it is necessary to prepare a keyword for each field in advance, and it cannot be applied to a sentence that does not include a word that matches the keyword.

【０００４】本発明は、キーワード検出によらず、文章
のフォーマットに関する特徴を検出して解析することに
より、文章の内容に応じた適切な読み上げが可能な音声
合成装置を提供することを主な目的とする。SUMMARY OF THE INVENTION It is a main object of the present invention to provide a speech synthesizer capable of appropriately reading aloud according to the contents of a sentence by detecting and analyzing features relating to the sentence format without detecting a keyword. And

【０００５】[0005]

【課題を解決するための手段】本発明は、入力文に対応
する合成音を出力する音声合成装置において、上記読み
込まれた文字情報中の文字コード又は位置情報に基づい
て上記入力文のフォーマットの特徴を検出する手段と、
上記検出されたフォーマットの特徴に基づいて上記合成
音を生成する手段と、を有することを特徴とする音声合
成装置を提供する。According to the present invention, there is provided a speech synthesizer for outputting a synthesized sound corresponding to an input sentence, wherein the format of the input sentence is determined based on a character code or position information in the read character information. Means for detecting features;
Means for generating the synthesized sound based on the detected characteristics of the format.

【０００６】本発明によれば、入力文のフォーマットの
特徴を検出し、その特徴に基づいて合成音を生成できる
ため、予め分野毎のキーワードを用意する必要がなく、
また、キーワードと一致する単語を含まない文章につい
ても、文章のフォーマットに応じた適切な読み上げが可
能である。According to the present invention, it is possible to detect a feature of the format of an input sentence and generate a synthesized speech based on the feature, so that it is not necessary to prepare a keyword for each field in advance.
Also, a sentence that does not include a word that matches the keyword can be appropriately read aloud according to the format of the sentence.

【０００７】また、本発明は、上記検出されたフォーマ
ットの特徴に基づいて上記入力文の文意の種類を判断す
る手段をさらに有し、上記生成する手段は、上記文意の
種類に応じて上記合成音を生成することを特徴とする音
声合成装置を提供する。Further, the present invention further comprises means for judging the type of sentence of the input sentence based on the detected characteristics of the format, and the generating means comprises a means for judging the type of sentence. A speech synthesizer characterized by generating the above-mentioned synthesized sound is provided.

【０００８】本発明によれば、入力文のフォーマットの
特徴から文意の種類を判断し、その判断結果に基づいて
合成音を生成できるため、入力文の文意に応じた適切な
読み上げが可能である。According to the present invention, the type of sentence meaning can be determined from the characteristics of the format of the input sentence, and a synthesized speech can be generated based on the result of the judgment. It is.

【０００９】また、本発明は、上記生成する手段は、文
意の種類に対応する韻律生成規則を選択し、該規則に基
づいて上記合成音の韻律情報を生成することを特徴とす
る音声合成装置を提供する。Further, the present invention is characterized in that the generation means selects a prosody generation rule corresponding to the type of sentence meaning, and generates the prosody information of the synthesized speech based on the rule. Provide equipment.

【００１０】本発明によれば、文意の種類に応じて韻律
情報を生成するため、入力文の文意に応じた韻律で入力
文を読み上げることができる。According to the present invention, the prosody information is generated according to the type of sentence meaning, so that the input sentence can be read aloud according to the prosody corresponding to the sentence meaning of the input sentence.

【００１１】また、本発明は、上記生成する手段は、文
意の種類に対応する辞書を選択し、該辞書に基づいて上
記入力文の言語解析を行うことを特徴とする音声合成装
置を提供する。Further, the present invention provides a speech synthesis apparatus, wherein the generating means selects a dictionary corresponding to the type of sentence meaning and performs a linguistic analysis of the input sentence based on the dictionary. I do.

【００１２】本発明によれば、文意の種類に対応する辞
書が優先して使用されるため、入力文の文意に応じたよ
り自然な読み上げが可能である。According to the present invention, the dictionary corresponding to the type of sentence is preferentially used, so that a more natural reading can be performed according to the sentence of the input sentence.

【００１３】また、本発明は、上記検出された特徴に基
づいて上記入力文の区切り位置を決定する手段をさらに
有し、上記生成する手段は、上記区切り位置に基づいて
上記合成音を生成することを特徴とする音声合成装置を
提供する。[0013] Further, the present invention further comprises means for determining a break position of the input sentence based on the detected feature, and the generating means generates the synthetic sound based on the break position. A speech synthesizing apparatus is provided.

【００１４】本発明によれば、入力文のフォーマットの
特徴から適切な区切り位置を決定して合成音を生成する
ため、入力文を滑らかに読み上げることができる。According to the present invention, an appropriate delimiter position is determined from the characteristics of the format of an input sentence to generate a synthesized speech, so that the input sentence can be read aloud smoothly.

【００１５】また、本発明は、上記入力文が電子メール
の文章であって、各行の文字数の差分が所定値以下であ
る場合に、上記決定する手段は、上記入力文の各行の改
行コードが存在する位置では入力文を区切らないように
決定することを特徴とする音声合成装置を提供する。Further, according to the present invention, when the input sentence is a text of an electronic mail and the difference in the number of characters in each line is equal to or less than a predetermined value, the determining means determines that the line feed code of each line in the input sentence is Provided is a speech synthesizing apparatus characterized in that an input sentence is determined so as not to be divided at an existing position.

【００１６】本発明によれば、文章の言語解析を行う場
合、通常、改行コードの位置で文章を区切ることが多い
が、電子メールの文章にバラツキが少ない場合、すなわ
ち、各行の文字数が略同一である場合は、各行の文字数
を合わせるためにユーザが強制的に改行を入力している
と判断し、各行の改行コードの位置では文章を区切らな
いように決定することにより、不自然な文章の途切れを
防止し、入力文を滑らかに読み上げることができる。According to the present invention, when performing linguistic analysis of a sentence, the sentence is often divided at the position of a line feed code. However, when the sentence of the electronic mail has little variation, that is, the number of characters in each line is substantially the same. In the case of, it is determined that the user has forcibly entered a line feed to match the number of characters on each line, and by determining that the sentence is not separated at the position of the line feed code on each line, an unnatural sentence The input sentence can be read out smoothly without interruption.

【００１７】また、本発明は、入力文に対応する合成音
を出力する音声合成方法において、上記入力文に対応す
る文字コード列を読み込むステップと、上記読み込まれ
た文字情報中の文字コード又は位置情報に基づいて上記
入力文のフォーマットの特徴を検出するステップと、上
記検出されたフォーマットの特徴に基づいて上記合成音
を生成するステップと、を有することを特徴とする音声
合成方法を提供する。Further, according to the present invention, in a speech synthesizing method for outputting a synthesized sound corresponding to an input sentence, a step of reading a character code string corresponding to the input sentence, wherein a character code or a position in the read character information is read. A speech synthesis method, comprising: detecting a format feature of the input sentence based on information; and generating the synthesized sound based on the detected format feature.

【００１８】本発明によれば、入力文のフォーマットの
特徴を検出し、その特徴に基づいて合成音を生成できる
ため、予め分野毎のキーワードを用意する必要がなく、
また、キーワードと一致する単語を含まない文章につい
ても、文章のフォーマットに応じた適切な読み上げが可
能である。According to the present invention, it is possible to detect a feature of the format of an input sentence and generate a synthesized speech based on the feature, so that it is not necessary to prepare a keyword for each field in advance.
Also, a sentence that does not include a word that matches the keyword can be appropriately read aloud according to the format of the sentence.

【００１９】また、本発明は、上記音声合成方法の各ス
テップをコンピュータに実行させるための音声合成プロ
グラムを記録したコンピュータ読み取り可能な記録媒体
を提供する。Further, the present invention provides a computer-readable recording medium on which a speech synthesis program for causing a computer to execute each step of the above speech synthesis method is recorded.

【００２０】本発明によれば、入力文のフォーマットの
特徴を検出し、その特徴に基づいて合成音を生成できる
ため、予め分野毎のキーワードを用意する必要がなく、
また、キーワードと一致する単語を含まない文章につい
ても、文章のフォーマットに応じた適切な読み上げが可
能である。According to the present invention, it is possible to detect a feature of the format of an input sentence and generate a synthesized speech based on the feature, so that it is not necessary to prepare a keyword for each field in advance.
Also, a sentence that does not include a word that matches the keyword can be appropriately read aloud according to the format of the sentence.

【００２１】[0021]

【発明の実施の形態】（実施の形態１）図１は、本発明
に係る音声合成装置の一形態を示す機能ブロック図であ
る。１７は本実施の形態に係る音声合成装置である。(Embodiment 1) FIG. 1 is a functional block diagram showing one embodiment of a speech synthesizing apparatus according to the present invention. Reference numeral 17 denotes a speech synthesizer according to the present embodiment.

【００２２】図１中のテキストは、音声合成装置１７に
入力される文字情報のことであり、例えば、ＡＳＣＩＩ
コードやＪＩＳコード等の所謂文字コードで示されるデ
ータ列や文字寄せなどを示す位置情報などのことであ
る。本実施の形態では、音声合成装置１７は汎用のパー
ソナルコンピュータに内蔵され、電子メールやＷｅｂペ
ージあるいは電子ファイル形式の文書などのテキストを
音声合成して出力する場合について説明を行うが、本発
明はこれに限定されるものではなく、入力されたテキス
トに対して音声合成を行うような装置全般に対して適用
可能である。The text in FIG. 1 is character information input to the speech synthesizer 17, for example, ASCII.
It is a data string represented by a so-called character code such as a code or a JIS code, or positional information indicating a character alignment or the like. In the present embodiment, a case will be described in which the speech synthesizer 17 is built in a general-purpose personal computer, and outputs a text such as an e-mail, a Web page, or a document in an electronic file format by speech synthesis. The present invention is not limited to this, and can be applied to all devices that perform speech synthesis on input text.

【００２３】図１に基づいて、テキストが入力されてか
ら合成音を出力するまでの音声合成装置１７における処
理の流れを説明する。Referring to FIG. 1, the flow of processing in the speech synthesizer 17 from the input of a text to the output of a synthesized sound will be described.

【００２４】音声合成装置１７に入力されたテキスト
は、言語解析部２において、周知の形態素解析や辞書引
き等によりテキストの言語解析が行われ、音声合成に必
要な読みや韻律などの情報がテキストに付与される。次
に、韻律情報生成部３において、言語解析部２から出力
される読みや韻律などの情報から韻律情報が生成され、
音声合成部４において音声が合成され、出力部５から合
成音が出力される。The text input to the speech synthesizer 17 is subjected to a linguistic analysis of the text by a well-known morphological analysis or dictionary lookup in the language analysis unit 2, and information necessary for speech synthesis such as reading and prosody is converted to the text. Is given to Next, in the prosody information generation unit 3, prosody information is generated from information such as readings and prosody output from the language analysis unit 2,
The voice is synthesized by the voice synthesis unit 4, and a synthesized sound is output from the output unit 5.

【００２５】一方、音声合成装置１７に入力されたテキ
ストは、上述のように言語解析部２に入力されると共
に、フォーマット解析部１にも入力され、テキストに対
応する文字情報が読み込まれる。その後、読み込まれた
文字情報が、入力文のフォーマットの特徴を特定する予
め定められた規則と一致するか否かが判定され、一致す
る場合にその一致した規則に対応するフォーマットの特
徴が検出される。そして、その検出されたフォーマット
の特徴に基づいてテキストの文意の種類が判断され、文
意の種類を示す情報が出力される。ここで、入力テキス
トのフォーマットの特徴とは、そのテキストに含まれる
箇条書きがあるか否か、文字寄せがあるか否か、段落が
あるか否か、各行の文字数のバラツキ度合い、漢字の占
める割合、記号の占める割合、引用部があるか否か、文
末記号の種類、強調文字があるか否かなどのことであ
る。また、文意の種類とは、そのテキストの内容がニュ
ース記事であるとか、レポート文であるとか、通達文で
あるとか、くだけた文章であるとか、形式ばった文章で
あるとかのことを意味する。On the other hand, the text input to the speech synthesizer 17 is input to the language analysis unit 2 as described above and also to the format analysis unit 1, and the character information corresponding to the text is read. Thereafter, it is determined whether or not the read character information matches a predetermined rule that specifies a characteristic of the format of the input sentence. If the character information matches, a characteristic of the format corresponding to the matched rule is detected. You. Then, the type of the meaning of the text is determined based on the detected characteristics of the format, and information indicating the type of the meaning is output. Here, the characteristics of the format of the input text include whether there is a bullet point included in the text, whether or not there is character alignment, whether or not there is a paragraph, the degree of variation in the number of characters in each line, and the occupation of kanji. The percentage, the percentage of the symbol, whether or not there is a quoted part, the type of end-of-sentence symbol, and whether or not there is an emphasis character. The type of sentence means that the content of the text is a news article, a report sentence, a notification sentence, an informal sentence, or a formal sentence. .

【００２６】そして、フォーマット解析部１において解
析されたテキストの文意の種類を示す情報は、言語解析
部２、韻律情報生成部３、音声合成部４に出力され、そ
れぞれにおいて、文意の種類に応じた制御が成され、合
成音が生成される。なお、文意の種類に応じて合成音を
生成する具体的な例については、実施の形態２〜実施の
形態４において説明する。The information indicating the type of sentiment of the text analyzed by the format analyzing unit 1 is output to the language analyzing unit 2, the prosody information generating unit 3, and the speech synthesizing unit 4. , And a synthesized sound is generated. A specific example of generating a synthesized sound in accordance with the type of sentence will be described in Embodiments 2 to 4.

【００２７】また、フォーマット解析部１では、検出さ
れた入力テキストのフォーマットの特徴に基づいて入力
テキストの区切り位置が決定され、その区切り位置を示
す情報はテキストの特徴を示す情報と共に言語解析部２
に出力される。なお、区切り位置に基づいて合成音を生
成する具体的な例については、実施の形態５において説
明する。In the format analysis unit 1, the break position of the input text is determined based on the detected format characteristics of the input text, and the information indicating the break position is included in the language analysis unit 2 together with the information indicating the text characteristics.
Is output to A specific example of generating a synthesized sound based on a break position will be described in a fifth embodiment.

【００２８】なお、図１において、フォーマット解析部
１から出ている点線は、入力テキストのフォーマットの
特徴に基づいて生成される情報が、言語解析部２又は韻
律情報生成部３又は音声合成部４又はそれらの組み合わ
せに対して出力されることを示している。In FIG. 1, the dotted line from the format analyzer 1 indicates that the information generated based on the characteristics of the format of the input text is the language analyzer 2, the prosody information generator 3, or the speech synthesizer 4. Or a combination thereof.

【００２９】図２は、フォーマット解析部１の詳細な機
能ブロック図である。フォーマット解析部１では、入力
されたテキストに対応する文字情報を読み込み（図示せ
ず）、特徴検出部１５でその文字情報が、入力文のフォ
ーマットの特徴を特定する予め定められた規則と一致す
るか否かを判定し、一致する場合にその一致した規則に
対応する特徴が検出される。そして、フォーマット分類
部１６において、その特徴に基づいて、そのテキストの
文意の種類が判断され、文意の種類を示す情報が出力さ
れる。FIG. 2 is a detailed functional block diagram of the format analyzer 1. The format analysis unit 1 reads character information corresponding to the input text (not shown), and the feature detection unit 15 matches the character information with a predetermined rule for specifying the characteristics of the format of the input sentence. Then, if they match, a feature corresponding to the matching rule is detected. Then, the format classification unit 16 determines the type of the meaning of the text based on the feature, and outputs information indicating the type of the meaning.

【００３０】特徴検出部１５は、箇条書検出部６、文字
寄せ検出部７、段落検出部８、文字数バラツキ検出部
９、漢字出現率算出部１０、記号出現率算出部１１、引
用部検出部１２、文末記号検出部１３、強調文字検出部
１４で構成されており、各部ではそれぞれ、入力テキス
トの文字情報からフォーマットの特徴を検出するための
規則が予め設定されている。なお、特徴検出部１５は、
上記各部の全部で構成される必要はなく、その一部で構
成されていても良い。The feature detecting section 15 includes an item detecting section 6, a character shift detecting section 7, a paragraph detecting section 8, a character number variation detecting section 9, a kanji appearance rate calculating section 10, a symbol appearance rate calculating section 11, and a quoting section detecting section 12. , A sentence end symbol detection unit 13 and an emphasized character detection unit 14. In each unit, rules for detecting format features from the character information of the input text are set in advance. Note that the feature detection unit 15
It is not necessary to constitute all of the above-mentioned parts, and it may be constituted by a part thereof.

【００３１】次に、特徴検出部１５の各部におけるフォ
ーマットの特徴を検出するための規則について説明す
る。Next, the rules for detecting format features in each section of the feature detection section 15 will be described.

【００３２】箇条書検出部６では、テキストに箇条書き
があるか否かを検出する。箇条書きの検出は例えば次の
ようにして行われる。すなわち、複数行の先頭文字が同
じ記号である場合（図３における記号「・」）、複数行
の先頭文字が数字であり、かつ、その各数字の並びが続
き番号になっている場合（図４）、複数行の先頭文字が
同じ記号であり、かつ、先頭文字の次の文字が数字であ
り、かつ、その各数字の並びが続き番号になっている場
合（図５）、複数行の先頭文字列が均等割付されてお
り、かつ、その直後に同じ数の空白になっている場合
（図６）などである。なお、図３〜図６に示す上記例に
おいては、箇条書を示す複数行の各行が１行単位で示さ
れているが、各行がそれぞれ複数行になっていても良
い。また、図６における均等割付された文字列の直後は
空白ではなく、例えば「：」などの記号であっても良
い。The bullet detecting section 6 detects whether or not the text has bullets. The bullets are detected, for example, as follows. That is, when the first characters of a plurality of lines are the same symbol (symbol “•” in FIG. 3), when the first characters of the plurality of lines are numbers, and when the arrangement of each number is a continuous number (FIG. 4) If the first character of a plurality of lines is the same symbol, the character following the first character is a number, and the sequence of each number is a consecutive number (FIG. 5), This is the case, for example, when the leading character strings are evenly allocated, and immediately after that, there is the same number of blanks (FIG. 6). In the above example shown in FIGS. 3 to 6, each of a plurality of lines indicating a bullet point is shown in units of one line, but each line may be a plurality of lines. Immediately after the equally assigned character string in FIG. 6 is not a blank but may be a symbol such as “:”.

【００３３】文字寄せ検出部７では、各行の文字列が、
右寄せ、左寄せ、又は中央寄せなどの所謂文字寄せがあ
るかどうか、又は、その文字寄せされた文字列が均等割
付されているかどうかを検出する。文字寄せの検出は、
例えば、入力テキストの文字情報中の文字寄せを示す位
置情報に基づいて行われる。In the character shift detecting section 7, the character string of each line is
It is detected whether there is a so-called right-justified, left-justified, or center-justified character alignment, or whether the aligned character string is equally allocated. Character alignment detection is
For example, it is performed based on the position information indicating the character alignment in the character information of the input text.

【００３４】段落検出部８では、テキスト中に段落が存
在するかどうかを検出する。段落は、文章と文章との間
に空白行が挿入されている場合に段落ありと判断され
る。The paragraph detecting section 8 detects whether a paragraph exists in the text. A paragraph is determined to have a paragraph when a blank line is inserted between sentences.

【００３５】文字数バラツキ検出部９では、各行の文字
数をカウントして、各行の文字数のバラツキを検出す
る。各行の文字数が略同一である場合はバラツキ度合い
は低いと判断される。逆に各行の文字数が大きく異なる
場合はバラツキ度合いは高いと判断される。すなわち、
各行の文字数の差分を取り、その差分値が所定値以下な
らばバラツキ度合いは低いと判断され、所定値以上なら
ばバラツキ度合いは高いと判断される。The number-of-characters variation detecting section 9 counts the number of characters in each line and detects the variation in the number of characters in each line. When the number of characters in each line is substantially the same, it is determined that the degree of variation is low. Conversely, if the number of characters in each line is significantly different, it is determined that the degree of variation is high. That is,
The difference in the number of characters in each line is calculated, and if the difference value is equal to or less than a predetermined value, the degree of variation is determined to be low, and if the difference value is equal to or greater than the predetermined value, the degree of variation is determined to be high.

【００３６】漢字出現率算出部１０では、テキストの全
文中から漢字を抽出し、全文字数に占める漢字の数で表
される漢字出現率を算出する。漢字の抽出は、ひらが
な、カタカナ、数字、記号以外の文字を漢字とみなして
行う。The kanji appearance rate calculation unit 10 extracts kanji from the entire text of the text and calculates the kanji appearance rate represented by the number of kanji in the total number of characters. Kanji is extracted by considering characters other than Hiragana, Katakana, numbers, and symbols as Kanji.

【００３７】記号出現率算出部１１では、テキストの全
文中から記号を抽出し、全文字数に占める記号の数で表
される記号出現率を算出する。The symbol appearance rate calculation unit 11 extracts symbols from the entire text of the text and calculates the symbol appearance rate represented by the number of symbols in the total number of characters.

【００３８】引用部検出部１２では、電子メール文の引
用部を検出する。引用部の検出は、各行の先頭文字が、
例えば「＞」などの予め設定された引用記号であるかど
うかで判断される。The quoted part detection unit 12 detects a quoted part of the electronic mail text. To detect a quote, the first character of each line is
For example, the determination is made based on whether the character is a preset quotation mark such as “>”.

【００３９】文末記号検出部１３では、各行の最終文字
が句点記号であるか、又はそれ以外の記号であるかを検
出する。文末記号の検出は、各行の最終文字が、例え
ば、「！」や「？」や「…」などの予め設定された句点
以外の記号であるかどうかで判断される。The end-of-sentence symbol detection unit 13 detects whether the last character of each line is a period mark symbol or another symbol. The detection of the end-of-sentence symbol is determined based on whether or not the last character of each line is a symbol other than a predetermined period, such as “!”, “?”, “…”.

【００４０】強調文字検出部１４では、テキスト中に強
調文字があるかどうかを検出する。強調文字の検出は、
テキスト中の文字に下線や太字などが付加されているか
どうかで判断される。The emphasized character detecting section 14 detects whether or not there is an emphasized character in the text. Detection of emphasized characters
Judgment is made based on whether or not an underline or bold character is added to characters in the text.

【００４１】上述のように、特徴検出部１５では、読み
込まれた文字情報中の文字コードに基づいて入力テキス
トのフォーマットに関する特徴を検出する。すわなち、
例えば、箇条書検出部６において、読み込まれた文字情
報中に、予め箇条書検出部６で定められた規則である複
数行の先頭文字が同じ記号であるという文字コードパタ
ーンを含む場合、その入力テキストのフォーマットの特
徴は、箇条書きがあるという特徴であると特定し、その
情報を検出する。As described above, the feature detecting section 15 detects a feature related to the format of the input text based on the character code in the read character information. That is,
For example, when the item detection unit 6 includes, in the read character information, a character code pattern in which the first character of a plurality of lines, which is a rule predetermined by the item detection unit 6, is the same symbol, The feature of the format is identified as having a bullet, and the information is detected.

【００４２】また、特徴検出部１５では、読み込まれた
文字情報中の位置情報に基づいて入力テキストのフォー
マットに関する特徴を検出する。すわなち、例えば、文
字寄せ検出部７において、読み込まれた文字情報中に、
右寄せがあるという位置情報を含む場合、その入力テキ
ストのフォーマットの特徴は、右寄せがあるという特徴
であると特定し、その情報を検出する。The feature detecting section 15 detects features related to the format of the input text based on the position information in the read character information. That is, for example, in the character shift detection unit 7, in the read character information,
When the position information indicating that there is right alignment is included, the format feature of the input text is identified as having the right alignment, and the information is detected.

【００４３】なお、上述の特徴検出部１５の各部におけ
る入力テキストの各文字の検出は、例えば「＞」文字を
検出するにはそれに対応した文字コードを検出すればよ
く、これらの処理は周知のソフトウエア技術で行うこと
ができる。For the detection of each character of the input text in each section of the feature detection section 15, for example, to detect a ">" character, a character code corresponding to the character may be detected, and these processes are well known. This can be done with software technology.

【００４４】フォーマット分類部１６では、特徴検出部
１５の各部で検出された結果に基づいて入力テキスト
を、友人同士でやり取りされる手紙（電子メール）など
のくだけた文章、強調文字や文字寄せなどを利用して開
催日程などを知らせる通達文や箇条書きを多用するレポ
ート文などの形式ばった文章、ニュース記事などのよう
に文意の種類を判断して分類し、その文意の種類を示す
情報を出力する。上記の分類は、例えば、入力されたテ
キストに、箇条書きと文字寄せと段落があり、各段落内
の文字数にバラツキがない場合は、フォーマットが形式
ばっていることから通達文であると判断するといったよ
うに、経験則からその分類する条件を設定すればよい。
例として、入力テキストが電子メールの場合におけるフ
ォーマット分類部１６の処理を示すフローチャートを図
７に示す。The format classifying section 16 converts the input text based on the result detected by each section of the feature detecting section 15 into a well-written sentence such as a letter (e-mail) exchanged between friends, emphasized characters and character alignment. Judgment type such as a notice sentence that informs the schedule and the like, a report sentence that makes heavy use of bullets, and a news article, etc., is classified and classified, and information indicating the type of sentence Is output. In the above classification, for example, if the input text includes bullets, character alignments, and paragraphs and the number of characters in each paragraph does not vary, it is determined to be a notification because the format is well-formed. For example, the classification condition may be set based on an empirical rule.
As an example, FIG. 7 shows a flowchart illustrating the processing of the format classification unit 16 when the input text is an e-mail.

【００４５】図７を参照してフォーマット分類部１６の
処理を説明すると次のようになる。引用部検出部１２で
引用部が検出されている場合は、友人同士でメールのや
り取りをしている確立が高いと判断して、そのテキスト
はくだけた文章に分類される（Ｓ２０１）。Ｓ２０１で
引用部がないとされ、段落検出部８で段落が検出されて
いない場合も、友人同士でメールのやり取りをしている
確立が高いと判断して、そのテキストはくだけた文章に
分類される（Ｓ２０２）。Ｓ２０２で段落が検出されて
おり、漢字出現率算出部１０で算出された値が所定値
（Ｋ％）以下の場合も、漢字の使用頻度が低いため、友
人同士でメールのやり取りをしている確立が高いと判断
して、そのテキストはくだけた文章に分類される（Ｓ２
０３）。Ｓ２０３で所定値（Ｋ％）以上であり、記号出
現率が所定値（Ｙ％）より大きい場合は、そのテキスト
はニュース記事であると分類される（Ｓ２０４）。Ｓ２
０４で所定値（Ｙ％）以下であり、文末記号検出部１３
で行の最終文字が句点以外の記号、例えば「！」又は
「？」である場合は、友人同士でメールのやり取りをし
ている確立が高いと判断して、そのテキストはくだけた
文章に分類される。なお、この場合、テキストの複数行
のうち１行でもこの条件を満たせばくだけた文章である
と判断してもよいし、複数行のうち所定数以上この条件
を満たせばくだけた文章であると判断するようにしても
よい。Ｓ２０５で行の最終文字が句点であり、強調文字
検出部１４で強調文字が検出されるか、又は、文字寄せ
検出部７で文字寄せが検出されている場合は、そのテキ
ストは開催日程などを知らせる比較的レイアウトを重視
した通達文に分類される（Ｓ２０６）。Ｓ２０６で強調
文字、文字寄せの双方が検出されず、箇条書検出部６で
箇条書きが検出されている場合は、そのテキストは形式
ばった文章のレポート文に分類される（Ｓ２０７）。Ｓ
２０７で箇条書きが検出されない場合は、そのテキスト
はくだけた文章に分類される。The processing of the format classification unit 16 will be described below with reference to FIG. If the quote section is detected by the quote section detection section 12, it is determined that the probability of exchanging emails between friends is high, and the text is classified into unreadable sentences (S201). If it is determined in S201 that there is no quoted portion and no paragraph is detected by the paragraph detection unit 8, it is determined that there is a high probability that friends are exchanging e-mails, and the text is classified into plain text. (S202). Even if a paragraph is detected in S202 and the value calculated by the kanji appearance rate calculation unit 10 is equal to or less than a predetermined value (K%), friends exchange emails because kanji is not frequently used. It is determined that the probability is high, and the text is classified into unreadable sentences (S2).
03). If it is not less than the predetermined value (K%) in S203 and the symbol appearance rate is larger than the predetermined value (Y%), the text is classified as a news article (S204). S2
04 is equal to or less than a predetermined value (Y%),
If the last character of the line is a symbol other than a punctuation mark, for example, "!" Or "?", It is judged that there is a high probability that a friend has exchanged mail, and the text is classified into plain text. Is done. In this case, it may be determined that at least one of the plurality of lines of the text satisfies the condition, or that the plurality of lines satisfies the condition more than a predetermined number. You may make it determine. If the last character of the line is a punctuation mark in S205 and the emphasized character is detected by the emphasized character detection unit 14 or the character shift is detected by the character shift detection unit 7, the text of the line is the schedule of the event. The notice is classified as a notice that emphasizes the layout relatively (S206). If both the emphasized character and the character alignment are not detected in S206, and the item detection is detected by the item detection unit 6, the text is classified into a formal sentence report (S207). S
If no bullets are detected at 207, the text is classified as plain text.

【００４６】なお、図７の例では、入力テキストの文意
の種類を４つに分類したが、例えば、メールなどのくだ
けた文章とレポート文などの形式ばった文章の２種類に
分類したり、さらにそれにニュース記事などのやや形式
ばった文章を加えて３種類に分類するといったように、
分類する数や種類はこれに限定されるものではない。ま
た、図７の例は、フォーマット分類部１６において入力
テキストの文意の種類を判断する際の一例を示すもので
あり、図７のフローチャートで示す全ての条件を使用す
る必要はなく、その一部だけを使用して判断するように
しても良い。In the example shown in FIG. 7, the meaning of the input text is classified into four types. For example, the input text is classified into two types: a plain text such as an e-mail and a formal text such as a report text. In addition, adding somewhat formal text such as news articles to it and classifying it into three types,
The number and types of classification are not limited to this. Further, the example of FIG. 7 shows an example when the type classification of the input text is determined by the format classification unit 16, and it is not necessary to use all the conditions shown in the flowchart of FIG. The determination may be made using only the division.

【００４７】（実施の形態２）図８は、本発明に係る音
声合成装置の別の形態を示す機能ブロック図である。図
１と同一の構成要素には同一の符号を付し、同じ説明は
省略する。(Embodiment 2) FIG. 8 is a functional block diagram showing another embodiment of the speech synthesizer according to the present invention. The same components as those in FIG. 1 are denoted by the same reference numerals, and the same description will be omitted.

【００４８】韻律情報生成部３内の韻律生成規則選択部
１８では、フォーマット解析部１から出力された文意の
種類を示す情報に対応した韻律生成規則を選択する。韻
律生成規則とは、テキストの種類に応じて予め設定され
たピッチ長、ピッチ変動幅、発声速度、ポーズ長などの
韻律生成のためのパラメータのことであり、例えば、文
意の種類を示す情報がくだけた文章である場合は、ピッ
チ変動幅は大きく設定されており、抑揚が明確に出るよ
うに制御されている。また、文意の種類を示す情報がく
だけた文章である場合は比較的簡単な単語を多く含んで
いるため聞き取り易いことから、発声速度は速めに設定
され、かつ、ポーズ長は短めに設定されており、人間が
普通に会話するような速度で読み上げるように制御され
ている。また、文意の種類を示す情報がレポート文など
のように形式ばった文章である場合は、落ち着いたトー
ンで読み上げるようにピッチ長を設定する。文意の種類
を示す情報が通達文などのように箇条書きの多い文章で
ある場合は、聞き取り易いように発声速度は遅めに設定
され、ポーズ長は長めに設定される。なお、上述の規則
は一例である。また、予めその規則を設定しておくので
はなく、ユーザがその規則を自由に設定できるようにし
ておいても良い。The prosody generation rule selection unit 18 in the prosody information generation unit 3 selects a prosody generation rule corresponding to the information indicating the type of sentence output from the format analysis unit 1. The prosody generation rule is a parameter for generating a prosody such as a pitch length, a pitch variation width, a utterance speed, and a pause length which are set in advance according to the type of text. For example, information indicating the type of sentence When the sentence is a loose sentence, the pitch fluctuation width is set to be large, and is controlled so that the intonation is clearly generated. Also, if the information indicating the type of sentiment is informal, it contains a relatively large number of relatively simple words and is easy to hear, so the utterance speed is set faster and the pause length is set shorter. And is controlled to read out at the rate of human conversation. If the information indicating the type of sentence is a well-formed sentence such as a report sentence, the pitch length is set so as to be read out with a calm tone. If the information indicating the type of sentence is a sentence with a large number of bullets, such as a notification sentence, the utterance speed is set to be slow and the pause length is set to be long so that it is easy to hear. The above rule is an example. Instead of setting the rule in advance, the user may be allowed to set the rule freely.

【００４９】このように、テキストの書式からその文意
の種類が分類され、その文意の種類に応じて合成音の韻
律情報を生成するため、テキストの文意に応じた韻律で
テキストを読み上げることが可能になる。特に、入力テ
キストが電子メールであり、連続して複数の電子メール
を読み上げる場合は、電子メール毎にその内容に応じた
韻律で読み上げられることから、文章の内容を韻律から
把握することが可能である。As described above, the type of the sentence is classified from the format of the text, and the prosody information of the synthesized sound is generated in accordance with the type of the sentence. Therefore, the text is read out in the prosody corresponding to the sentence of the text. It becomes possible. In particular, when the input text is an e-mail and a plurality of e-mails are read aloud in succession, the content of the sentence can be grasped from the prosody since each e-mail is read out according to the prosody corresponding to the content. is there.

【００５０】（実施の形態３）図９は、本発明に係る音
声合成装置の別の形態を示す機能ブロック図である。図
１と同一の構成要素には同一の符号を付し、同じ説明は
省略する。(Embodiment 3) FIG. 9 is a functional block diagram showing another embodiment of the speech synthesizer according to the present invention. The same components as those in FIG. 1 are denoted by the same reference numerals, and the same description will be omitted.

【００５１】言語解析部２内の辞書選択部１９では、フ
ォーマット解析部１から出力された文意の種類を示す情
報に対応した辞書を選択する。この選択された辞書は言
語解析部２で使用される。例えば、文意の種類を示す情
報がくだけた文章である場合は、話し言葉用辞書を選択
し、言語解析部２においてこの話し言葉用辞書が通常使
用される辞書よりも優先して使用されるように制御す
る。友人間でやり取りされる電子メールの文章は話し言
葉で書かれている場合が多いため、話し言葉用辞書を選
択することによって、より正しく形態素解析を行うこと
が可能になる。The dictionary selecting section 19 in the language analyzing section 2 selects a dictionary corresponding to the information indicating the type of sentence output from the format analyzing section 1. The selected dictionary is used in the language analysis unit 2. For example, if the information indicating the type of sentence is a plain text, a dictionary for spoken language is selected, and the language analysis unit 2 uses the dictionary for spoken language so as to be used in preference to a dictionary normally used. Control. Since the text of an e-mail exchanged between friends is often written in spoken language, selecting a spoken language dictionary enables more accurate morphological analysis.

【００５２】また、語尾に「○○ですー」や「××なん
だ〜」などの話し言葉特有の読み上げ方を特徴付ける記
号（この場合、「ー」や「〜」）がある場合には、その
記号に適した韻律規則を予め話し言葉用辞書に設定して
おくことにより、より話し言葉調に近い韻律を生成する
ことが可能になる。If there is a symbol (such as "-" or "-") at the end of the sentence, which characterizes the spoken language, such as "○○ ーー" or "XX Nanda ~" By setting the prosodic rules suitable for the symbols in the spoken language dictionary in advance, it becomes possible to generate a prosody closer to the spoken tone.

【００５３】また、箇条書きであると判断された場合に
は、箇条書きを表す先頭文字（例えば、図３に示す記号
「・」）は読み上げないように予め話し言葉用辞書に設
定しておくようにしても良い。When it is determined that the item is an itemized item, the first character representing the itemized item (for example, the symbol "." Shown in FIG. 3) is set in the spoken word dictionary in advance so as not to be read out. You may do it.

【００５４】また、友人間でやり取りされる電子メール
の文章などの話し言葉の文章では、例えば「○○を書い
てマス。」というように単語の一部をカタカナで書いた
りすることがあり、予め話し言葉用辞書にそのようなカ
タカナはひらがなに直すように設定しておくことによ
り、言語解析部２において正しい言語解析を行うことが
でき、より自然な読み上げが可能になる。In spoken sentences such as an e-mail sent between friends, a part of a word may be written in katakana, for example, "write XX and put squares." By setting such katakana in the spoken language dictionary so as to correct the hiragana, correct linguistic analysis can be performed in the linguistic analysis unit 2, and more natural reading aloud becomes possible.

【００５５】（実施の形態４）図１０は、本発明に係る
音声合成装置の別の形態を示す機能ブロック図である。
図１と同一の構成要素には同一の符号を付し、同じ説明
は省略する。(Embodiment 4) FIG. 10 is a functional block diagram showing another embodiment of the speech synthesizer according to the present invention.
The same components as those in FIG. 1 are denoted by the same reference numerals, and the same description will be omitted.

【００５６】音声合成部４内の声質選択部２０では、フ
ォーマット解析部１から出力された文意の種類を示す情
報に対応した声質を選択する。この選択された声質は、
合成音声部４で使用される。声質選択部２０では、予め
複数の声質を用意しておき、例えば、文意の種類を示す
情報が通達文やレポート文などの形式ばった文章である
場合は低い声にするなど、文意の種類に応じて適切な声
質を予め設定しておく。The voice quality selecting section 20 in the voice synthesizing section 4 selects a voice quality corresponding to the information indicating the type of sentence output from the format analyzing section 1. This selected voice quality
Used by the synthesized speech unit 4. The voice quality selection unit 20 prepares a plurality of voice qualities in advance, and for example, if the information indicating the type of sentence is a formal sentence such as a notification sentence or a report sentence, a low voice is used. Appropriate voice quality is set in advance in accordance with.

【００５７】このように、テキストのフォーマットを解
析して分類することにより、文章の内容に応じた声質を
選択することが可能であり、様々な文章に対して異なる
声質で読み上げることが可能である。As described above, by analyzing and classifying the text format, it is possible to select a voice quality according to the content of the text, and it is possible to read out various texts with different voice quality. .

【００５８】（実施の形態５）図１１は、本発明に係る
音声合成装置の別の形態を示す機能ブロック図である。
図１と同一の構成要素には同一の符号を付し、同じ説明
は省略する。(Embodiment 5) FIG. 11 is a functional block diagram showing another embodiment of the speech synthesizer according to the present invention.
The same components as those in FIG. 1 are denoted by the same reference numerals, and the same description will be omitted.

【００５９】フォーマット解析部１は、特徴検出部１５
の各部において検出されたテキストの特徴を示す情報及
びそのテキスト中の位置を示す情報を区切り位置決定部
２１に出力する。つまり、特徴検出部１５の箇条書検出
部６で箇条書きが検出された場合は、箇条書きが検出さ
れたことを示す情報及びその位置情報が出力され、ま
た、文字寄せ検出部７で文字寄せが検出された場合は、
文字寄せが検出されたことを示す情報及びその位置情報
が出力される。なお、特徴検出部１５の各部で複数の特
徴が検出された場合は、その全ての情報を出力するよう
にしておく。The format analyzer 1 includes a feature detector 15
The information indicating the characteristics of the text detected in each part and the information indicating the position in the text are output to the delimiter position determination unit 21. That is, when a bullet is detected by the bullet detection unit 6 of the feature detection unit 15, information indicating that the bullet is detected and its positional information are output. If found,
Information indicating that the character misalignment has been detected and position information thereof are output. When a plurality of features are detected by each unit of the feature detection unit 15, all the information is output.

【００６０】区切り位置決定部２１では、テキストの特
徴を示す情報及びその位置情報に基づいて言語解析部２
で解析されるテキストの文や単語の区切り位置を決定す
る。The delimiter position determining unit 21 uses the language analyzing unit 2 based on the information indicating the characteristics of the text and the position information.
Determines the breakpoints between sentences and words in the text analyzed by.

【００６１】まず、文を区切る位置を決定するための文
区切り規則の例について説明する。First, an example of a sentence separation rule for determining a position for separating sentences will be described.

【００６２】例えば、入力テキストに箇条書きがある場
合は、文章の内容はそれぞれの箇条書き毎に区切られて
いるため、その箇条書きの行については改行の位置でそ
の文を区切るように決定すれば、箇条書きのそれぞれの
行を滑らかに読み上げることができる。すなわち、区切
り位置決定部２１は、テキストに箇条書きの特徴がある
ことを示す情報を検出した場合は、言語解析部２に対し
てその箇条書きの行については改行コードの位置でその
文を区切るように命令すれば良い。また、テキストに文
字寄せがある場合についても同様に、文字寄せの行とそ
の次の行とは文章の内容が異なるため、区切り位置決定
部２１は、テキストに文字寄せの特徴を示す情報を検出
した場合は、言語解析部２に対してその文字寄せの行に
ついては改行コードの位置でその文を区切るように命令
する。そうすることにより、箇条書きの行とその次の行
とを滑らかに読み上げることが可能になる。For example, if the input text has a bullet, the contents of the sentence are divided for each bullet, so that the line of the bullet is determined to be divided at the line feed position. You can read each line of the bullet smoothly. That is, when detecting the information indicating that the text has bullet points, the break position determining unit 21 instructs the linguistic analysis unit 2 to break the sentence at the position of the line feed code for the bulleted line. It is good to instruct. Similarly, in the case where the text has a character alignment, the content of the sentence is different between the line of the character alignment and the next line. Therefore, the delimiter position determination unit 21 detects the information indicating the characteristic of the character alignment in the text. In this case, the language analysis unit 2 is instructed to separate the sentence at the position of the line feed code for the line of the character alignment. By doing so, it becomes possible to smoothly read the bulleted line and the next line.

【００６３】また、電子メールなどで、入力テキストの
各行の文字数のバラツキが少ない場合は、各行の文字数
を合わせて複数行を全体として見やすくするために、ユ
ーザが自ら改行を入力している場合が多いため、その改
行では文を区切らないように決定すれば、その行とその
次の行とを滑らかに読み上げることができる。すなわ
ち、区切り位置決定部２１は、バラツキ度合いが低い場
合、つまり各行の文字数の差分が所定値以下であるとい
う特徴を示す情報を検出した場合は、言語解析部２に対
してその各行における改行コードを無視してその位置で
は文を区切らないように命令すれば良い。In the case where the number of characters in each line of the input text is small in an electronic mail or the like, the user may input a line break in order to make the plurality of lines easy to see as a whole by adjusting the number of characters in each line. Since there are many cases, if the line break is determined not to separate sentences, the line and the next line can be read out smoothly. That is, when the degree of variation is low, that is, when information indicating a characteristic that the difference in the number of characters in each line is equal to or smaller than a predetermined value is detected, the break position determining unit 21 instructs the language analysis unit 2 to insert a line feed code in each line. And ignore the sentence at that position.

【００６４】また、電子メールなどで、入力テキストに
段落がある場合は、段落ごとに文章の内容が異なってい
る場合が多いため、段落中の改行ではその文を区切らな
いように決定すれば、文章を滑らかに読み上げることが
できる。すなわち、区切り位置決定部２１は、テキスト
に段落があるという特徴を示す情報を検出した場合は、
言語解析部２に対してその段落中の改行コードを無視し
てその位置では文を区切らないよう命令すれば良い。な
お、１つの段落中で極端に文字数が少ない行がある場合
は、その位置で文章の内容が変化している可能性が高い
ため、その行末の改行コードで文を区切るようにしても
良い。Also, when there is a paragraph in the input text in an electronic mail or the like, since the content of the sentence is often different for each paragraph, if it is determined that the sentence is not delimited by a line feed in the paragraph, You can read sentences smoothly. That is, when detecting the information indicating the feature that the text has a paragraph,
What is necessary is just to instruct the linguistic analysis unit 2 to ignore the line feed code in the paragraph and not to break the sentence at that position. If there is a line with an extremely small number of characters in one paragraph, there is a high possibility that the content of the sentence has changed at that position, and the sentence may be separated by a line feed code at the end of the line.

【００６５】また、電子メールなどで、入力テキストに
引用部がある場合は、相手先からの返信の際にメールサ
ーバなどで強制的に改行コードが挿入された可能性が高
いため、引用部中の改行ではその文を区切らないように
決定すれば、文章を滑らかに読み上げることができる。
すなわち、区切り位置決定部２１は、テキストに引用部
があるという特徴を示す情報を検出した場合は、言語解
析部２に対してその引用部中の改行コードを無視してそ
の位置では文を区切らないように命令すれば良い。If the input text includes a quoted part in an electronic mail or the like, it is highly likely that a line feed code was forcibly inserted by a mail server or the like when replying from the other party. If you decide not to break the sentence in the line feed, you can read the sentence smoothly.
That is, when detecting the information indicating the feature that the text has a quoted portion, the break position determining unit 21 ignores the line feed code in the quoted portion and separates the sentence at the position. Just tell them not to.

【００６６】上述の文区切り規則の例のように、入力テ
キストのフォーマットの特徴から文の区切りを決定する
ことにより、句点などの文の区切りを示す記号を検出し
てその位置で文を区切る場合に比べて、例えば、文章の
見出しを記した行において句点がない場合や、誤って文
の途中で改行してしまった場合などにおける文の区切り
の誤検出を低く抑えることができる。なお、上述の文区
切り規則の例に、文の区切りを示す記号による手法を組
み合わせて、入力テキストの文の区切りを決定するよう
にしても良い。As in the above example of the sentence separation rule, when the sentence separation is determined from the characteristics of the format of the input text, a symbol indicating a sentence separation such as a period is detected and the sentence is separated at that position. In comparison with, for example, it is possible to suppress erroneous detection of a sentence delimiter in a case where there is no punctuation in a line in which a headline of a sentence is described, or a case where a line feed is mistakenly made in the middle of a sentence. Note that the above-described example of the sentence separation rule may be combined with a technique using a symbol indicating a sentence separation to determine the sentence separation of the input text.

【００６７】次に、文字の間に空白（所謂スペース）が
ある場合の単語区切り規則について説明する。この単語
区切り規則では、文字の間に空白が挿入されている文字
列において、その空白を削除して空白の前後の文字を結
合するかどうかを決定する。Next, a description will be given of a word division rule when there is a space (a so-called space) between characters. In this word separation rule, in a character string in which a space is inserted between characters, it is determined whether to delete the space and combine characters before and after the space.

【００６８】例えば、入力テキストに箇条書きがあり、
かつ、文字列の文字寄せが均等割付になっている場合
は、均等割付された文字列から空白を削除し、空白の前
後の文字を結合する。すなわち、区切り位置決定部２１
は、テキストに箇条書きがあるという特徴を示す情報及
び均等割付があるという特徴を示す情報が含まれている
場合は、言語解析部２に対してその空白を削除し、空白
の前後の文字を結合するように命令する。こうすること
により、空白によって文字列が区切られること無く、言
語解析部２において、その文字列を単語として解析する
ことができる。For example, if the input text has a bullet,
If the character strings are evenly aligned, blanks are deleted from the uniformly allocated character strings, and characters before and after the blanks are combined. That is, the break position determining unit 21
If the text contains information indicating that there is an itemized bullet and information indicating that the text has an equal assignment, the space is deleted from the language analysis unit 2 and the characters before and after the space are deleted. Order to join. By doing so, the character string can be analyzed as a word in the language analysis unit 2 without separating the character string by a blank.

【００６９】上述の単語区切り規則の例のように、入力
テキストのフォーマットの特徴から単語の区切りを決定
することにより、改行の前後又は空白の前後で文章を結
合して形態素解析を行う手法に比べて、計算量を低く抑
えることができる。As in the above-described example of the word separation rule, by determining the word separation from the characteristics of the format of the input text, a morphological analysis is performed by combining sentences before and after a line feed or before and after a space. Thus, the amount of calculation can be kept low.

【００７０】なお、上述の実施の形態１〜実施の形態５
における音声合成装置の各構成を任意に組み合わせた音
声合成装置を提供することも可能である。このように各
構成を任意に組み合わせた場合でも、それぞれの実施の
形態で説明した同様の作用効果を奏することは明らかで
ある。The first to fifth embodiments described above.
It is also possible to provide a voice synthesizing device obtained by arbitrarily combining the components of the voice synthesizing device in the above. It is apparent that the same operation and effect as described in each embodiment can be obtained even when the respective configurations are arbitrarily combined.

【００７１】また、上述の各実施の形態における処理の
一部又は全部をコンピュータによる処理に適した命令の
順番付けられた列からなるもの（プログラム）として提
供することも可能である。また、そのプログラムのイン
ストール、実行、プログラムの流通のために、そのプロ
グラムを記録したコンピュータ読取可能な記録媒体とし
て提供することも可能である。Further, a part or all of the processing in each of the above-described embodiments can be provided as a program (program) comprising an ordered sequence of instructions suitable for processing by a computer. In addition, the program can be provided as a computer-readable recording medium on which the program is recorded for installation, execution, and distribution of the program.

【００７２】[0072]

【発明の効果】本発明によれば、入力テキストのフォー
マットの特徴を検出し、その特徴に基づいて音声合成を
行うため、入力テキストの文意に応じた適切な読み上げ
が可能である。According to the present invention, since the features of the format of the input text are detected and speech synthesis is performed based on the features, it is possible to read out the input text appropriately according to the meaning of the text.

【００７３】また、本発明によれば、入力テキストのフ
ォーマットの特徴から適切な区切り位置を決定して合成
音を生成するため、入力テキストを滑らかに読み上げる
ことが可能である。Further, according to the present invention, an appropriate delimiter position is determined from the characteristics of the format of the input text to generate a synthesized sound, so that the input text can be read aloud smoothly.

[Brief description of the drawings]

【図１】本発明の実施の形態１における音声合成装置の
機能ブロック図である。FIG. 1 is a functional block diagram of a speech synthesizer according to Embodiment 1 of the present invention.

【図２】本発明の実施の形態１におけるフォーマット解
析部の機能ブロック図である。FIG. 2 is a functional block diagram of a format analysis unit according to the first embodiment of the present invention.

【図３】本発明の実施の形態１における、箇条書きを有
するテキストのフォーマットを示す概略図の第１の例で
ある。FIG. 3 is a first example of a schematic diagram showing a format of a text having an itemized list according to the first embodiment of the present invention.

【図４】本発明の実施の形態１における、箇条書きを有
するテキストのフォーマットを示す概略図の第２の例で
ある。FIG. 4 is a second example of a schematic diagram showing a format of a text having bullet points according to the first embodiment of the present invention.

【図５】本発明の実施の形態１における、箇条書きを有
するテキストのフォーマットを示す概略図の第３の例で
ある。FIG. 5 is a third example of a schematic diagram showing a format of a text having bullet points according to the first embodiment of the present invention.

【図６】本発明の実施の形態１における、箇条書きを有
するテキストのフォーマットを示す概略図の第４の例で
ある。FIG. 6 is a fourth example of a schematic diagram showing a format of a text having bullet points according to the first embodiment of the present invention.

【図７】本発明の実施の形態１におけるフォーマット解
析部のフローチャートである。FIG. 7 is a flowchart of a format analysis unit according to Embodiment 1 of the present invention.

【図８】本発明の実施の形態２における音声合成装置の
機能ブロック図である。FIG. 8 is a functional block diagram of a speech synthesizer according to Embodiment 2 of the present invention.

【図９】本発明の実施の形態３における音声合成装置の
機能ブロック図である。FIG. 9 is a functional block diagram of a speech synthesizer according to Embodiment 3 of the present invention.

【図１０】本発明の実施の形態４における音声合成装置
の機能ブロック図である。る。FIG. 10 is a functional block diagram of a speech synthesis device according to a fourth embodiment of the present invention. You.

【図１１】本発明の実施の形態５における音声合成装置
の機能ブロック図である。FIG. 11 is a functional block diagram of a speech synthesizer according to Embodiment 5 of the present invention.

[Explanation of symbols]

１フォーマット解析部２言語解析部３韻律情報生成部４音声合成部１７音声合成装置 DESCRIPTION OF SYMBOLS 1 Format analysis part 2 Language analysis part 3 Prosody information generation part 4 Voice synthesis part 17 Voice synthesis device

Claims

[Claims]

1. A speech synthesizer for outputting a synthesized sound corresponding to an input sentence, comprising: means for reading character information corresponding to the input sentence; and a character code or position information in the read character information. A speech synthesizer comprising: means for detecting a feature of a format of an input sentence; and means for generating the synthetic sound based on the feature of the detected format.

2. The apparatus according to claim 1, further comprising: a unit configured to determine a type of sentence of the input sentence based on the detected characteristic of the format. The generating unit generates the synthesized sound according to the type of sentence. The speech synthesis device according to claim 1, wherein the speech synthesis device generates the speech.

3. The method according to claim 2, wherein said generating means selects a prosody generation rule corresponding to the type of sentence meaning, and generates the prosody information of the synthesized sound based on the rule. Speech synthesizer.

4. The speech synthesizer according to claim 2, wherein said generating means selects a dictionary corresponding to the type of sentence meaning, and performs language analysis of the input sentence based on the dictionary. .

5. The apparatus according to claim 5, further comprising: means for determining a break position of the input sentence based on the detected feature of the format, wherein the generating means generates the synthetic sound based on the break position. The speech synthesizer according to claim 1, wherein:

6. When the input sentence is an electronic mail sentence and the difference in the number of characters in each line is equal to or less than a predetermined value, the determining means determines whether or not a line feed code exists in each line of the input sentence. The speech synthesizer according to claim 5, wherein the input sentence is determined so as not to be divided.

7. A speech synthesizing method for outputting a synthesized speech corresponding to an input sentence, wherein a character code string corresponding to the input sentence is read, and a character code or position information in the read character information is read. A speech synthesis method comprising: detecting a format feature of the input sentence; and generating the synthesized speech based on the detected format feature.

8. A computer-readable recording medium on which a speech synthesis program for causing a computer to execute each step of the speech synthesis method according to claim 7 is recorded.