JP3122097B2

JP3122097B2 - Document reading control device and document reading method

Info

Publication number: JP3122097B2
Application number: JP63325052A
Authority: JP
Inventors: 隆之大山; 良明寺本; 光子加世田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1988-12-20
Filing date: 1988-12-20
Publication date: 2001-01-09
Anticipated expiration: 2016-01-09
Also published as: JPH02166554A

Description

【発明の詳細な説明】［概要］コード化された日本語文字データに従って音声を合成
する文書読み上げ制御装置および文書読み上げ方法に関
し、文章の改行において句読点で区切られない場合に次の
行の句読点までの文章として処理することによって、正
確な音声を合成することを目的とし、文章の改行において句読点で区切られない場合に、該
当する最終文字列データを一時的に記憶し、次に現れる
最初の句読点データまでの文字列データに該最終文字列
データを結合して句読点間の文字列データを形成し、該
句読点間の文字列データに従って音声合成を行なうこと
により、不適切な文字列データの切れ目を無くして正確
な文章読み上げを行なうように構成する。DETAILED DESCRIPTION OF THE INVENTION [Summary] The present invention relates to a text-to-speech control device and a text-to-speech method for synthesizing speech in accordance with coded Japanese character data, and to a punctuation mark of the next line when a line break of a text is not separated by a punctuation mark. For the purpose of synthesizing an accurate voice by processing as a sentence, if the punctuation is not delimited in the line feed of the sentence, the corresponding final character string data is temporarily stored, and the first punctuation that appears next By combining the final character string data with the character string data up to the data to form character string data between punctuation marks, and performing speech synthesis in accordance with the character string data between the punctuation marks, it is possible to prevent inappropriate character string data breaks. It is configured to perform accurate text reading aloud.

［産業上の利用分野］本発明は、コード化された日本語文字データに従って
音声を合成する文書読み上げ制御装置および文書読み上
げ方法に関する。The present invention relates to a text-to-speech control device and a text-to-speech method for synthesizing speech in accordance with coded Japanese character data.

文章読み上げ装置は、所謂日本語ワード・プロセッサ
（以下、ワープロ）やその他の文字入力機能を備えたコ
ンピュータ・システム等の機器によって作成された文章
のデジタル・データを音声信号に変換して、あたかも文
章を朗読する如きの機能を有する装置であり、目の不自
由な人々や、多量の文書を迅速に理解したり、肉声を使
用しないで音響機器による情報伝達を行なう場合等に極
めて優れた機能を有し、更なる機能の充実が望まれてい
る。A text-to-speech device converts digital data of a text created by a device such as a so-called Japanese word processor (hereinafter referred to as a word processor) or another computer system having a character input function into a speech signal, and as if it were a text. This is a device that has the function of reading aloud, and is a very excellent function for people who are blind, to quickly understand a large number of documents, or to transmit information using audio equipment without using real voice. Therefore, further enhancement of functions is desired.

［従来の技術］第４図は従来の文章読み上げ装置の構成を示す。[Prior Art] FIG. 4 shows a configuration of a conventional text-to-speech apparatus.

同図において、文字列データは文字列入力部２に入力
された後、解析単位分割部４に転送され、文章中の句読
点「。」と「、」等の区切りデータによって分割され、
この句読点間で分割された文字列データ毎に読み変換部
６へ転送される。読み変換部６はそれぞれの分割された
文字列データを辞書部８に記憶されている日本語の表記
と照合させることによって使用単語を同定し、該単語に
対応する発音データを出力する。この単語同定の方法と
しては例えば最長一致法等の技術を適用している。ここ
で出力された発声データは音声合成部10へ転送され、発
声データに従って音声が合成される。そして、この音声
合成方法として例えば、VCV方式等の技術が適用されて
いる。In FIG. 1, after character string data is input to a character string input unit 2, it is transferred to an analysis unit dividing unit 4 and divided by delimiter data such as punctuation marks “.” And “,” in a sentence.
The character string data divided between the punctuation marks is transferred to the reading conversion unit 6. The reading conversion unit 6 identifies the word to be used by comparing each of the divided character string data with the Japanese notation stored in the dictionary unit 8, and outputs pronunciation data corresponding to the word. As a method of this word identification, for example, a technique such as the longest match method is applied. The utterance data output here is transferred to the speech synthesizer 10, and the speech is synthesized according to the utterance data. For example, a technology such as a VCV method is applied as the voice synthesis method.

［発明が解決しようとする手段］しかしながら、このような従来の文章読み上げ装置に
あっては、句読点「。」と「、」等の区切りデータによ
って文字列データを分割して、それぞれの分割された文
字列データ毎に音声合成の処理を行なう方法を採ってい
るが、例えば第５図に示すように、文章が複数行に渡っ
て続き、その改行の部分が句読点「。」と「、」等の区
切り記号で区切られていない場合であっても、文字列の
改行部分を文章の区切りと判断するため、本来区切るべ
きでない箇所で文章を切ってしまい、正しい文字列デー
タで音声合成が成されないという問題があった。第５図
に示す場合には、「文章読み上げ装置は」、「日本語文
字列から」、「明瞭な音声を」と「合成します」に分割
してからそれぞれの文字列を単語同定して音声合成すべ
きところを、途中で改行されるので、「文章読み上げ装
置は」、「日本語文字列から」、「明瞭な音」、「声
を」と「合成します」に分割してしまい、「明瞭な音声
を」の部分が誤って単語同定されることとなり、誤った
音声合成が行なわれることとなる。[Means to be Solved by the Invention] However, in such a conventional text-to-speech apparatus, character string data is divided by delimiter data such as punctuation marks "." Although a method of performing speech synthesis processing for each character string data is adopted, for example, as shown in FIG. 5, a sentence continues over a plurality of lines, and the line feed portion includes punctuation marks “.” And “,”. Even if they are not separated by a delimiter, the line feed part of the character string is determined to be a text delimiter, so the text is cut at places that should not be separated, and speech synthesis is not performed with correct character string data There was a problem. In the case shown in Fig. 5, the text-to-speech device is divided into "From Japanese character string", "Clear voice" and "Synthesize", and each character string is identified by words. The line where the speech should be synthesized is broken on the way, so it is divided into "sentence reading device", "from Japanese character string", "clear sound", "voice" and "synthesize". , "Clear speech" is erroneously identified as a word, and erroneous speech synthesis is performed.

本発明は、このような課題に鑑みてなされたものであ
り、必ず句読点間で文字列データを分割して正確な音声
合成を行ない得る文書読み上げ制御装置および文書読み
上げ方法を提供することを目的とする。The present invention has been made in view of such a problem, and an object of the present invention is to provide a document reading control device and a document reading method that can always perform character synthesis by dividing character string data between punctuation marks. I do.

［課題を解決するための手段］第１図は本発明の原理説明図である。[Means for Solving the Problems] FIG. 1 is an explanatory view of the principle of the present invention.

まず本発明は、最大文字数が予め決められた行毎に区
切られ且つ複数の行に渡って続くコード化された文字列
データを句読点データ毎に分割する解析単位分割部と、
それぞれ分割された文字列データを発生データに変換す
る変換部を備え前記発生データを音声として出力するた
めの文書読み上げ制御装置を対象とする。First, the present invention provides an analysis unit dividing unit that divides, for each punctuation mark data, coded character string data in which the maximum number of characters is divided for each predetermined line and continues over a plurality of lines,
The present invention is directed to a text-to-speech control device that includes a conversion unit that converts each divided character string data into generated data and that outputs the generated data as voice.

このような装置に対して或る行において句読点データ
で分割されない最終文字列データを保持し、最終文字列
データに続く次の行の文字列データ中の最初に現れる句
読点データまでの文字列データに最終文字列データを結
合して句読点間の文字列データを形成して発生データに
変換する補正部を具備したことを特徴とする。For such a device, the final character string data that is not divided by punctuation data in a certain line is retained, and the character string data up to the first punctuation data in the character string data of the next line following the final character string data is stored. A correction unit is provided that combines the final character string data to form character string data between punctuation marks and converts the data into generated data.

また本発明は最大文字数が予め決められた行毎に区切
られ且つ複数の行に渡って続くコード化された文字列デ
ータを句読点データ毎に分割し、それぞれ分割された文
字列データを発生データに変換することによって音声と
して出力する文書読み上げ方法を対象とする。Also, the present invention divides coded character string data in which the maximum number of characters is divided for each predetermined line and continues over a plurality of lines for each punctuation mark data, and converts each divided character string data into generated data. It is intended for a text-to-speech method that outputs as a sound by conversion.

このような方法において或る行において句読点データ
で分割されない最終文字列データを保持し、最終文字列
データに続く次の行の文字列データ中の最初に現れる句
読点データまでの文字列データに最終文字列データを結
合して句読点間の文字列データを形成して発生データに
変換することを特徴とする。In such a method, the last character string data which is not divided by the punctuation data in a certain line is held, and the last character string data up to the first punctuation data in the character string data of the next line following the last character string data is stored. The method is characterized in that the string data is combined to form character string data between punctuation marks and converted to generated data.

［作用］このような構成を有する本発明の文章読み上げ装置に
あっては、文字列データを必ず句読点間の文字列データ
に分割するので、文章通りの正しい音声合成を行なうこ
とができる。[Operation] In the text-to-speech apparatus of the present invention having such a configuration, since character string data is always divided into character string data between punctuation marks, correct speech synthesis can be performed as written.

［実施例］第２図は本発明の一実施例を示した実施例構成図であ
る。[Embodiment] Fig. 2 is an embodiment configuration diagram showing one embodiment of the present invention.

第２図において、12は文字入力部であり、コード化さ
れた文字列データを入力する。14は解析単位分割部であ
り、入力された文字列データを句読点「。」と「、」等
の区切りデータによって分割する。16はそれぞれ分割さ
れた文字列データを辞書部18に記憶されている日本語の
表記データと照合させることによって使用単語を同定
し、該単語に対応する発音データを出力する。この単語
同定の方法としては例えば最長一致法等の技術を適用し
ている。20は音声合成部であり、発声データに従って音
声合成を行なう。そして、この音声合成方法として例え
ば、VCV方式等の技術が適用されている。In FIG. 2, reference numeral 12 denotes a character input unit for inputting coded character string data. Reference numeral 14 denotes an analysis unit division unit, which divides the input character string data by delimiter data such as punctuation marks “.” And “,”. Reference numeral 16 identifies a word to be used by comparing the divided character string data with Japanese notation data stored in the dictionary unit 18 and outputs pronunciation data corresponding to the word. As a method of this word identification, for example, a technique such as the longest match method is applied. Reference numeral 20 denotes a speech synthesis unit that performs speech synthesis according to the utterance data. For example, a technology such as a VCV method is applied as the voice synthesis method.

22は補正部であり、メモリ24と結合部26を備え、改行
等によって句読点間の文字列データが複数の単位ブロッ
クに跨がった場合は、句読点データで分割されない最終
文字データ列を一時的にメモリ24に記憶し、解析単位分
割部14が次の文字データ列中の最初に現れる句読点デー
タまでの文字列データを分割すると、結合部26が該文字
列データに該最終文字データ列を結合することによって
句読点間の文字データ列を形成し、この句読点間の文字
データ列を読み変換部16へ転送する。Reference numeral 22 denotes a correction unit, which includes a memory 24 and a combining unit 26.If character string data between punctuation marks extends over a plurality of unit blocks due to line feed or the like, the final character data string not divided by punctuation mark data is temporarily stored. When the analysis unit dividing unit 14 divides the character string data up to the first punctuation data appearing in the next character data string, the combining unit 26 combines the final character data string with the character string data. Thus, a character data string between punctuation marks is formed, and the character data string between punctuation marks is transferred to the reading conversion unit 16.

次にかかる構成の実施例の作動を第３図と共に説明す
る。Next, the operation of the embodiment having such a configuration will be described with reference to FIG.

第５図に示すように、一行当り（単位ブロック当た
り）の最大文字数が決められており、句読点で切れずに
複数行に渡って続く文章について音声合成する場合を説
明すると、まず、入力された文字列データに従って解析
単位分割部14が、「文章読み上げ装置」までの文字列デ
ータを分割し、読み変換部16へ転送し、音声合成部20に
よる音声合成が行なわれる。As shown in FIG. 5, the case where the maximum number of characters per line (per unit block) is determined, and speech synthesis is performed on a sentence that continues over a plurality of lines without being cut off by punctuation marks will be described. The analysis unit dividing unit 14 divides the character string data up to the “sentence reading device” according to the character string data, transfers the character string data to the reading conversion unit 16, and performs speech synthesis by the speech synthesis unit 20.

次に、解析単位分割部14は、「日本語文字列から」ま
での文字列データを分割し、変換部16へ転送し、音声合
成部20による音声合成が行なわれる。Next, the analysis unit division unit 14 divides the character string data from “Japanese character string” to transfer it to the conversion unit 16, and the speech synthesis unit 20 performs speech synthesis.

次に、解析単位分割部14は、「明瞭な音」までの文字
列データを分割する。しかし、この文字列データ
は「。」と「、」の句読点で切れていないので、メモリ
24へ転送され、次の行（単位ブロック）処理に移る。Next, the analysis unit dividing unit 14 divides the character string data up to “clear sound”. However, since this character string data is not broken by the punctuation of "." And ",",
The processing is transferred to 24, and the processing moves to the next line (unit block).

次の単位ブロックの処理においては、解析単位分割部
14は、「声を」までの文字列データを分割する。そし
て、メモリ24に記憶されている「明瞭な音」と「声を」
の文字列データを結合して「明瞭な音声を」の句読点間
の文字列データを形成して変換部16へ転送し、音声合成
部20による音声合成を行なう。In the processing of the next unit block, the analysis unit division unit
14 divides the character string data up to “voice”. The “clear sound” and “voice” stored in the memory 24
Are combined to form character string data between punctuation marks of “clear voice”, and are transferred to the conversion unit 16, and the speech synthesis unit 20 performs speech synthesis.

そして最後に、解析単位分割部14は、「合成します」
の文字列データを分割し、音声合成部20による音声合成
を行なう。And finally, the analysis unit division unit 14 “composes”
Is divided, and the speech synthesis unit 20 performs speech synthesis.

このように、この実施例によれば、読み上げようとす
る文章が改行等によって切れる場合でも、必ず句読点間
の文字列毎に分割して音声合成を行なうので、正しく文
章を読み上げることができる。As described above, according to this embodiment, even when a sentence to be read is cut off due to a line break or the like, speech is always divided for each character string between punctuation marks, so that the sentence can be read correctly.

［発明の効果］以上説明したように本発明によれば、改行等によって
句読点間の文字列データが複数の単位ブロックに跨がっ
た場合は、句読点データで分割されない最終文字データ
列を補正部に保持し、更に該最終文字列データに続く次
の文字列データ中の最初に現れる句読点データまでの文
字列データに上記最終文字列データを結合して句読点間
の文字列データを形成して、音声合成を行なうので、常
に正しく文章を読み上げることができる。[Effects of the Invention] As described above, according to the present invention, when character string data between punctuation marks extends over a plurality of unit blocks due to line feed or the like, the correction unit corrects the last character data string that is not divided by the punctuation mark data. And further form the character string data between punctuation marks by combining the final character string data with the character string data up to the first punctuation data appearing in the next character string data following the final character string data, Since speech synthesis is performed, sentences can always be read correctly.

[Brief description of the drawings]

第１図は本発明の原理説明図；第２図は本発明の実施例構成図；第３図は実施例の作動説明図；第４図は従来例の構成図；第５図は従来例の作動を説明するための説明図である。図中、 12:文字列入力部 14:解析単位分割部 16:読み変換部 18:辞書部 20:音声合成部 22:補正部 24:メモリ 26:結合部 1 is a diagram illustrating the principle of the present invention; FIG. 2 is a diagram illustrating the configuration of an embodiment of the present invention; FIG. 3 is a diagram illustrating the operation of the embodiment; FIG. FIG. 5 is an explanatory diagram for explaining the operation of FIG. In the figure, 12: character string input section 14: analysis unit division section 16: reading conversion section 18: dictionary section 20: speech synthesis section 22: correction section 24: memory 26: connection section

───────────────────────────────────────────────────── フロントページの続き (72)発明者加世田光子神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (56)参考文献特開昭63−106040（ＪＰ，Ａ) 特開昭60−246436（ＪＰ，Ａ) ──────────────────────────────────────────────────続き Continuation of the front page (72) Inventor Mitsuko Kaseta 1015 Ueodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa Fujitsu Limited (56) References JP-A-63-106040 (JP, A) JP-A-60-246436 (JP, A)

Claims

(57) [Claims]

An analysis unit dividing unit that divides coded character string data for each punctuation mark data in which a maximum number of characters is divided for each predetermined line and continues over a plurality of lines; A document reading control device including a conversion unit for converting column data into generated data for outputting the generated data as voice, wherein a final character string data which is not divided by the punctuation data in a certain line is held, A correction unit that combines the character string data up to the first punctuation data appearing in the character string data of the next line following the data with the final character string data to form character string data between punctuation marks and converts the data into the generated data. A text-to-speech control device, comprising:

2. A method according to claim 1, wherein the coded character string data in which the maximum number of characters is divided for each predetermined line and which continues over a plurality of lines is divided for each punctuation mark data, and each divided character string data is generated as generated data. In the text-to-speech method of outputting as a voice by converting to the following, the final character string data not divided by the punctuation data in a certain line is held, and the first character string data in the next line following the final character string data is stored. A method of reading a document, comprising combining character string data up to appearing punctuation data with the final character string data to form character string data between punctuation marks and converting the data into the generated data.