JPH03293398A

JPH03293398A - Device and method for reading out japanese language sentence

Info

Publication number: JPH03293398A
Application number: JP2093895A
Authority: JP
Inventors: Yoshiaki Teramoto; 寺本　良明
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1990-04-11
Filing date: 1990-04-11
Publication date: 1991-12-25
Anticipated expiration: 2015-04-17
Also published as: JP3034554B2

Abstract

PURPOSE:To read out the Japanese language sentence with satisfactory operation convenience and at a high speed without allowing it to pass through a host device by providing a phoneme data holding/sending-out part and sending out the phoneme data concerned to a feature parameter synthesizing part by designation of discrimination data. CONSTITUTION:When the command for showing the correspondence of a character data string and an identification number is sent from a host device 10, a word identification part 11 collates it with a word dictionary 11a, selects an optimal word train and outputs the phoneme information concerned. A rhythm imparting part 12 gives expiration/paragraph information, and information of an accent type, etc., thereto and generates phoneme data. It is written in an analytic text temporary accumulation buffer 15c together with an identification number. Thereafter, if only there is a command of its identification number utterance, the data is read out of the buffer 15c, sent out to a feature parameter synthesizing part 13, subjected to voice synthesizing analog conversion and the synthetic voice can be generated from a loudspeaker 14c.

Description

【発明の詳細な説明】〔目次〕概要産業上の利用分野従来の技術（第５，６図）発明が解決しようとする課題課題を解決するための手段（第１゜作用（第１，２図）実施例（第３．４図）発明の効果２図）（概要）日本語文章を表す文字データ列を解析して、対応する音
韻情報を出力する単語同定部と、出力された当該音韻情
報に基づいて、呼気段落情報等の韻律情報を付与する韻
律付与部と、付与された韻律情報及び前記音韻情報を含
む音韻データを解析し、対応する特徴パラメータを合成
する特徴パラメータ合成部と、合成された特徴パラメー
タを合成音声に変換して出力する合成音声発生器とを有
する日本語文章読上げ装置及び方法に関し、単語同定等
に費やす時間を短縮し、これにより上位装置の手を煩わ
すことがなく、また、複雑なインタフェースを用いるこ
となく、使い勝手の良い、かつ、迅速に処理が行われる
高速な日本語文章読上げ装置及び方法を提供することを
目的とし、所定の単語列毎に、当該単語列を識別する識別データと
ともに対応する音韻データを保持し、識別データが指定
された場合には、対応する単語列の音韻情報を取り出し
て前記特徴パラメータ合成部に送出する音韻データ保持
送出部を設けた構成である。[Detailed description of the invention] [Table of contents] Overview Industrial field of application Prior art (Figures 5 and 6) Means for solving the problem to be solved by the invention (1st action (1st, 2nd) Figure) Embodiment (Figure 3.4) Effect of the invention Figure 2) (Summary) A word identification unit that analyzes a string of character data representing a Japanese sentence and outputs corresponding phoneme information, and the output phoneme. a prosody adding unit that adds prosodic information such as exhalation paragraph information based on the information; a feature parameter synthesizing unit that analyzes the provided prosodic information and phonetic data including the phonetic information and synthesizes corresponding feature parameters; Regarding a Japanese sentence reading device and method having a synthesized speech generator that converts synthesized feature parameters into synthesized speech and outputs it, the time spent on word identification etc. can be shortened, thereby eliminating the need for higher-level devices. The purpose of the present invention is to provide a high-speed Japanese text reading device and method that is easy to use and performs processing quickly without using a complicated interface. A phonological data holding/sending section is provided which holds corresponding phonological data together with identification data for identifying the string, and when the identification data is designated, extracts phonological information of the corresponding word string and sends it to the feature parameter synthesizing section. The configuration is as follows.

[Industrial application field]

本発明は日本語文章読上げ装置及び方法に係り、特に入
力する日本語文章を表す文字データ列を解析して、対応
する音韻及びアクセント型等の音韻情報を出力する単語
同定部と、出力された当該音韻情報に基づいて、呼気段
落情報、フレーズ境界情報、アクセント境界情報、アク
セント句毎のアクセント型等の韻律情報を付与する韻律
付与部と、付与された韻律情報及び前記音韻情報を含む
音韻データを解析し、対応する特徴パラメータを合成す
る特徴パラメータ合成部と、合成された特徴パラメータ
を合成音声に変換して出力する合成音声発生器とを有す
る日本語文章読上げ装置及び方法に関する。The present invention relates to a Japanese text reading device and method, and more particularly to a word identification unit that analyzes a character data string representing an input Japanese text and outputs phonological information such as the corresponding phonology and accent type; a prosody assignment unit that assigns prosodic information such as exhalation paragraph information, phrase boundary information, accent boundary information, and accent type for each accent phrase based on the phonological information; and phonological data including the assigned prosodic information and the phonological information. The present invention relates to a Japanese sentence reading device and method having a feature parameter synthesis unit that analyzes and synthesizes corresponding feature parameters, and a synthesized speech generator that converts the synthesized feature parameters into synthesized speech and outputs the synthesized speech.

本発明に係る日本語読上げ装置はＣＰＵ等に用いられる
デイスプレィ装置の代りに用いられたり、マルチメディ
アの一つとしての機能を応用した分野、すなわち、ＣＡ
　Ｉ　（ｃｏｍｐｕｔｅｒ　ａｓｓｉｓｔｅｄｉｎｓｔ
ｒｕｃｔｉｏｎ　；電子計算機による教育システムであ
り、プログラム学習による個人別教授を電子計算機と対
話しながら行うようにしたもの）等での音声ガイダンス
としての用法がある。The Japanese reading device according to the present invention can be used in place of a display device used in a CPU, etc., or can be used in fields where the function as one of multimedia is applied, that is, CA
I (computer assisted
It is used as audio guidance in systems such as computer-based educational systems in which individual instruction is conducted through program learning while interacting with the computer.

[Conventional technology]

従来、第５図に示すような第一の従来例に係る日本語文
章読上げ装置があった。Conventionally, there has been a Japanese text reading device according to a first conventional example as shown in FIG.

当該装置は同図に示すように、入力する日本語文章に対
応する文字系列データを単語辞書と照合することにより
候補単語を抽出し、これらの単語の組合せのうちで最適
な単語列を選択し、同定された単語に対応する音韻、ア
クセント型及び文法等の音韻情報を出力する単語同定部
４１と、同定された単語に含まれている音韻情報に基づ
いて、呼気段落情報、フレーズ境界情報、アクセント境
界情報、アクセント句毎のアクセント型等の韻律情報を
付与する韻律付与部４２と、付与された韻律情報及び前
記音韻情報を含む音韻データを解析し、音節ファイルか
ら対応する声道特性パラメータを取り出して結合させる
とともに、ピッチ等の音源パラメータを計算して求め声
道特性パラメータに付与する特徴パラメータ合成部４３
と、合成された特徴パラメータを合成音声に変換する合
成音声発生器４４とを有するものである。As shown in the figure, this device extracts candidate words by comparing the character sequence data corresponding to the input Japanese text with a word dictionary, and selects the optimal word string from among these word combinations. , a word identification unit 41 that outputs phonological information such as phoneme, accent type, and grammar corresponding to the identified word; and based on the phonological information included in the identified word, exhalation paragraph information, phrase boundary information, A prosody adding unit 42 adds prosodic information such as accent boundary information and accent type for each accent phrase, and analyzes the provided prosodic information and phonetic data including the phonetic information, and extracts the corresponding vocal tract characteristic parameters from the syllable file. Feature parameter synthesis unit 43 that extracts and combines sound source parameters such as pitch and adds them to the obtained vocal tract characteristic parameters.
and a synthesized speech generator 44 that converts the synthesized feature parameters into synthesized speech.

一方、第二の従来例に係る日本語文章読上げ装置を第６
図に示す。On the other hand, the Japanese text reading device according to the second conventional example was installed in the sixth
As shown in the figure.

当該装置は同図に示すように、入力する日本語文章に対
応する文字系列データを単語辞書と照合することにより
候補単語を抽出し、これらの単語の組み合せのうちで最
適な単語列を選択し、同定された単語に対応する音韻、
アクセント型及び文法等の音韻情報を出力する単語同定
部５１と、同定された単語に含まれている音韻情報に基
づいて、呼気段落情報、フレーズ境界情報、アクセント
境界情報、アクセント句毎のアクセント型等の韻律情報
を付与する韻律付与部５２とを有するとともに、付与さ
れた韻律情報及び前記音韻情報を含む音韻データを受は
取り、受は取った音韻データを前記特徴パラメータ合成
部５３に送出する音韻データ保持送出部を上位装置５０
内に設け、当該音韻データを解析し、音節ファイルから
対応する声道特性パラメータを取り出して結合させると
ともに、ピッチ等の音源パラメータを計算して声道特性
パラメータに付与する特徴パラメータ合成部５３と、合
成された特徴パラメータを合成音声に変換する合成音声
発生器５４とを有するものである。As shown in the figure, this device extracts candidate words by comparing the character sequence data corresponding to the input Japanese text with a word dictionary, and selects the optimal word string from among the combinations of these words. , the phonology corresponding to the identified word,
A word identification unit 51 outputs phonological information such as accent type and grammar, and extracts exhalation paragraph information, phrase boundary information, accent boundary information, and accent type for each accent phrase based on the phonological information included in the identified word. It has a prosody adding unit 52 that adds prosody information such as, etc., and the receiver receives phoneme data including the added prosody information and the phoneme information, and the receiver sends the taken phoneme data to the feature parameter synthesis unit 53. The phoneme data holding and sending unit is connected to the host device 50.
a feature parameter synthesis unit 53 provided within the phonological data, which analyzes the phoneme data, extracts and combines the corresponding vocal tract characteristic parameters from the syllable file, and calculates sound source parameters such as pitch and adds them to the vocal tract characteristic parameters; The synthesized speech generator 54 converts the synthesized feature parameters into synthetic speech.

これにより、第一の従来例に比べ、次のような利点が生
ずる。This provides the following advantages over the first conventional example.

すなわち、特徴パラメータ合成と音声合成の処理時間の
方が単語同定と韻律付与の処理時間よりもかなり小さい
ので、上位装置５０からデータを送ってから音声合成を
開始するまでのオーバヘッド時間を十分小さくすること
ができる。また、上位装置５０で韻律情報を含む音韻デ
ータを操作することができるので、発生に関するもう少
し詳細な情報を追加して付与することができる。In other words, since the processing time for feature parameter synthesis and speech synthesis is considerably shorter than the processing time for word identification and prosody assignment, the overhead time from sending data from the host device 50 to starting speech synthesis is made sufficiently small. be able to. Furthermore, since the phonetic data including prosody information can be manipulated by the host device 50, more detailed information regarding occurrence can be added and provided.

[Problem to be solved by the invention]

ところで、第二の従来例は、第一の従来例に比べ処理時
間の短縮等の利点はあるが、次のような問題点をも有し
ていた。By the way, although the second conventional example has advantages such as shorter processing time compared to the first conventional example, it also has the following problems.

すなわち、上位装置５０と日本語読上げ装置とのインタ
フェース（データのやりとり）が非常に複雑となり、上
位装置の負担が増大し、むしろ、上位装置の方が処理が
複雑になるという問題点を有していた。In other words, the interface (data exchange) between the host device 50 and the Japanese reading device becomes very complicated, increasing the burden on the host device, and in fact, the problem is that the processing on the host device becomes more complicated. was.

そこで、本発明は上位装置の負担を増大させることなく
、かつ、処理時間を短縮し、発生に関するもう少し詳細
な情報を追加することができる日本語文章読上げ装置及
び方法を提供することを目的としてなされたものである
。Therefore, the present invention has been made with the object of providing a Japanese sentence reading device and method that can shorten processing time and add more detailed information regarding occurrences without increasing the burden on host devices. It is something that

（課題を解決するための手段）以上の技術的課題を解決するため、第一の発明は第１図
に示すように、入力する日本語文章を表す文字データ列
を解析して、対応する音韻及びアクセント型等の音韻情
報を出力する単語同定部１と、出力された当該音韻情報
に基づいて、呼気段落情報、フレーズ境界情報、アクセ
ント境界情報、アクセント句毎のアクセント型等の韻律
情報を付与する韻律付与部２と、付与された韻律情報及
び前記音韻情報を含む音韻データを解析し、対応する特
徴パラメータを合成する特徴パラメータ合成部３と、合
成された特徴パラメータを合成音声に変換して出力する
合成音声発生器４とを有する日本語文章読上げ装置にお
いて、所定の単語列毎に、当該単語列を識別する識別デ
ータとともに対応する音韻データを保持し、識別データ
が指定された場合には、対応する単語列の音韻情報を取
り出して前記特徴パラメータ合成部３に送出する音韻デ
ータ保持送出部５を設けたものである。(Means for Solving the Problems) In order to solve the above-mentioned technical problems, the first invention, as shown in FIG. and a word identification unit 1 that outputs phonological information such as accent type, and provides prosodic information such as exhalation paragraph information, phrase boundary information, accent boundary information, and accent type for each accent phrase based on the outputted phonological information. a prosody assigning unit 2 that analyzes the assigned prosody information and the phonetic data including the phonetic information, and a feature parameter synthesizing unit 3 that synthesizes corresponding feature parameters; In a Japanese text reading device having a synthesized speech generator 4 for outputting, for each predetermined word string, identification data for identifying the word string and corresponding phonological data are held, and when identification data is specified, , a phonological data holding/sending section 5 is provided which extracts phonological information of a corresponding word string and sends it to the feature parameter synthesizing section 3.

第二の発明は第２図に示すように、日本語文章を表す文
字データ列を入力し（Ｓｌ）、入力した文字データ列を
解析し、対応する音韻及びアクセント型等の音韻情報を
出力し（Ｓ２）、出力された当該音韻情報に基づいて、
呼気段落情報、フレーズ境界情幸団、アクセント境界情
報、アクセント句毎のアクセント型等の韻律情報を付与
（Ｓ３）し、付与された韻律情幸σ及び前記音韻情報を
含む音韻データを解析し、対応する特徴パラメータを合
成（Ｓ６）し、合成された特徴パラメータを合成音声に
変換して合成音声を出力（Ｓ７）する日本語文章読上げ
方法において、所定の単語列毎に、当該単語列を識別す
る識別データとともに対応する音韻データを保持（Ｓ４
）し、識別データが指定された場合（Ｓ５）には、対応する単
語列の音韻情報を取り出した後、特徴パラメータの合成
を行う（Ｓ６）ものである。As shown in Figure 2, the second invention inputs a character data string representing a Japanese sentence (Sl), analyzes the input character data string, and outputs phoneme information such as the corresponding phoneme and accent type. (S2), based on the output phonetic information,
Adding prosodic information such as exhalation paragraph information, phrase boundary emotion group, accent boundary information, and accent type for each accent phrase (S3), and analyzing the attached prosodic emotion σ and phonological data including the phonological information, In a Japanese text reading method that synthesizes corresponding feature parameters (S6), converts the synthesized feature parameters into synthesized speech, and outputs the synthesized speech (S7), the word string is identified for each predetermined word string. The corresponding phoneme data is retained together with the identification data (S4
), and when identification data is specified (S5), the phonetic information of the corresponding word string is extracted, and then feature parameters are synthesized (S6).

（作用）続いて、第−及び第二の発明に係る読上げ装置及び方法
の動作について説明する。(Operation) Next, the operation of the reading device and method according to the first and second inventions will be explained.

日本語の文章の読み上げを行うには、第１図及び第２図
に示すように、ステップＳ１で読み上げようとする日本
語文章を表す単語データ列が所定の単語列を識別する識
別データに対応させて入力すると、ステップＳ２で前記
単語同定部１は当該日本語文章を表す文字データ列に対
応する音韻情報を、例えば、単語辞書と照合することに
より候補単語を抽出し、選択した最適な単語列に相当す
る音韻情報を出力する。出力された音韻情報は識別デー
タに対応させて前記韻律付与部２に送出される。To read out a Japanese sentence, as shown in FIGS. 1 and 2, in step S1, a word data string representing the Japanese sentence to be read out corresponds to identification data for identifying a predetermined word string. Then, in step S2, the word identification unit 1 extracts candidate words by comparing the phonological information corresponding to the character data string representing the Japanese sentence with a word dictionary, and selects the selected optimal word. Outputs the phonological information corresponding to the string. The output phonetic information is sent to the prosody adding section 2 in correspondence with the identification data.

ステップＳ３で当該韻律付与部２は同定された単語に含
まれる、音韻（読み）、アクセント型、文法、・・・等
の情報を用いて、音韻情報に対し、呼気段落情報、フレ
ーズ境界情報、アクセント句毎のアクセント型等の情報
が付与される。In step S3, the prosody imparting unit 2 uses information such as phoneme (reading), accent type, grammar, etc. contained in the identified word to apply exhalation paragraph information, phrase boundary information, etc. to the phoneme information. Information such as the accent type for each accent phrase is given.

ここで、「呼気段落」とは息継ぎ等の呼気を伴うところ
の、−息で発声される文章のまとまりをいう。Here, the term "exhalation paragraph" refers to a group of sentences uttered with an exhalation, such as a breather.

「フレーズ境界」とは句等に相当する発語区分毎の緩や
かな声の高さの上げ下げであり、イントネーションにお
いて、声の立て直しを行う境界をいう。A "phrase boundary" is a gradual rise and fall in the pitch of each utterance segment corresponding to a phrase, etc., and refers to a boundary where the voice is reformed in intonation.

「アクセント句境界」とはほぼ、単語毎の局所的な声の
高さの上げ下げを伴なったまとまりに分ける境界をいう
。The term "accent phrase boundary" roughly refers to a boundary that divides each word into groups with a local rise or fall in pitch.

こうして韻律情報が付与された各単語は、ステップＳ４
で前記識別データとともに、前記音韻データ保持送出部
５に一旦保持される。Each word to which prosody information has been added in this way is processed in step S4.
The data is temporarily held in the phoneme data holding and sending unit 5 together with the identification data.

保持された当該音韻情報は、ステップＳ５で前記識別デ
ータを指定した読出しの指示があると、ステップＳ６で
当該音韻情報は前記特徴パラメータ合成部３に送出され
、特徴パラメータの合成が行われることになる。When there is an instruction to read out the retained phonetic information specifying the identification data in step S5, the phonetic information is sent to the feature parameter synthesis section 3 in step S6, and feature parameter synthesis is performed. Become.

ここで、「特徴パラメータ」とは音声の特徴を表現する
ために指定されるパラメータであって、声質を表す音源
パラメータと、言語的内容を表わす声道特性パラメータ
の二種類があり、音源パラメータには基本周波数（ピッ
チ）や振幅等があり、声道特性パラメータにはＬＰＧ係
数、ＰＡＲＣＯＲ係数、ＬＳＰ係数、ホルマント周波数
等がある。Here, "feature parameters" are parameters specified to express the characteristics of speech, and there are two types: sound source parameters that represent voice quality and vocal tract characteristic parameters that represent linguistic content. includes fundamental frequency (pitch), amplitude, etc., and vocal tract characteristic parameters include LPG coefficient, PARCOR coefficient, LSP coefficient, formant frequency, etc.

「特徴パラメータ合成」とは入力する音韻データを解読
し、例えば、それに対応する（ｃｖ＆Ｖ）音節ファイル
から、対応する特徴パラメータの声道特性パラメータを
取り出して結合させる。"Feature parameter synthesis" involves decoding input phoneme data, extracting vocal tract characteristic parameters of corresponding feature parameters from the corresponding (CV&V) syllable file, and combining them.

また、ピッチ等の音源パラメータを計算して求め、声道
パラメータに付与するといった処理を行うことになる。In addition, processing such as calculating and obtaining sound source parameters such as pitch and adding them to vocal tract parameters is performed.

合成された当該パラメータは前記合成音声発生器４に送
出され、ステップＳ７で入力した日本語の文章が合成音
声で読み上げられることになる。The synthesized parameters are sent to the synthesized speech generator 4, and the Japanese sentence input in step S7 is read out using synthesized speech.

〔Example〕

次に、本発明の実施例に係る日本語文章読上げ及び方法
装置について説明する。Next, a Japanese text reading and method apparatus according to an embodiment of the present invention will be described.

第３図に本実施例に係る全体機器構成図を示す。FIG. 3 shows an overall equipment configuration diagram according to this embodiment.

本システムは同図に示すように、読み上げの指示ととも
に、読み上げようとするデータを出力する上位装置１０
と、本実施例に係る日本語文章読上げ装置２０とからな
っている。As shown in the figure, this system includes a host device 10 that outputs reading instructions and data to be read out.
and a Japanese text reading device 20 according to this embodiment.

上位装置１０はＣＰＵ１０ａと、インタフェース１０ｂ
と、表示部１０ｃ、ファイル１０ｄ、プリンタ装置１０
ｅ等の入出力装置を伴なっている。The host device 10 has a CPU 10a and an interface 10b.
, display unit 10c, file 10d, printer device 10
It is accompanied by input/output devices such as e.

また、日本語文章読上げ装置２０は同図に示すように、
単語同定等を行うＣＰＵ２１と、単語の同定を行うため
に使用する辞書等が格納されているメモリ２２と、前記
合成音声発生器１４とを有するものである。In addition, as shown in the figure, the Japanese text reading device 20 is
It has a CPU 21 that performs word identification, a memory 22 that stores a dictionary, etc. used for word identification, and the synthesized speech generator 14.

また、当該合成音声発生器１４は同図に示すように、後
述する特徴パラメータに基づいて音声の合成を行う音声
合成部としてのＤＳＰ１４ａと、スピーカ制御部１４ｂ
と、スピーカ１４ｃとを有するものである。In addition, as shown in the figure, the synthesized speech generator 14 includes a DSP 14a as a speech synthesis section that synthesizes speech based on characteristic parameters to be described later, and a speaker control section 14b.
and a speaker 14c.

第４図には本実施例に係る日本語文章読上げ装置を機能
的に示したものであり、読上げの指示や読み上げようと
する日本語の文章を識別データとしての識別番号に対応
させて入力させる上位装置１０と、当該上位装置１０か
ら入力したコマンドを解読して対応する信号及びデータ
を出力するコマンド・データ解析処理部１６と、入力す
る漢字かなまじり日本語文章に対応する文字データ列を
単語辞書１１ａと照合することにより候補単語を抽出し
、これらの単語の組み合せのうちで最適な単語列を選択
し、同定された単語に対応する音韻、アクセント型及び
文法等の音韻情報を出力する単語同定部１１と、同定さ
れた単語に含まれている音韻情報に基づいて、呼気段落
情報、フレーズ境界情報、アクセント境界情報、アクセ
ント句毎のアクセント型等の韻律情報を付与する韻律付
与部１２と、付与された韻律情幸匿及び前記音韻情報を
含む音韻データを解析し、音節ファイルから対応する声
道パラメータを取り出して結合させるとともに、ピッチ
等の音源パラメータを計算して求め、声道パラメータに
付与する特徴パラメータ合成部１３と、合成された特徴
パラメータを合成音声に変換する合成音声発生器１４と
、所定の単語列毎に、当該単語列を識別する識別番号（
データ）とともに、対応する音韻データを保持し、識別
データが指定された場合には、対応する単語列の音韻情
報を取り出して前記特徴パラメータ合成部１３に送出す
る音韻データ保持送出部１５を有するものである。FIG. 4 functionally shows the Japanese text reading device according to this embodiment, in which reading instructions and Japanese text to be read are input in correspondence with identification numbers as identification data. A host device 10, a command/data analysis processing unit 16 that decodes commands input from the host device 10 and outputs corresponding signals and data, and converts character data strings corresponding to input Kanji/Kanaji/Japanese text into words. A word that extracts candidate words by comparing them with the dictionary 11a, selects the optimal word string from among the combinations of these words, and outputs phonological information such as phoneme, accent type, and grammar corresponding to the identified word. an identification unit 11; and a prosody assignment unit 12 that assigns prosodic information such as exhalation paragraph information, phrase boundary information, accent boundary information, and accent type for each accent phrase based on the phonological information included in the identified word. , analyzes the given prosodic information and the phonological data including the phonological information, extracts and combines the corresponding vocal tract parameters from the syllable file, calculates and obtains sound source parameters such as pitch, and converts the vocal tract parameters into vocal tract parameters. For each predetermined word string, an identification number (
data) as well as corresponding phoneme data, and has a phoneme data holding and sending unit 15 that extracts the phoneme information of the corresponding word string and sends it to the feature parameter synthesis unit 13 when identification data is specified. It is.

ここで、コマンド・データ解析処理部１６、単語同定部
１１、韻律付与部１２、特徴パラメータ合成部１３は前
記ＣＰＵ２１及びメモリ２２に相当するものである。ま
た、前記音韻データ保持送出部１５は前記メモリ２２等
に相当し、第４図に示すように、書込み部１５ａと、読
出し部１５ｂと、解析テキスト−時蓄積バッファ１５ｃ
とを有するものである。Here, the command/data analysis processing section 16, the word identification section 11, the prosody adding section 12, and the feature parameter synthesis section 13 correspond to the CPU 21 and the memory 22. Further, the phoneme data holding/sending section 15 corresponds to the memory 22, etc., and as shown in FIG.
It has the following.

さらに、前記スピーカ制御部１４ｂは第３図に示すよう
に、ディジタル・データをアナログ・データへ変換する
Ｄ／Ａ変換器１４１ｂと、ＬＰＦ１４２ｂと、増幅器１
４３ｂとを有するものである。Furthermore, as shown in FIG.
43b.

続いて、本実施例に係る日本語文章読上げ装置の動作を
説明する。Next, the operation of the Japanese text reading device according to this embodiment will be explained.

本実施例は、第一の従来例と異なり、日本語文章を入力
して単純に読み上げを行う通常のコマンドの他に、次の
２つのコマンドを追加する。This embodiment differs from the first conventional example in that the following two commands are added in addition to the normal command for simply inputting a Japanese sentence and reading it aloud.

■一つのコマンドとしては上位装置から日本語文章及び
それを識別するための識別番号を入力し、それを解析（
単語同定＋韻律付与）して韻律情報を含む音韻データを
生成して前記バッファ１５ｃに前記識別番号と一緒に蓄
えておくコマンドである。■One command is to input a Japanese sentence and an identification number to identify it from the host device, and analyze it (
This is a command to generate phonetic data including prosody information (word identification + prosody assignment) and store it in the buffer 15c together with the identification number.

■もう１つのコマンドとして、上位装置から識別番号を
与えることにより、解析テキスト−時蓄積バッファ１５
ｃ上からそれに対応する韻律情報を含む音韻データを取
り出し、そのデータを処理することにより前記特徴パラ
メータ合成部１３により特徴パラメータ合成を行うコマ
ンドである。■As another command, by giving the identification number from the host device, the analysis text-time accumulation buffer 15
This is a command to perform feature parameter synthesis by the feature parameter synthesis section 13 by extracting phonetic data including prosody information corresponding to it from above c and processing the data.

前記上位装置１０から識別番号ｕｌｎ、日本語文章「音
声処理しますか？」という文章を表す文字データ列と識
別番号との対応を示すコマンドが送られる。The host device 10 sends an identification number uln and a command indicating the correspondence between the identification number and a character data string representing the Japanese sentence "Do you want to perform voice processing?".

すると、前記単語同定部１１は、この日本語文章を表す
文字データ列について、単語辞書１１ａと照合すること
により、候補単語を抽出し、これらの候補単語の組み合
せの中で最適な単語列を選択し、該当する音韻情報を出
力する。Then, the word identification unit 11 extracts candidate words by comparing the character data string representing this Japanese sentence with the word dictionary 11a, and selects the optimal word string from among the combinations of these candidate words. and outputs the corresponding phoneme information.

出力された当該音韻情報は前記韻律付与部１２により同
定された単語に含まれている音韻（読み）、アクセント
型、文法、情報を用いて、音韻情報に対して呼気段落情
報、フレーズ境界情報、アクセント境界情報、アクセン
ト句等のアクセント型等の情報を付与する。The output phonological information uses the phonology (reading), accent type, grammar, and information included in the word identified by the prosody adding section 12 to apply exhalation paragraph information, phrase boundary information, Information such as accent boundary information and accent types such as accent phrases is provided.

こうして、音韻情報に付与された音韻データ「オンセー
ショ　リーシマ＊スカ？」が生成されることになる。In this way, the phoneme data "Onsesho Rishima*Suka?" added to the phoneme information is generated.

生成された音韻データ「オンセーショ　リーシマ＊スカ
？」は、識別番号“１″と一緒に前記書込み部１５ａに
より解析テキスト−時蓄積バッファ１５ｃに書き込まれ
ることになる。The generated phoneme data "ONSATION RISIMA*SKA?" is written together with the identification number "1" into the analysis text-time storage buffer 15c by the writing section 15a.

その後、当該音韻データ保持送出部１５に識別番号“１
パの文章を発声しなさいというコマンドが送られると、
前記読出し部１５ｂにより、当該識別番号“１″に対応
して格納されている前記音韻データを受は取ると、先程
示した対応する韻律情報を含む音韻データ「オンセーシ
ョ　リーシマ＊スカ？」が前記解析テキスト−時蓄積バ
ッファ１５ｃから読み出され、前記特徴パラメータ合成
部１３に送出されることになる。Thereafter, the identification number “1” is sent to the phoneme data holding/sending unit 15.
When a command is sent to say the sentence ``pa'',
When the reading unit 15b receives the phonetic data stored corresponding to the identification number "1," the phonetic data "Onsation Risima*Ska?" containing the corresponding prosody information shown earlier is analyzed. It is read out from the text-time accumulation buffer 15c and sent to the feature parameter synthesis section 13.

当該特徴パラメータは前記合成音声発生器１４の音声合
成部１４ａとしてのＤＳＰに送られ、音声合成され、さ
らに、前記スピーカ制御部１４ｂによりアナログ変換さ
れてスピーカ１４ｃから合成音声が発生することになる
。The characteristic parameters are sent to the DSP as the speech synthesis section 14a of the synthesized speech generator 14, where they are synthesized into speech, and further converted into analog by the speaker control section 14b, so that synthesized speech is generated from the speaker 14c.

尚、データベース検索等の種々アプリケーションにおい
て、音声を使用しようとした場合には通常、例えば、以
下に示すような、操作者に操作を促すようなメツセージ
を音声合成することが普通である。Note that when attempting to use voice in various applications such as database searches, it is common to synthesize a message that prompts the operator to perform an operation, for example, as shown below.

その場合、上位装置はアプリケーションを起動する前に識別番号とそれに対応する日本語文章を予め本
装置に送っておくことが必要であるが、アプリケーショ
ンが音声合成を行いたい場合にはいっでも、任意の識別
番号を指定するだけで、それに対応する音声を即時に合
成することが可能である。In that case, the host device needs to send the identification number and the corresponding Japanese text to this device before starting the application, but if the application wants to perform speech synthesis, any arbitrary By simply specifying an identification number, it is possible to instantly synthesize the corresponding voice.

すなわち、本実施例にあっては、音声合成をマルチメデ
ィアの音声ガイダンスという機能として扱った場合に、
「０時間のかかる単語同定十韻律付与の処理を予め行っ
ておくため、上位装置からコマンドを送ってから音声を
合成し始めるまでの時間を短縮することができる。■上
位装置上のアプリケーションとのインタフェースとして
複雑なものが必要でない。」という２つの利点を有する
。In other words, in this embodiment, when speech synthesis is treated as a multimedia voice guidance function,
``Since the time-consuming process of word identification and prosody assignment is performed in advance, the time from sending a command from the host device to starting to synthesize speech can be shortened.■ Interaction with applications on the host device. It has two advantages: it does not require a complicated interface.

〔Effect of the invention〕

以上説明したように、本発明では、韻律付与された音韻
データを一旦前記音韻データ保持送出部に保持し、対応
する識別データの指定があった場合には、該当する音韻
データが出力され前記特徴パラメータ合成部に入力する
ようにしている。As described above, in the present invention, the phoneme data to which prosody has been added is once held in the phoneme data holding and sending section, and when the corresponding identification data is specified, the corresponding phoneme data is outputted and the characteristics of the phoneme data are output. It is input to the parameter synthesis section.

したがって、従来のように上位装置の手を煩わせること
なく、識別データの入力のみで、対応する音韻データが
送出されるようにしている。Therefore, the corresponding phoneme data can be sent simply by inputting the identification data, without bothering the host device as in the past.

したがって、単語同定等に費やす時間を短縮するととも
、これにより上位装置の手を煩わすことがない。末だ、
複雑なインタフェースを用いることなく、使い勝手の良
い、かつ、迅速に処理が行われ、さらに発生に関するも
う少し詳細な情報を追加することができる、高速な日本
語文章読上げ装置及び方法を提供することができること
になる。Therefore, the time spent on word identification etc. is shortened, and the host device is not bothered by this. It's the end,
It is possible to provide a high-speed Japanese sentence reading device and method that is easy to use and performs processing quickly without using a complicated interface, and can also add more detailed information regarding the occurrence. become.

[Brief explanation of drawings]

第１図は第一の発明の原理ブロック図、第２図は第二の
発明の原理流れ図、第３図は実施例に係る全体機器構成
図、第４図は実施例に係る日本語文章読上げ装置を示す
ブロック図、第５図は第一の従来例に係るブロック図、
及び第６図は第二の従来例に係る日本語文章読上げ装置
を示す図である。１・・・単語同定部２・・・韻律付与部３・・・特徴パラメータ合成部４・・・合成音声発生器５・・・音韻データ保持送出部出願人富士通株式会社第閣第−の従来例に係るブロック図第５図Fig. 1 is a block diagram of the principle of the first invention, Fig. 2 is a flowchart of the principle of the second invention, Fig. 3 is an overall equipment configuration diagram of the embodiment, and Fig. 4 is a reading of Japanese text related to the embodiment. A block diagram showing the device, FIG. 5 is a block diagram related to the first conventional example,
and FIG. 6 is a diagram showing a Japanese text reading device according to a second conventional example. 1...Word identification unit 2...Prosody assignment unit 3...Feature parameter synthesis unit 4...Synthesized speech generator 5...Phonological data holding and sending unit Conventional art of applicant Fujitsu Ltd. Block diagram according to the example Fig. 5

Claims

[Claims]

(1) A word identification unit (1) that analyzes a character data string representing an input Japanese sentence and outputs phonological information such as the corresponding phoneme and accent type; A prosody assignment unit (2) that assigns prosodic information such as paragraph information, phrase boundary information, accent boundary information, and accent type for each accent phrase; and a prosody assignment unit (2) that assigns prosodic information such as paragraph information, phrase boundary information, accent boundary information, and accent type for each accent phrase; A Japanese sentence reading device includes a feature parameter synthesis unit (3) that synthesizes feature parameters to be processed, and a synthesized speech generator (4) that converts the synthesized feature parameters into synthesized speech and outputs the synthesized speech. For each word string, identification data for identifying the word string and corresponding phonological data are held, and when identification data is specified, the phonological information of the corresponding word string is extracted and sent to the feature parameter synthesis unit (3). A Japanese sentence reading device characterized in that it is provided with a phoneme data holding and transmitting section (5).

(2) Input the character data string representing the Japanese sentence (S1)
, analyzes the input character data string, outputs corresponding phoneme information such as phoneme and accent type (S2), and based on the output phoneme information, exhalation paragraph information, phrase boundary information, accent boundary information, accent Prosodic information such as accent type for each phrase is added (S3), the added prosodic information and phonetic data including the phonetic information are analyzed, and corresponding feature parameters are synthesized (S6), and the synthesized feature parameters are Convert to synthesized voice and output synthesized voice (S7)
In the Japanese text reading method, for each predetermined word string, identification data for identifying the word string and corresponding phonological data are retained (S4).
However, when identification data is specified (S5), the phonological information of the corresponding word string is extracted, and then feature parameters are synthesized (S6).