JPH07152740A

JPH07152740A - Sentence reading device

Info

Publication number: JPH07152740A
Application number: JP5298687A
Authority: JP
Inventors: Fumio Oyama; 史生大山
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1993-11-29
Filing date: 1993-11-29
Publication date: 1995-06-16

Abstract

PURPOSE:To read out a sentence by reading intended by a user by previously registering information relating to the reading of a phrase optionally specified in text data to be read out and referring to the registered information at the time of language analysis. CONSTITUTION:A CPU 1 controls the whole document reader and synthesizes voice to be generated through a voice synthesizer 2. A user registering dictionary SRAM 7 rgisters a user registering dictionary including correspondence relation between a specific phrase newly registered by the use and its reading voice and the dictionary stores information indicating the reading, accent type and connection existence of each phrase to be registered. A work DRAM 8 temporarily stores various data on the way of processing of the CPU 1. A ROM 9 stores a program for driving the CPU 1, data 9-1, a language analytical processing dictionary 9-2, and a voice element file 9-3 for voice synthesis.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、文書作成装置等によっ
て作成されたテキストデータ（電子メール、報告書、小
説、新聞記事等）をもとに音声を出力させる文章読み上
げ装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a text-to-speech device for outputting voice based on text data (e-mail, report, novel, newspaper article, etc.) created by a document creating device or the like.

【０００２】[0002]

【従来の技術】近年、文書作成装置（ワードプロセッサ
やテキストエディタ等を含む）などで作成された文章の
テキストデータをもとに、音声信号を合成して文章を読
み上げる音声を出力可能な文章読み上げ装置が開発され
利用されている。この種、文章読み上げ装置は、任意の
文章の内容を直接聞き取ることができるので各種の用途
がある。2. Description of the Related Art Recently, a text-to-speech device capable of synthesizing a voice signal based on text data of a text created by a text creating device (including a word processor, a text editor, etc.) and outputting a voice to read the text. Has been developed and used. This type of text-to-speech device has various uses because it can directly hear the content of any text.

【０００３】文章読み上げ装置は、具体的には、一般的
なパーソナルコンピュータ（以下、ＰＣと略称する）や
ワークステーション（以下、ＷＳと略称する）などの各
種情報処理機器と組み合わせて使用する拡張基板型、一
般的なＰＣやＷＳ、手帳型小型電子機器と組み合わせて
使用するＩＣカード型（ＣＰＵを内蔵させ単体で使用で
きるものもある）、あるいはＰＣやＷＳのみで構成さ
れ、文章読み上げのためのソフトウェアを実行すること
により実現される。The text-to-speech device is specifically an expansion board used in combination with various information processing devices such as a general personal computer (hereinafter abbreviated as PC) and a workstation (hereinafter abbreviated as WS). Type, general PC or WS, IC card type used in combination with a notebook type small electronic device (some have a built-in CPU and can be used alone), or consist of only PC or WS for reading aloud sentences It is realized by executing software.

【０００４】ここで、従来の文章読み上げ装置における
動作の概略について説明する。まず、文書作成装置等に
よって作成された文章のテキストデータについて言語解
析処理を行なう。言語解析処理では、テキストデータを
語単位に分割し、各語に関する言語解析辞書内の情報を
参照して、読み情報、アクセント情報、息継ぎ部（ポー
ズ）を生成し、これらの情報を含む音声記号列としてま
とめる。Here, an outline of the operation of the conventional text-to-speech device will be described. First, a language analysis process is performed on text data of a sentence created by a document creation device or the like. In the linguistic analysis process, the text data is divided into words, and the information in the linguistic analysis dictionary for each word is referenced to generate reading information, accent information, and breathing parts (pauses), and phonetic symbols including these information. Organize in columns.

【０００５】次に、言語解析処理によって得られた音声
記号列を基に、音声素片ファイルを参照して、音声信号
合成のもとになる音声合成パラメータを生成する。音声
合成機能は、音声合成パラメータに従って、音声信号を
規則合成して出力する。音声信号は、増幅されて外部の
イヤホンを介して、あるいはスピーカを通して発声され
音声に変換される。Next, based on the speech symbol string obtained by the language analysis processing, a speech unit file is referenced to generate speech synthesis parameters that are the basis of speech signal synthesis. The voice synthesizing function regularly synthesizes and outputs a voice signal according to a voice synthesizing parameter. The audio signal is amplified and uttered through an external earphone or through a speaker and converted into audio.

【０００６】[0006]

【発明が解決しようとする課題】このように従来の文章
読み上げ装置では、テキストデータから実際の読み上げ
に必要な情報（読みやアクセント、息継ぎの有無）を含
む音声記号列に変換する際には、予め用意された言語解
析辞書内の情報を参照している。As described above, in the conventional text-to-speech device, when converting from text data into a phonetic symbol string including information necessary for actual reading (presence or absence of reading, accent, breath), Information in a language analysis dictionary prepared in advance is referenced.

【０００７】従って、辞書に格納されていない単語がテ
キストデータ内に存在したり、登録された単語の組み合
わせであっても一般的でない語句が存在する場合、利用
者が期待する音声の読みが得られない場合があり、自然
な読み上げ、及びそれによる文章内容理解の妨げになる
ことがあった。Therefore, when a word that is not stored in the dictionary exists in the text data, or if there is an uncommon word or phrase even in a combination of registered words, the voice reading expected by the user can be obtained. In some cases, it was not possible to read the text naturally, which hindered the understanding of the text content.

【０００８】例えば、「ＳＲＡＭ」という語句が文章中
に現れた場合、「えすらむ」という期待される読みに対
して、アルファベット読み上げ規則により「えすあーる
えーえむ」（実際の読み上げ内容とこの記載とは若干異
なる）と読み上げられる。For example, when the word "SRAM" appears in a sentence, the expected reading "Esram" is "Esaaru-Emu" according to the alphabetic reading rules (the actual reading content and this description). Is slightly different).

【０００９】また、例えば「大山史生」（読みは「おお
やまふみお」）という人名が文章中に現われた際、言語
解析辞書の第１候補として「大山」は「だいせん」の読
みが記載されている。また「史生」は言語解析辞書に登
録されていないため、辞書非記載語句の読み上げ規則に
従って「しじょう」と読みが生成される。[0009] For example, when a person named "Fumio Oyama" (reading "Fumio Ooyama") appears in the text, the reading "Daisen" is written for "Oyama" as the first candidate in the language analysis dictionary. ing. In addition, since "Fumio" is not registered in the language analysis dictionary, the reading "Shijo" is generated in accordance with the reading rules of the words not written in the dictionary.

【００１０】その結果、「おおやまふみお」という期待
される読み上げに対し、「だいせんしじょう」と読み上
げられる。こうした読み上げに対し、利用者は一瞬、何
が読み上げられたのか判らず気をとられてしまい、この
語句のみならずその周辺の文章理解の大きな妨げとなっ
てしまう。As a result, the expected reading "Oyama Fumio" is read "Daisenjojo". For such reading aloud, the user is instantly distracted by not knowing what was read aloud, which is a great obstacle to understanding not only this phrase but the surrounding text.

【００１１】本発明は前記のような事情を考慮してなさ
れたもので、言語解析辞書に登録されていない単語や、
一般的ではない単語の組み合わせによる語句を含む文章
について自然な読み上げが可能な文章読み上げ装置を提
供することを目的とする。The present invention has been made in consideration of the above circumstances, and it includes words that are not registered in the language analysis dictionary,
An object of the present invention is to provide a text-to-speech device capable of natural reading of a text including a phrase formed by an uncommon word combination.

【００１２】[0012]

【課題を解決するための手段】本発明は、テキストデー
タに対する言語解析により音声合成データを生成して音
声を発声させる文章読み上げ装置において、前記テキス
トデータ中の特定の語句と同特定の語句についての読み
上げに関する情報との対応関係を登録する登録手段と、
前記登録手段によって登録された対応関係を含めた言語
解析により文章読み上げを行なう文章読み上げ手段とを
具備したことを特徴とする。SUMMARY OF THE INVENTION The present invention relates to a text-to-speech device for generating voice-synthesized data by linguistic analysis of text data and uttering a voice, with respect to a specific word and phrase in the text data. Registration means for registering the correspondence with information about reading aloud,
It is characterized by further comprising a sentence reading unit for reading a sentence by a language analysis including the correspondence relationship registered by the registration unit.

【００１３】[0013]

【作用】このような構成によれば、読み上げ対象とする
テキストデータ中の任意に指定される語句に対して、読
みに関する情報を登録しておき、言語解析の際に参照さ
せることにより、利用者が意図する読みで文章を読み上
げさせることができる。According to this structure, information about reading is registered for a word or phrase arbitrarily specified in the text data to be read aloud, and the user can refer to the information during language analysis. The sentence can be read aloud as intended.

【００１４】[0014]

【実施例】以下、図面を参照して本発明の一実施例を説
明する。図１は本発明の一実施例に係わる文章読み上げ
装置のハードウェア構成を示すブロック図である。図１
に示すように文章読み上げ装置は、ＣＰＵ１、音声合成
器２、Ｄ／Ａコンバータ３、ローパスフィルタ４、増幅
器５、イヤホンジャック６、利用者登録辞書用ＳＲＡＭ
７、ワーク用ＤＲＡＭ、ＲＯＭ９、外部Ｉ／Ｆ１０、バ
ス１１、及びバッテリ１２によって構成されている。ま
た、ＣＰＵ１、音声合成器２、及びＤ／Ａコンバータ３
は、１チップのＬＳＩとして構成されているものとす
る。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing the hardware configuration of a text-to-speech device according to an embodiment of the present invention. Figure 1
As shown in FIG. 1, the text-to-speech device comprises a CPU 1, a voice synthesizer 2, a D / A converter 3, a low-pass filter 4, an amplifier 5, an earphone jack 6, a user registration dictionary SRAM.
7, a work DRAM, a ROM 9, an external I / F 10, a bus 11, and a battery 12. Further, the CPU 1, the voice synthesizer 2, and the D / A converter 3
Is configured as a one-chip LSI.

【００１５】ＣＰＵ１は、文書読み上げ装置全体を制御
するもので、音声合成書の際には、入力されたテキスト
データを言語解析処理した後、音声合成パラメータを生
成して、音声合成器２を介して音声を発声させる制御を
行なう。The CPU 1 controls the entire document reading device. In the case of a voice synthesis book, the CPU 1 performs a language analysis process on the input text data, generates a voice synthesis parameter, and outputs the voice synthesis parameter via the voice synthesizer 2. Control to make a voice.

【００１６】音声合成器２は、ＣＰＵ１の制御のもと
に、音声合成パラメータについてディジタル信号処理を
施してディジタル発声データに変換するものである。Ｄ
／Ａコンバータ３は、音声合成器２からのディジタル音
声データを実際にアナログの音声信号に変換する。Under the control of the CPU 1, the voice synthesizer 2 subjects the voice synthesis parameters to digital signal processing and converts them into digital vocal data. D
The / A converter 3 actually converts the digital voice data from the voice synthesizer 2 into an analog voice signal.

【００１７】ローパスフィルタ４は、Ｄ／Ａコンバータ
３によって得られたアナログ音声信号に含まれる不要な
高周波成分を除去する。増幅器５は、ローパスフィルタ
４によって高周波成分が除去されたアナログ音声信号を
電圧／電力増幅して出力する。The low-pass filter 4 removes unnecessary high frequency components contained in the analog audio signal obtained by the D / A converter 3. The amplifier 5 voltage / power amplifies the analog audio signal from which the high-frequency component has been removed by the low-pass filter 4 and outputs it.

【００１８】イヤホンジャック６は、増幅器５によって
電圧／電力増幅された信号を外部に出力するためのもの
である。イヤホンジャック６には、直接イヤホンを接続
するか、パワーアンプ内蔵のスピーカが接続される。The earphone jack 6 is for outputting the signal voltage / power amplified by the amplifier 5 to the outside. An earphone is directly connected to the earphone jack 6 or a speaker with a built-in power amplifier is connected.

【００１９】利用者登録辞書用ＳＲＡＭ７は、利用者が
新たに登録した特定語句（テキストデータ）と読み上げ
る音声（読み情報）との対応関係を内容とする利用者登
録辞書を登録するためのものである。利用者登録辞書用
ＳＲＡＭ７は、バッテリ１２によりバックアップされて
おり、メイン電源がオフされた場合でも利用者登録辞書
の内容が保持されるように構成されている。The user registration dictionary SRAM 7 is for registering a user registration dictionary having a correspondence relationship between a specific phrase (text data) newly registered by the user and a voice (reading information) read aloud. is there. The user registration dictionary SRAM 7 is backed up by the battery 12 and is configured to retain the contents of the user registration dictionary even when the main power supply is turned off.

【００２０】ワーク用ＤＲＡＭ８は、ＣＰＵ１が処理途
中で各種データを一時的に格納するためのものである。
ＲＯＭ９は、ＣＰＵ１を動作させるためのプログラムや
データ９−１、言語解析処理用辞書９−２、音声合成用
の音声素片ファイル９−３を格納するためのものであ
る。The work DRAM 8 is for the CPU 1 to temporarily store various data during processing.
The ROM 9 is for storing a program and data 9-1 for operating the CPU 1, a language analysis processing dictionary 9-2, and a voice unit file 9-3 for voice synthesis.

【００２１】外部Ｉ／Ｆ１０は、例えば文章読み上げ装
置がＩＣカード型に構成された場合における、ＪＥＩＤ
Ａ（Japan Electronic Industry Development Associat
ion）のVer.４規格に準拠した、ＩＣカードとＰＣ／Ｗ
Ｓ等の情報処理装置とを接続するための標準的なインタ
フェースである。The external I / F 10 is, for example, a JEID in the case where the text-to-speech device is constructed as an IC card type.
A (Japan Electronic Industry Development Associat
Ion) Ver.4 standard, IC card and PC / W
It is a standard interface for connecting to an information processing device such as S.

【００２２】バス１１は、前述した各部を接続するもの
である。バッテリ１２は、文章読み上げ装置を単体で使
用するときに装置を構成する各部に電力を供給すると共
に、装置の電源オフ時に利用者登録辞書用ＳＲＡＭ７の
内容をバックアップする、例えばボタン型の電池であ
る。The bus 11 connects the above-mentioned units. The battery 12 is, for example, a button-type battery that supplies electric power to each unit constituting the device when the sentence reading device is used alone and backs up the contents of the user registration dictionary SRAM 7 when the device is powered off. .

【００２３】図２には利用者登録辞書用ＳＲＡＭ７に格
納される利用者登録辞書の内容を示している。利用者登
録辞書には、登録する語句に関して、それぞれ「読
み」、アクセント型、接続の有無を示す情報が格納され
る。FIG. 2 shows the contents of the user registration dictionary stored in the user registration dictionary SRAM 7. The user registration dictionary stores information indicating “reading”, accent type, and presence / absence of connection for each word to be registered.

【００２４】登録する語句は、読み上げの対象となるテ
キストデータ中に現れた、利用者によって「読み」が指
定される対象となった文字列である。「読み」は、登録
する語句に対する指定する読みである。また、読みを複
数に区切る（図中では／（スラッシュ）によって区切っ
ている）ことにより、アクセント型を別々に指定するこ
ともできる。アクセント型は、例えば、指定した単語の
読みに対して何拍（音声に対応した読みの区切り）目ま
でが高いかを示す。「読み」が区切られている場合に
は、それぞれの部分に関して設定する。接続の有無は、
登録する語句が文章中の前または後の単語と接続される
か否かを示す。単語が接続される場合、予め定めた規則
に従ってアクセントの位置を移動する。The term to be registered is a character string that appears in the text data to be read out and for which "reading" is designated by the user. “Yomi” is a designated yomi for the word to be registered. Also, the accent type can be specified separately by dividing the reading into a plurality (delimited by / (slash) in the figure). The accent type indicates, for example, the number of beats (reading division corresponding to the voice) up to the reading of the designated word. If "reading" is separated, set for each part. With or without connection
Indicates whether the registered phrase is connected to the previous or subsequent word in the sentence. When words are connected, the accent position is moved according to a predetermined rule.

【００２５】例えば、語句「ＲＡＭ」は、予め用意され
た言語解析辞書の内容に従えば「あーるえーえむ」と読
み上げられるが、図２に示すように、利用者登録辞書が
登録されていれば、音の高さが一定のまま「らむ」と読
み上げられる。語句「ＲＯＭ」の場合も、ほぼ同様であ
るが、前の単語との接続が可能であり、例えば「マスク
ＲＯＭ」となると、「マスク」のアクセントが移動して
「ますくろ」までが一定の高さで読み上げられ「む」で
低くなる。利用者登録辞書には、以上のような内容が登
録される。For example, the word "RAM" is read aloud as "al-emu" according to the contents of the language analysis dictionary prepared in advance, but if the user registration dictionary is registered as shown in FIG. , It is read aloud when the pitch is constant. In the case of the word "ROM", it is almost the same, but it is possible to connect with the previous word. For example, in the case of "mask ROM", the accent of "mask" moves and "maskuro" becomes constant. It is read aloud at the height and becomes low when "mu". The above contents are registered in the user registration dictionary.

【００２６】図３には本実施例における文章読み上げ装
置の機能ブロック図を示している。図３に示すように、
言語解析処理手段２０、合成パラメータ生成手段２１、
音声合成手段２２、言語解析辞書２３、利用者登録辞書
２４、音声素片ファイル２５、及び辞書登録手段２６に
よって構成されている。FIG. 3 shows a functional block diagram of the text-to-speech device in this embodiment. As shown in FIG.
Language analysis processing means 20, synthesis parameter generation means 21,
It is composed of a voice synthesis unit 22, a language analysis dictionary 23, a user registration dictionary 24, a voice unit file 25, and a dictionary registration unit 26.

【００２７】各手段２０，２１，２２，２６は、主にＣ
ＰＵ１及びＲＯＭ９（に記憶されたＣＰＵ１を動作させ
るためのプログラムやデータ９−１）によって実現さ
れ、音声合成手段２２は、さらに音声合成器２、Ｄ／Ａ
コンバータ３を含んで実現される。The means 20, 21, 22, 26 are mainly C
It is realized by the PU1 and the ROM 9 (a program and data 9-1 for operating the CPU 1 stored therein), and the voice synthesizing means 22 further includes the voice synthesizer 2 and the D / A.
It is realized by including the converter 3.

【００２８】言語解析辞書２３及び音声素片ファイル２
５は、ＲＯＭ９にその内容が記録されて実現され、利用
者登録辞書２４は、利用者登録辞書用ＳＲＡＭ７にその
内容が登録されて実現される。Language analysis dictionary 23 and speech unit file 2
5 is realized by recording its contents in the ROM 9, and the user registration dictionary 24 is realized by registering its contents in the user registration dictionary SRAM 7.

【００２９】次に、本実施例の動作について説明する。
ここでは、文章読み上げ装置がＩＣカード型に構成さ
れ、ＰＣに設けられたＪＥＩＤＡのVer.４規格に準拠し
たＩＣカードスロットに装着されて読み上げが実行され
るものとする。Next, the operation of this embodiment will be described.
Here, it is assumed that the text-to-speech device is configured as an IC card type and is inserted into an IC card slot conforming to JEIDA Ver.

【００３０】はじめに、テキストデータをもとに文章読
み上げる動作の概略について説明する。まず、文書作成
装置等によって作成された文章のテキストデータについ
て言語解析処理を行なう。テキストデータは、例えば外
部Ｉ／Ｆ１０を介してＰＣ側から文章読み上げ装置に入
力されワーク用ＤＲＡＭ８に格納される。First, an outline of an operation of reading a sentence based on text data will be described. First, a language analysis process is performed on text data of a sentence created by a document creation device or the like. The text data is input to the text reading device from the PC side via the external I / F 10 and stored in the work DRAM 8, for example.

【００３１】言語解析処理手段２０は、テキストデータ
を語単位に分割し、各語に関する言語解析辞書２３及び
利用者登録辞書２４内の情報を参照して、読み情報、ア
クセント情報、息継ぎ部（ポーズ）を生成し、これらの
情報を含む音声記号列としてまとめる。この際、利用者
登録辞書２４が言語解析辞書２３よりも優先され、言語
解析辞書２３に登録されていても利用者登録辞書２４に
対象とする語句が登録されていれば、利用者登録辞書２
４の内容に従って処理がなされる。The linguistic analysis processing means 20 divides the text data into words, and refers to the information in the linguistic analysis dictionary 23 and the user registration dictionary 24 regarding each word to read information, accent information, breathing section (pause). ) Is generated and summarized as a phonetic symbol string including these pieces of information. At this time, the user registration dictionary 24 is prioritized over the language analysis dictionary 23. Even if the user registration dictionary 24 is registered in the language analysis dictionary 23, if the target phrase is registered in the user registration dictionary 24, the user registration dictionary 2
Processing is performed according to the contents of 4.

【００３２】次に、言語解析処理によって得られた音声
記号列を基に、合成パラメータ生成手段２１は、音声素
片ファイル２５を参照して、音声信号合成のもとになる
音声合成パラメータを生成する。Next, based on the speech symbol string obtained by the language analysis processing, the synthesis parameter generation means 21 refers to the speech unit file 25 to generate the speech synthesis parameter which is the basis of the speech signal synthesis. To do.

【００３３】音声合成手段２２は、音声合成パラメータ
に従って、音声信号を規則合成して出力する。音声信号
は、ローパスフィルタ４、増幅器５を介して増幅されて
イヤホンジャック６に転送される。イヤホンジャック６
からは、イヤホン、あるいはスピーカを通して発声され
音声に変換される。The voice synthesizing means 22 regularly synthesizes the voice signal according to the voice synthesizing parameter and outputs it. The audio signal is amplified via the low-pass filter 4 and the amplifier 5 and transferred to the earphone jack 6. Earphone jack 6
Is uttered through an earphone or a speaker and converted into voice.

【００３４】次に、利用者登録辞書２４に、図２に示す
ような辞書内容を登録するための動作（登録処理）につ
いて、図４に示すフローチャートを参照しながら説明す
る。前述したように文章読み上げ装置によって読み上げ
が実行されている際には、図５に示すような画面が表示
される。すなわち、読み上げの対象となっている文章内
容が表示され、その中で現在読み上げ中の箇所にカーソ
ルが表示される（読み上げが進むに従ってカーソルの表
示位置が移動する）。また、「読み上げ中」「中断中」
等の実行状況を通知するメッセージを表示するためのメ
ッセージ行、機能実行のための入力操作を示すファンク
ション行が設けられている。Next, the operation (registration processing) for registering the dictionary contents as shown in FIG. 2 in the user registration dictionary 24 will be described with reference to the flowchart shown in FIG. As described above, when the text-to-speech device is reading the text, a screen as shown in FIG. 5 is displayed. That is, the content of the sentence to be read is displayed, and the cursor is displayed at the position currently being read (the display position of the cursor moves as the reading progresses). Also, "Reading aloud" and "Interrupting"
There are provided a message line for displaying a message notifying the execution status such as, and a function line indicating an input operation for executing the function.

【００３５】登録処理は、文章読み上げ実行中に、利用
者が不自然な読みに気付き、その理由が文章中のある構
成句に期待される読み方と実際の読み上げがことなるこ
とによると気付いた場合、ＰＣに設けられたキーボード
の「ＥＳＣ」キーが押下されることによって起動され
る。In the registration process, when the user notices an unnatural reading while reading a sentence, and the reason is that the actual reading is different from the expected reading for a certain constituent phrase in the sentence. It is activated by pressing the "ESC" key of the keyboard provided on the PC.

【００３６】「ＥＳＣ」キーが押下されて文章読み上げ
の中断が指示されると、文章読み上げ装置のＣＰＵ１
は、読み上げ中断に必要な一連の処理を実行した後、読
み上げ処理を中断する（ステップＳ１）。一方、ＰＣ側
では、図６に示すような画面を表示する。この画面中、
カーソルは中断点の文字位置に表示されている。When the "ESC" key is pressed to instruct the interruption of the text reading, the CPU 1 of the text reading device
Executes a series of processes necessary for reading interruption, and then interrupts the reading process (step S1). On the other hand, the PC side displays a screen as shown in FIG. On this screen,
The cursor is displayed at the character position of the break point.

【００３７】ここで、「Ｆ２」キーが押下されて利用者
登録辞書への登録を行なう登録モードが選択されると、
ＣＰＵ１は、登録モジュールへ移行する（ステップＳ
２）。また、ＰＣは、図７に示すような、利用者登録辞
書に登録すべき語句を入力するための画面（語句登録画
面）を表示する。語句登録画面には、「登録語句」「登
録読み」「アクセント型」「接続の有無」を表示するた
めの領域が設けられている。ＰＣ側では、文章読み上げ
装置からの要求に応じて、メッセージ行に操作内容を表
示する。使用者は、メッセージ行に表示されるメッセー
ジに応じて、語句登録を進めていく。When the "F2" key is pressed to select the registration mode for registering in the user registration dictionary,
The CPU 1 shifts to the registration module (step S
2). Further, the PC displays a screen (word registration screen) for inputting words to be registered in the user registration dictionary, as shown in FIG. The word registration screen has an area for displaying "registered word", "registered reading", "accent type", and "connection / non-connection". On the PC side, the operation content is displayed on the message line in response to a request from the text reading device. The user proceeds with word registration according to the message displayed on the message line.

【００３８】ＣＰＵ１は、登録モジュールに移行する
と、ＰＣに対し、登録語句範囲指定要求を出力する（ス
テップＳ３）。ＰＣでは、範囲指定を、画面上のそれま
でに読み上げていた文章内で、登録する語句の始点終点
を、語句登録指定用カーソルを用いて指定することで行
なう。登録語句が複数の単語からなる複合語のような場
合、「／」（スラッシュ）で区切って指定する。When the CPU 1 shifts to the registration module, it outputs a registration phrase range designation request to the PC (step S3). In the PC, the range is designated by designating the start point and the end point of the word to be registered in the sentence read up to that point on the screen by using the word registration specifying cursor. If the registered word is a compound word consisting of multiple words, delimit by "/" (slash) and specify.

【００３９】語句登録範囲指定がなされると、ＣＰＵ１
は、ＰＣに対し、登録語句の「読み」設定要求を出力す
る（ステップＳ４）。ＰＣでは、登録語句の「読み」
を、キーボードより入力し、「登録読み」の欄に表示さ
せる。When the word registration range is designated, the CPU 1
Outputs a "reading" setting request for the registered word to the PC (step S4). On PC, the registered word "reading"
Is input from the keyboard and is displayed in the "registration reading" field.

【００４０】「読み」の入力がなされると、ＣＰＵ１
は、ＰＣに対し、登録語句のアクセント型設定要求を出
力する（ステップＳ５）。ＰＣでは、登録語句のアクセ
ント型を、キーボードより入力し（メニューから選択で
きるようにしても良い）、「アクセント型」の欄に表示
させる。アクセント型は、登録する「読み」に対し、何
拍目まで声の調子を高く維持するかを示す数字である。
登録語句に複合語が設定されている場合には、それぞれ
の単語について「アクセント型」を指定する。When "reading" is input, the CPU 1
Outputs an accent type setting request for the registered word to the PC (step S5). On the PC, the accent type of the registered word is input from the keyboard (may be selected from the menu) and displayed in the "accent type" field. The accent type is a number indicating how many beats the tone of the voice is kept high with respect to the registered “reading”.
When a compound word is set in the registered word, "accent type" is specified for each word.

【００４１】「アクセント型」の入力がなされると、最
後にＣＰＵ１は、ＰＣに対し、登録語句の前後への接続
の有無設定要求を出力する（ステップＳ６）。ＰＣで
は、登録語句が前、後の語と接続されるかどうか（有
無）を設定する。ここで接続を指定した場合、実際の読
み上げを行なう際、言語解析処理手段２０により、予め
定めた条件を満たす場合、複合語として扱われ、アクセ
ントの移動などが行なわれる。When the "accent type" is input, the CPU 1 finally outputs to the PC a connection presence / absence setting request before and after the registered phrase (step S6). The PC sets whether (presence or absence) the registered phrase is connected to the preceding and following words. When the connection is designated here, when the actual reading is performed, the language analysis processing means 20 treats the compound as a compound word and moves the accent when the predetermined condition is satisfied.

【００４２】こうして全項目が設定され、「Ｆ２」キー
が押下されることにより登録実行が指示されると、文章
読み上げ装置は、設定された情報をもとに、登録語句の
読み上げを実行する（ステップＳ７）。すなわち、登録
内容（「登録読み」「アクセント型」）が正しいがどう
かを確認させる。When all items are set in this way and the registration execution is instructed by pressing the "F2" key, the text-to-speech device reads out the registered phrase based on the set information ( Step S7). That is, the user confirms whether the registered contents (“reading registration” or “accent type”) are correct.

【００４３】ここで、ＣＰＵ１は、登録モードの続行／
終了の選択要求を出力するが（ステップＳ８）、登録内
容の再設定が必要と判断された場合、あるいは続けて別
の語句の登録を行なうために終了が選択されなければ、
そのまま登録モードの続行が選択される（ステップＳ
８）。Here, the CPU 1 continues the registration mode /
Although an end selection request is output (step S8), if it is determined that the registered contents need to be reset, or if the end is not selected to continue to register another word,
The continuation of the registration mode is selected as it is (step S
8).

【００４４】また、登録モードの終了が指示されると、
ＣＰＵ１は、利用者登録辞書用ＳＲＡＭ７に、登録内容
を所定の形式にして登録し、登録モジュール（ステップ
Ｓ９）を終了する（ステップＳ９）。そして、文章読み
上げの指示に応じて、中断位置、または指定位置から読
み上げを再開する（ステップＳ１０）。When the end of the registration mode is instructed,
The CPU 1 registers the registered content in the user registration dictionary SRAM 7 in a predetermined format, and ends the registration module (step S9) (step S9). Then, the reading is restarted from the interruption position or the designated position according to the sentence reading instruction (step S10).

【００４５】なお、前記実施例において、文章読み上げ
装置は、ＣＰＵ１が搭載されたＩＣカード型に構成し、
ＰＣに設けられたＩＣカードスロットに装着して実現さ
れるものとして説明したが、他のハードウェア構成をと
っても良い。すなわち、ＰＣやＷＳなどの各種情報処理
機器と組み合わせて使用する拡張基板型、一般的なＰＣ
やＷＳ、手帳型小型電子機器と組み合わせて使用するＩ
Ｃカード型（ＣＰＵを内蔵させないもの）、あるいはＰ
ＣやＷＳのみで構成され、文章読み上げのためのソフト
ウェアを実行することにより実現される構成であっても
良い。In the above embodiment, the text-to-speech device is an IC card type equipped with the CPU 1.
Although it has been described as being realized by being mounted in the IC card slot provided in the PC, other hardware configurations may be adopted. That is, a general-purpose PC, which is an expansion board type used in combination with various information processing devices such as PC and WS.
, WS, notebook type small electronic device used in combination with I
C card type (no built-in CPU) or P
It may be configured only by C or WS and realized by executing software for reading a sentence.

【００４６】また、利用者登録辞書２４を、ＩＣカード
内のバッテリバックアップされた利用者登録辞書用ＳＲ
ＡＭ７に格納するものとして説明したが、拡張基板また
はＩＣカードに搭載されたフラッシュメモリ内、または
ＰＣやＷＳ内のバッテリバックアップ付きのＳＲＡＭ内
に保存し、一旦電源を切った後、再度使用する再にも、
再登録を必要とせずに、一度登録した内容を参照できる
ようにしても良い。Further, the user registration dictionary 24 is stored in the IC card as a battery backup SR for the user registration dictionary.
Although it has been described as being stored in the AM7, it is stored in the flash memory mounted on the expansion board or the IC card or in the SRAM with battery backup in the PC or WS, and is turned off and then reused. Also,
The registered content may be referred to without the need for re-registration.

【００４７】このようにして、文章中の特定の語句をど
のように読み上げさせるかを利用者が任意に設定できる
ようにし、特定の語句（「登録語句」）とその「読み」
「アクセント型」「接続の有無」を対応づけて利用者登
録辞書として保持させることにより、言語解析辞書に登
録されていない単語や、一般的ではない単語の組み合わ
せによる語句を含む文章についても自然な読み上げが可
能となる。In this way, the user can arbitrarily set how to read a specific phrase in a sentence, and the specific phrase ("registered phrase") and its "reading" can be set.
By storing “accent type” and “presence / absence of connection” as a user registration dictionary in association with each other, it is possible to naturalize words that are not registered in the language analysis dictionary or sentences that include phrases that are not common word combinations. It is possible to read aloud.

【００４８】また、利用者登録辞書を不揮発性記憶装置
（利用者登録辞書用ＳＲＡＭ７）に格納しておくこと
で、装置の主電源をオフしてもその内容が失われないの
で再登録等の必要がなく有効に利用することができる。By storing the user registration dictionary in the non-volatile storage device (SRAM 7 for user registration dictionary), the contents are not lost even when the main power supply of the device is turned off. It can be used effectively without the need.

【００４９】[0049]

【発明の効果】以上のように本発明によれば、読み上げ
の対象となる文章中の任意の語句について、読みに関す
る情報を対応づけて保存し、文章読み上げの際に利用す
ることができるので、言語解析辞書に登録されていない
単語や、一般的ではない単語の組み合わせによる語句を
含む文章について自然な読み上げが可能となるものであ
る。As described above, according to the present invention, it is possible to store information relating to reading in association with an arbitrary phrase in a sentence to be read out, and use it when reading out a sentence. This allows natural reading of words that are not registered in the linguistic analysis dictionary or sentences that include phrases that are combinations of unusual words.

[Brief description of drawings]

【図１】本発明の一実施例に係わる文章読み上げ装置の
構成を示すブロック図。FIG. 1 is a block diagram showing the configuration of a text-to-speech device according to an embodiment of the present invention.

【図２】本実施例における利用者登録辞書用ＳＲＡＭ７
に格納される利用者登録辞書の内容を示す図。FIG. 2 is a SRAM 7 for user registration dictionary in the present embodiment.
Showing the contents of the user registration dictionary stored in.

【図３】本実施例における文章読み上げ装置の構成を示
す機能ブロック図。FIG. 3 is a functional block diagram showing the configuration of a text-to-speech device according to the present embodiment.

【図４】本実施例における利用者登録辞書への登録手順
を示すフローチャート。FIG. 4 is a flowchart showing a registration procedure in a user registration dictionary in this embodiment.

【図５】本実施例における文章読み上げの際に表示され
る画面の一例を示す図。FIG. 5 is a diagram showing an example of a screen displayed when reading a sentence according to the present embodiment.

【図６】本実施例における文章読み上げを中断した際に
表示される画面の一例を示す図。FIG. 6 is a diagram showing an example of a screen displayed when text reading is interrupted in the present embodiment.

【図７】本実施例における利用者登録辞書に登録を行な
う際に表示される画面の一例を示す図。FIG. 7 is a diagram showing an example of a screen displayed when registering in the user registration dictionary according to the present embodiment.

[Explanation of symbols]

１…ＣＰＵ、２…音声合成器、３…Ｄ／Ａコンバータ、
４…ローパスフィルタ、５…増幅器、６…イヤホンジャ
ック、７…利用者登録辞書用ＳＲＡＭ、８…ワーク用Ｄ
ＲＡＭ、９…ＲＯＭ、１０…外部Ｉ／Ｆ、１１…バス、
１２…バッテリ、２０…言語解析書手段、２１…合成パ
ラメータ生成手段、２２…音声合成手段、２３…言語解
析辞書、２４…利用者登録辞書、２５…音声素片ファイ
ル、２６…辞書登録手段。1 ... CPU, 2 ... voice synthesizer, 3 ... D / A converter,
4 ... Low-pass filter, 5 ... Amplifier, 6 ... Earphone jack, 7 ... User registration dictionary SRAM, 8 ... Work D
RAM, 9 ... ROM, 10 ... External I / F, 11 ... Bus,
12 ... Battery, 20 ... Language analysis writing means, 21 ... Synthesis parameter generation means, 22 ... Speech synthesis means, 23 ... Language analysis dictionary, 24 ... User registration dictionary, 25 ... Speech element file, 26 ... Dictionary registration means.

Claims

[Claims]

1. A text-to-speech device for generating speech synthesis data by linguistic analysis of text data and uttering a voice, wherein a correspondence between a specific phrase in the text data and information regarding reading of the specific phrase is defined. A text-to-speech device comprising: a registration means for registering; and a text-to-speech means for reading a sentence by language analysis including the correspondence relationship registered by the registration means.

2. The text-to-speech device according to claim 1, wherein the content registered by the registration means is stored in a non-volatile storage means.

3. The text-to-speech device according to claim 1, which is configured as an IC card type.

4. The reading information registered by the registration means includes at least one of reading of the specific phrase, accent type, and information about presence / absence of connection with preceding and following words. The sentence reading device according to item 1.