JP2006030326A

JP2006030326A - Speech synthesizer

Info

Publication number: JP2006030326A
Application number: JP2004205362A
Authority: JP
Inventors: Kenji Nagamatsu; 健司永松
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2004-07-13
Filing date: 2004-07-13
Publication date: 2006-02-02

Abstract

<P>PROBLEM TO BE SOLVED: To realize readings of words and phrases that correspond to a document by more suitably reflecting correction results of readings, accents, and intonations by saving information regarding the relation of a word and a phrase together when the correct reading of the word changes depending upon whether there is another word or phrase is present nearby or with the field or category of document contents including the word. <P>SOLUTION: Disclosed is a speech synthesizer or speech data generating device characterized in that an editing means of editing an intermediate language of a document to be read aloud receives a correction instruction inputted for the intermediate language displayed by a display means and stores information on the correction instruction in a storage means, and the correction instruction includes at least specification of a word or phrase to be corrected and condition specification of correction reflection. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は音声データ作成装置に関係する。特に、音声合成装置への入力データである中間言語と呼ばれる音声データを編集する中間言語編集システムに関する。 The present invention relates to an audio data creation apparatus. In particular, the present invention relates to an intermediate language editing system that edits speech data called intermediate language that is input data to a speech synthesizer.

銀行の残高照会や株価情報の提供などの単なる情報の読み上げでのみ利用されてきた音声合成装置が、最近では自動車のカーナビゲーション装置に代表される車載用情報端末や携帯電話などにも搭載されるようになってきている。これらの情報提供装置では、時事ニュースやサービス会社からの案内などの情報を読み上げるために音声合成装置が用いられている。このような用途では、できるだけ読み間違いやイントネーションの不自然さがなくなるように、あらかじめ情報提供元が、音声合成装置への入力となる中間言語データを正しく修正しておくことが重要である。 Speech synthesizers that have been used only for reading out information such as bank balance inquiry and stock price information are now also installed in in-vehicle information terminals and mobile phones such as car navigation systems for automobiles. It has become like this. In these information providing devices, a speech synthesizer is used to read out information such as current news and guidance from a service company. In such applications, it is important that the information provider corrects the intermediate language data to be input to the speech synthesizer in advance so that reading errors and intonation unnaturalness are eliminated as much as possible.

この目的のため、中間言語データを編集する装置、ツール類を利用して、様々な読み上げスタイルを指定できるように中間言語データを編集するための技術がある。例として読み上げる文章の中で複数の読みを持つ単語があった場合、音声合成装置で読み上げる途中でも、編集者が正しい読みを指定できるようにする機能とともに、指定された正しい読み情報を元の文書データと関連づけて保存する技術がある（例えば、特許文献１参照）。 For this purpose, there is a technique for editing intermediate language data so that various reading styles can be specified using devices and tools for editing intermediate language data. As an example, if there is a word with multiple readings in the text to be read, the correct correct reading information is specified in the original document along with a function that allows the editor to specify the correct reading even while reading with the speech synthesizer. There is a technique for storing data in association with the data (see, for example, Patent Document 1).

特開平７−１３４５９７号公報Japanese Unexamined Patent Publication No. 7-134597

上記の特許文献１記載の技術では読み記号への変換の際に、共起辞書というデータを用いて複数の読みを持つ単語に対してその単語の近隣にどのような単語があるかによって読みを変更するという機能を実現している。その上で、出力された読みが間違っていた場合、その複数の読みを持つ単語に対して正しい読みを保存し、別の文書で同じ単語が出現した際にその修正された読みへと変換する技術を発明している。 In the technique described in Patent Document 1 described above, at the time of conversion to a phonetic symbol, a co-occurrence dictionary is used to read a word having a plurality of readings depending on what word is in the vicinity of the word. The function of changing is realized. On top of that, if the output reading is wrong, save the correct reading for the word with multiple readings and convert it to the corrected reading when the same word appears in another document Invented technology.

しかし、この技術では、その読みの修正がどのような条件の場合になりたつかを保存できないため、修正された読みが保存されている単語は、常に同じ読みに修正されるという問題がある。さらに、単語の読みとは異なり、音声の抑揚はその単語単独で定まるものではなく、その近隣の語句やフレーズの存在、特徴によって変わってくることが知られているが、特許文献１記載の技術では、抑揚の文脈依存性を適切に扱える機構が含まれない。本発明では、ある単語のある単語の正しい読みが、別のある単語やフレーズが近くにあるか、またはその単語を含む文書内容の分野・カテゴリーによって異なってくる場合等に、その単語・フレーズの関係に関する情報も併せて保存することで、より適切に読み・アクセント・イントネーションの修正結果を反映できるようにする。 However, this technique has a problem in that the word in which the corrected reading is stored is always corrected to the same reading because the correction of the reading cannot be stored under any conditions. Furthermore, unlike the reading of a word, the inflection of a voice is not determined by the word alone, but is known to vary depending on the presence and characteristics of nearby words and phrases. Does not include a mechanism that can properly handle the independence context dependency. In the present invention, when the correct reading of a certain word is close to another word or phrase or differs depending on the field / category of the document content including the word, By storing information related to the relationship as well, it is possible to reflect the correction results of reading, accent, and intonation more appropriately.

漢字を含む読み上げ対象文書から中間言語を生成する生成手段と、中間言語を編集する編集手段とを有する音声データ生成装置、若しくは音声合成装置。特に、編集手段は、表示手段に表示される上記中間言語に対して入力される修正指示を受け付けて、該修正指示の情報を記憶手段に記憶するものであって、修正指示は、少なくとも修正対象の語句の指定と、該修正反映の条件指定を含む。又、その編集手段におけるユーザインタフェイスも開示する。 A voice data generation apparatus or a voice synthesis apparatus including a generation unit that generates an intermediate language from a reading target document including a kanji and an editing unit that edits the intermediate language. In particular, the editing means receives a correction instruction input to the intermediate language displayed on the display means, and stores information on the correction instruction in the storage means. The correction instruction is at least a correction target. Including the phrase designation and the condition for reflecting the correction. A user interface in the editing means is also disclosed.

本発明によれば、ある単語・フレーズに対して正しい読み・アクセントになるようにまた聞き取りやすい抑揚になるように調整・編集した中間言語データをそのデータを再利用可能な条件とともに登録フレーズデータとして記録しておき、現在編集中の文書内で再利用可能な場合は自動的に挿入することで、従来の音声データ作成装置では正しく反映できなかった文脈による違いを扱えるようになる。 According to the present invention, intermediate language data adjusted and edited so as to be correct reading / accenting for a certain word / phrase and easy to hear inflection is used as registered phrase data together with conditions for reusing the data. By recording it and inserting it automatically when it can be reused in the document that is currently being edited, it becomes possible to handle differences due to the context that cannot be correctly reflected by the conventional audio data creation apparatus.

以下、本発明の実施形態について、図面を参照しながら説明する。
図１は本発明の音声データ作成装置の基本的構成について説明する図である。
同図に示されるように、本発明の音声データ作成装置の基本的構成は、読み上げる漢字かな混じり文章が記録された文書データ記録装置００１０と、そこから編集処理を行なう文書データを入力する文書データ入力手段００２０と、入力された文書データの各単語に読み・アクセント情報を付与する中間言語生成手段００３０と、生成された中間言語データの編集機能を提供する中間言語編集手段００６０と、生成された中間言語データや編集作業の状態を表示する中間言語表示装置００５０と、生成された中間言語データを視聴のために合成音声に変換する音声合成装置００７０と、変換された合成音声を再生する音声再生装置００８０と、編集作業において利用者の編集指示を入力する編集指示入力装置００９０と、編集された中間言語を出力する中間言語出力装置０１００とを持つ音声データ作成装置である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a diagram for explaining the basic configuration of an audio data creation apparatus according to the present invention.
As shown in the figure, the basic configuration of the audio data creation device of the present invention is that the document data recording device 0010 in which the kanji-kana mixed text to be read is recorded, and the document data for inputting the document data to be edited therefrom is input. An input unit 0020, an intermediate language generation unit 0030 that gives reading / accent information to each word of the input document data, an intermediate language editing unit 0060 that provides an editing function of the generated intermediate language data, and Intermediate language display device 0050 for displaying the intermediate language data and the state of editing work, speech synthesis device 0070 for converting the generated intermediate language data into synthesized speech for viewing, and audio reproduction for reproducing the converted synthesized speech Device 0080, editing instruction input device 0090 for inputting a user's editing instruction in editing work, and edited intermediate language An audio data generation apparatus having an intermediate language output device 0100 outputs.

更に、ある単語・フレーズの正しい読み・アクセント・イントネーション情報を別の単語・フレーズまたは文書内容カテゴリーの情報と関連づけた登録フレーズ情報を利用者の指定により作成する登録フレーズ指定手段０１１０と、その登録フレーズを記録装置に格納する登録フレーズ格納手段０１３０と、登録フレーズを記録保持する登録フレーズ記録装置０１５０と、登録フレーズ記録装置からある単語・フレーズに関する登録フレーズ情報を検索する登録フレーズ検索手段０１４０と、検索された登録フレーズ情報を元に元の単語・フレーズの正しい読み・アクセント・イントネーションを中間言語データに挿入する登録フレーズ挿入手段０１２０と、中間言語生成手段や登録フレーズ挿入手段から出力された言語情報に基づいて中間言語データ内の個々の単語・フレーズに対する読み・アクセント・イントネーションの候補となりうるパタンを生成する読み・アクセント候補生成手段００４０を有する。 Furthermore, registered phrase specifying means 0110 for creating registered phrase information in which correct reading / accent / intonation information of a certain word / phrase is associated with information of another word / phrase or document content category, by the user's specification, and the registered phrase Registered phrase storage means 0130 for storing the recorded phrase in the recording apparatus, registered phrase recording apparatus 0150 for recording and holding the registered phrase, registered phrase search means 0140 for searching for registered phrase information related to a certain word / phrase from the registered phrase recording apparatus, and search The registered phrase insertion means 0120 for inserting the correct reading, accent, and intonation of the original word / phrase into the intermediate language data based on the registered phrase information, and the language information output from the intermediate language generation means and the registered phrase insertion means Based on intermediate Having a reading accent candidate generating unit 0040 for generating a pattern that can be the individual readings for the word phrase accent intonation candidates in terms data.

図１において、文書データ記録装置００１０をサーバーに置き、文書データ入力手段００２０がネットワークを介して文書データを取得する実施形態もありえる。また、文書データ記録装置００１０と文書データ入力手段００２０と中間言語生成手段００３０をサーバーにおき、中間言語編集手段００６０がネットワークを介して中間言語データを取得する実施形態もありえる。また、登録フレーズ記録装置０１５０をサーバーにおき、登録フレーズ格納手段０１３０や登録フレーズ検索手段０１４０がネットワークを介して登録フレーズ記録装置０１５０との間で登録フレーズ情報をやり取りする実施形態もありえる。 In FIG. 1, there may be an embodiment in which the document data recording device 0010 is placed on a server and the document data input unit 0020 acquires document data via a network. In addition, there may be an embodiment in which the document data recording device 0010, the document data input unit 0020, and the intermediate language generation unit 0030 are placed on a server, and the intermediate language editing unit 0060 acquires intermediate language data via a network. There may also be an embodiment in which the registered phrase recording device 0150 is placed on a server, and the registered phrase storage unit 0130 and the registered phrase search unit 0140 exchange registered phrase information with the registered phrase recording device 0150 via a network.

次に、本発明の基本的構成図１において、各要素を具体的にどのような装置として構成されるかについて説明する。
文書データ記録装置００１０は、中間言語化して読み・アクセント・イントネーションなどを編集するための漢字かな混じり文章が記録された装置であり、ハードディスク、ＣＤＲＯＭなどの記録媒体である。文書データ入力手段００２０は、文書データ記録装置００１０から文書データを入力する手段であり、ハードディスクディスクデバイスドライバー、ＣＤＲＯＭデバイスドライバーなどのプログラムである。また、文書データ記録装置００１０がサーバーに置かれている場合は、文書データ入力手段００２０はネットワークを介してデータを受信するネットワークデバイスドライバーなどのプログラムである。中間言語生成手段００３０は、漢字かな混じり文章として入力された文書データに対して、言語解析を行なう装置であり、少なくとも、入力文章を単語に分割して読みおよびアクセント情報を付加する形態素解析処理を持ち、好ましくはさらに構文・意味解析処理をも行なう機能を有する。形態素解析とは漢字かな混じり文を個々の単語に分割する処理であり、品詞間の接続可能性を規定した接続データと品詞のコストを定義したコストデータをもとに、コスト最小法や文節数最小法などの手法を用いて実現する。 Next, the basic configuration of the present invention will be described in detail with reference to FIG.
The document data recording apparatus 0010 is an apparatus in which a kanji / kana mixed text for editing reading, accent, intonation, etc. is recorded as an intermediate language, and is a recording medium such as a hard disk or a CDROM. The document data input unit 0020 is a unit for inputting document data from the document data recording device 0010, and is a program such as a hard disk device driver or a CDROM device driver. When the document data recording apparatus 0010 is placed on a server, the document data input unit 0020 is a program such as a network device driver that receives data via a network. The intermediate language generation means 0030 is a device that performs language analysis on document data input as kanji-kana mixed text, and at least performs morphological analysis processing that divides the input text into words and adds reading and accent information. Preferably, and also has a function of performing syntax / semantic analysis processing. Morphological analysis is a process of dividing kanji-kana mixed sentences into individual words. Based on connection data that defines the connectability between parts of speech and cost data that defines the cost of parts of speech, the minimum cost method and the number of phrases This is achieved by using a method such as the minimum method.

また、構文解析は品詞の関係構造を規定する文法データとＬＲパーザやＣＹＫ解析などの構文解析手法を用いて、入力文章に対する係り受け関係を出力する。読み・アクセント候補生成手段００４０は、中間言語生成手段００３０から出力された言語解析データと、登録フレーズ検索手段０１２０から出力された登録フレーズ情報をもとに、複数の読みやアクセント、複数のイントネーションの候補を持つ単語・フレーズそれぞれに対する変更候補リストデータを生成し、中間言語編集手段へと出力する。読み・アクセント候補生成手段００４０の実現方法は、以降の実施形態の詳細において説明する。中間言語表示装置００５０は、中間言語編集手段００６０から出力された中間言語データを利用者に対して表示する機能を持ち、ＣＲＴなどの文字・画像表示装置を利用することができる。 In the syntax analysis, the dependency relationship with respect to the input sentence is output by using the grammatical data defining the part-of-speech relationship structure and the syntax analysis method such as LR parser or CYK analysis. The reading / accent candidate generation means 0040 uses a plurality of readings, accents, and intonations based on the language analysis data output from the intermediate language generation means 0030 and the registered phrase information output from the registered phrase search means 0120. Change candidate list data for each word / phrase having candidates is generated and output to intermediate language editing means. A method for realizing the reading / accent candidate generation means 0040 will be described in detail in the following embodiments. The intermediate language display device 0050 has a function of displaying the intermediate language data output from the intermediate language editing unit 0060 to the user, and can use a character / image display device such as a CRT.

中間言語編集手段００６０は、読み・アクセント候補生成手段００４０から出力された変更候補リストデータを利用して、中間言語生成手段００３０から出力された中間言語データを中間言語表示装置００５０に表示するとともに、利用者からの編集指示を編集指示入力装置００９０から受け取り、中間言語データ内での単語・フレーズの読み・アクセントの変更や、イントネーションの調整などの中間言語編集機能を提供する。中間言語編集手段００６０の実現方法は、以降の実施形態の詳細において説明する。 The intermediate language editing unit 0060 displays the intermediate language data output from the intermediate language generation unit 0030 on the intermediate language display device 0050 using the change candidate list data output from the reading / accent candidate generation unit 0040, and An editing instruction from the user is received from the editing instruction input device 0090, and an intermediate language editing function such as word / phrase reading / accent change in intermediate language data and adjustment of intonation is provided. A method for realizing the intermediate language editing unit 0060 will be described in detail in the following embodiments.

音声合成装置００７０は、中間言語生成手段００３０から出力された中間言語データや、中間言語編集手段００６０で編集された中間言語データを入力として、音声波形データへと変換する機能を持ち、従来の音声合成手法を利用して実現できる。
音声再生装置００８０は音声合成装置００７０で変換された音声波形データを実際の音声として再生するスピーカーである。編集指示入力装置００９０は、利用者が編集指示を中間言語編集手段００６０に伝えるための装置であり、マウスやキーボード、またはタブレットなどの情報入力装置を用いて実現することができる。中間言語出力装置０１００は、中間言語編集手段００６０で正しい読み・アクセント、適切なイントネーションに調整された中間言語データを外部記録装置に出力したり、またはネットワークを介して別のシステムに送信したりする機能を有する。尚、外部記録装置への出力を行う場合はハードディスクデバイスドライバーのようなプログラムとして、ネットワークを介した送信を行う場合はネットワーク転送プログラムとして実現できる。 The speech synthesizer 0070 has a function of converting the intermediate language data output from the intermediate language generation unit 0030 and the intermediate language data edited by the intermediate language editing unit 0060 into speech waveform data. This can be realized by using a synthesis method.
The voice reproduction device 0080 is a speaker that reproduces the voice waveform data converted by the voice synthesis device 0070 as actual voice. The edit instruction input device 0090 is a device for a user to transmit an edit instruction to the intermediate language editing unit 0060, and can be realized using an information input device such as a mouse, a keyboard, or a tablet. The intermediate language output device 0100 outputs the intermediate language data adjusted to the correct reading / accent and appropriate intonation by the intermediate language editing unit 0060 to an external recording device, or transmits the intermediate language data to another system via a network. It has a function. It should be noted that when outputting to an external recording device, it can be realized as a program such as a hard disk device driver, and when transmitting via a network, it can be realized as a network transfer program.

登録フレーズ指定手段０１１０は、中間言語編集手段００６０を使って利用者が正しい読み・アクセント、適切なイントネーションに調整した結果の中間言語データ、または中間言語データの一部の単語・フレーズを、その近傍の単語・フレーズ、または文書の内容やカテゴリーなどの情報と関連づけて指定する機能を持ち、関連づけられたデータは登録フレーズ格納手段０１３０を介して登録フレーズ記録装置０１５０に格納される。中間言語編集手段００６０は、中間言語表示装置００５０での中間言語データの表示の際に、中間言語生成手段００３０からの出力結果である中間言語データ内の単語・フレーズに対して、読み・アクセント候補生成手段００４０の出力結果である変更候補リストデータ、それに登録フレーズ記録装置０１５０から登録フレーズ検索手段０１２０を介して検索された登録フレーズ情報を対応づけて表示する。検索された登録フレーズデータを中間言語データ内のどの位置に挿入して表示するかは、登録フレーズ挿入手段０１２０が判定する。登録フレーズ指定手段０１１０、登録フレーズ格納手段０１３０、登録フレーズ検索手段０１４０、登録フレーズ挿入手段０１２０はそれぞれコンピュータプログラムを装置の処理装置で読み込むことで実現され、その詳細な実現方法は、以降の実施例において説明する。
以上、説明したような本発明の基本的構成を基に、より具体的な形態で実施する例を以下で説明する。 Registered phrase designating unit 0110 uses intermediate language editing unit 0060 as a correct reading / accent by the user, intermediate language data obtained as a result of adjustment to appropriate intonation, or some words / phrases of intermediate language data in the vicinity thereof And the associated data is stored in the registered phrase recording device 0150 via the registered phrase storage unit 0130. The intermediate language editing means 0060 reads / accents candidates for words / phrases in the intermediate language data, which is the output result from the intermediate language generation means 0030, when displaying the intermediate language data on the intermediate language display device 0050. The change candidate list data, which is the output result of the generation unit 0040, and the registered phrase information retrieved from the registered phrase recording device 0150 via the registered phrase search unit 0120 are displayed in association with each other. The registered phrase insertion unit 0120 determines at which position in the intermediate language data the searched registered phrase data is to be inserted and displayed. The registered phrase specifying unit 0110, the registered phrase storage unit 0130, the registered phrase searching unit 0140, and the registered phrase inserting unit 0120 are each realized by reading a computer program by the processing device of the apparatus. Will be described.
Based on the basic configuration of the present invention as described above, an example implemented in a more specific form will be described below.

第１番の実施形態として、図１の基本的構成を採り、基本的なＧＵＩ編集機能を備えた中間言語編集システムとして、本発明を実施した場合を説明する。
図２は実施例１の編集画面レイアウトの一例である。画面左側には編集対象となる文書データリスト１０１０が表示され、画面右上には現在、編集中の文書内容１０２０が表示され、画面右下には現在編集中の文書の中間言語データ１０３０が表示されている。 As the first embodiment, a case will be described in which the present invention is implemented as an intermediate language editing system that adopts the basic configuration of FIG. 1 and has a basic GUI editing function.
FIG. 2 is an example of an edit screen layout according to the first embodiment. The document data list 1010 to be edited is displayed on the left side of the screen, the document content 1020 currently being edited is displayed on the upper right side of the screen, and the intermediate language data 1030 of the document currently being edited is displayed on the lower right side of the screen. ing.

図３は実施例１の中間言語編集システムの各装置・手段での処理内容の処理のフローチャートである。まず、本中間言語編集システムが起動すると（ステップ２０１０）、文書データ記録装置００１０から編集対象となる複数の文書データを読み込み、その文書タイトルを中間言語表示装置００５０の文書データリスト領域１０１０にリスト表示する（ステップ２０２０）。次に、本中間言語編集システムは利用者からの編集対象文書の指定待ちループに入る（ステップ２０３０）。編集指示入力装置００９０を介して、利用者が編集対象となる文書データを文書データリストの中から選ぶと、本中間言語編集システムは編集対象文書の内容表示を行なう（ステップ２０４０）。この内容表示とは、指定された編集文書の内容を文書データ記録装置００１０から読み込み、文書内容領域１０２０に表示する処理である。次に、読み込んだ指定文書の内容を中間言語生成手段００３０に入力して、中間言語データへと変換する（ステップ２０５０）。前述のように、中間言語生成手段００３０は、形態素解析や構文解析などの言語解析手法を用いて中間言語データへの変換を行なう。中間言語生成手段では従来の形態素解析手法や構文解析手法等を用いることができる。 FIG. 3 is a flowchart of processing content processing in each device / means of the intermediate language editing system according to the first embodiment. First, when the intermediate language editing system is activated (step 2010), a plurality of document data to be edited is read from the document data recording device 0010, and the document titles are displayed in a list in the document data list area 1010 of the intermediate language display device 0050. (Step 2020). Next, the intermediate language editing system enters a loop for waiting for designation of the document to be edited from the user (step 2030). When the user selects document data to be edited from the document data list via the editing instruction input device 0090, the intermediate language editing system displays the contents of the editing target document (step 2040). This content display is a process of reading the content of the specified edited document from the document data recording device 0010 and displaying it in the document content area 1020. Next, the contents of the read designated document are input to the intermediate language generation means 0030 and converted into intermediate language data (step 2050). As described above, the intermediate language generation unit 0030 performs conversion into intermediate language data using a language analysis method such as morphological analysis or syntax analysis. The intermediate language generation means can use conventional morphological analysis methods, syntax analysis methods, and the like.

中間言語生成手段００３０で行なわれる形態素解析の結果を図４に示す。形態素解析処理の結果、入力文章は単語に分割され、かつ、それぞれの単語に対して読み・アクセントの候補が辞書データから検索される。図４の例では、１８番目の形態素「最高値」に対して、「サイコ’ーチ」という読み・アクセントの候補（第１候補）と、「サイタカ’ネ」という第２候補が検索されたことを示している。 The result of the morphological analysis performed by the intermediate language generation unit 0030 is shown in FIG. As a result of the morphological analysis processing, the input sentence is divided into words, and reading / accent candidates for each word are searched from the dictionary data. In the example of FIG. 4, for the 18th morpheme “highest value”, a reading / accenting candidate (first candidate) “Psycho'chi” and a second candidate “Saitaka'ne” are searched. It is shown that.

また、中間言語生成手段００３０で構文解析処理が行なわれた場合の解析結果例を図５に示す。図５の例では、「本日の」という文節と「株式」という文節が共に「市況は」という文節に係っていることなど、文節間の係り受け関係が解析されている。さらには、中間言語生成手段００３０では、入力文章中の単語の読みそれぞれについて、近くにどのような単語があったらその読みで読まれることが多いかを示す、共起データと呼ばれるデータを共起データ辞書から検索しておく場合もある。共起データの例を図６に示す。図６の例では、形態素「最高値」は、近くに「気温」「実験」などの単語があった場合に「サイコ’ーチ」という読み・アクセントになることが多く、近くに「株価」「株式」「終値」などの単語があった場合には「サイタカ’ネ」という読み・アクセントになることが多いことを示している。 FIG. 5 shows an example of the analysis result when the syntax analysis processing is performed by the intermediate language generation unit 0030. In the example of FIG. 5, the dependency relationship between clauses is analyzed, for example, the phrase “today” and the clause “stock” are both related to the clause “market conditions”. Further, the intermediate language generation means 0030 co-occurs data called co-occurrence data that indicates what word is nearby when the word is read in the input sentence. In some cases, the data dictionary is searched. An example of co-occurrence data is shown in FIG. In the example of FIG. 6, the morpheme “highest value” is often read and accented as “psycho” when there are words such as “temperature” and “experiment” nearby, and “stock price” is nearby. When there are words such as “stock” and “closing price”, it is often read and accented as “Saitaka'ne”.

中間言語生成手段００３０では、最終的に図４から図６のような解析結果を総合して、一番もっともらしい読み・アクセントを用いた中間言語データを生成する。図７と図８に中間言語データの例を示す。図７は共起データを用いなかった場合、図８は適切な共起データが存在し、「最高値」の読み・アクセントを正しく付与できた場合を示している。
こうして中間言語生成手段００３０で生成された中間言語データ、および言語解析データは、次に読み・アクセント候補生成手段００４０に渡され、読み・アクセント候補生成処理（ステップ２０６０）が行なわれるとともに、登録フレーズ検索手段０１４０に渡され、登録フレーズの検索処理（ステップ２０７０）が行なわれる。次に、読み・アクセント候補生成処理（ステップ２０６０）では、中間言語生成手段００３０から出力された中間言語データ（図７、図８）と、言語解析データ（図４、図５、図６）をもとに、複数の読み・アクセントを持つ単語を抜き出し、かつその単語に対する複数の読み・アクセントのリストを作成する。 The intermediate language generation means 0030 finally combines the analysis results as shown in FIGS. 4 to 6 to generate intermediate language data using the most probable reading / accent. 7 and 8 show examples of intermediate language data. FIG. 7 shows a case where co-occurrence data is not used, and FIG. 8 shows a case where appropriate co-occurrence data exists and the “highest value” reading / accent can be correctly given.
The intermediate language data and the language analysis data thus generated by the intermediate language generation means 0030 are then transferred to the reading / accent candidate generation means 0040 for reading / accent candidate generation processing (step 2060) and the registered phrase. The data is passed to the search means 0140, and a registered phrase search process (step 2070) is performed. Next, in the reading / accent candidate generation process (step 2060), the intermediate language data (FIGS. 7 and 8) output from the intermediate language generation means 0030 and the language analysis data (FIGS. 4, 5, and 6) are used. First, a word having a plurality of readings / accents is extracted, and a list of a plurality of readings / accents for the words is created.

ここでの文書データ例では、図９に示すように単語「最高値」のみが複数の読み・アクセント候補を持つ。図９の例は、中間言語生成手段００３０で共起データ図６を用いずに解析処理を行なった場合の読み・アクセント候補リストであり、「サイコ’ーチ」が第１候補、「サイタカ’ネ」が第２候補となっている。もし、中間言語生成手段００３０で図６の共起データを用いて解析処理を行なった場合は、図９の第１候補、第２候補の順番は入れ替わる。 In the document data example here, as shown in FIG. 9, only the word “highest value” has a plurality of reading / accenting candidates. The example of FIG. 9 is a reading / accent candidate list when the intermediate language generation means 0030 performs the analysis process without using the co-occurrence data FIG. 6, and “Psycho'-chi” is the first candidate, “Saitaka ' “Ne” is the second candidate. If the intermediate language generation means 0030 performs the analysis process using the co-occurrence data in FIG. 6, the order of the first candidate and the second candidate in FIG. 9 is switched.

一方、中間言語生成手段００３０から出力された中間言語データ、および言語解析データは登録フレーズ検索手段０１４０に入力され、その文章内の単語・フレーズに対応する登録フレーズデータが登録フレーズ記録装置０１５０内に存在しているかどうかを検索する（ステップ２０７０）。本発明における登録フレーズデータとは、以前の中間言語編集作業において、正しい読み・アクセントと聞き取りやすい抑揚に調整された結果を、その単語・フレーズと共に格納したデータを意味する。 On the other hand, the intermediate language data and language analysis data output from the intermediate language generation unit 0030 are input to the registered phrase search unit 0140, and the registered phrase data corresponding to the word / phrase in the sentence is stored in the registered phrase recording device 0150. It is searched whether it exists (step 2070). The registered phrase data in the present invention means data in which the result of adjustment to correct reading / accent and easy to hear inflection in the previous intermediate language editing work is stored together with the word / phrase.

登録フレーズ記録装置０１５０に格納されている登録フレーズデータの例を図１０に示す。図１０の登録フレーズデータ例の意味は、「終値で」という単語・フレーズ９０１０に対する調整済み読み・アクセント・抑揚データ９０４０が「＃８０オワリ’ネデ｜１」であり、その調整済みデータはその単語・フレーズ「終値で」の直後（＋１）に関連単語・フレーズ「今年」がある場合に再利用できるということを示している。ここで「＃８０」や「｜１」などのデータが抑揚などの聴きやすさを調整するパラメータである。中間言語のテキストデータの中にそのまま埋め込む形で韻律・抑揚の調整パラメータを記述するため、上記調整済み中間言語データをそのまま保存すれば、調整された韻律も保存され、その後の処理に活かすことができる。図１０のもう一つのデータ例では、「最高値」に対する調整済みデータ「サイタカ’ネ」は、「株価」や「終値」という単語・フレーズが元の単語から見て前後５単語以内に存在していれば、再利用できるということを示している。 An example of registered phrase data stored in the registered phrase recording device 0150 is shown in FIG. The meaning of the registered phrase data example in FIG. 10 is that the adjusted reading / accent / intonation data 9040 for the word “phrase 9010” at the closing price is “# 80 Owari'Nede | 1”, and the adjusted data is This indicates that if the related word / phrase “this year” is immediately after the word / phrase “at closing price” (+1), it can be reused. Here, data such as “# 80” and “| 1” are parameters for adjusting ease of listening such as intonation. Since the prosody / inflection adjustment parameters are written in the intermediate language text data as they are, if the adjusted intermediate language data is saved as it is, the adjusted prosody can be saved and used for further processing. it can. In the other data example of FIG. 10, the adjusted data “Saitaka'ne” for “highest price” is within 5 words before and after the word / phrase “stock price” or “close price” as seen from the original word. Indicates that it can be reused.

これ以降の説明では、登録フレーズ検索処理（ステップ２０７０）を行なった結果、登録フレーズ記録装置０１５０から図１０に示す二つの登録フレーズデータが検索されたものとする。次に、登録フレーズ検索処理（ステップ２０７０）で検索された登録フレーズデータ（図１０）は、登録フレーズ挿入手段０１２０に渡され、登録フレーズ挿入処理（ステップ２０９０）が行なわれる。登録フレーズ挿入処理（ステップ２０９０）では、中間言語生成処理（ステップ２０５０）から出力された中間言語データの中で、検索された登録フレーズデータに一致する部分を、その調整済み読み・アクセント・抑揚データで置換する処理を行なう。図８の中間言語データ、および図１０の登録フレーズデータに対して、登録フレーズ挿入処理を行なった結果を図１１に示す。 In the following description, it is assumed that, as a result of the registered phrase search process (step 2070), two registered phrase data shown in FIG. 10 are searched from the registered phrase recording device 0150. Next, the registered phrase data (FIG. 10) searched in the registered phrase search process (step 2070) is transferred to the registered phrase insertion means 0120, and the registered phrase insertion process (step 2090) is performed. In the registered phrase insertion process (step 2090), a portion of the intermediate language data output from the intermediate language generation process (step 2050) that matches the searched registered phrase data is subjected to the adjusted reading / accent / inflation data. Perform the replacement process with. FIG. 11 shows the result of the registered phrase insertion process performed on the intermediate language data in FIG. 8 and the registered phrase data in FIG.

次に、登録フレーズ挿入手段０１２０から出力された中間言語データ（図１１）、および、読み・アクセント候補生成手段００４０から出力された読み・アクセント候補リスト（図９）は、中間言語編集手段００６０に渡され、編集文書の中間言語表示処理（ステップ２０８０）が行なわれる。編集文書の中間言語表示処理（ステップ２０８０）では、中間言語データ（図１１）、および読み・アクセント候補リスト（図９）を、中間言語表示装置００５０上で利用者にとって見やすく、かつ編集しやすいレイアウトで表示する。この編集文書の中間言語表示処理（ステップ２０８０）で表示された中間言語データの例を図１２に示す。このレイアウト例ではアクセント句と呼ばれる単位ごとに、メニュー形式で変更候補を示しており、この説明例の場合、「終値で」と「最高値を」の二つの語句に対して、それぞれ二つの変更候補が示されている。さらに、登録フレーズ挿入処理（ステップ２０９０）の結果、「終値で」に対しては記録されていた調整済み読み・アクセント・抑揚データ「＃８０オワリ’ネデ｜１」が第１候補として挿入され、「最高値を」に対しては読み・アクセント候補生成処理（ステップ２０６０）から出力された読み・アクセント候補リストの中で、検索された登録フレーズデータに一致する「サイタカ’ネ」に対応する第２候補がすでに選ばれていることを示している。 Next, the intermediate language data (FIG. 11) output from the registered phrase inserting unit 0120 and the reading / accent candidate list (FIG. 9) output from the reading / accent candidate generating unit 0040 are sent to the intermediate language editing unit 0060. Then, the intermediate language display process (step 2080) of the edited document is performed. In the intermediate language display processing (step 2080) of the edited document, the intermediate language data (FIG. 11) and the reading / accent candidate list (FIG. 9) are easily readable and editable on the intermediate language display device 0050 by the user. Is displayed. An example of the intermediate language data displayed in the intermediate language display process (step 2080) of the edited document is shown in FIG. In this layout example, change candidates are shown in the menu format for each unit called an accent phrase. In this example, two changes are made for each of the two words "Close price" and "Maximum price". Candidates are shown. Further, as a result of the registered phrase insertion processing (step 2090), the adjusted reading / accent / intonation data “# 80 O'Nede | 1” recorded for “at the closing price” is inserted as the first candidate. , “Highest value” corresponds to “Saitaka'ne” that matches the retrieved registered phrase data in the reading / accent candidate list output from the reading / accent candidate generation process (step 2060). This indicates that the second candidate has already been selected.

こうして、編集文書の中間言語表示処理（ステップ２０８０）によって、中間言語表示装置００５０に中間言語データが表示されると、中間言語編集手段００６０は利用者からの編集指示を受け付けるループ（ステップ２１００）に入る。編集指示入力装置００９０を介して、利用者からの中間言語編集指示要求が発生すると、中間言語編集手段００６０は指示された編集処理の実行を行なう（ステップ２１１０）。 Thus, when the intermediate language data is displayed on the intermediate language display device 0050 by the intermediate language display processing (step 2080) of the edited document, the intermediate language editing means 0060 enters a loop (step 2100) that accepts an editing instruction from the user. enter. When an intermediate language editing instruction request from the user is generated via the editing instruction input device 0090, the intermediate language editing means 0060 executes the instructed editing process (step 2110).

編集処理の実行が完了すると、中間言語編集手段００６０は再び利用者からの編集指示を受け付けるループに戻る（ステップ２１２０、ステップ２１００）。ここで、利用者から入力された編集指示が、編集中文書データの編集終了要求であった場合（ステップ２１２０）、システムは編集結果の中間言語出力処理（ステップ２１３０）を行なった後、次に編集する文書データの指定待ちループに戻る（ステップ２０２０）。編集結果の中間言語出力処理（ステップ２１３０）は、利用者の編集指示を実行した結果、最終的に作成された中間言語データ（図１１など）を、中間言語出力装置０１００によって、外部記録媒体への保存、もしくはネットワークを介してサーバー装置などへの送信処理が行なわれる。 When the execution of the editing process is completed, the intermediate language editing unit 0060 returns to the loop for receiving the editing instruction from the user again (steps 2120 and 2100). If the editing instruction input from the user is a request to end editing of the document data being edited (step 2120), the system performs an intermediate language output process (step 2130) of the editing result, and then The process returns to a loop for waiting for designation of document data to be edited (step 2020). In the intermediate language output process (step 2130) of the editing result, the intermediate language data (such as FIG. 11) finally created as a result of executing the user's editing instruction is output to an external recording medium by the intermediate language output device 0100. Is stored or transmitted to a server device or the like via a network.

次に、編集指示入力装置００９０を介して、利用者から個別の編集指示があった場合について説明する。編集指示の入力方法には、大きく分けて、編集対象となる語句・フレーズを指定してから、その対象語句・フレーズにどのような編集操作を行なうかを指定するという方法と、逆に次に行なう編集操作を指定してから、その操作対象となる語句・フレーズを指定するという方法がありえる。本発明の編集指示入力装置００９０は、編集指示指定操作が終わった後に編集対象語句・フレーズと編集操作内容を合わせて中間言語編集手段００６０に渡すことで、どちらの指定方法であっても対応することは可能である。また、対象と操作を別々に指定する方法のほかに、マウスとキーボード、またはマウスと音声などのように異なる入力装置を使って、対象と操作を同時に指定するという場合もありえるが、このような指定方法であっても、同様である。よって、中間言語編集手段００６０は、そのどちらかの手法を前提としたものである必要はない。 Next, a case where there is an individual editing instruction from the user via the editing instruction input device 0090 will be described. The input method of editing instructions can be broadly divided into a method of specifying a word / phrase to be edited and then specifying what editing operation is to be performed on the target word / phrase. There may be a method of designating a word / phrase to be an operation target after designating an editing operation to be performed. The editing instruction input device 0090 according to the present invention corresponds to any designation method by passing the editing target phrase / phrase and the editing operation contents together to the intermediate language editing means 0060 after the editing instruction designation operation is completed. It is possible. In addition to specifying the target and the operation separately, there may be a case where the target and the operation are specified at the same time using different input devices such as a mouse and a keyboard or a mouse and a voice. The same applies to the designation method. Therefore, the intermediate language editing unit 0060 does not have to be based on either method.

編集操作の一例として、まず、読み・アクセントの変更を説明する。
図１２のように表示された中間言語レイアウトにおいて、利用者が「サイタカ’ネオ」を指定し、その読み・アクセントを「サイコ’ーチオ」に変更するという編集指示が行なわれたとする。 As an example of the editing operation, first, reading / accent change will be described.
In the intermediate language layout displayed as shown in FIG. 12, it is assumed that the user designates “Saitaka'Neo” and an editing instruction is given to change its reading / accent to “Psycho'-Ochio”.

この編集指示入力は、例えば、編集指示入力装置００９０がマウスだったとして、現在、有効になっている「サイタカ’ネオ」をクリックすることで編集対象語句・フレーズの指定を行い、次に「サイコ’ーチオ」をクリックして別の読み・アクセント候補への変更という編集指示を指定することでシステムに入力することができる。中間言語編集手段００６０は、この編集指示を受けて、「最高値を」に対応する読み・アクセントデータを現在の「サイタカ’ネオ」から「サイコ’ーチオ」に変更する処理を、指示された編集処理の実行ステップ（ステップ２１１０）で実施する。この編集結果は、中間言語編集手段００６０によって、中間言語表示装置００５０の中間言語レイアウト上でも更新される。 In this editing instruction input, for example, assuming that the editing instruction input device 0090 is a mouse, an edit target word / phrase is specified by clicking “Saitaka 'Neo” which is currently valid. You can enter it into the system by clicking on '-thio' and specifying editing instructions to change to another reading / accent candidate. In response to this editing instruction, the intermediate language editing unit 0060 changes the reading / accent data corresponding to “highest value” from the current “Saitaka 'Neo” to “Psycho' thio”. This is performed in the process execution step (step 2110). The editing result is also updated on the intermediate language layout of the intermediate language display device 0050 by the intermediate language editing means 0060.

次に、登録フレーズ指定という編集指示について説明する。
登録フレーズ指定処理は、利用者が編集指示入力装置００９０を介して、中間言語編集手段００６０に登録フレーズ指定という編集指示を行なうことで、実行される。この登録フレーズ指定処理は、図１０に示すような登録フレーズデータを新たに記録・保存するための処理であり、ここで保存された登録フレーズデータは、別の文書データの編集作業において、登録フレーズ検索手段０１４０で検索され、再利用される。 Next, an editing instruction called registration phrase designation will be described.
The registered phrase specifying process is executed when the user gives an editing instruction for specifying a registered phrase to the intermediate language editing unit 0060 via the editing instruction input device 0090. This registered phrase designating process is a process for newly recording and storing registered phrase data as shown in FIG. 10, and the registered phrase data stored here is used in another document data editing operation. Searched by the search means 0140 and reused.

図１３に登録フレーズ指定処理のフローチャートを示す。
図１３は、編集指示入力装置００９０での編集指示の指定が、編集操作を指定した後に編集対象となる単語・フレーズを指定する順序である場合のフローチャートであるが、前述のようにこの順番は逆になる場合もあり、また、同時に指定される場合もある。以下では、あらかじめ図６に示す共起データや、図１０に示す登録フレーズデータが保存されていなかった場合を想定して説明する。この場合、図２の文書内容表示領域１０２０に表示されている文章例に対する中間言語データとしては図８に示す誤りを含むものが出力され、中間言語表示装置００５０には、図１４に示す中間言語データレイアウトが表示されることになるが、その後、この状態の中間言語データに対して利用者が中間言語編集処理を行い、「最高値を」に対する正しい読み・アクセントの選択、および、「終値で」に対する聴きやすい抑揚の指定がなされた後、最終的に図１３に示す中間言語データに編集されたとする。 FIG. 13 shows a flowchart of the registered phrase specifying process.
FIG. 13 is a flowchart in the case where the designation of the editing instruction in the editing instruction input device 0090 is the order in which the word / phrase to be edited is designated after the editing operation is designated. In some cases, it may be reversed, and in other cases, it may be specified at the same time. In the following, description will be made assuming that the co-occurrence data shown in FIG. 6 and the registered phrase data shown in FIG. 10 are not stored in advance. In this case, as the intermediate language data for the sentence example displayed in the document content display area 1020 of FIG. 2, the data including the error shown in FIG. 8 is output, and the intermediate language shown in FIG. The data layout will be displayed. After that, the user performs an intermediate language editing process on the intermediate language data in this state, selects the correct reading / accent for “highest value”, and “at the closing price”. Suppose that the intonation easy-to-listen is specified for “” and then the intermediate language data shown in FIG. 13 is edited.

この状態で、まず、編集指示入力装置００９０を介して利用者によって登録フレーズ指定操作が指示されると、中間言語編集装置００６０は、登録フレーズ指定手段０１１０を起動する（ステップ１２０１０）。次に登録フレーズ指定手段０１１０は、登録単語・フレーズ指定処理を実行し（ステップ１２０２０）、登録フレーズデータの本体である再利用したい単語・フレーズの指定入力を受け付ける。この指定入力は、例えば、中間言語表示装置００５０に表示されている当該単語・フレーズ位置をマウスでクリックしたり、または中間言語表示装置００５０中間言語データ内に表示されているカーソルを当該単語・フレーズ位置までキーボードなどで移動させた後、例えば、リターンキーなどの特別なキー入力を行なわせるなどの入力方法を採用することができる。ここでは、この登録フレーズ指定処理（ステップ１２０２０）によって、図１２の「＃８０オワリネ’デ｜１」という調整済み単語が選択されたものとする。 In this state, first, when a registered phrase specifying operation is instructed by the user via the editing instruction input device 0090, the intermediate language editing device 0060 activates the registered phrase specifying means 0110 (step 12010). Next, the registered phrase specifying unit 0110 executes a registered word / phrase specifying process (step 12020), and receives a specified input of a word / phrase to be reused which is the main body of the registered phrase data. For example, the designation input may be performed by clicking the position of the word / phrase displayed on the intermediate language display device 0050 with the mouse or by moving the cursor displayed in the intermediate language display device 0050 intermediate language data to the word / phrase. For example, an input method such as performing a special key input such as a return key after moving to a position with a keyboard or the like can be employed. Here, it is assumed that the adjusted word “# 80 O'line'de | 1” in FIG. 12 is selected by this registered phrase specifying process (step 12020).

次に登録フレーズ指定手段０１１０は、関連単語・フレーズ指定処理を実行し（ステップ１２０３０）、先に入力した登録単語・フレーズ「＃８０オワリネ’デ｜１」の調整に対して文脈データとなる関連単語・フレーズの指定入力を受け付ける。この指定入力も、マウスやキーボードを使って同様に指定させることが可能である。ただし、中間言語表示装置００５０上で複数の単語・フレーズが選択表示されることになるため、その違いを明確にさせる目的で、選択された単語・フレーズの色を変えたり、または文字の大きさや書体を変えるなどの表示方法の変更を行なうことが有効である。 Next, the registered phrase designating unit 0110 executes a related word / phrase designating process (step 12030), and becomes a context data for the adjustment of the previously input registered word / phrase “# 80 Oline'de | 1”. Accepts specified word / phrase input. This designation input can be similarly designated using a mouse or a keyboard. However, since a plurality of words / phrases are selected and displayed on the intermediate language display device 0050, the color of the selected words / phrases, the size of characters, It is effective to change the display method such as changing the typeface.

次に登録フレーズ指定手段０１１０は、関連語位置の調整入力処理（ステップ１２０４０）を実行し、関連単語・フレーズが登録単語・フレーズからどのような位置関係にあるかの情報を入力させる。この入力は、二つの単語・フレーズの間にある形態素の個数を、システムが自動的にカウントしたものを利用してもよく、また、その値を利用者に提示して、調整した値を入力させるなどの方法をとってもよい。次に登録フレーズ指定手段０１１０は、ステップ１２０２０からステップ１２０４０までで入力された登録単語・フレーズデータ、関連単語・フレーズデータ、および関連語位置データをもとに、登録フレーズデータの生成処理（ステップ１２０５０）を実行する。この処理は、それまでのステップで入力された３種の情報を、登録フレーズ記憶装置０１５０に格納可能なデータ形態に変換するものであり、具体的には、図１０に示す４つのフィールド値を持つデータベースレコードなどの構造を用いることが可能である。 Next, the registered phrase designating unit 0110 executes a related word position adjustment input process (step 12040), and inputs information about the positional relationship of the related word / phrase from the registered word / phrase. For this input, the system automatically counts the number of morphemes between two words / phrases, or presents the value to the user and inputs the adjusted value. You may take the method of making it. Next, the registered phrase designating unit 0110 generates registered phrase data (step 12050) based on the registered word / phrase data, the related word / phrase data, and the related word position data input in steps 12020 to 12040. ). This process converts the three types of information input in the previous steps into a data format that can be stored in the registered phrase storage device 0150. Specifically, the four field values shown in FIG. It is possible to use a structure such as a database record.

次に、こうして生成された登録フレーズデータは、登録フレーズ格納手段０１３０に渡され、登録フレーズ記録装置０１５０に記録・保存される。こうして新たに記録・保存された登録フレーズ情報は、本実施例１の前半の説明で述べたとおり、別の文書データの編集作業において自動的に再利用され、その結果、正しい読み・アクセント、聞き取りやすい抑揚がより適切に付与されることが可能となる。 Next, the generated registered phrase data is transferred to the registered phrase storage unit 0130 and recorded / saved in the registered phrase recording device 0150. Registered phrase information newly recorded and stored in this way is automatically reused in editing other document data as described in the first half of the first embodiment. As a result, correct reading / accenting and listening are obtained. Easy inflection can be given more appropriately.

第2番目の実施例として、本発明における登録フレーズ指定手段０１１０の実施の際に、ＧＵＩを用いてより直感的な指定ができるようにした中間言語編集ツールの例を説明する。実施例２の処理の流れは、登録フレーズ指定手段０１１０の中での処理（図１３のフローチャートで示される処理）以外はすべて実施例１の場合と同一であるので、ここでの説明は省略する。
以下、図３の中間言語編集ツールの処理フローにおいて、編集指示として登録フレーズの指定が選択されて登録フレーズ指定手段０１１０が起動され、図１３に示す登録フレーズ指定処理が開始されて以降の処理について説明する。 As a second embodiment, an example of an intermediate language editing tool that allows more intuitive designation using a GUI when the registered phrase designation unit 0110 according to the present invention is implemented will be described. Since the processing flow of the second embodiment is the same as that of the first embodiment except for the processing in the registered phrase specifying unit 0110 (the processing shown in the flowchart of FIG. 13), the description thereof is omitted here. .
Hereinafter, in the processing flow of the intermediate language editing tool of FIG. 3, the designation of the registered phrase is selected as the editing instruction, the registered phrase specifying means 0110 is started, and the registered phrase specifying process shown in FIG. explain.

実施例１では、登録単語・フレーズ指定（ステップ１２０２０）、関連単語・フレーズ指定（ステップ１２０３０）での指定方法としてキーボード、マウス、または音声入力などで指定する実施形態を説明したが、関連単語・フレーズとしてどのようなものを指定すればよいのかについては、音声・言語・文法に関するある程度の知識がないと難しい場合が多い。これに対して、実施例２では、メニュー形式での指定方法を実現する。 In the first embodiment, the embodiment in which the designated word / phrase designation (step 12020) and the related word / phrase designation (step 12030) are designated by a keyboard, a mouse, or voice input has been described. It is often difficult to specify what phrases should be specified unless there is a certain level of knowledge about speech, language, and grammar. On the other hand, in the second embodiment, a designation method in a menu format is realized.

図１５に本実施例の画面レイアウト例を示す。図１５は、登録フレーズ指定手段０１１０が起動され、図１３のフローチャートに従って登録フレーズ指定処理が開始された後、マウスカーソルを登録したい単語・フレーズ上でクリックすることで登録単語・フレーズ指定処理（ステップ１２０２０）が完了し、それに対するシステムの応答として、関連単語・フレーズおよびその関連語位置の候補をメニュー形式で表示している時点のレイアウトである。 FIG. 15 shows a screen layout example of this embodiment. FIG. 15 shows a registered word / phrase designating process (steps) when the registered phrase designating unit 0110 is activated and the registered phrase designating process is started according to the flowchart of FIG. 13 and the mouse cursor is clicked on the word / phrase to be registered. 12020) is completed, and as a response of the system to that, the related word / phrase and the related word position candidates are displayed in a menu format.

図１６に本実施例における登録フレーズ指定処理のフローチャートを示す。図１３のフローチャートからは、関連単語・フレーズ指定処理（ステップ１２０３０）が関連単語・フレーズ候補リスト生成処理（ステップ１５０３０）に変わり、また関連語位置の調整入力処理（ステップ１２０４０）がユーザからの候補選択待ち処理（ステップ１５０４０）に変わっている。さらに図１６の関連単語・フレーズ候補リスト生成処理（ステップ１５０３０）の内部詳細処理を説明するフローチャートを図１７に示す。 FIG. 16 shows a flowchart of registered phrase designation processing in the present embodiment. From the flowchart of FIG. 13, the related word / phrase designation process (step 12030) is changed to the related word / phrase candidate list generation process (step 15030), and the related word position adjustment input process (step 12040) is the candidate from the user. The process is changed to a selection waiting process (step 15040). Further, FIG. 17 shows a flowchart for explaining the internal detailed processing of the related word / phrase candidate list generation processing (step 15030) of FIG.

以下、図１６、図１７のフローチャートに従って本実施例における登録フレーズ指定手段０１１０の動作について説明する。まず、編集指示入力装置００９０によって、ユーザから登録フレーズ指定という編集指示が中間言語編集手段００６０に送られる。次に、中間言語編集手段００６０は登録フレーズ指定手段０１１０を起動し、登録フレーズの指定処理へと移行する。ここで、図１６のフローチャートに制御が移り、最初に登録単語・フレーズ指定処理（ステップ１５０２０）が実行される。 Hereinafter, the operation of the registered phrase specifying unit 0110 in the present embodiment will be described with reference to the flowcharts of FIGS. 16 and 17. First, the editing instruction input device 0090 sends an editing instruction for specifying a registered phrase from the user to the intermediate language editing unit 0060. Next, the intermediate language editing unit 0060 activates the registered phrase specifying unit 0110, and proceeds to a registered phrase specifying process. Here, the control shifts to the flowchart of FIG. 16, and first, a registered word / phrase designation process (step 15020) is executed.

本実施例では登録単語・フレーズ指定処理（ステップ１５０２０）は、マウスカーソルで登録したい単語をクリックすることで、またはドラッグなどの操作で単語を複数選択して登録したいフレーズを決定した後にそのフレーズをクリックすることで完了する。ここでどの単語、またはどのフレーズが選択されたかを示す情報が、次の関連単語・フレーズ候補生成処理（ステップ１５０３０）に渡される。この関連単語・フレーズ候補生成処理（ステップ１５０３０）では、登録単語・フレーズ指定処理（ステップ１５０２０）から渡された登録したい単語・フレーズに関する情報を元に、その単語・フレーズに関連の深い単語・フレーズを推定し、関連の深さによって並べ替えたリストを作成するという処理を行なう。 In this embodiment, the registered word / phrase designation process (step 15020) is performed by clicking a word to be registered with the mouse cursor or selecting a plurality of words by dragging or the like to determine a phrase to be registered and then selecting the phrase. Click to complete. Information indicating which word or phrase is selected is passed to the next related word / phrase candidate generation process (step 15030). In the related word / phrase candidate generation process (step 15030), the word / phrase deeply related to the word / phrase based on the information about the word / phrase to be registered passed from the registered word / phrase specifying process (step 15020). And processing that creates a list sorted according to the depth of association.

次に、図１７のフローチャートに移動し、関連単語・フレーズ候補リスト生成処理が開始される。まず、登録単語フレーズ指定処理（ステップ１５０２０）から渡された登録単語・フレーズの品詞推定処理（ステップ１６０３０）と、その登録単語・フレーズの直前と直後の単語の抽出およびその品詞推定および句読点の有無などの判定を行なう直前・直後位置の単語・品詞・句読点の判定処理（ステップ１６０４０）と、登録単語・フレーズを含む文書内で特徴的な語句を抽出する文書内からの特徴単語抽出処理（ステップ１６０２０）が実行される。これらの３つの処理はそれぞれ並列に実行してもよく、また一つずつ順番に実行してもよい。 Next, the process moves to the flowchart of FIG. 17, and the related word / phrase candidate list generation process is started. First, the part-of-speech estimation process (step 16030) of the registered word / phrase passed from the registered word / phrase designation process (step 15020), the extraction of words immediately before and after the registered word / phrase, the part-of-speech estimation, and the presence / absence of punctuation marks The word / part of speech / punctuation mark determination process immediately before and after the determination (step 16040), and the characteristic word extraction process from the document for extracting a characteristic word / phrase in the document including the registered word / phrase (step 16040) 16020) is executed. These three processes may be executed in parallel, or may be executed in order one by one.

まず、文書内からの特徴単語抽出処理（ステップ１６０２０）は、例えば図２に示す例文の場合、この文書が経済・株式に関するものであることを判定し、この文書内から「株式」「経済」「最高値」などの特徴語句を抽出する。この特徴語句抽出処理には、例えば、大量のテキストコーパスから計算した単語の出現頻度（出現確率）情報や、または複数単語の組での同時出現頻度（同時出現確率）などの情報を用いて、実現することができる。 First, in the feature word extraction process (step 16020) from the document, for example, in the case of the example sentence shown in FIG. 2, it is determined that this document relates to economy / stock, and “stock” and “economy” are identified from this document. Extract feature words such as “highest value”. For this feature phrase extraction process, for example, using information such as the appearance frequency (appearance probability) information of words calculated from a large amount of text corpus, or the simultaneous appearance frequency (coincidence probability) in a set of multiple words, Can be realized.

図１８に単語の出現確率辞書の例を示す。図１８のように各単語に対してあるテキストコーパス内での出現確率を記述した辞書データを参照し、文書内に含まれる各単語の出現確率を求めると、例えばその中で一番小さい出現確率の単語が重要度が高い、すなわち特徴語句と見なすことができる。次に、登録単語・フレーズ品詞推定処理（ステップ１６０３０）は、形態素解析で用いる単語辞書を用いて検索すればよく、または中間言語生成手段００３０から中間言語データと共に品詞や係り受けなどの言語情報を渡してもらい、それを利用してもよい。次に、直前・直後位置の単語・品詞・句読点の判定処理（ステップ１６０４０）は、登録単語・フレーズの直前と直後の単語を取り出し、上述の手法でその品詞を推定し、さらに直前直後に句読点が存在するかどうかをチェックすればよいので、詳細は省略する。 FIG. 18 shows an example of a word appearance probability dictionary. When the appearance probability of each word included in the document is determined by referring to dictionary data describing the appearance probability in a text corpus for each word as shown in FIG. Can be regarded as having high importance, that is, a characteristic phrase. Next, the registered word / phrase part-of-speech estimation process (step 16030) may be performed using a word dictionary used in morphological analysis, or language information such as part of speech or dependency is obtained from the intermediate language generation unit 0030 together with intermediate language data. You may get it and use it. Next, the word / part of speech / punctuation mark determination processing (step 16040) immediately before and after the position takes out the words immediately before and immediately after the registered word / phrase, estimates the part of speech using the above-described method, and further punctuation marks immediately before and after. Since it is sufficient to check whether or not exists, details are omitted.

この登録単語フレーズ指定処理（ステップ１５０２０）から渡された登録単語・フレーズの品詞推定処理（ステップ１６０３０）と、その登録単語・フレーズの直前と直後の単語の抽出およびその品詞推定および句読点の有無などの判定を行なう直前・直後位置の単語・品詞・句読点の判定処理（ステップ１６０４０）と、登録単語・フレーズを含む文書内で特徴的な語句を抽出する文書内からの特徴単語抽出処理（ステップ１６０２０）の処理の結果、例えば、図１９に示すようなデータが得られる。 Part-of-speech estimation processing (step 16030) of the registered word / phrase passed from the registered word / phrase designation processing (step 15020), extraction of words immediately before and after the registered word / phrase, estimation of the part-of-speech, presence / absence of punctuation marks, etc. Word / part-of-speech / punctuation determination processing immediately before and after the determination (step 16040), and feature word extraction processing from the document for extracting characteristic words / phrases from the document including the registered word / phrase (step 16020) As a result of the process (), for example, data as shown in FIG. 19 is obtained.

この３つの処理は、関連語句・フレーズ候補リスト作成のための関連条件種別を列挙するために、それぞれ異なる観点からの条件を抽出する。関連条件種別が多くなるほど、以下の判定における精度が向上することが予想され、また、ここで抽出される条件は、抑揚・読みなどの違いに特に関連の深いものであることが望ましい。
本実施例２で挙げた登録単語・フレーズの品詞推定処理（ステップ１６０３０）、直前・直後位置の単語・品詞・句読点の判定処理（ステップ１６０４０）、文書内からの特徴単語抽出処理（ステップ１６０２０）はその一例であり、必ずしもこれら３つである必要はなく、また、これら３つを必ず同時に実行しなければならないものでもない。 These three processes extract conditions from different viewpoints in order to enumerate related condition types for creating a related word / phrase candidate list. It is expected that the accuracy in the following determination is improved as the number of related condition types increases, and it is desirable that the conditions extracted here are particularly deeply related to differences such as intonation and reading.
Part-of-speech estimation processing of registered words / phrases (step 16030), determination processing of words / parts-of-speech / punctuation marks immediately before and immediately after (step 16040), feature word extraction processing from a document (step 16020) Is an example thereof, and it is not always necessary that these three are used, and these three are not necessarily executed simultaneously.

次に、図１９に示す判定結果データは、調整事例コーパスの検索処理（ステップ１６０５０）に渡される。この調整事例コーパスの検索処理（ステップ１６０５０）では、これまでに行なった単語・フレーズへの読み・アクセント・抑揚の調整結果、およびその単語・フレーズの調整に関連があると判定された語句・フレーズが登録されている調整事例コーパスを参照し、図１９の関連語句・フレーズ候補それぞれに対して、それがどの程度、登録しようとしている調整結果と関連しているかを表す数値である調整品質予測値を付与する。 Next, the determination result data shown in FIG. 19 is passed to the adjustment case corpus search process (step 16050). In the adjustment case corpus search process (step 16050), the adjustment result of reading, accenting, and inflection on the word / phrase performed so far, and the phrase / phrase determined to be related to the adjustment of the word / phrase. Is an adjustment quality prediction value that is a numerical value indicating how much of each related word / phrase candidate in FIG. 19 is related to the adjustment result to be registered. Is granted.

この予測値付与処理の手順は、調整事例コーパスの構成が例えば図２０のようになっている場合に、まず現在登録しようとしている語句・フレーズの単語・品詞と調整事例コーパス内の登録語句・フレーズとを比較し、その一致度に基づいて対応する関連条件種別を並び替えた後、その一致度を調整品質予測値として出力することで実現できる。一致度は、単語１つが一致すれば＋１０ポイント、品詞が１つ一致すれば＋５ポイントのようにして定めておき、その値を加算、もしくは何らかの演算を行い算出できる。この調整事例コーパスのデータには、本発明の中間言語編集ツールを使って以前に登録された登録フレーズ情報を変換・蓄積しておいたものでもよく、また、あらかじめ、人手により調整規則として抽出・登録されたものでもよい。 For example, when the adjustment example corpus is configured as shown in FIG. 20, the predicted value assignment process is performed by first registering words / parts of words / phrases of words / phrases to be registered and registered words / phrases in the adjustment example corpus. And rearranging the corresponding related condition types based on the matching degree, and then outputting the matching degree as an adjusted quality prediction value. The degree of coincidence can be calculated by adding +10 points if one word matches or +5 points if one part of speech matches, and adding the values or performing some calculation. This adjustment example corpus data may be data obtained by converting and accumulating previously registered phrase information using the intermediate language editing tool of the present invention. It may be registered.

次に、調整事例コーパスの検索処理（ステップ１６０５０）で調整品質予測値を付与された候補リスト（図１９）は、調整品質予測値による並び替え処理（ステップ１６０６０）で予測値の大きい順番で並び替えられる。その結果を図２１に示す。
次に、この結果は、フローチャート図１６のユーザからの候補選択待ち処理（ステップ１５０４０）に渡され、登録単語・フレーズ「＃８０オワリ’ネデ｜１」に対する関連語句・位置候補メニューの形態に変換され、中間言語表示装置００５０上に表示される。この表示されたメニューの中からユーザによってその一つの関連語句・位置候補が選択されると、登録フレーズデータ生成処理（ステップ１５０５０）を経て、登録フレーズ格納処理（ステップ１５０６０）が実行され、指定された登録語句・フレーズが適切な関連語句・位置情報とともに保存されることになる。この登録フレーズデータ生成処理（ステップ１５０５０）と登録フレーズ格納処理（ステップ１５０６０）については、実施例１で説明した通りであるため、省略する。 Next, the candidate list (FIG. 19) to which the adjustment quality predicted value is assigned in the adjustment example corpus search process (step 16050) is arranged in descending order of the predicted value in the rearrangement process using the adjustment quality predicted value (step 16060). Be replaced. The result is shown in FIG.
Next, this result is passed to the candidate selection waiting process (step 15040) from the user in the flowchart of FIG. 16, and is displayed in the form of a related word / position candidate menu for the registered word / phrase “# 80 Owari'Nede | 1”. It is converted and displayed on the intermediate language display device 0050. When the user selects the one related phrase / position candidate from the displayed menu, the registered phrase data generation process (step 15050) is executed and the registered phrase storage process (step 15060) is executed and designated. The registered words / phrases are stored together with appropriate related words / positions. Since the registered phrase data generation process (step 15050) and the registered phrase storage process (step 15060) are the same as those described in the first embodiment, a description thereof will be omitted.

本実施例２によれば、音声・言語・文法などの専門知識に乏しいユーザであってもシステムが提示した関連語句候補の中から選択すればよいため、単語・フレーズの読み・アクセント・抑揚の調整結果を適切な関連語句とともに登録フレーズ情報として容易に保存・再利用することができるようになる。また、本実施例２に係る別の画面レイアウト例として、図２２のように、直前語句に関する条件が左側に、直後語句に関する条件が右側に、文書特徴語に関する条件が上側に、などのように「直前」「直後」「文書」などの語句から連想される位置に対応する候補メニューを表示する場合も考えられる。 According to the second embodiment, even a user who lacks expertise such as speech, language, and grammar may select from related word candidates presented by the system. The adjustment result can be easily stored and reused as registered phrase information together with appropriate related phrases. Further, as another screen layout example according to the second embodiment, as shown in FIG. 22, the condition relating to the immediately preceding phrase is on the left side, the condition relating to the immediately following phrase is on the right side, the condition relating to the document feature word is on the upper side, etc. It is also conceivable to display a candidate menu corresponding to a position associated with words such as “immediately before”, “immediately after”, and “document”.

本発明の基本的構成、および実施例１のシステム構成を示す図。1 is a diagram illustrating a basic configuration of the present invention and a system configuration of Embodiment 1. FIG. 本発明の中間言語編集システムの画面レイアウト例を示す図。The figure which shows the example of a screen layout of the intermediate language editing system of this invention. 本発明の実施例１における処理のフローチャートを示す図。The figure which shows the flowchart of the process in Example 1 of this invention. 本発明の実施例１において、中間言語生成手段で内部的に解析される形態素解析データの一例を示す図。FIG. 3 is a diagram illustrating an example of morpheme analysis data that is internally analyzed by an intermediate language generation unit in the first embodiment of the present invention. 本発明の実施例１において、中間言語生成手段で内部的に解析される構文解析データの一例を示す図。FIG. 5 is a diagram illustrating an example of syntax analysis data internally analyzed by an intermediate language generation unit in the first embodiment of the present invention. 本発明の実施例１において、中間言語生成手段で内部的に解析される共起解析データの一例を示す図。FIG. 5 is a diagram illustrating an example of co-occurrence analysis data internally analyzed by an intermediate language generation unit in the first embodiment of the present invention. 本発明の実施例１において、中間言語生成手段から出力される中間言語データの一例を示す図。FIG. 5 is a diagram illustrating an example of intermediate language data output from the intermediate language generation unit according to the first embodiment of the present invention. 本発明の実施例１において、中間言語生成手段から出力される中間言語データの一例を示す図。FIG. 5 is a diagram illustrating an example of intermediate language data output from the intermediate language generation unit according to the first embodiment of the present invention. 本発明の実施例１において、読み・アクセント候補生成手段から出力される読み・アクセント候補リストデータの一例を示す図。The figure which shows an example of the reading / accent candidate list data output from a reading / accent candidate production | generation means in Example 1 of this invention. 本発明の実施例１において、登録フレーズ記録装置から登録フレーズ検索手段によって検索される登録フレーズデータの一例を示す図。The figure which shows an example of the registration phrase data searched by the registration phrase search means from the registration phrase recording device in Example 1 of this invention. 本発明の実施例１において、登録フレーズ挿入手段から出力される登録フレーズデータを適用した後の中間言語データの一例を示す図。The figure which shows an example of intermediate language data after applying the registration phrase data output from a registration phrase insertion means in Example 1 of this invention. 本発明の実施例１において、編集文書の中間言語表示処理によって中間言語表示装置に表示される中間言語データレイアウトの一例を示す図。FIG. 6 is a diagram illustrating an example of an intermediate language data layout displayed on the intermediate language display device by the intermediate language display process of the edited document in the first embodiment of the present invention. 本発明の実施例１において、登録フレーズの指定処理のフローチャートを示す図。The figure which shows the flowchart of the designation | designated process of a registration phrase in Example 1 of this invention. 本発明の実施例１において、登録フレーズデータが存在しなかった場合に編集文書の中間言語表示処理によって中間言語表示装置に表示される中間言語データレイアウトの一例を示す図。FIG. 6 is a diagram illustrating an example of an intermediate language data layout displayed on an intermediate language display device by intermediate language display processing of an edited document when registered phrase data does not exist in Embodiment 1 of the present invention. 本発明の実施例２において、登録フレーズ指定処理を行なっている途中の画面レイアウトの一例を示す図。The figure which shows an example of the screen layout in the middle of performing the registration phrase designation | designated process in Example 2 of this invention. 本発明の実施例２において、登録フレーズの指定処理のフローチャートを示す図。The figure which shows the flowchart of the designation | designated process of a registration phrase in Example 2 of this invention. 本発明の実施例２において、関連単語・フレーズ候補リスト生成処理のフローチャートを示す図。The figure which shows the flowchart of a related word and phrase candidate list production | generation process in Example 2 of this invention. 本発明の実施例２において、文書内からの特徴単語抽出処理に用いる単語出現確率辞書の構成を示す図。The figure which shows the structure of the word appearance probability dictionary used for the characteristic word extraction process from the document in Example 2 of this invention. 本発明の実施例２において、関連単語・フレーズ候補リスト生成処理の途中で生成される関連語句・フレーズ候補リストの一例を示す図。In Example 2 of this invention, The figure which shows an example of the related phrase / phrase candidate list produced | generated in the middle of a related word / phrase candidate list production | generation process. 本発明の実施例２において、調整事例コーパスの構成を示す図。The figure which shows the structure of the adjustment example corpus in Example 2 of this invention. 本発明の実施例２において、調整品質予測値で並び替えられた結果の関連語句・フレーズ候補リストの一例を示す図。In Example 2 of this invention, the figure which shows an example of the related phrase / phrase candidate list of the result rearranged by the adjustment quality prediction value. 本発明の実施例２に係る別の画面レイアウトの一例を示す図。FIG. 10 is a diagram illustrating an example of another screen layout according to the second embodiment of the present invention.

Explanation of symbols

００１０…文書データ記録装置、００２０…文書データ入力手段、００３０…中間言語生成手段、００４０…読み・アクセント候補生成手段、００５０…中間言語表示装置、００６０…中間言語編集手段、００７０…音声合成装置、００８０…音声再生装置、００９０…編集指示入力装置、０１００…中間言語出力装置、０１１０…登録フレーズ指定手段、０１２０…登録フレーズ挿入手段、０１３０…登録フレーズ格納手段、０１４０…登録フレーズ検索手段、０１５０…登録フレーズ記録装置、１０１０…文書データリスト表示領域、１０２０…文書内容表示領域、１０３０…中間言語データ表示領域、２０１０…起動処理ステップ、２０２０…文書データ読み込み・表示処理ステップ、２０３０…編集文書の指定あり判断ステップ、２０４０…編集文書の内容表示処理ステップ、２０５０…中間言語生成処理ステップ、２０６０…読み・アクセント候補生成処理ステップ、２０７０…登録フレーズ検索処理ステップ、２０８０…編集文書の中間言語表示処理ステップ、２０９０…登録フレーズ挿入処理ステップ、２１００…編集指示の入力あり判断ステップ、２１１０…指示された編集処理の実行処理ステップ、２１２０…編集終了判断ステップ、２１３０…編集結果の中間言語出力処理ステップ、３０１０…形態素番号、３０２０…形態素表記、３０３０…品詞、３０４０…読み・アクセント（第１候補）、３０５０…読み・アクセント（第２候補）、５０１０…形態素、５０２０…読み・アクセント、５０３０…共起単語リスト、８０１０…形態素、８０２０…読み・アクセント（第１候補）、８０３０…読み・アクセント（第２候補）、９０１０…単語・フレーズ、９０２０…関連単語・フレーズ、９０３０…関連語の位置、９０４０…調整済み読み・アクセント・抑揚、１２０１０…登録フレーズ指定開始ステップ、１２０２０…登録単語・フレーズ指定処理ステップ、１２０３０…関連単語・フレーズ指定処理ステップ、１２０４０…関連語位置の調整入力処理ステップ、１２０５０…登録フレーズデータ生成処理ステップ、１２０６０…登録フレーズ格納処理ステップ、１２０７０…登録フレーズ指定終了ステップ、１５０１０…登録フレーズ指定開始ステップ、１５０２０…登録単語・フレーズ指定処理ステップ、１５０３０…関連単語・フレーズ候補リスト生成処理ステップ、１５０４０…ユーザからの候補選択待ち処理ステップ、１５０５０…登録フレーズデータ生成処理ステップ、１５０６０…登録フレーズ格納処理ステップ、１５０７０…登録フレーズ指定終了ステップ、１６０１０…関連単語・フレーズ候補リスト生成処理開始ステップ、１６０２０…文書内からの特徴単語抽出処理ステップ、１６０３０…登録単語・フレーズ品詞推定処理ステップ、１６０４０…直前・直後位置の単語・品詞・句読点の判定処理ステップ、１６０５０…調整事例コーパスの検索処理ステップ、１６０６０…調整品質予測値による並び替え処理ステップ、１６０７０…関連単語・フレーズ候補リスト生成処理終了ステップ。 0010 ... Document data recording device, 0020 ... Document data input means, 0030 ... Intermediate language generation means, 0040 ... Reading / accent candidate generation means, 0050 ... Intermediate language display device, 0060 ... Intermediate language editing means, 0070 ... Speech synthesizer, 0080 ... Audio playback device, 0090 ... Editing instruction input device, 0100 ... Intermediate language output device, 0110 ... Registered phrase designation means, 0120 ... Registered phrase insertion means, 0130 ... Registered phrase storage means, 0140 ... Registered phrase search means, 0150 ... Registered phrase recording device, 1010 ... Document data list display area, 1020 ... Document content display area, 1030 ... Intermediate language data display area, 2010 ... Start processing step, 2020 ... Document data reading / display processing step, 2030 ... Designation of edited document Yes judgment step, 2 40 ... Edited document content display processing step, 2050 ... Intermediate language generation processing step, 2060 ... Reading / accent candidate generation processing step, 2070 ... Registered phrase search processing step, 2080 ... Edited document intermediate language display processing step, 2090 ... Registration Phrase insertion processing step, 2100 ... Determination instruction input determination step, 2110 ... Instructed editing processing execution processing step, 2120 ... Editing end determination step, 2130 ... Editing result intermediate language output processing step, 3010 ... Morphological number, 3020 ... Morphological notation, 3030 ... Part of speech, 3040 ... Reading / accent (first candidate), 3050 ... Reading / accenting (second candidate), 5010 ... Morphological, 5020 ... Reading / accent, 5030 ... Co-occurrence word list, 8010 ... Morpheme, 8020 ... reading Cent (first candidate), 8030 ... reading / accent (second candidate), 9010 ... word / phrase, 9020 ... related word / phrase, 9030 ... position of related word, 9040 ... adjusted reading / accent / inflection, 12010 ... Registration phrase designation start step, 12020 ... Registered word / phrase designation processing step, 12030 ... Related word / phrase designation processing step, 12040 ... Related word position adjustment input processing step, 12050 ... Registered phrase data generation processing step, 12060 ... Registered phrase Storage processing step, 12070 ... Registered phrase designation end step, 15010 ... Registered phrase designation start step, 15020 ... Registered word / phrase designation processing step, 15030 ... Related word / phrase candidate list generation processing step, 15040 ... User? Candidate selection waiting processing step, 15050 ... registered phrase data generation processing step, 15060 ... registration phrase storage processing step, 15070 ... registration phrase designation end step, 16010 ... related word / phrase candidate list generation processing start step, 16020 ... in document Feature word extraction processing step from 16030 ... Registered word / phrase part-of-speech estimation processing step, 16040 ... Word / part-of-speech / punctuation determination step immediately before / immediately after position, 16050 ... Adjustment example corpus search processing step, 16060 ... Adjustment quality Sorting process step based on predicted values, 16070 ... related word / phrase candidate list generation process ending step.

Claims

A means for inputting a document to be read aloud including kanji,
Generating means for generating an intermediate language from the reading target document;
Editing means for editing the intermediate language, speech synthesis means for performing speech synthesis based on the edited intermediate language,
An output means for the synthesized voice,
The editing means receives a correction instruction input to the intermediate language displayed on the display means, stores information on the correction instruction in the storage means,
The speech synthesizer according to claim 1, wherein the correction instruction includes at least specification of a word to be corrected and specification of conditions for reflecting the correction.

2. The speech synthesizer according to claim 1, wherein the modification reflection condition designation includes information on a co-occurrence phrase of the modification target phrase.

The speech synthesizer according to claim 1 or 2, wherein the correction reflection condition designation further includes information on a position condition of the co-occurrence word / phrase.

4. The voice according to claim 1, wherein the editing unit presents a plurality of candidate conditions together with the intermediate language on the display unit, and accepts a selection instruction from the input unit. Synthesizer.

The editing means estimates the part of speech of the input correction target phrase, estimates the part of speech of the word before and after the correction target phrase, extracts characteristic words in the target document, and extracts these results in the past 5. The speech synthesizer according to claim 1, wherein a correction condition determined by comparison with the correction record is displayed on the display means.