JPH01273171A

JPH01273171A - Document rewriting system and automatic translating system

Info

Publication number: JPH01273171A
Application number: JP63101916A
Authority: JP
Inventors: Masanobu Higashida; 正信東田; Masahiro Oku; 雅博奥; Koji Matsuoka; 浩司松岡; Atsuo Kawai; 河合　敦夫
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1988-04-25
Filing date: 1988-04-25
Publication date: 1989-11-01

Abstract

PURPOSE:To execute a translation processing after having converted a sentence to a sentence which can be translated easily and to shorten the translation time by providing a pre-processing before the translation processing. CONSTITUTION:A pre-processing part 1 executes processings such as a standardization of a notation, a long sentence division, a connecting word supplement, an indispensable case supplement with respect to an original language sentence, obviates or reduces ambiguity and incompleteness of the original language sentence, and converts it to a sentence which can be translated easily. Subsequently, a translation processing part 6 inputs the sentence which has been converted to the sentence which can be translated easily by obviating or reducing ambiguity and incompleteness in the pre-processing part 1, and translates it to an object language. In such a way, pre-editing by a person's help is scarcely required, and the processing time can be shortened as the whole translation.

Description

【発明の詳細な説明】（１）発明の属する技術分野本発明は、自然言語処理方式、詳しくは、入力された文
データを平易な文データに書き換える文書書き換え方式
および。DETAILED DESCRIPTION OF THE INVENTION (1) Technical field to which the invention pertains The present invention relates to a natural language processing method, more specifically, a document rewriting method for rewriting input sentence data into plain sentence data.

入力された文データの言語を別体系の言語の文データへ
自動的に翻訳する自動翻訳方式に関するものである。The present invention relates to an automatic translation method that automatically translates the language of input sentence data into sentence data of a different language.

（２）従来の技術翻訳対象文章は、一般に、そのままの形で自動翻訳する
ことが難しい、このため、従来の翻訳方式では、翻訳処
理に入る前にブリエデイツトを施こすことによって翻訳
しやすい文章に変換することが利用者に要求されていた
。(2) Conventional technology It is generally difficult to automatically translate sentences in their original form. Therefore, in conventional translation methods, before starting the translation process, the sentences are made easier to translate by applying Brieddate. Users were required to convert.

しかし、当該従来の翻訳方式の場合には。However, in the case of the conventional translation method.

■　ブリエデイツト結果のばらつき。■ Dispersion of Briedet results.

（利用者によって、プリエデイツトの結果が異なる） ■　プリエデイツトの不完全性。(Preedit results vary depending on the user) ■　Incompleteness of preedit.

（プリエデイツトが不完全であるため、翻訳不可能な文
が残る） ■　プリエデイツトの工数。(Because the preedit is incomplete, some sentences remain that cannot be translated.) ■ Man-hours for the preedit.

（プリエデイツトに人手と時間がかかる）などの問題が
あり、翻訳方式としては。(Pre-editing requires manpower and time) and other problems, making it an unsuitable translation method.

■、■のため。■, for ■.

（１）　　翻訳処理への人力文章において現れる言語現
象を絞れず、あらゆる言語現象に対応する必要が生じ。(1) Human power for translation processing It is not possible to narrow down the linguistic phenomena that appear in sentences, and it becomes necessary to deal with all linguistic phenomena.

（２）　　翻訳処理が複雑でかつ大規模となってしまう
ため、処理の見通しが悪くなる。(2) Translation processing becomes complex and large-scale, making the process less predictable.

（３）結果として、訳文の品質が悪くなる。(3) As a result, the quality of the translated text deteriorates.

■のため。■For.

（４）翻訳処理全体としてみたとき、処理時間が長くな
る。(4) When looking at the translation process as a whole, the processing time becomes longer.

というｆｉｌ〜（４）の欠点があった。There was a drawback of fil~(4).

（３）発明の目的本発明の目的は、上記の問題点を解決した高品質の訳文
を出力できる文書書き換え方式および自動翻訳方式を提
供することにある。(3) Purpose of the Invention An object of the present invention is to provide a document rewriting method and an automatic translation method that can output high-quality translations that solve the above-mentioned problems.

（４）発明の構成（４−１）発明の特徴と従来の技術との差異入力された
文データの言語（原言語）に対して。(4) Structure of the invention (4-1) Differences between features of the invention and conventional technology Regarding the language (original language) of input sentence data.

表記標準化知識辞書と長文分割知識辞書と接続語補完知
識辞書と格要素補完知識辞書とのうち少なくとも１つを
備え、該表記標準化知識辞書を参照することによって９
表記標準化を必要とする語と該表記標準化を必要とする
語の標準形を得、該表記標準化を必要とする語を標準形
に書き換える処理と、該長文分割知識辞書を参照するこ
とによって、長文を定義する文の文字数と分割を行う切
断点および分割を行った後に付与すべき接続語とを得、
長文の分割を行う処理と、該接続語補完知識辞書を参照
することによって、２つの文の論理関係に応じた接続語
を補完する処理と、該格要素補完知識辞書を参照するこ
とによって、省略されている格要素を前文または後文中
から補う処理とのうち、少なくとも１つの処理を実行す
る第１の手段（前処理）と、該第１の手段からの出力文
データを入力として、別体系の言語（目的言語）の文デ
ータに翻訳し、出力する第２の手段とを存することを最
も主要な特徴とする。At least one of a notation standardized knowledge dictionary, a long sentence division knowledge dictionary, a connective word complementary knowledge dictionary, and a case element complementary knowledge dictionary is provided, and by referring to the notation standardized knowledge dictionary,
By obtaining the words that require notation standardization and the standard form of the word that requires notation standardization, and rewriting the word that requires notation standardization into the standard form, and by referring to the long sentence division knowledge dictionary, Obtain the number of characters in the sentence that defines the sentence, the cutting point at which the division is performed, and the connecting word that should be added after the division,
A process of dividing a long sentence, a process of complementing a connective word according to the logical relationship between two sentences by referring to the relevant connective word complementary knowledge dictionary, and an abbreviation process by referring to the relevant case element complementary knowledge dictionary. A first means (preprocessing) for performing at least one process of supplementing the case element in the preceding sentence or the following sentence, and a separate system using the output sentence data from the first means as input. The main feature is that there is a second means for translating and outputting sentence data in the language (target language).

従来の技術とは、従来１人手によるプリエデイツトで行
う必要があった処理の大部分を前処理として実現するた
め。Conventional technology is because most of the processing that conventionally required one-person pre-editing can be accomplished as pre-processing.

■　翻訳処理の各機能の中で分散処理されていた原言語
の書き換え機能を前処理の中で一括処理することにより
、翻訳機能を前処理機能と翻訳機能とに明確に分離して
構成できるようになる。■ By processing the source language rewriting function, which was distributed among the translation processing functions, all at once during preprocessing, it is now possible to clearly separate the translation function into preprocessing function and translation function. become.

■　翻訳処理への入力文の品質、性質が均一となり、翻
訳処理で必要となる言語処理機能が従来の機能に比べて
大幅に削減できる。■ The quality and nature of input sentences to translation processing become uniform, and the language processing functions required for translation processing can be significantly reduced compared to conventional functions.

■　■のため、翻訳処理がコンパクトに実現でき■　結
果として訳文の品質を上げることができる。■ Because of this, the translation process can be realized compactly and ■ As a result, the quality of the translated text can be improved.

■　人手によるブリエデイツトがほとんど不必要となる
ため、翻訳全体としての処理時間が短かくなる。■ Since manual translation is almost unnecessary, the overall translation processing time is shortened.

の各点が異なる。Each point is different.

（４−２）実施例第１図は１本発明の基本構成図である。■は入力された
原言語文章に対して２表記の標準化、長文分割、接続語
補完や必須格補完などの処理を行い、入力された原言語
文章のあいまい性や不完全性を解消または軽減し、翻訳
しやすい文章に変換する前処理部、２は前処理部ｌで必
要な知識を得るために知識辞書を検索する知識辞書検索
部、３は前処理部１で用いる知識を体系化した知識辞書
−群、４は前処理部ｌにおいて未知語または訳出困難語
であると判定された語を利用者辞書に登録する利用者辞
書登録部、５は未知語あるいは訳出困難語を登録してお
く利用者辞書、６は前処理部１において、あいまい性や
不完全性を解消または軽減され、翻訳しやすい文章に変
換された文章を入力とし、目的言語へ翻訳する翻訳処理
部、７は翻訳処理部６に必要な情報を持つ翻訳辞書群や
利用者辞書を検索する翻訳辞書検索部、８は原言語解析
辞書、原言語−目的言語変換辞書や目的言語生成辞書な
どの翻訳処理部６に必要な情報を持つ翻訳辞書群、９は
翻訳処理部６の翻訳結果を記録媒体や端末に出力する出
力部、１０は自然言語前処理付自動翻訳装置である。第
２図は、自然言語前処理付自動翻訳装置ｌＯの動作の概
略フローである。(4-2) Embodiment FIG. 1 is a basic configuration diagram of the present invention. ■ performs processing such as standardization of binary notation, long sentence division, connective word completion, and mandatory case completion on the input source language sentence, and eliminates or reduces ambiguity and incompleteness of the input source language sentence. , a preprocessing unit that converts the text into a text that is easy to translate; 2 is a knowledge dictionary search unit that searches a knowledge dictionary to obtain the necessary knowledge in the preprocessing unit l; and 3 is knowledge that systematizes the knowledge used in the preprocessing unit 1. Dictionary group 4 is a user dictionary registration unit that registers words determined to be unknown words or difficult-to-translate words in the preprocessing unit 1 into the user dictionary; 5 is a user dictionary registration unit that registers unknown words or difficult-to-translate words; A user dictionary, 6 is a translation processing unit that takes as input a sentence that has been converted into an easy-to-translate text with ambiguity and incompleteness resolved or reduced in the preprocessing unit 1, and translates it into the target language; 7 is a translation processing unit Section 6 is a translation dictionary search section that searches for a group of translation dictionaries and user dictionaries that have necessary information, and 8 is a source language analysis dictionary, a source language-target language conversion dictionary, a target language generation dictionary, etc. necessary for the translation processing section 6. 9 is an output unit that outputs the translation result of the translation processing unit 6 to a recording medium or a terminal, and 10 is an automatic translation device with natural language preprocessing. FIG. 2 is a schematic flowchart of the operation of the automatic translation device IO with natural language preprocessing.

次に、第１図、第２図に従って、動作の説明を行う。Next, the operation will be explained according to FIGS. 1 and 2.

自然言語前処理付自動翻訳装置１０の入力である原言語
文章に対して前処理部１では複数個の前処理項目を実行
する。第２図中のｎはこの前処理項目の数をカウントし
ている。前処理部１は、各前処理項目に対して必要な知
識を知識辞書検索部２に転送するように要求する。知識
辞書検索部２では、現在処理されている前処理項目に必
要な知識を格納した知識辞書を知識辞書群３から選んで
検索しくステップ２）、その情報を前処理部１に転送す
る。転送された情報を利用して、前処理部１では、現在
処理中の前処理項目に対する処理を完了しくステップ３
）１次の前処理項目に進む（即ち第２図図示のステップ
６の処理に入る）。The preprocessing unit 1 executes a plurality of preprocessing items on the source language text that is input to the automatic translation device with natural language preprocessing 10 . n in FIG. 2 counts the number of preprocessing items. The preprocessing unit 1 requests the knowledge dictionary search unit 2 to transfer the knowledge necessary for each preprocessing item. The knowledge dictionary search unit 2 selects and searches for a knowledge dictionary storing knowledge necessary for the preprocessing item currently being processed from the knowledge dictionary group 3 (step 2), and transfers the information to the preprocessing unit 1. Using the transferred information, the preprocessing unit 1 completes the process for the preprocessing item currently being processed in step 3.
) Proceed to the first pre-processing item (that is, enter the process of step 6 shown in FIG. 2).

なお、各前処理項目の処理において利用者辞書に登録す
る必要がある語が生じたとき（ステップ４）には、その
都度利用者辞書登録部４に要求を出し、咳語を利用者辞
書５に登録する（ステップ５）。In addition, when a word that needs to be registered in the user dictionary occurs in the processing of each preprocessing item (step 4), a request is sent each time to the user dictionary registration unit 4, and the cough word is stored in the user dictionary 5. (Step 5).

前処理部ｌにおいて、すべての前処理項目が終了すると
、前処理を施された原言語文章は翻訳処理部６に送られ
る。When all the preprocessing items are completed in the preprocessing unit l, the preprocessed source language text is sent to the translation processing unit 6.

翻訳処理部６では、前処理を施された原言語文章を目的
言語文章に翻訳する処理が行われる（ステップ７）。こ
の際、翻訳処理部６は、必要に応じて、この処理を、翻
訳辞書検索部７によって翻訳辞書群８や利用者辞書５か
ら得られた情報を用いることによって遂行する。なお、
この翻訳処理の方式には、ピポフト方式やトランスファ
方式などが存在するが、ここでは特に限定しない。The translation processing unit 6 performs a process of translating the preprocessed source language text into a target language text (step 7). At this time, the translation processing unit 6 performs this process by using information obtained from the translation dictionary group 8 and the user dictionary 5 by the translation dictionary search unit 7, as necessary. In addition,
Methods for this translation processing include the Pipoft method and the transfer method, but these are not particularly limited here.

翻訳処理部６で得られた目的言語文章は出力部９に送ら
れる。出力部９では、該目的言語文章を。The target language sentence obtained by the translation processing section 6 is sent to the output section 9. The output unit 9 outputs the target language sentence.

あらかじめ利用者によって指定された出力光（ハードデ
ィスクなどの記録媒体やデイスプレー、プリンタなど）
に出力する（ステップ８）。Output light specified in advance by the user (recording media such as hard disks, displays, printers, etc.)
(Step 8).

次に原言語が日本語である場合を例文を用いて具体的に
述べる。Next, we will specifically discuss the case where the source language is Japanese using example sentences.

第３図に例文として用いる入力日本語文章を示す、該入
力日本語文章は、前処理部ｌによって前処理を施され、
第４図に示す前処理を施された入力日本語文章となって
翻訳処理部６へ送られる。FIG. 3 shows an input Japanese sentence used as an example sentence. The input Japanese sentence is preprocessed by a preprocessing unit l, and
The preprocessed input Japanese text shown in FIG. 4 is sent to the translation processing section 6.

なお、第３図の入力日本語文章は、前処理部１において
、第４図中に示す■表記の標準化、■長文分割、■接続
語補完、■格要素補完の４つの前処理項目による処理を
受けて、第４図の前処理を施された入力日本語文章に変
換されており、他の前処理項目による処理は受けていな
い。The input Japanese text in Figure 3 is processed in the preprocessing unit 1 using the four preprocessing items shown in Figure 4: ■ Standardization of notation, ■ Long sentence division, ■ Conjunction word completion, and ■ Case element completion. The input Japanese text is then converted into the input Japanese text that has been subjected to the preprocessing shown in FIG. 4, and has not been processed by other preprocessing items.

次に■〜■の各前処理項目の処理の動作について詳述す
る。Next, the processing operations for each of the preprocessing items ① to ② will be described in detail.

■　表記の標準化（同義の単語および９表記の異なる同
一の単語（例：　ｃｏｌｏｒとｃｏｌｏｕｒ）を標準と
定める１つの単語の表記に置き換えること）前処理部１
において、前処理項目が「表記の標準化」に来たとき、
前処理部１は知識辞書検索部２に「表記の標準化」に関
する情報の転送を要求する。知識辞書検索部２では、「
表記の標準化」に関する情報を、知識辞書群３中の「表
記の標準化」に用いる知識辞書を検索することによって
得、これを前処理部１に転送する。■ Standardization of notation (replacing synonymous words and the same word with 9 different notations (e.g. color and color) with a single word notation that is determined as standard) Preprocessing unit 1
In , when the preprocessing item comes to "standardization of notation",
The preprocessing unit 1 requests the knowledge dictionary search unit 2 to transfer information regarding "standardization of notation." In the knowledge dictionary search section 2, “
Information regarding "standardization of notation" is obtained by searching the knowledge dictionary used for "standardization of notation" in the knowledge dictionary group 3, and is transferred to the preprocessing unit 1.

第５図に表記の標準化に用いる知識辞書の内容例を表形
式を用いて示す、第５図において、５−１は表記の標準
化の対象となる語の表記、５−２は該表記の語の持つ品
詞、５−３は該表記に対する標準表記、５−４は該標準
表記の語のもつ品詞である。Figure 5 shows an example of the contents of a knowledge dictionary used for standardization of notation in a table format. 5-3 is the standard notation for the notation, and 5-4 is the part of speech of the word in the standard notation.

知識辞書検索部２から転送を受けた前処理部１は、入力
文中の語と５−１の欄の表記及び５−２の欄の品詞とを
比べ１両者がともに一致するすべての語を対応する標準
表記５−３に書きかえる。第３図図示の例では、第５図
中の矢印のレコードが該当し、「羽田空港−東京国際空
港」と書きかえられる（第４図■）。The preprocessing unit 1, which has received the information from the knowledge dictionary search unit 2, compares the word in the input sentence with the notation in the column 5-1 and the part of speech in the column 5-2, and identifies all words that match both. Rewrite it in standard notation 5-3. In the example shown in FIG. 3, the record indicated by the arrow in FIG. 5 corresponds to the record, and is rewritten as "Haneda Airport-Tokyo International Airport" (■ in FIG. 4).

■　長文分割（翻訳困難な長文を、翻訳容易な複数の短
文に分割し、それらの間に接続関係を付与すること）前処理部１において、前処理項目が「長文分割」に来た
とき、前処理部１は知識辞書検索部２に「長文分割」に
関する情報の転送を要求する。知識辞書検索部２では、
「長文分割」に関する情報を、知識辞書群３中の「長文
分割」に用いる知識辞書を検索することによって得、こ
れを前処理部１に転送する。第６図に長文分割に用いる
知識辞書の内容例を表形式を用いて示す。第６図におい
て、６−１は、長文分割の対象となる文の最低の文字数
を示した文の長さ。■ Long sentence division (dividing a long sentence that is difficult to translate into multiple short sentences that are easy to translate, and assigning connections between them) When the preprocessing item comes to "Long sentence division" in the preprocessing section 1, The preprocessing unit 1 requests the knowledge dictionary search unit 2 to transfer information regarding “long sentence division”. In the knowledge dictionary search section 2,
Information regarding "long sentence division" is obtained by searching the knowledge dictionary used for "long sentence division" in the knowledge dictionary group 3, and is transferred to the preprocessing unit 1. FIG. 6 shows an example of the contents of a knowledge dictionary used for long sentence segmentation in a table format. In FIG. 6, 6-1 indicates the length of a sentence indicating the minimum number of characters in a sentence to be divided into long sentences.

６−２は該長文をどの位置で切断するかを２元の単文間
の接続関係で示した切断点、６−３は切断後に挿入すべ
き接続詞を示した切断後の接続詞である。Reference numeral 6-2 indicates a cutting point indicating at which position the long sentence is to be cut, based on the connection relationship between two simple sentences, and 6-3 indicates a conjunction after cutting, indicating a conjunction to be inserted after cutting.

知識辞書検索部２から転送を受けた前処理部１は、入力
文が「文の長さ」６−１に示される文字数よりも長く、
かつ、２つ以上の単文から成立している場合に、どこで
切断するか、切断後の接続詞は何かを転送されてきた情
報（第６図参照）から決定し、実行する。第３図図示の
例は、１長さが４９文字で、かつ、２つ以上の単文から
成立しているので長文分割の対象となり。The preprocessing unit 1 that received the transfer from the knowledge dictionary search unit 2 determines that the input sentence is longer than the number of characters indicated in the “sentence length” 6-1,
If the sentence is made up of two or more simple sentences, it is determined where to cut it and what conjunction should be used after cutting from the transferred information (see FIG. 6) and executes it. The example shown in FIG. 3 has a length of 49 characters and is made up of two or more simple sentences, so it is subject to long sentence division.

第６図の矢印のレコードが該当し、連用中止の部分（第
３図の「〜が故障し、」）で切断される（第４図■）、
なお、該レコード中の「切断後の接続詞」６−３の欄が
空白であるのは、この時点では切断後の接続詞が一意に
決まらないことを示している。The record indicated by the arrow in Figure 6 corresponds to the record, and is cut off at the part where continuous use is discontinued (``... is out of order'' in Figure 3) (■ in Figure 4).
It should be noted that the fact that the column 6-3 of "connection after disconnection" in the record is blank indicates that the connective after disconnection is not uniquely determined at this point.

■　接続語補完（短文間の論理関係から２両者の間に最
も適当な接続語を補完すること）前処理部１において、
前処理項目が「接続語補完」に来たとき、前処理部１は
知識辞書検索部２に「接続語補完」に関する情報の転送
を要求する。知識辞書検索部２では、「接続語補完」に
関する情報を、知識辞書群３中の「接続語補完」に用い
る知識辞書を検索することによって得、これを前処理部
１に転送する。第７図に接続語補完に用いる知識辞書の
内容例を表形式を用いて示す。第７図において、７−１
は接続語の挿入位置の直前の文と直後の文との論理関係
を示した前文と後文の関係、７−２は該前文と後文の関
係を持つ際に挿入すべき接続語を示した挿入する接続語
である。■ Connecting word completion (complementing the most appropriate connecting word between two short sentences based on the logical relationship between them) In the preprocessing unit 1,
When the preprocessing item is "connective word completion," the preprocessing section 1 requests the knowledge dictionary search section 2 to transfer information regarding "connective word completion." The knowledge dictionary search section 2 obtains information regarding "connective word completion" by searching the knowledge dictionary used for "connective word completion" in the knowledge dictionary group 3, and transfers this information to the preprocessing section 1. FIG. 7 shows an example of the contents of a knowledge dictionary used for connective word completion in a table format. In Figure 7, 7-1
7-2 shows the connection word that should be inserted when there is a relationship between the preceding sentence and the following sentence. is a conjunction to be inserted.

知識辞書検索部２から転送を受けた前処理部１は、接続
語の挿入位置の直前の文と直後の文との論理関係を把握
しくこの方式については特に限定しない）、　　７−Ｉ
ＩｆＡと比較し、一致するレコードの接続語を挿入する
。第３図図示の例では、直前の文と直後の文との論理関
係が「原因−結果」であるので、第７図の矢印のレコー
ドが該当し、接続語として「このため」が挿入される（
第４図■）。The preprocessing unit 1 that has received the transfer from the knowledge dictionary search unit 2 grasps the logical relationship between the sentence immediately before and the sentence immediately following the insertion position of the connective word (this method is not particularly limited), 7-I
Compare with IfA and insert the connective word of the matching record. In the example shown in Figure 3, the logical relationship between the immediately preceding sentence and the immediately following sentence is "cause-effect," so the record indicated by the arrow in Figure 7 corresponds, and "for this reason" is inserted as a conjunction. (
Figure 4 ■).

■　格要素補完（１訳すべき文中に必須格が存在しない
とき、咳文の前文又は後文からその必須格に対応する語
を補完すること）前処理部１において、前処理項目が「格要素補完」に来
たとき、前処理部１は知識辞書検索部２に「格要素補完
」に関する情報の転送を要求する。知識辞書検索部２で
は、「格要素補完」に関する情報を、知識辞書群３中の
「格要素補完」に用いる知識辞書を検索することによっ
て得、これを前処理部１に転送する。第８図に格要素補
完に用いる知識辞書の内容例を表形式を用いて示す、第
８図において、８−１は補完しようとしている必須格を
示した補完すべき格要素、８−２は補完しようとしてい
る必須格を、前方の文から補完する際に、前方の文のど
の格を埋めている語によって補完するかを示した前文か
らの補完、８−３は補完しようとしている必須格を、後
方の文から補完する際に、後方の文のどの格を埋めてい
る語によって補完するかを示した後文からの補完、８−
４は補完する際に、補完する語の持つ意味カテゴリが、
補完される単文の用言の持つ格構造中の補完によって埋
められる格の意味カテゴリに含まれることが必要かどう
かを示した用言パターンとのカテゴリマンチである。■ Case element completion (when there is no essential case in the sentence to be translated, complementing the word corresponding to the essential case from the preamble or post-sentence of the cough sentence) In the preprocessing section 1, if the preprocessing item is ``case element When the preprocessing section 1 reaches the point of "completion," the preprocessing section 1 requests the knowledge dictionary search section 2 to transfer information regarding "case element completion." The knowledge dictionary search section 2 obtains information regarding "case element completion" by searching the knowledge dictionary used for "case element completion" in the knowledge dictionary group 3, and transfers this information to the preprocessing section 1. Figure 8 shows an example of the contents of a knowledge dictionary used for case element completion in a table format. Completion from the previous sentence indicates which case in the previous sentence is to be completed by the word filling the required case when completing the required case from the previous sentence. 8-3 is the required case that is being completed. Completion from the subsequent sentence, which indicates which case in the subsequent sentence is to be completed by the word filling the subsequent sentence, 8-
4. When completing, the meaning category of the word to be completed is
This is a category match with a predicate pattern that shows whether it is necessary to be included in the semantic category of the case filled in by completion in the case structure of the predicate of the simple sentence being completed.

知識辞書検索部２から転送を受けた前処理部１は、補完
しようとしている必須格と８−１の欄の補完すべき格要
素とが一致するレコードを選び、前文から補完するか、
後文から補完するかを決定（この方法については特に限
定しない）した後、該レコードの記述に従って格要素補
完を行う。第３図図示の例においては、補完しようとし
ている必須格一対象格であるので。The preprocessing unit 1, which has received the transfer from the knowledge dictionary search unit 2, selects a record in which the essential case to be completed matches the case element to be completed in column 8-1, and completes it from the preamble, or
After determining whether to complete from the post-sentence (the method is not particularly limited), case element completion is performed according to the description of the record. In the example shown in Figure 3, it is the obligatory case and the object case that we are trying to complete.

第８図の矢印あレコードが選択される。前処理部１は前
文から補完することを決定し、該レコードの記述に従っ
て、前文の場所格である語「東京国際空港コを補完する
ことを試みる。該レコードの８−４欄が「必要」となっ
ているので、「東京国際空港」の意味カテゴリと、補完
される単文の用言である「使用する」の対象格の意味カ
テゴリとをチエツクする。この結果。The record indicated by the arrow in FIG. 8 is selected. The preprocessing unit 1 decides to complete from the preamble and attempts to complete the word "Tokyo International Airport" which is a locative case in the preamble according to the description of the record.The column 8-4 of the record is "necessary" Therefore, we check the semantic category of ``Tokyo International Airport'' and the semantic category of the object case of ``to use'', which is a simple phrase to be complemented. As a result.

「東京国際空港」は用言「使用する」の対象格となりう
るので、「東京国際空港」は対象格として補完される（
第４図■）。Since "Tokyo International Airport" can be the object case of the term "use,""Tokyo International Airport" is complemented as the object case (
Figure 4 ■).

以上、前処理部１で処理される前処理項目のうち４つに
ついて例を用いて詳述した。Above, four of the preprocessing items processed by the preprocessing unit 1 have been described in detail using examples.

第４図の前処理を施された人力日本語文章は翻訳処理部
６に送られ目的言語へと変換された後。The human-powered Japanese text that has been subjected to the preprocessing shown in FIG. 4 is sent to the translation processing unit 6, where it is converted into the target language.

出力部９に送られ１．出力部９で出力される。Sent to the output section 9, 1. It is output from the output section 9.

本実施例では２表記の標準化処理、長文分割処理、接続
語補完処理、格要素補完処理の４つの前処理について述
べたが、他に、複合語パラフレーズ知識辞書を参照する
ことによって、複雑な複合語を構成する語基間の関係を
認定し、パラフレーズする処理等も実現できる（例：文
書書き換え方式→文書を書き換えるための方式）。In this example, we have described four preprocessing processes: standardization processing for binary notation, long sentence division processing, connective word completion processing, and case element completion processing. It is also possible to recognize the relationship between the bases that make up a compound word and perform paraphrasing (eg, document rewriting method → document rewriting method).

このような構造および作用となっていることがら、従来
の方法に比べて。Due to this structure and operation, compared to conventional methods.

■　従来は、翻訳処理の各機能の中で分散処理されてい
た原言語の書き換え機能を前処理の中で一括処理するこ
とにより、翻訳機能を前処理機能と翻訳Ｉｉ！ｉ能とに
明確に分離して構成できるようになる。■ Conventionally, the source language rewriting function, which was distributed among each function of translation processing, is processed all at once during preprocessing, so that the translation function can be combined with the preprocessing function and Translation II! It becomes possible to clearly separate and configure the i-function and the i-function.

■　■のため、翻訳処理がコンパクトに実現できる。■ Because of ■, translation processing can be realized compactly.

■　結果として訳文の品質を上げることができる。■ As a result, the quality of the translated text can be improved.

■　人手によるプリエデイツトがほとんど不必要となる
ため、翻訳全体としての処理時間が短くなる。■ Since manual pre-editing is almost unnecessary, the overall translation processing time is shortened.

の各点で改善があった。There were improvements in each point.

（５）発明の詳細な説明したように１本発明によれば、翻訳処理の前に前
処理を設けることによって２人手によって行うブリエデ
イツト項目を極力抑え、さらに。(5) As described in detail, according to the present invention, by providing pre-processing before translation processing, the number of brieddates performed by two people can be minimized, and further.

人力原言語文章のあいまい性や不完全性を解消または軽
減し、翻訳しやすい文章に変換した後に翻訳処理を行え
るので、翻訳時間が短かく、高品質な訳文を出力できる
という利点がある。Translation processing can be performed after eliminating or reducing the ambiguity and incompleteness of human source language sentences and converting them into sentences that are easy to translate, which has the advantage of shortening translation time and outputting high-quality translated sentences.

また、第１の手段のみからなる文書書き換え方式は、？
ｊ！ｌな文データを、不完全性やあいまい性を解消また
は軽減した文データに書き換えることができるので、該
方式を文データの解析を必要とするシステム（データベ
ース検索システムなど〉の前処理として用いれば、該シ
ステムの解析の対象となる言語現象を絞ることができ、
該解析方式の設計の容易化を図ることができる。Also, what is the document rewriting method consisting only of the first means?
j! Since it is possible to rewrite unwritten sentence data into sentence data that eliminates or reduces incompleteness and ambiguity, this method can be used as preprocessing for systems that require analysis of sentence data (such as database search systems). , it is possible to narrow down the linguistic phenomena to be analyzed by the system,
The design of the analysis method can be facilitated.

[Brief explanation of the drawing]

第１図は本発明の一実施例の構成を示す基本構成図９第
２図は本発明の動作の概略フロー、第３図は例文として
用いた入力日本語文章、第４図は第３図の入力日本語文
章が前処理部を通った後の前処理を施された入力日本語
文章、第５図は前処理項目の１つである「表記の標準化
」に使う表記の標準化に用いる知識辞書の内容例、第６
図は前処理項目の１つである「長文分割」に使う長文分
割に用いる知識辞書の内容例、第７図は前処理項目の１
つである「接続語補完」に使う接続語補完に用いる知識
辞書の内容例、第８図は前処理項目の１つである「格要
素補完」に使う格要素補完に用いる知識辞書の内容例で
ある。１・・・前処理部、２・・・知識辞書検索部、３・・・
知識辞書群、４・・・利用者辞書登録部、５・・・利用
者辞書。６・・・翻訳処理部、７・・・翻訳辞書検索部、８・・
・翻訳辞書群、９・・・出力部。特許出願人　　日本電信電話株式会社Figure 1 is a basic configuration diagram showing the configuration of an embodiment of the present invention.9 Figure 2 is a schematic flow of the operation of the present invention.Figure 3 is an input Japanese sentence used as an example sentence.Figure 4 is the figure 3. Figure 5 shows the preprocessed input Japanese text after the input Japanese text passes through the preprocessing section. Figure 5 shows the knowledge used to standardize the notation used in "standardization of notation," which is one of the preprocessing items. Dictionary content example, Part 6
The figure shows an example of the contents of a knowledge dictionary used for long sentence segmentation, which is one of the preprocessing items, and Figure 7 shows one of the preprocessing items.
Figure 8 is an example of the contents of a knowledge dictionary used for case element completion, which is used for ``case element completion'', which is one of the preprocessing items. It is. 1... Preprocessing section, 2... Knowledge dictionary search section, 3...
Knowledge dictionary group, 4... User dictionary registration section, 5... User dictionary. 6... Translation processing unit, 7... Translation dictionary search unit, 8...
- Translation dictionary group, 9...output section. Patent applicant Nippon Telegraph and Telephone Corporation

Claims

[Claims]

(1) For the source language of input sentence data, at least one of a notation standardized knowledge dictionary, a long sentence division knowledge dictionary, a connective word complementary knowledge dictionary, and a case element complementary knowledge dictionary is provided, and the notation standardized knowledge dictionary is provided. A process of obtaining a word that requires notation standardization and a standard form of the word that requires notation standardization, and rewriting the word that requires notation standardization into the standard form, and the long sentence division knowledge dictionary. By referring to , obtain the number of characters in the sentence that defines the long sentence, the cutting point for dividing, and the connective word to be added after dividing, and refer to the process of dividing the long sentence and the connective word supplementary knowledge dictionary. The process of complementing connective words according to the logical relationship between two sentences by , a document rewriting method characterized by executing at least one process.

(2) For the source language of the input sentence data, at least one of a notation standardized knowledge dictionary, a long sentence division knowledge dictionary, a connective word complementary knowledge dictionary, and a case element complementary knowledge dictionary is provided, and the notation standardized knowledge dictionary is provided. A process of obtaining a word that requires notation standardization and a standard form of the word that requires notation standardization, and rewriting the word that requires notation standardization into the standard form, and the long sentence division knowledge dictionary. By referring to , obtain the number of characters in the sentence that defines the long sentence, the cutting point for dividing, and the connective word to be added after dividing, and refer to the process of dividing the long sentence and the connective word supplementary knowledge dictionary. The process of complementing connective words according to the logical relationship between two sentences by , a first means comprising a document rewriting method characterized by executing at least one process; and inputting output sentence data from the first means, translating it into sentence data of a target language of a different system, An automatic translation method comprising: a second means for outputting.