JPH09325960A

JPH09325960A - Document processing system

Info

Publication number: JPH09325960A
Application number: JP8141644A
Authority: JP
Inventors: Toshiyuki Sugio; 俊之杉尾
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1996-06-04
Filing date: 1996-06-04
Publication date: 1997-12-16

Abstract

PROBLEM TO BE SOLVED: To extract a non-language document or a link destination document as well by extracting a section matched with the specified pattern of format designation information to extract a character string and generating the output document by arranging the extracted section or arranging the post-processed extracted section. SOLUTION: A machine translation system is connected through an input/ output means 2 to a network 1, and an input document 3 is defined as a document with tag inputted from the network 1 through the input/output means 2 or inputted from a user through the input/output means 2. A tag identifying means 4 identifies tag information contained in the input document 3 and extracts a translation object expression containing the tag information, and a translation object tag information storage means 5 stores the translation object tag information to be referred to when the tag identifying means 4 identifies the tag information. A translating means 6 translates the document with tag into a document in a target language, and an output document generating means 7 generates the document with tag in the target language from the output result of the translating means 6.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は文書処理システムに
関し、例えば、文書が表示、印刷出力されたときの形式
を指定するようなタグ情報を含んだ文書を計算機システ
ムを利用して翻訳する機械翻訳システムに適用して好適
なものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document processing system, for example, a machine translation for translating a document including tag information for designating a format when the document is displayed and printed out by using a computer system. It is suitable for application to a system.

【０００２】[0002]

【従来の技術】最近、計算機ネットワークシステムが充
実し、電子化された文書の流通が盛んになっている。計
算機ネットワークシステムは世界中に張り巡らされ、流
通する文書には自国語（日本語）のみならず、英語等の
外国語の文書も多く含まれ、外国語に精通しない利用者
には言語の違いが情報受信又は情報発信の大きな障壁と
なっている。このように、異なる言語が入り交じる計算
機ネットワークシステムの利用者は、海外からの情報受
信又は海外への情報発信に際して、対象となる文書を翻
訳する必要性が出てくるが、翻訳のためのコストは決し
て安価であるとは言い難い。2. Description of the Related Art Recently, computer network systems have been enhanced and electronic documents have become popular. Computer network systems are spread all over the world, and the documents that are distributed include not only their own language (Japanese) but also foreign language documents such as English, which makes the language difference for users who are not familiar with foreign languages. Is a major obstacle to receiving or transmitting information. In this way, users of computer network systems that mix different languages will need to translate the target documents when receiving or transmitting information from overseas, but the cost of translation Is by no means cheap.

【０００３】そこで、情報を海外から受信し、又は、海
外へ発信する場合に、情報を必要最小限に限定して翻訳
することにより翻訳にかかるコストを低減することが重
要になる。ある文書の情報を必要最小限に限定するに
は、文書を要約する技術が必要である。Therefore, when receiving or transmitting information from abroad, it is important to reduce the cost of translation by translating the information while limiting the information to the necessary minimum. To limit the information in a document to the minimum necessary, a technique for summarizing the document is necessary.

【０００４】従来、この種の装置としては、特開平６−
１４９８７６号公報に開示されるものがある。この装置
は、文書の構造を図式化することにより、分かり易い文
書の作成を支援し、また、既存の文書の内容を瞬時に理
解できるようにすることを目的にしている。そして、こ
の目的を達成するために、(1) 入力文書を文単位に分け
る文書切り出し手段と、(2) 切り出し手段から入力文を
１文毎に受けて、所定のキーワード辞書を参照してキー
ワードを抽出しつつ、入力文を語単位に分けるキーワー
ド抽出手段と、(3) キーワード抽出手段から入力文を語
単位で受けて、文の構造を定型化して記憶している所定
の関係辞書を参照して入力文の構造を抽出すると共に、
抽出した文の構造を表す所定の関係記号と、入力文を構
成する語のうち図形中に表示すべき語とからなる中間結
果を作成する関係抽出手段と、(4) 中間結果に含まれる
関係記号に所定の図形を対応させると共に、図形と表示
すべき語の大きさ及び配置を決定し、決定した結果を表
す所定形式の図形情報を作成するキーワード配置手段
と、(5) 図形情報を受けて、受けた図形情報の内容に応
じて入力文書を図形化して表示する表示部とを備えてい
る。Conventionally, as an apparatus of this kind, Japanese Patent Laid-Open No. 6-
There is one disclosed in Japanese Patent No. 149876. This device aims to support the creation of an easy-to-understand document by graphically representing the structure of the document, and to make it possible to instantly understand the contents of the existing document. In order to achieve this purpose, (1) a document cutout unit that divides the input document into sentence units, and (2) an input sentence is received from the cutout unit for each sentence, and a keyword is referred to by referring to a predetermined keyword dictionary. (3) A keyword extraction unit that separates the input sentence into word units while extracting the input sentence, and (3) Receives the input sentence from the keyword extraction unit in word units, and standardizes and stores the sentence structure. And extract the structure of the input sentence,
Relationship extraction means that creates an intermediate result consisting of a predetermined relationship symbol that represents the structure of the extracted sentence and words that should be displayed in the figure among the words that make up the input sentence, and (4) the relationship included in the intermediate result. In addition to associating a predetermined graphic with a symbol, determine the size and layout of the graphic and the word to be displayed, and create keyword information in a predetermined format that represents the result of the determination, and (5) receive the graphic information. And a display unit for displaying the input document as a graphic according to the content of the received graphic information.

【０００５】[0005]

【発明が解決しようとする課題】ところで、電子化され
た文書には、その文書が印刷されたときの形式を指定す
るようなタグ情報を含んだもの（以下、タグ付き文書と
呼ぶ）もあり、従来のキャラクタセットコードのみから
構成される文書とは区別される。タグ情報には、他の文
書との関係（以下、リンクと呼ぶ）を示すことにより、
文書間のつながりを規定するものや、文書中の任意の部
分を強調したり、任意の部分が図や表であることを示す
といったものがある。By the way, some electronic documents include tag information (hereinafter referred to as a tagged document) that specifies the format in which the document was printed. , Documents that are composed only of conventional character set codes are distinguished. By indicating the relationship with other documents (hereinafter referred to as links) in the tag information,
There are things that define the connection between documents, emphasize any part in a document, and indicate that any part is a figure or table.

【０００６】このタグ付き文書を要約して翻訳する場
合、上記構成の従来装置では、（１）文書を構成する構成要素には、テキストに限らず
図や表といった非言語情報も含まれるので、それらの情
報が欠落する（２）複数の文書がリンクにより関係をもつことを想定
していないので、リンク先の文書の情報が欠落するという課題が発生し、要約された結果は文書全体の意図
を正確に把握できない。つまり、上記構成の従来装置で
は、元の文書の内容を正確に理解できるように翻訳する
ことができないという課題があった。In the case of summarizing and translating this tagged document, in the conventional apparatus having the above-mentioned structure, (1) the constituent elements of the document include not only text but also non-language information such as figures and tables. The information is missing. (2) Since it is not assumed that multiple documents are related by a link, the problem of missing the information of the linked document occurs, and the summarized result is the intention of the entire document. I can't figure out exactly. That is, the conventional apparatus having the above configuration has a problem in that the content of the original document cannot be translated so that it can be accurately understood.

【０００７】なお、翻訳処理を実行しないで単に要約文
書を作成する場合、要約文書に対して翻訳以外の処理を
施す場合でも、非言語文書やリンク先文書の情報が欠落
するという同様な課題が生じている。[0007] In the case of simply creating a summary document without executing the translation process, the same problem that the information of the non-language document or the linked document is lost even when the process other than translation is performed on the summary document. Has occurred.

【０００８】[0008]

【課題を解決するための手段】かかる課題を解決するた
め、本発明においては、表示、印刷出力時の形式を指定
する形式指定情報を伴なう文書を処理する文書処理シス
テムにおいて、(1) 入力文書における所定種類の文字列
を抽出するための、形式指定情報の特定パターンを格納
している抽出対象特定情報記憶手段と、(2) 入力文書に
おいて、抽出対象特定情報記憶手段に格納されている形
式指定情報の特定パターンに合致している部分を抽出す
る形式指定情報識別抽出手段と、(3) 抽出された部分を
整備して、又は、その後処理された抽出部分を整備して
出力文書を生成する出力文書生成手段とを有することを
特徴とする。In order to solve such a problem, according to the present invention, in a document processing system for processing a document accompanied by format designation information for designating a format at the time of display and print output, (1) An extraction target specific information storage unit that stores a specific pattern of format specification information for extracting a character string of a predetermined type in the input document; and (2) an input target specific information storage unit that stores the extraction target specific information storage unit in the input document. Output document with format specification information identification and extraction means that extracts the part that matches the specified pattern of the format specification information, and (3) prepare the extracted part, or prepare the extracted part that has been processed thereafter. And an output document generating means for generating.

【０００９】これにより、抽出対象特定情報記憶手段の
記憶内容によっては、入力文書の所定部分だけでなく、
非言語文書やリンク先文書も取出すことができる構成に
することもでき、それら特殊な文書情報をも反映させた
出力文書を形成させることができる。As a result, depending on the storage contents of the extraction target specific information storage means, not only the predetermined portion of the input document but
A non-language document or a linked document can also be taken out, and an output document that reflects such special document information can be formed.

【００１０】[0010]

BEST MODE FOR CARRYING OUT THE INVENTION

（Ａ）第１の実施形態以下、本発明による文書処理システムを、タグ付き文書
の機械翻訳システムに適用した第１の実施形態を図面を
参照しながら詳述する。(A) First Embodiment Hereinafter, a first embodiment in which the document processing system according to the present invention is applied to a machine translation system for tagged documents will be described in detail with reference to the drawings.

【００１１】この第１の実施形態の機械翻訳システム
は、タグ付き文書のタグ情報が付与された部分には、当
該文書の作者の何らかの意図が働いていると考え、その
部分のみに限定して要約を実施することにより、入力文
書の必要最小限の情報を抽出して翻訳するものである。In the machine translation system of the first embodiment, it is considered that the tag information of the tagged document has some intention of the author of the document, and is limited to that part. By performing the abstract, the minimum necessary information of the input document is extracted and translated.

【００１２】（Ａ−１）第１の実施形態の構成第１の実施形態の機械翻訳システムは、例えば、入力装
置や処理装置や記憶装置（補助記憶装置を含む）や出力
装置や通信装置を備えたワークステーションやパーソナ
ルコンピュータ等の情報処理装置上に構築されるが、機
能的には図１に示す構成を有する。(A-1) Configuration of the First Embodiment The machine translation system of the first embodiment includes, for example, an input device, a processing device, a storage device (including an auxiliary storage device), an output device, and a communication device. Although it is constructed on an information processing apparatus such as a workstation or a personal computer provided, it functionally has the configuration shown in FIG.

【００１３】図１において、第１の実施形態の機械翻訳
システムは、入出力手段２、入力文書（バッファ）３、
タグ識別手段４、翻訳対象タグ情報格納手段５、翻訳手
段６、出力文書生成手段７及び出力文書（バッファ）８
から構成されており、この機械翻訳システムは、入出力
手段２を介して、ネットワーク１に接続されている。In FIG. 1, the machine translation system according to the first embodiment has an input / output unit 2, an input document (buffer) 3,
Tag identification means 4, translation target tag information storage means 5, translation means 6, output document generation means 7 and output document (buffer) 8
This machine translation system is connected to the network 1 via the input / output means 2.

【００１４】ネットワーク１は、例えば世界中に接続さ
れているネットワークであり、タグ付文書を各装置（実
施形態の機械翻訳システムを含む）間で授受できるもの
である。The network 1 is, for example, a network that is connected all over the world, and is capable of exchanging a tagged document between each device (including the machine translation system of the embodiment).

【００１５】入出力手段２は、実施形態の機械翻訳シス
テムとネットワーク１とのインタフェース、及び、利用
者とのインタフェースを行なうものであり、すなわち、
タグ付き文書等の入出力を実施するものである。入力文
書（バッファ）３は、入出力手段（の通信構成）２を介
して、ネットワーク１から入力されたタグ付き文書、又
は、入出力手段（の入力構成）２によって利用者から入
力されたタグ付き文書である。タグ識別手段４は、入力
文書２に含まれるタグ情報を識別し、タグ情報を含む翻
訳対象表現を抽出するものである。翻訳対象タグ情報格
納手段５は、タグ識別手段４が、タグ情報を識別する際
に参照する翻訳対象タグ情報を格納するものである。翻
訳手段６は、タグ付き文書を目的言語の文書に翻訳する
ものである。出力文書生成手段７は、翻訳手段６の出力
結果から目的言語のタグ付き文書を生成するものであ
る。出力文書（バッファ）８は、出力文書生成手段７に
よって生成された目的言語のタグ付き文書である。The input / output unit 2 serves as an interface between the machine translation system of the embodiment and the network 1 and an interface with the user, that is,
Input / output of documents with tags is performed. The input document (buffer) 3 is a tagged document input from the network 1 via the input / output means (communication configuration) 2 or a tag input from the user by the input / output means 2 (input configuration). It is an attached document. The tag identifying means 4 identifies tag information included in the input document 2 and extracts a translation target expression including the tag information. The translation target tag information storage unit 5 stores the translation target tag information that the tag identifying unit 4 refers to when identifying the tag information. The translation means 6 translates the tagged document into a document in the target language. The output document generation means 7 generates a tagged document in the target language from the output result of the translation means 6. The output document (buffer) 8 is a target language tagged document generated by the output document generating means 7.

【００１６】図２は、翻訳対象タグ情報格納手段５に格
納されている翻訳対象タグ情報の構成を示す説明図であ
る。FIG. 2 is an explanatory diagram showing the structure of the translation target tag information stored in the translation target tag information storage means 5.

【００１７】タグ付き文書の場合、タグ情報が付与され
ている文字列は、タイトルであったり、重要であるため
に下線付与や強調字体出力等が指示されたりしているも
のであり、その文書の特徴的文字列である。従って、こ
のような文字列部分だけを入力文書から抽出した場合、
タグ付き文書の要約文書を得ることができる。この第１
の実施形態の場合、利用者は必ずしも入力文書全体の情
報を必要としていないことに着目し、このような要約文
書の訳文文書を得ることとしている。In the case of a document with a tag, the character string to which the tag information is added is a title, and since it is important, an underline addition, an emphasized font output, etc. are instructed. Is a characteristic character string of. Therefore, if only such a character string part is extracted from the input document,
You can get a summary of tagged documents. This first
In the case of the above embodiment, the user does not necessarily need the information of the entire input document, and obtains a translated document of such a summary document.

【００１８】翻訳対象タグ情報格納手段５に格納されて
いる翻訳対象タグ情報は、要約文書の要素となり得る文
字列部分を抽出できる情報となっている。The translation target tag information stored in the translation target tag information storage means 5 is information capable of extracting a character string portion that can be an element of the summary document.

【００１９】図２において、翻訳対象タグ情報２００
は、複数（図２では９個については具体的に示してい
る）のタグパターン２０１〜２０９、…からなってい
る。各タグパターンは、入力文書３に出現する可能性が
あるタグ情報を含む表現のパターンをそれぞれ定義して
いる。各タグパターンには、任意の１文字を表す「．」
や、１回以上の繰り返しを表す「＋」等の正規表現も適
宜含まれており、入力文書３に出現する文字列と照合す
べき文字列の定義となっている。In FIG. 2, translation target tag information 200
Is composed of a plurality of (nine in FIG. 2 are specifically shown) tag patterns 201 to 209, .... Each tag pattern defines an expression pattern including tag information that may appear in the input document 3. Each tag pattern has a "."
Alternatively, a regular expression such as “+” indicating one or more repetitions is also included as appropriate, and is a definition of a character string to be collated with the character string that appears in the input document 3.

【００２０】例えば、「＜Ｉ＞．＋＜／Ｉ＞」というタ
グパターン２０１において、「＜Ｉ＞」と「＜／Ｉ＞」
に挟まれる「．＋」は、任意の１文字の１回以上の繰り
返しを意味し、「ｐａｒｓｉｎｇ」というような文字列
が照合することになる。また、「＜Ｈ［０−９］＞．＋
＜／Ｈ［０−９］＞」というタグパターン２０２におい
て、「［０−９］」の部分には、任意の数字１文字が照
合するので、例えば、「＜Ｈ１＞ＲｅｃｅｎｔＴｒｅ
ｎｄｏｆＲｅｓｅａｒｃｈＡｎｄＤｅｖｅｌｏ
ｐｍｅｎｔ＜／Ｈ１＞」という文字列が照合することに
なる。同様に、他の各タグパターン２０３、…、２０９
にも適合する文字列がそれぞれ存在する。For example, in the tag pattern 201 ". + ", "" and ""
The ". +" Sandwiched between "" means one or more repetitions of any one character, and a character string such as "parsing" is matched. In addition, “<H [0-9]>. +
In the tag pattern 202 "</ H [0-9]>", since one arbitrary numeral is collated with "[0-9]", for example, "<H1> Recent Tre"
nd of Research And Develo
The character string "pment </ H1>" is matched. Similarly, each of the other tag patterns 203, ..., 209
There are character strings that match

【００２１】なお、図２においては、翻訳対象タグ情報
２００として９個のタグパターン２０１〜２０９が記述
されいるが、翻訳対象タグ情報２００に、入出力手段２
を介して利用者によりタグパターンを適宜追加修正でき
るようにしておくことが好ましい。すなわち、入力文書
３に出現するＨＴＭＬ（HyperText Markup Language）
等のタグ付き文書の種類により記載内容を任意に変更で
きるようにしておくことが好ましい。この実施形態は、
翻訳対象タグ情報２００に記述されるタグパターを編集
可能であるとする。従って、定義対象となるタグ付き文
書の種類に応じてタグパターンを増減することができ
る。In FIG. 2, nine tag patterns 201 to 209 are described as the translation target tag information 200, but the translation target tag information 200 includes the input / output means 2.
It is preferable that the user can appropriately add and correct the tag pattern via the. That is, HTML (HyperText Markup Language) that appears in the input document 3
It is preferable that the description content can be arbitrarily changed according to the type of the tagged document such as. This embodiment is
It is assumed that the tag pattern described in the translation target tag information 200 can be edited. Therefore, the tag pattern can be increased or decreased according to the type of the tagged document to be defined.

【００２２】図３は、後述する動作説明で利用する入力
文書３（３０１）の例を示す図である。この実施形態で
対象としている文書は、例えば、ＨＴＭＬで記述された
タグ付き文書３０１である。このタグ付き文書３０１に
は、図（イメージ）情報３０２にリンクする（＜ＩＭＧ
ＳＲＣ＝”．．．”＞）や、他のＨＴＭＬ記述文書３
０３へのリンクを示すタグ（＜ＡＨＲＥＦ
＝”．．．”＞．．．＜／Ａ＞）等のリンクタグが含ま
れている。FIG. 3 is a diagram showing an example of the input document 3 (301) used in the description of the operation described later. The target document in this embodiment is, for example, the tagged document 301 described in HTML. This tagged document 301 is linked to the figure (image) information 302 (<IMG
SRC = "...">) and other HTML description documents 3
Tag indicating the link to 03 (<A HREF
= "...">. . . </A>) and other link tags are included.

【００２３】図４は、入力文書３（３０１）の表示例を
示す図である。上述したタグ付き文書３０１は、入出力
手段２を介して表示された場合、図４に示す表示画面４
００のようになる。表示域４０１は、入力文書３０１の
「＜ＴＩＴＬＥ＞ＲｅｃｅｎｔＴｒｅｎｄ．．．Ｄ
ｅｖｅｌｏｐｍｅｎｔ＜／ＴＩＴＬＥ＞」の部分に対応
するものである。表示域４０２は、入力文書３０１の
「＜ＢＯＤＹ＞」と「＜／ＢＯＤＹ＞」に挟まれた部分
の記述に対応するものである。なお、図４中の挿入図面
は、図３のそれと同様の図（イメージ）情報であり、入
力文書３０１の「＜ＩＭＧＳＲＣ＝”ＴＲＡＮＳＦＥ
Ｒ．ｇｉｆ”＞」に対応して表示されている。また、図
４には、図３の文書３０３に対応する表示としては、入
力文書３０１の「＜ＡＨＲＥＦ＝”ａｍｂｉｇｕｉｔ
ｙ．ｈｔｍｌ”＞ａｍｂｉｇｕｉｔｙｐｒｏｂｌｅｍ
＜／Ａ＞」に対応し、表示域４０２の下線が施された
「ａｍｂｉｇｕｉｔｙｐｒｏｂｌｅｍ」の部分が相当
し、リンク先である文書３０３の実体は表示されていな
い。このように、タグ付き文書は、リンクタグによっ
て、図表を当該文書に埋め込んで表示したり、逆に、リ
ンク情報のみを表示することにより、リンク先の文書は
隠蔽して表示しない場合がある。FIG. 4 is a diagram showing a display example of the input document 3 (301). When the above-mentioned tagged document 301 is displayed via the input / output unit 2, the display screen 4 shown in FIG.
00. In the display area 401, “<TITLE> Recent Trend ... D of the input document 301 is displayed.
This corresponds to the "environment </ TITLE>" portion. The display area 402 corresponds to the description of the part sandwiched between “<BODY>” and “</ BODY>” of the input document 301. Note that the inserted drawing in FIG. 4 is the same drawing (image) information as that in FIG. 3, and “<IMG SRC =” TRANSFE of the input document 301.
R. It is displayed corresponding to "gif">". Further, in FIG. 4, as a display corresponding to the document 303 of FIG. 3, “<A HREF =” ambiguit of the input document 301 is displayed.
y. html ”> ambiguity problem
"/ A>" corresponding to the underlined "ambiguity problem" portion of the display area 402, and the entity of the document 303 that is the link destination is not displayed. As described above, in the tagged document, the diagram may be embedded in the document by the link tag and displayed, or conversely, only the link information may be displayed, and the linked document may not be hidden and displayed.

【００２４】なお、この例の入力文書３０１（３）を、
この第１の実施形態の機械翻訳システムで処理した出力
文書８（９００）を図９を示し、その出力文書８（９０
０）の表示画面を図１０に示している。The input document 301 (3) of this example is
An output document 8 (900) processed by the machine translation system of the first embodiment is shown in FIG. 9, and the output document 8 (90) is shown.
The display screen of 0) is shown in FIG.

【００２５】（Ａ−２）第１の実施形態の動作次に、以上のようなタグ付き文書を翻訳処理する第１の
実施形態の機械翻訳システムの動作を説明する。(A-2) Operation of the First Embodiment Next, the operation of the machine translation system of the first embodiment for translating a tagged document as described above will be described.

【００２６】図５は、第１の実施形態の機械翻訳システ
ム全体の動作を示すフローチャートである。FIG. 5 is a flow chart showing the operation of the entire machine translation system of the first embodiment.

【００２７】第１の実施形態の機械翻訳システムは、上
述したように、タグ付き文書のタグ情報が付与された部
分には、当該文書の作者の何らかの意図が働いていると
考え、その部分のみに限定して要約を実施することによ
り、入力文書３の必要最小限の情報を抽出して翻訳する
ものであり、そのために、入力文書３に含まれるタグ情
報に着目し、タグ情報が付与された表現を翻訳対象とし
てそれを翻訳し、出力文書８を生成して表示する。As described above, the machine translation system of the first embodiment considers that the portion of the tagged document to which the tag information is added has some intention of the author of the document, and only that portion is considered. By performing the abstraction limited to, the minimum necessary information of the input document 3 is extracted and translated. Therefore, the tag information included in the input document 3 is focused and the tag information is added. The above expression is used as a translation target and translated, and the output document 8 is generated and displayed.

【００２８】まず、タグ識別手段４が、入出力手段２を
介して、ネットワーク１からタグ付きの入力文書３を入
力する（ステップ５０１）。入力文書３にはタグ情報が
含まれているので、タグ識別手段４は、翻訳対象タグ情
報格納手段５から、翻訳対象タグ情報２００を得て、入
力文書３の翻訳対象を識別し、それを翻訳手段６に渡す
（ステップ５０２）。翻訳手段６は、タグ識別手段４か
ら得た翻訳対象を翻訳し、その結果を出力文書生成手段
７に渡す（ステップ５０３）。出力文書生成手段７は、
翻訳手段６から得た翻訳結果を入力文書３を参照しなが
ら整形して出力文書８として格納する（ステップ５０
４）。First, the tag identifying means 4 inputs the input document 3 with a tag from the network 1 via the input / output means 2 (step 501). Since the input document 3 includes the tag information, the tag identifying means 4 obtains the translation target tag information 200 from the translation target tag information storage means 5, identifies the translation target of the input document 3, and identifies it. It is passed to the translation means 6 (step 502). The translation unit 6 translates the translation target obtained from the tag identification unit 4, and passes the result to the output document generation unit 7 (step 503). The output document generation means 7
The translation result obtained from the translation means 6 is shaped with reference to the input document 3 and stored as the output document 8 (step 50).
4).

【００２９】これらステップ５０１〜５０４の動作は、
タグ識別手段４が入力文書３の全てを処理し、出力文書
生成手段７が翻訳手段６から全ての翻訳結果を受け取
り、出力文書８の整形が完了するまで繰り返される（ス
テップ５０５）。The operation of these steps 501 to 504 is as follows.
The tag identification means 4 processes all of the input document 3, the output document generation means 7 receives all the translation results from the translation means 6, and the process is repeated until the shaping of the output document 8 is completed (step 505).

【００３０】入力文書３が全て処理され、出力文書８の
整形が完了したならば、出力文書生成手段７は、出力文
書８を入出力手段２を介して表示し（ステップ５０
６）、この機械翻訳システムは動作を終了する（ステッ
プ５０７）。When the input document 3 is all processed and the shaping of the output document 8 is completed, the output document generating means 7 displays the output document 8 via the input / output means 2 (step 50).
6) The machine translation system ends the operation (step 507).

【００３１】次に、タグ識別手段４の動作を図面を参照
しながら説明する。ここで、図６が、タグ識別手段４の
動作を示すフローチャートである。Next, the operation of the tag identifying means 4 will be described with reference to the drawings. Here, FIG. 6 is a flowchart showing the operation of the tag identifying means 4.

【００３２】タグ識別手段４は、まず、入出力手段２を
介して、ネットワーク１から入力文書３を得る（ステッ
プ５０１）。The tag identifying means 4 first obtains the input document 3 from the network 1 via the input / output means 2 (step 501).

【００３３】次に、得られた入力文書３のタグ情報に関
わる表現を抽出するために、翻訳対象タグ情報格納手段
５によって格納されている翻訳対象タグ情報２００の１
個のタグパターンを得る（ステップ６０１）。次に、得
られたタグパターンに適合する文字列を入力文書３から
検索する（ステップ６０２）。ここで、当該タグパター
ンに適合する文字列が入力文書３に存在するならば（ス
テップ６０３）、適合した文字列を翻訳対象として翻訳
手段６へ転送する（ステップ６０４）。ステップ６０３
で、適合する文字列が入力文書３に存在しない場合に
は、ステップ６０４の動作は省略される。Next, in order to extract the expression related to the tag information of the obtained input document 3, 1 of the translation target tag information 200 stored by the translation target tag information storage means 5 is extracted.
The individual tag patterns are obtained (step 601). Next, the input document 3 is searched for a character string that matches the obtained tag pattern (step 602). If a character string matching the tag pattern is present in the input document 3 (step 603), the matched character string is transferred to the translation means 6 as a translation target (step 604). Step 603
If the matching character string does not exist in the input document 3, the operation of step 604 is omitted.

【００３４】しかる後に、翻訳対象タグ情報格納手段５
によって格納されている翻訳対象タグ情報２００の全て
のタグパターンで、入力文書３を検索したか否かをチェ
ックし、まだ、検索していないタグパターンが存在する
ならば（ステップ６０５）、上述したステップ６０１〜
６０４の処理を繰り返す。一方、全てのタグパターンに
よる検索が完了したならば（ステップ６０５）、タグ識
別手段４は動作を終了する（ステップ６０６）。Thereafter, the translation target tag information storage means 5
It is checked whether or not the input document 3 has been searched for in all the tag patterns of the translation target tag information 200 stored by, and if there is a tag pattern that has not been searched yet (step 605), the above-mentioned is performed. Step 601-
The processing of 604 is repeated. On the other hand, if the search by all the tag patterns is completed (step 605), the tag identifying means 4 ends the operation (step 606).

【００３５】次に、翻訳手段６の動作を図面を参照しな
がら説明する。ここで、図７が、翻訳手段６の動作を示
すフローチャートである。なお、この翻訳手段６は、当
然にタグ付き文書を翻訳できるものであり、例えば、下
記文献に記載のものを適用できる。Next, the operation of the translation means 6 will be described with reference to the drawings. Here, FIG. 7 is a flowchart showing the operation of the translation means 6. It should be noted that the translation means 6 can naturally translate a tagged document, and, for example, the one described in the following document can be applied.

【００３６】文献『石川直太、檜山正幸共著、「タグ付
き文書の英日機械翻訳支援システム」、CALS Japan '9
4, S2-1』まず、翻訳手段６は、タグ識別手段４がステップ６０４
で転送した翻訳対象を得て（ステップ７０１）、タグ情
報に対応した翻訳を実施する（ステップ７０２）。タグ
情報に対応した翻訳は、タグ部分を隠蔽した後に、タグ
部分以外の部分を通常の機械翻訳方法により翻訳し、そ
の結果に隠蔽していたタグ部分を復元することで実現さ
れる。最後に、翻訳手段６は、タグを含む翻訳結果を出
力文書生成手段７へ転送し（ステップ７０３）、動作を
終了する（ステップ７０４）。Reference “Naoto Ishikawa, Masayuki Hiyama”, “English-Japanese Machine Translation Support System for Tagged Documents”, CALS Japan '9
4, S2-1 ”First, in the translating means 6, the tag identifying means 4 performs step 604.
The translation target transferred in (1) is obtained (step 701), and the translation corresponding to the tag information is performed (step 702). The translation corresponding to the tag information is realized by hiding the tag portion, translating a portion other than the tag portion by an ordinary machine translation method, and restoring the concealed tag portion as a result. Finally, the translation unit 6 transfers the translation result including the tag to the output document generation unit 7 (step 703) and ends the operation (step 704).

【００３７】次に、出力文書生成手段７の動作を図面を
参照しながら説明する。ここで、図８は、出力文書生成
手段７の動作を示すフローチャートである。Next, the operation of the output document generating means 7 will be described with reference to the drawings. Here, FIG. 8 is a flowchart showing the operation of the output document generation means 7.

【００３８】まず、出力文書生成手段７は、翻訳手段６
が、ステップ７０３で転送した翻訳結果を受け取る（ス
テップ８０１）。次に、受け取った翻訳結果を、入力文
書３の書式を参照しながら、出力文書８に格納する（ス
テップ８０２）。以上のステップ８０１及びステップ８
０２の動作は、タグ識別手段４が入力文書３の全てを処
理し、出力文書生成手段７が翻訳手段６から全ての翻訳
結果を受け取り、出力文書８の整形が完了するまで繰り
返される（ステップ５０５）。First, the output document generation means 7 is the translation means 6
Receives the translation result transferred in step 703 (step 801). Next, the received translation result is stored in the output document 8 with reference to the format of the input document 3 (step 802). Step 801 and step 8 above
The operation of 02 is repeated until the tag identifying means 4 processes all of the input document 3, the output document generating means 7 receives all the translation results from the translating means 6, and the shaping of the output document 8 is completed (step 505). ).

【００３９】入力文書３が全て処理され、出力文書８の
整形が完了したならば（ステップ５０５）、出力文書生
成手段７は、出力文書８を入出力手段２を介して表示し
（ステップ５０６）、動作を終了する（ステップ８０
３）。When all the input documents 3 are processed and the shaping of the output document 8 is completed (step 505), the output document generation means 7 displays the output document 8 via the input / output means 2 (step 506). , End the operation (step 80)
3).

【００４０】以下では、図３に示したタグ付き文書３０
１を入力文書３の例として、また、図２に示した翻訳対
象タグ情報２００を翻訳対象タグ情報格納手段５が格納
しているとして、第１の実施形態の機械翻訳システムの
動作を具体的に説明する。In the following, the tagged document 30 shown in FIG.
1 as an example of the input document 3 and the translation target tag information storage unit 5 storing the translation target tag information 200 shown in FIG. 2, the operation of the machine translation system of the first embodiment will be described in detail. Explained.

【００４１】まず、タグ識別手段４は入力文書３０１を
得る（ステップ５０１）。次に、タグ識別手段４は、翻
訳対象タグ情報格納手段５が格納する翻訳対象タグ情報
２００の第１のタグパターン２０１を得る（ステップ６
０１）。得られたタグパターンは「＜Ｉ＞．＋＜／Ｉ
＞」であるので、これに適合する文字列を入力文書３０
１から検索する（ステップ６０２）。その結果、「＜Ｉ
＞ｐａｒｓｉｎｇ＜／Ｉ＞」、「＜Ｉ＞ｔｒａｎｓｆｅ
ｒｒｉｎｇ＜／Ｉ＞」、「＜Ｉ＞ｇｅｎｅｒａｔｉｎｇ
＜／Ｉ＞」及び「＜Ｉ＞ｔｒａｎｓｆｅｒｍｅｔｈｏ
ｄ＜／Ｉ＞」がタグパターン２０１に適合する文字列と
して抽出される（ステップ６０３）ので、タグ識別手段
４は、それらの文字列を翻訳対象として翻訳手段６に転
送する（ステップ６０４）。First, the tag identifying means 4 obtains the input document 301 (step 501). Next, the tag identifying means 4 obtains the first tag pattern 201 of the translation target tag information 200 stored in the translation target tag information storage means 5 (step 6).
01). The obtained tag pattern is ". + ”, A character string matching this is input document 30
Search from 1 (step 602). As a result, " Parsing , “ transfer
"ring ", "generation"
 ”and“ transfer method ”
Since "d " is extracted as a character string that matches the tag pattern 201 (step 603), the tag identification means 4 transfers these character strings to the translation means 6 as a translation target (step 604).

【００４２】翻訳手段６は、これらの４つの翻訳対象を
得て（ステップ７０１）、それらを翻訳し、それぞれ
「＜Ｉ＞解析＜／Ｉ＞」、「＜Ｉ＞変換＜／Ｉ＞」、
「＜Ｉ＞生成＜／Ｉ＞」、「＜Ｉ＞トランスファ方式＜
／Ｉ＞」なる翻訳結果を得る（ステップ７０２）。翻訳
手段６は、これら４つの翻訳結果を出力文書生成手段７
に転送する（ステップ７０３）。The translation means 6 obtains these four translation objects (step 701), translates them, and then respectively translates them into " analysis ", " transformation ",
“ Generation ”, “ Transfer method ”is obtained (step 702). The translation means 6 outputs these four translation results to the output document generation means 7
(Step 703).

【００４３】出力文書生成手段７は、翻訳手段６から得
た４つの翻訳結果を、入力文書３０１の書式を参照しな
がら出力文書９００の所定の位置（図９参照）に格納す
る（ステップ８０２）。The output document generation means 7 stores the four translation results obtained from the translation means 6 in a predetermined position (see FIG. 9) of the output document 900 with reference to the format of the input document 301 (step 802). .

【００４４】その後、タグ識別手段４において、翻訳対
象タグ情報２００の次のタグパターンが残されているの
で（ステップ６０５）、第２のタグパターン２０２を得
る（ステップ６０１）。得られたタグパターンは「＜Ｈ
［０−９］＞．＋＜／Ｈ［０−９］＞」であるので、こ
れに適合する文字列を入力文書３０１から検索する（ス
テップ６０２）。その結果、「＜Ｈ１＞Ｒｅｃｅｎｔ
ＴｒｅｎｄｏｆＲｅｓｅａｒｃｈＡｎｄＤｅｖ
ｅｌｏｐｍｅｎｔ＜／Ｈ１＞」及び「＜Ｈ２＞Ｔｈｅ
ＬａｔｅｓｔＴｅｃｈｎｏｌｏｇｉｃａｌＴｒｅｎ
ｄ＜／Ｈ２＞」がタグパターン２０２に適合する文字列
として抽出される（ステップ６０３）ので、タグ識別手
段４は、それらの文字列を翻訳対象として翻訳手段６に
転送する（ステップ６０４）。After that, the tag identification means 4 obtains the second tag pattern 202 (step 601) because the tag pattern next to the translation target tag information 200 remains (step 605). The obtained tag pattern is "<H
[0-9]>. + </ H [0-9]> ”, a character string matching this is retrieved from the input document 301 (step 602). As a result, “<H1> Recent
Trend of Research And Dev
"Epment </ H1>" and "<H2> The
Latest Technological Tren
Since "d </ H2>" is extracted as a character string that matches the tag pattern 202 (step 603), the tag identification means 4 transfers these character strings to the translation means 6 as a translation target (step 604).

【００４５】翻訳手段６は、これらの２つの翻訳対象を
得て（ステップ７０１）、それらを翻訳し、それぞれ
「＜Ｈ１＞最近の研究開発動向＜／Ｈ１＞」、「＜Ｈ２
＞最新技術動向＜／Ｈ２＞」なる翻訳結果を得る（ステ
ップ７０２）。翻訳手段６は、これら２つの翻訳結果を
出力文書生成手段７に転送する（ステップ７０３）。The translation means 6 obtains these two objects to be translated (step 701), translates them and translates them into "<H1> Recent research and development trends </ H1>" and "<H2
A translation result "> latest technological trend </ H2>" is obtained (step 702). The translation means 6 transfers these two translation results to the output document generation means 7 (step 703).

【００４６】出力文書生成手段７は、翻訳手段６から得
た４つの翻訳結果を、入力文書３０１の書式を参照しな
がら出力文書９００の所定の位置（図９参照）に格納す
る（ステップ８０２）。The output document generation means 7 stores the four translation results obtained from the translation means 6 in a predetermined position (see FIG. 9) of the output document 900 with reference to the format of the input document 301 (step 802). .

【００４７】以下同様にして、入力文書３０１の各タグ
パターン２０３、…、２０９に適合する文字列が順次翻
訳され、出力文書９００に格納され、最終的に図１０に
示す出力文書の表示画面１０００が入出力手段２を介し
て表示される（ステップ５０６）。.., 209 of the input document 301 are sequentially translated, stored in the output document 900, and finally displayed in the output document display screen 1000 shown in FIG. Is displayed via the input / output unit 2 (step 506).

【００４８】（Ａ−３）第１の実施形態の効果以上に説明したように、第１の実施形態の機械翻訳シス
テムによれば、翻訳対象となり得る入力文書３の部分を
特定する翻訳対象タグ情報を記述して格納しておき、こ
の翻訳対象タグ情報に該当する入力文書３の部分だけを
翻訳するようにしたので、入力文書３を要約して翻訳し
た結果である出力文書８を得ることができる。(A-3) Effects of First Embodiment As described above, according to the machine translation system of the first embodiment, a translation target tag that specifies a portion of the input document 3 that can be a translation target. Since the information is described and stored, and only the part of the input document 3 corresponding to this translation target tag information is translated, the output document 8 which is the result of translating the input document 3 is obtained. You can

【００４９】かくするにつき、翻訳対象タグ情報に図や
表を規定する情報を記述しておくことにより、翻訳結果
に図や表といった非言語情報も含めることができ、ま
た、翻訳対象タグ情報にリンク先の文書を指定する情報
を記述しておくことにより、翻訳結果にリンク先の文書
情報を含めることができる。すなわち、入力文書３の作
者の意図を損なうことなく必要最小限の情報に要約され
た翻訳結果を得ることができる。In this way, by describing the information defining the figure or table in the translation target tag information, the translation result can include non-language information such as the figure or table, and the translation target tag information can be included. By describing the information designating the document of the link destination, the document information of the link destination can be included in the translation result. That is, it is possible to obtain the translation result summarized into the minimum necessary information without impairing the intention of the author of the input document 3.

【００５０】いま、図３に示す入力文書３０１と、図９
に示す出力文書９００を比べてみると、入力文書３０１
のタグ情報に関わる表現のみに限定された部分が抽出さ
れて翻訳されているので、入力文書３０１の作者の意図
を損なうことなく必要最小限の情報に要約された翻訳が
実施されていることが判る。このことは、図１０に示す
出力文書９００の表示画面１０００を参照することで更
に明らかである。Now, the input document 301 shown in FIG. 3 and FIG.
Comparing the output document 900 shown in FIG.
Since the part limited to only the expressions related to the tag information of is extracted and translated, it is possible that the translation summarized into the minimum necessary information is performed without impairing the intention of the author of the input document 301. I understand. This is further apparent by referring to the display screen 1000 of the output document 900 shown in FIG.

【００５１】なお、図１０の表示域１００１は、出力文
書９００の「＜ＴＩＴＬＥ＞最近の研究開発動向＜／Ｔ
ＩＴＬＥ＞」の部分に対応するものである。表示域１０
０２は、出力文書９００の「＜ＢＯＤＹ＞」と「＜／Ｂ
ＯＤＹ＞」に挟まれた部分の記述に対応する表示域であ
る。なお、図１０にある図３０２は、図３のそれと同様
の図（イメージ）情報であり、出力文書９００の「＜Ｉ
ＭＧＳＲＣ＝”ＴＲＡＮＳＦＥＲ．ｇｉｆ”＞」に対
応して表示されている。また、図１０には、図３のリン
ク先文書３０３に対応する表示として、出力文書９００
の「＜ＡＨＲＥＦ＝”ａｍｂｉｇｕｉｔｙ．ｈｔｍ
ｌ”＞曖昧性の問題＜／Ａ＞」に対応した下線が施され
た「曖昧性の問題」の部分も表示されている。Note that the display area 1001 of FIG. 10 shows the “<TITLE> Recent research and development trend </ T> of the output document 900.
This corresponds to the part of "ITLE>". Display area 10
02 is "<BODY>" and " ”. Note that FIG. 302 in FIG. 10 is the same diagram (image) information as that of FIG.
MG SRC = “TRANSFER.gif”> ”is displayed. Further, in FIG. 10, an output document 900 is displayed as a display corresponding to the link destination document 303 in FIG.
“<A HREF =” ambiguity. htm
The underlined "ambiguity problem" part corresponding to "l"> ambiguity problem </A>"is also displayed.

【００５２】（Ｂ）第２の実施形態次に、本発明による文書処理システムを、タグ付き文書
の機械翻訳システムに適用した第２の実施形態を図面を
参照しながら詳述する。(B) Second Embodiment Next, a second embodiment in which the document processing system according to the present invention is applied to a machine translation system for tagged documents will be described in detail with reference to the drawings.

【００５３】この第２の実施形態の機械翻訳システム
は、第１の実施形態と同様に、タグ付き文書を要約した
翻訳結果を得るようにしたものである。これに加えて、
要約に含まれる図や表に存在する単語の訳語も、翻訳結
果に含めるようにしたものである。As with the first embodiment, the machine translation system of the second embodiment is adapted to obtain a translation result summarizing a tagged document. In addition to this,
The translations of the words existing in the figures and tables included in the abstract are also included in the translation result.

【００５４】（Ｂ−１）第２の実施形態の構成図１１は、この第２の実施形態の機械翻訳システムのブ
ロック図であり、上述した第１の実施形態に係る図１と
の同一、対応部分には同一符号を付して示している。(B-1) Configuration of the Second Embodiment FIG. 11 is a block diagram of the machine translation system of the second embodiment, which is the same as FIG. 1 according to the first embodiment described above. Corresponding parts are designated by the same reference numerals.

【００５５】図１１及び図１との比較から明らかなよう
に、第２の実施形態の機械翻訳システムは、第１の実施
形態の機械翻訳システムの構成に加えて、リンクオブジ
ェクト獲得手段９及び符号化手段１０を備えている。As is clear from comparison with FIG. 11 and FIG. 1, the machine translation system of the second embodiment has the link object acquisition means 9 and the code in addition to the configuration of the machine translation system of the first embodiment. It is provided with a conversion means 10.

【００５６】リンクオブジェクト獲得手段９は、タグ識
別手段４が抽出したイメージタグにより、入出力手段２
を介してネットワーク１からイメージタグが参照する実
体（以下、リンクオブジェクトと呼ぶ）を獲得するもの
である。符号化手段１０は、リンクオブジェクト獲得手
段９が得たリンクオブジェクトのイメージから文字情報
を認識するものである。符号化手段１０としては、ドッ
トパターンでなるイメージ情報からその中に含まれてい
る文字を認識する既存の文字認識装置を適用することが
できる。The link object acquisition means 9 uses the image tag extracted by the tag identification means 4 to input / output means 2
The entity referred to by the image tag (hereinafter referred to as a link object) is acquired from the network 1 via the. The encoding means 10 recognizes character information from the image of the link object obtained by the link object acquisition means 9. As the encoding means 10, it is possible to apply an existing character recognition device for recognizing a character contained in the dot pattern image information.

【００５７】なお、第２の実施形態のタグ識別手段４
は、入力文書２に含まれるタグ情報を識別し、タグ情報
を含む翻訳対象表現を抽出するだけでなく、入力文書２
に含まれる非言語情報（図表やイメージ情報）を参照す
るためのタグ情報（以下、イメージタグと呼ぶ）も抽出
する。The tag identifying means 4 of the second embodiment
Not only identifies the tag information included in the input document 2 and extracts the translation target expression including the tag information, but also the input document 2
Tag information (hereinafter referred to as an image tag) for referring to non-language information (figure or image information) included in is also extracted.

【００５８】この第２の実施形態の機械翻訳システム
は、上述したリンクオブジェクト獲得手段９及び符号化
手段１０を第１の実施形態のシステム構成に追加するこ
とにより、入力文書３の非言語情報（図表やイメージ情
報）を言語情報に変換して翻訳できるようになってい
る。In the machine translation system of the second embodiment, by adding the link object acquisition means 9 and the encoding means 10 described above to the system configuration of the first embodiment, the non-language information of the input document 3 ( (Figures and image information) can be converted into language information and translated.

【００５９】なお、図３に示した入力文書３０１（３）
を、この第２の実施形態の機械翻訳システムで処理した
出力文書８（１６００）を図１６を示し、その出力文書
８（１６００）の表示画面を図１７に示している。これ
ら図１６及び図１７を、上述した図８及び図９と比較す
ることにより、この第２の実施形態の機械翻訳システム
によれば、第１の実施形態では実行できなかった、入力
文書３の非言語情報（図表やイメージ情報）を言語情報
に変換して翻訳することができていることが分かる。The input document 301 (3) shown in FIG.
FIG. 16 shows an output document 8 (1600) processed by the machine translation system of the second embodiment, and FIG. 17 shows a display screen of the output document 8 (1600). By comparing these FIGS. 16 and 17 with FIGS. 8 and 9 described above, according to the machine translation system of the second embodiment, the input document 3 of the input document 3 which could not be executed in the first embodiment is displayed. It can be seen that non-linguistic information (figure and image information) can be converted into linguistic information and translated.

【００６０】（Ｂ−２）第２の実施形態の動作次に、以上のようなタグ付き文書を翻訳処理する第２の
実施形態の機械翻訳システムの動作を説明する。(B-2) Operation of the Second Embodiment Next, the operation of the machine translation system of the second embodiment for translating a tagged document as described above will be described.

【００６１】図１２は、第２の実施形態の機械翻訳シス
テム全体の動作を示すフローチャートである。FIG. 12 is a flow chart showing the operation of the entire machine translation system of the second embodiment.

【００６２】第２の実施形態の機械翻訳システムも、上
述したように、タグ付き文書のタグ情報が付与された部
分には、当該文書の作者の何らかの意図が働いていると
考え、その部分のみに限定して要約を実施することによ
り、入力文書３の必要最小限の情報を抽出して翻訳する
ものであり、そのために入力文書３に含まれるタグ情報
に着目し、タグ情報が付与された表現を翻訳対象として
それを翻訳して出力文書８を生成して表示させる。加え
て、入力文書３の非言語情報（図表やイメージ情報）か
ら言語情報を認識して翻訳し、作者の意図を更に正確に
伝達することを実現する。Also in the machine translation system of the second embodiment, as described above, it is considered that some part of the tagged document to which the tag information is added has some intention of the author of the document, and only that part is considered. By performing the summary only in the above, the minimum necessary information of the input document 3 is extracted and translated. Therefore, the tag information included in the input document 3 is focused and the tag information is added. The expression is used as a translation target and is translated to generate and display the output document 8. In addition, it recognizes linguistic information from the non-linguistic information (figure or image information) of the input document 3 and translates it, thereby more accurately transmitting the author's intention.

【００６３】まず、タグ識別手段４が、入出力手段２を
介して、ネットワーク１からタグ付きの入力文書３を入
力する（ステップ５０１）。入力文書３にはタグ情報が
含まれているので、タグ識別手段４は、翻訳対象タグ情
報格納手段５から、翻訳対象タグ情報２００を得て、入
力文書３の翻訳対象部分を識別する（ステップ１２０
１）。First, the tag identifying means 4 inputs the input document 3 with a tag from the network 1 via the input / output means 2 (step 501). Since the input document 3 contains the tag information, the tag identifying means 4 obtains the translation target tag information 200 from the translation target tag information storage means 5 and identifies the translation target part of the input document 3 (step 120
1).

【００６４】ここで、識別されたタグが非言語情報（図
表やイメージ情報）を参照するイメージタグであれば
（ステップ１２０２）、リンクオブジェクト獲得手段９
が、入出力手段２を介して、ネットワーク１から当該リ
ンクタグに対応するリンクオブジェクトを得る（ステッ
プ１２０３）。しかる後に、符号化手段１０が、当該リ
ンクオブジェクトから言語情報、すなわち、文字列を認
識する（ステップ１２０４）。タグ識別手段４は、符号
化手段１０が認識した文字列をリンクオブジェクト獲得
手段９を介して得て、リンクタグの文字列と認識文字列
を翻訳対象として翻訳手段６へ渡す（ステップ１２０
５）。If the identified tag is an image tag that refers to non-language information (table or image information) (step 1202), the link object acquisition means 9
Acquires the link object corresponding to the link tag from the network 1 via the input / output unit 2 (step 1203). Thereafter, the encoding means 10 recognizes the language information, that is, the character string, from the link object (step 1204). The tag identification means 4 obtains the character string recognized by the encoding means 10 via the link object acquisition means 9, and passes the character string of the link tag and the recognized character string to the translation means 6 as a translation target (step 120).
5).

【００６５】一方、上述のステップ１２０２の判断で、
タグ識別手段４が識別したタグ情報がイメージタグでな
いならば、タグ識別手段４は、それを翻訳手段６に渡す
（ステップ１２０６）。On the other hand, in the judgment at step 1202 described above,
If the tag information identified by the tag identifying means 4 is not an image tag, the tag identifying means 4 passes it to the translating means 6 (step 1206).

【００６６】翻訳手段６は、タグ識別手段４から得た翻
訳対象を翻訳し、その結果を出力文書生成手段７に渡す
（ステップ５０３）。出力文書生成手段７は、翻訳手段
６から得た翻訳結果を入力文書３を参照しながら整形し
て出力文書８に格納する（ステップ５０４）。The translation means 6 translates the translation object obtained from the tag identification means 4 and transfers the result to the output document generation means 7 (step 503). The output document generation means 7 shapes the translation result obtained from the translation means 6 while referring to the input document 3 and stores it in the output document 8 (step 504).

【００６７】以上のステップ５０１〜ステップ５０４の
動作は、タグ識別手段４が入力文書３の全てを処理し、
出力文書生成手段７が翻訳手段６から全ての翻訳結果を
受け取り、出力文書８の整形が完了するまで繰り返され
る（ステップ５０５）。In the operations of the above steps 501 to 504, the tag identifying means 4 processes all of the input document 3,
The output document generation means 7 receives all the translation results from the translation means 6 and repeats until the shaping of the output document 8 is completed (step 505).

【００６８】入力文書３が全て処理され、出力文書８の
整形が完了したならば、出力文書生成手段７は、出力文
書８を入出力手段２を介して表示し（ステップ５０
６）、第２の実施形態の機械翻訳システムは動作を終了
する（ステップ１２０７）。When all the input documents 3 are processed and the shaping of the output document 8 is completed, the output document generating means 7 displays the output document 8 via the input / output means 2 (step 50).
6), the machine translation system of the second embodiment ends the operation (step 1207).

【００６９】次に、タグ識別手段４の動作を図面を参照
しながら説明する。ここで、図１３が、タグ識別手段４
の動作を示すフローチャートである。Next, the operation of the tag identifying means 4 will be described with reference to the drawings. Here, FIG. 13 shows the tag identifying means 4
6 is a flowchart showing the operation of the first embodiment.

【００７０】タグ識別手段４は、まず、入出力手段２を
介して、ネットワーク１から入力文書３を得る（ステッ
プ５０１）。The tag identifying means 4 first obtains the input document 3 from the network 1 via the input / output means 2 (step 501).

【００７１】次に、得られた入力文書３のタグ情報に関
わる表現を抽出するために、翻訳対象タグ情報格納手段
５によって格納されている翻訳対象タグ情報２００の１
個のタグパターンを得る（ステップ６０１）。次に得ら
れたタグパターンに適合する文字列を入力文書３から検
索する（ステップ６０２）。ここで、当該タグパターン
に適合する文字列が入力文書３に存在するならば（ステ
ップ６０３）、適合した文字列に含まれるタグ情報がイ
メージタグであるか否かをさらに検査する（ステップ１
２０２）。Next, in order to extract the expression related to the tag information of the obtained input document 3, 1 of the translation target tag information 200 stored by the translation target tag information storage means 5 is extracted.
The individual tag patterns are obtained (step 601). Next, the input document 3 is searched for a character string that matches the obtained tag pattern (step 602). If a character string matching the tag pattern is present in the input document 3 (step 603), it is further checked whether the tag information included in the matching character string is an image tag (step 1).
202).

【００７２】当該タグ情報がイメージタグであるなら
ば、当該文字列をリンクオブジェクト獲得手段９に転送
する（ステップ１３０１）。次に、リンクオブジェクト
獲得手段９及び符号化手段１０の動作を経て認識された
文字列をリンクオブジェクト獲得手段９から受け取る
（ステップ１３０２）。そして、イメージタグの文字列
と受け取った認識文字列を翻訳対象として翻訳手段６へ
転送する（ステップ１２０５）。If the tag information is an image tag, the character string is transferred to the link object acquisition means 9 (step 1301). Next, the character string recognized through the operations of the link object acquisition means 9 and the encoding means 10 is received from the link object acquisition means 9 (step 1302). Then, the character string of the image tag and the received recognized character string are transferred to the translation means 6 as translation targets (step 1205).

【００７３】一方、ステップ１２０２の判断で、タグ識
別手段４が識別したタグ情報がイメージタグでないとい
う結果を得たならば、タグ識別手段４は、それを翻訳手
段６に渡す（ステップ１２０６）。On the other hand, if it is determined in step 1202 that the tag information identified by the tag identifying means 4 is not an image tag, the tag identifying means 4 passes it to the translating means 6 (step 1206).

【００７４】また、上述のステップ６０３で、適合する
文字列が入力文書３に存在しない場合には、ステップ１
２０６、及び、ステップ１３０１〜ステップ１２０５の
動作は省略される。If there is no matching character string in the input document 3 in the above step 603, step 1
The operations of 206 and steps 1301 to 1205 are omitted.

【００７５】しかる後に、翻訳対象タグ情報格納手段５
によって格納されている翻訳対象タグ情報２００の全て
のタグパターンで、入力文書３を検索したか否かをチェ
ックし（ステップ６０５）、未だ検索していないタグパ
ターンが存在するならば、ステップ６０１〜ステップ１
２０５、又は、ステップ６０１〜ステップ１２０６を繰
り返す。After that, the translation target tag information storage means 5
It is checked whether or not the input document 3 is searched for in all the tag patterns of the translation target tag information 200 stored by (step 605), and if there is a tag pattern that has not been searched yet, steps 601 to 601 are executed. Step 1
205, or steps 601-1206 are repeated.

【００７６】そして、全てのタグパターンによる検索が
完了したならば（ステップ６０５で肯定結果）、タグ識
別手段４は動作を終了する（ステップ１３０３）。When the search using all the tag patterns is completed (Yes in step 605), the tag identifying means 4 ends the operation (step 1303).

【００７７】次に、リンクオブジェクト獲得手段９の動
作を図面を参照しながら説明する。ここで、図１４が、
リンクオブジェクト獲得手段９の動作を示すフローチャ
ートである。Next, the operation of the link object acquisition means 9 will be described with reference to the drawings. Here, FIG.
7 is a flowchart showing the operation of the link object acquisition means 9.

【００７８】リンクオブジェクト獲得手段９は、まず、
タグ識別手段４が、ステップ１３０１で転送した当該オ
ブジェクトの参照先を示す文字列（リンク情報）を受け
取る（ステップ１４０１）。次に、入出力手段２を介し
て、ネットワーク１から当該リンク情報に対応するリン
クオブジェクトを得る（ステップ１２０３）。そして、
得られたリンクオブジェクトを符号化手段１０に転送し
（ステップ１４０２）、符号化手段１０がリンクオブジ
ェクトから認識した認識文字列を符号化手段１０から得
る（ステップ１４０３）。最後に、リンクオブジェクト
獲得手段９は、当該認識文字列を、タグ識別手段４に返
送し（ステップ１４０４）、動作を終了する（ステップ
１４０５）。The link object acquisition means 9 first
The tag identifying means 4 receives the character string (link information) indicating the reference destination of the object transferred in step 1301 (step 1401). Next, the link object corresponding to the link information is obtained from the network 1 via the input / output unit 2 (step 1203). And
The obtained link object is transferred to the encoding means 10 (step 1402), and the recognition character string recognized by the encoding means 10 from the link object is obtained from the encoding means 10 (step 1403). Finally, the link object acquisition means 9 returns the recognized character string to the tag identification means 4 (step 1404) and ends the operation (step 1405).

【００７９】ここで、タグ識別手段４とリンクオブジェ
クト獲得手段９とは、ステップ１３０１とステップ１４
０１、及び、ステップ１４０４とステップ１３０２とで
互いに同期して動作する。Here, the tag identification means 4 and the link object acquisition means 9 have steps 1301 and 14 respectively.
01, and step 1404 and step 1302 operate in synchronization with each other.

【００８０】次に、符号化手段１０の動作を図面を参照
しながら説明する。ここで、図１５が、符号化手段１０
の動作を示すフローチャートである。Next, the operation of the encoding means 10 will be described with reference to the drawings. Here, FIG. 15 shows the encoding means 10.
6 is a flowchart showing the operation of the first embodiment.

【００８１】符号化手段１０は、まず、リンクオブジェ
クト獲得手段９が、ステップ１４０２で転送したリンク
オブジェクトを受け取る（ステップ１５０１）。次に、
リンクオブジェクト中の文字列を認識する（ステップ１
２０４）。文字列の認識は、既存のいずれかの文字認識
方法によって実現されている。最後に、符号化手段１０
は、認識した文字列をリンクオブジェクト獲得手段９に
返送し（ステップ１５０２）、動作を終了する（ステッ
プ１５０３）。In the encoding means 10, first, the link object acquisition means 9 receives the link object transferred in step 1402 (step 1501). next,
Recognize the character string in the link object (step 1
204). The recognition of the character string is realized by any existing character recognition method. Finally, the encoding means 10
Returns the recognized character string to the link object acquisition means 9 (step 1502) and ends the operation (step 1503).

【００８２】リンクオブジェクト獲得手段９と符号化手
段１０とは、ステップ１４０２とステップ１５０１、及
び、ステップ１５０２とステップ１４０３で互いに同期
して動作する。The link object acquisition means 9 and the encoding means 10 operate in synchronization with each other at step 1402 and step 1501, and at step 1502 and step 1403.

【００８３】なお、第２の実施形態における翻訳手段６
は、図７に示すフローチャートに従って、第１の実施形
態における翻訳手段６と同様に動作する。また、第２の
実施形態における出力文書生成手段７も、図８に示すフ
ローチャートに従って、第１の実施形態における出力文
書生成手段７と同様に動作する。The translation means 6 in the second embodiment
Operates in the same manner as the translation means 6 in the first embodiment according to the flowchart shown in FIG. The output document generation means 7 in the second embodiment also operates in the same manner as the output document generation means 7 in the first embodiment according to the flowchart shown in FIG.

【００８４】以下では、図３に示したタグ付き文書３０
１を入力文書３の例として、また、図２に示した翻訳対
象タグ情報２００を翻訳対象タグ情報格納手段５が格納
しているとして、第２の実施形態の機械翻訳システムの
動作を具体的に説明する。In the following, the tagged document 30 shown in FIG.
1 as an example of the input document 3 and the translation target tag information storage unit 5 storing the translation target tag information 200 shown in FIG. 2, the operation of the machine translation system of the second embodiment will be described in detail. Explained.

【００８５】まず、タグ識別手段４は入力文書３０１を
得る（ステップ５０１）。次に、タグ識別手段４は、翻
訳対象タグ情報格納手段５が格納する翻訳対象タグ情報
２００の第１のタグパターン２０１を得る（ステップ６
０１）。得られたタグパターンは「＜Ｉ＞．＋＜／Ｉ
＞」であるので、これに適合する文字列を入力文書３０
１から検索する（ステップ６０２）。その結果、「＜Ｉ
＞ｐａｒｓｉｎｇ＜／Ｉ＞」、「＜Ｉ＞ｔｒａｎｓｆｅ
ｒｒｉｎｇ＜／Ｉ＞」、「＜Ｉ＞ｇｅｎｅｒａｔｉｎｇ
＜／Ｉ＞」及び「＜Ｉ＞ｔｒａｎｓｆｅｒｍｅｔｈｏ
ｄ＜／Ｉ＞」とがタグパターン２０１に適合する文字列
として抽出され（ステップ６０３）、さらに、それら
は、イメージタグではないので、タグ識別手段４は、そ
れらの文字列を翻訳対象として翻訳手段６に転送する
（ステップ１２０６）。First, the tag identifying means 4 obtains the input document 301 (step 501). Next, the tag identifying means 4 obtains the first tag pattern 201 of the translation target tag information 200 stored in the translation target tag information storage means 5 (step 6).
01). The obtained tag pattern is ". + ”, A character string matching this is input document 30
Search from 1 (step 602). As a result, " Parsing , “ transfer
"ring ", "generation"
 ”and“ transfer method ”
d ”is extracted as a character string that matches the tag pattern 201 (step 603), and since they are not image tags, the tag identifying means 4 translates these character strings as translation targets. Transfer to the means 6 (step 1206).

【００８６】翻訳手段６は、これら４つの翻訳対象を得
て（ステップ７０１）、それらを翻訳し、それぞれ「＜
Ｉ＞解析＜／Ｉ＞」、「＜Ｉ＞変換＜／Ｉ＞」、「＜Ｉ
＞生成＜／Ｉ＞」、「＜Ｉ＞トランスファ方式＜／Ｉ
＞」なる翻訳結果を得る（ステップ７０２）。翻訳手段
６は、これら４つの翻訳結果を出力文書生成手段７に転
送する（ステップ７０３）。The translation means 6 obtains these four translation objects (step 701), translates them, and outputs "<
“I> analysis ”, “ conversion ”, “ Generation , “ Transfer method 
> ”Is obtained (step 702). The translation means 6 transfers these four translation results to the output document generation means 7 (step 703).

【００８７】出力文書生成手段７は、翻訳手段６から得
た４つの翻訳結果を、入力文書３０１の書式を参照しな
がら出力文書１６００の所定の位置（図１６参照）に格
納する（ステップ８０２）。The output document generation means 7 stores the four translation results obtained from the translation means 6 in a predetermined position (see FIG. 16) of the output document 1600 while referring to the format of the input document 301 (step 802). .

【００８８】その後、タグ識別手段４において、翻訳対
象タグ情報２００の次のタグパターンが残されているの
で（ステップ６０５）、以下同様にして、入力文書３０
１のタグパターン２０２、…、２０７に適合する文字列
が順次翻訳され、出力文書１６００に格納される。Thereafter, since the tag pattern next to the translation target tag information 200 is left in the tag identifying means 4 (step 605), the input document 30 is similarly processed.
, 207 of 1 are sequentially translated and stored in the output document 1600.

【００８９】次に、タグ識別手段４は、翻訳対象タグ情
報格納手段５が格納する翻訳対象タグ情報２００の第８
のタグパターン２０８を得る（ステップ６０１）。その
結果、「＜ｃｅｎｔｅｒ＞＜ＩＭＧＳＲＣ＝”ＴＲＡ
ＮＳＦＥＲ．ｇｉｆ”＞＜／ｃｅｎｔｅｒ＞」がタグパ
ターン２０８に適合する文字列として抽出される（ステ
ップ６０３）。ここで、当該文字列に含まれるタグはイ
メージタグ（＜ＩＭＧ．．．＞）であるので（ステップ
１２０２）、当該文字列が、リンクオブジェクト獲得手
段９に転送される（ステップ１３０１）。Next, the tag identifying means 4 determines the eighth of the translation target tag information 200 stored in the translation target tag information storage means 5.
The tag pattern 208 is obtained (step 601). As a result, "<center><IMG SRC =" TRA
NSFER. gif "></center>" is extracted as a character string that matches the tag pattern 208 (step 603). Here, since the tag included in the character string is an image tag (<IMG ...>) (step 1202), the character string is transferred to the link object acquisition means 9 (step 1301).

【００９０】リンクオブジェクト獲得手段９は、「＜ｃ
ｅｎｔｅｒ＞＜ＩＭＧＳＲＣ＝”ＴＲＡＮＳＦＥＲ．
ｇｉｆ”＞＜／ｃｅｎｔｅｒ＞」に適合するリンクオブ
ジェクト３０２をネットワーク１から入出力手段２を介
して得て（ステップ１２０３）、さらに、符号化手段１
０が、リンクオブジェクト３０２から文字列「Ｏｒｉｇ
ｉｎａｌＳｅｎｔｅｎｃｅ」、「Ｐａｒｓｉｎｇ」、
…、「ＴｒａｎｓｆｅｒＭｅｔｈｏｄ」を得る（ステ
ップ１２０４）。タグ識別手段４は、こうして得られた
６つの認識文字列と、リンクタグを含む文字列「＜ｃｅ
ｎｔｅｒ＞＜ＩＭＧＳＲＣ＝”ＴＲＡＮＳＦＥＲ．ｇ
ｉｆ”＞＜／ｃｅｎｔｅｒ＞」を翻訳手段６に転送する
（ステップ１２０５）。The link object acquisition means 9 displays "<c
enter><IMG SRC = “TRANSFER.
A link object 302 conforming to gif "></center>" is obtained from the network 1 via the input / output unit 2 (step 1203), and further, the encoding unit 1
0 is the character string “Orig from the link object 302.
"internal Sentence", "Parsing",
..., "Transfer Method" is obtained (step 1204). The tag identifying means 4 recognizes the six recognized character strings thus obtained and the character string “<ce which includes the link tag.
inter><IMG SRC = “TRANSFER.g
If "></center>" is transferred to the translation means 6 (step 1205).

【００９１】なお、その際、認識文字列に翻訳手段６に
対して非翻訳を指示するタグ（ＨＴＭＬでは、＜ＰＲＥ
＞．．．＜／ＰＲＥ＞）を付与して、認識文字列と対応
するイメージとの関係をより明確にするように工夫して
も良い。その結果、「＜ｃｅｎｔｅｒ＞＜ＩＭＧＳＲ
Ｃ＝”ＴＲＡＮＳＦＥＲ．ｇｉｆ”＞＜／ｃｅｎｔｅｒ
＞」、「＜ＰＲＥ＞［ＯｒｉｇｉｎａｌＳｅｎｔｅｎ
ｃｅ：＜／ＰＲＥ＞ＯｒｉｇｉｎａｌＳｅｎｔｅｎ
ｃｅ＜ＰＲＥ＞］＜／ＰＲＥ＞」、「＜ＰＲＥ＞［Ｐａ
ｒｓｉｎｇ：＜／ＰＲＥ＞Ｐａｒｓｉｎｇ＜ＰＲＥ
＞］＜／ＰＲＥ＞」、…、「＜ＰＲＥ＞［Ｔｒａｎｓｆ
ｅｒＭｅｔｈｏｄ：＜／ＰＲＥ＞Ｔｒａｎｓｆｅｒ
Ｍｅｔｈｏｄ＜ＰＲＥ＞］＜／ＰＲＥ＞」とが、翻訳
手段６に渡される。At this time, a tag (in HTML, <PRE in HTML is used to instruct the translation means 6 to perform non-translation on the recognition character string.
>. . . </ PRE>) may be added to make the relationship between the recognized character string and the corresponding image clearer. As a result, “<center><IMG SR
C = “TRANSFER.gif”></ center
> ”,“ <PRE> [Original Senten
ce: </ PRE> Original Senten
ce <PRE>] </ PRE> ”,“ <PRE> [Pa
rsing: </ PRE> Parsing <PRE
>] </ PRE> ”, ...,“ <PRE> [Transf
er Method: </ PRE> Transfer
Method <PRE>] </ PRE> ”is passed to the translation means 6.

【００９２】翻訳手段６は、これらの７つの翻訳対象を
得て（ステップ７０１）、それらを翻訳し、それぞれ
「＜ｃｅｎｔｅｒ＞＜ＩＭＧＳＲＣ＝”ＴＲＡＮＳＦ
ＥＲ．ｇｉｆ”＞＜／ｃｅｎｔｅｒ＞」、「［Ｏｒｉｇ
ｉｎａｌＳｅｎｔｅｎｃｅ：原文］」、「［Ｐａｒｓ
ｉｎｇ：解析］」、…、「［ＴｒａｎｓｆｅｒＭｅｔ
ｈｏｄ：トランスファ方式］」なる翻訳結果を得る（ス
テップ７０２）。翻訳手段６は、これら７つの翻訳結果
を出力文書生成手段７に転送する（ステップ７０３）。The translating means 6 obtains these seven translation objects (step 701), translates them, and outputs "<center><IMG SRC =" TRANSF.
ER. gif ”></center>”, “[Orig
internal Sentence: original text, ”“ [Pars
ing: Analysis] ”, ...,“ [Transfer Met
"hod: transfer method]" is obtained (step 702). The translation means 6 transfers these seven translation results to the output document generation means 7 (step 703).

【００９３】出力文書生成手段７は、翻訳手段６から得
た７つの翻訳結果を、入力文書３０１の書式を参照しな
がら出力文書１６００の所定の位置（図１６の符号１６
０１参照）に格納する（ステップ８０２）。The output document generation means 7 refers to the seven translation results obtained from the translation means 6 with reference to the format of the input document 301, and outputs the output document 1600 at a predetermined position (reference numeral 16 in FIG. 16).
No. 01) (step 802).

【００９４】以下同様にして、入力文書３０１のタグパ
ターン２０９に適合する文字列が順次翻訳され、出力文
書１６００に格納され、最終的に図１７に示す出力文書
の表示画面１７００が入出力手段２を介して表示される
（ステップ５０６）。Similarly, the character strings conforming to the tag pattern 209 of the input document 301 are sequentially translated and stored in the output document 1600. Finally, the output document display screen 1700 shown in FIG. Is displayed via (step 506).

【００９５】（Ｂ−３）第２の実施形態の効果この第２の実施形態の機械翻訳システムによっても、第
１の実施形態の技術的思想をそのまま有するので、第１
の実施形態が有していた効果を奏することができる。(B-3) Effect of Second Embodiment The machine translation system of the second embodiment also has the technical idea of the first embodiment as it is.
It is possible to achieve the effect that the embodiment described above has.

【００９６】これに加えて、第２の実施形態によれば、
入力文書３を要約して翻訳した結果に、イメージ部分の
文字列情報が翻訳された内容（図１６の符号１６０１参
照）を付加した出力文書８を得ることができ、作者の意
図をより一段と明確にすることができる。In addition to this, according to the second embodiment,
It is possible to obtain the output document 8 in which the translated content of the character string information of the image portion (see reference numeral 1601 in FIG. 16) is added to the result of the translation of the input document 3 in a summarized manner, and the intention of the author is further clarified. Can be

【００９７】図３に示す入力文書３０１と、図１６に示
す出力文書１６００を比べてみると、文字列が原言語
（英語）で表示されるイメージ部分３０２に対しても、
文字列部分が翻訳されて日本語で出力されているので、
作者の意図がより明確になっていることが判る。また、
このことは、図１６に示す出力文書１６００の表示画面
１７００を参照することで更に明らかである。Comparing the input document 301 shown in FIG. 3 with the output document 1600 shown in FIG. 16, even for the image portion 302 in which the character string is displayed in the original language (English),
Since the character string part is translated and output in Japanese,
It can be seen that the author's intention is clearer. Also,
This is further apparent by referring to the display screen 1700 of the output document 1600 shown in FIG.

【００９８】なお、図１７にある表示域３０２は、図３
のそれと同様の図（イメージ）情報であり、その下に、
この表示域３０２（図）の文字列部分が翻訳された表示
域１６０１が挿入されている。The display area 302 shown in FIG. 17 is the same as that shown in FIG.
Figure (image) information similar to that of
A display area 1601 in which a character string portion of this display area 302 (FIG.) Is translated is inserted.

【００９９】（Ｃ）第３の実施形態次に、本発明による文書処理システムを、タグ付き文書
の機械翻訳システムに適用した第３の実施形態を図面を
参照しながら詳述する。(C) Third Embodiment Next, a third embodiment in which the document processing system according to the present invention is applied to a machine translation system for tagged documents will be described in detail with reference to the drawings.

【０１００】この第３の実施形態の機械翻訳システム
は、第２の実施形態と同様な機能に加えて、入力文書の
翻訳結果を表示する際に、入力文書で指定されているリ
ンク先文書の翻訳結果をも併せて表示させる機能を有す
るものである。The machine translation system of the third embodiment has the same function as that of the second embodiment, and in addition to the function of the second embodiment, when displaying the translation result of the input document, the link destination document specified in the input document is displayed. It also has a function of displaying the translation result together.

【０１０１】（Ｃ−１）第３の実施形態の構成図１８は、この第３の実施形態の機械翻訳システムのブ
ロック図であり、上述した第２の実施形態に係る図１１
との同一、対応部分には同一符号を付して示している。(C-1) Configuration of Third Embodiment FIG. 18 is a block diagram of a machine translation system of the third embodiment, and FIG. 11 according to the second embodiment described above.
The same and corresponding parts as those shown in FIG.

【０１０２】図１８及び図１１との比較から明らかなよ
うに、第３の実施形態の機械翻訳システムは、第２の実
施形態の機械翻訳システムの構成に加えて、オブジェク
ト格納手段１１を備えている。As is clear from comparison with FIG. 18 and FIG. 11, the machine translation system of the third embodiment has an object storage means 11 in addition to the configuration of the machine translation system of the second embodiment. There is.

【０１０３】オブジェクト格納手段１１は、リンクオブ
ジェクト獲得手段９が得たリンクオブジェクトを新たな
入力文書とするために格納するものである。The object storage means 11 stores the link object obtained by the link object acquisition means 9 in order to make it a new input document.

【０１０４】なお、第３の実施形態のタグ識別手段４
は、入力文書２に含まれるタグを識別し、タグを含む翻
訳対象表現及び入力文書２に含まれるメージタグ情報を
抽出するだけでなく、他の文書へのリンクを示すタグ情
報（以下、リンクタグと呼ぶ）も抽出する。The tag identifying means 4 of the third embodiment is used.
Identifies not only the tag included in the input document 2 and the translation target expression including the tag and the image tag information included in the input document 2, but also tag information indicating a link to another document (hereinafter referred to as a link tag). Also called).

【０１０５】なお、第３の実施形態の機械翻訳システム
は、第２の実施形態の構成にオブジェクト格納手段１１
を追加することにより、入力文書３が参照する文書（リ
ンク先文書）を新たな入力文書として翻訳するできるよ
うにしている。The machine translation system according to the third embodiment has the same structure as that of the second embodiment except that the object storage means 11 is used.
Is added, the document referred to by the input document 3 (link destination document) can be translated as a new input document.

【０１０６】ここで、図３に示した入力文書３０１
（３）を、この第３の実施形態の機械翻訳システムで処
理した出力文書８（２２００）を図２２に示し、その出
力文書８（２２００）の表示画面を図２３に示してい
る。これら図２２及び図２３を、上述した図１６及び図
１７と比較することにより、この第３の実施形態の機械
翻訳システムによれば、第２の実施形態では実行できな
かった、リンク関係にある複数の文書の翻訳結果の同時
表示ができていることが分かる。Here, the input document 301 shown in FIG.
An output document 8 (2200) obtained by processing (3) in the machine translation system of the third embodiment is shown in FIG. 22, and a display screen of the output document 8 (2200) is shown in FIG. By comparing these FIG. 22 and FIG. 23 with the above-mentioned FIG. 16 and FIG. 17, the machine translation system of this third embodiment has a link relationship that could not be executed in the second embodiment. It can be seen that the translation results of multiple documents can be displayed simultaneously.

【０１０７】（Ｃ−２）第３の実施形態の動作次に、以上のようなタグ付き文書を翻訳処理する第３の
実施形態の機械翻訳システムの動作を説明する。(C-2) Operation of the Third Embodiment Next, the operation of the machine translation system of the third embodiment for translating a tagged document as described above will be described.

【０１０８】図１９は、第３の実施形態の機械翻訳シス
テム全体の動作を示すフローチャートである。FIG. 19 is a flow chart showing the operation of the entire machine translation system of the third embodiment.

【０１０９】第３の実施形態の機械翻訳システムは、第
２の実施形態の機械翻訳システムが有する機能を実現す
るように動作するだけでなく、加えて、入力文書３のリ
ンクタグを参照することによりリンクしている文書を獲
得し、それを新たな入力文書として翻訳して、第２の実
施形態ではリンク情報だけしか示されないリンク先の文
書について、その要約翻訳結果もリンク元の入力文書の
翻訳結果に付加して表示するように動作する。The machine translation system of the third embodiment not only operates so as to realize the functions of the machine translation system of the second embodiment, but also refers to the link tag of the input document 3. A linked document is acquired and translated as a new input document, and the summary translation result of the link destination document in which only the link information is shown in the second embodiment is also the input source document of the link source. It operates to add and display the translation result.

【０１１０】まず、タグ識別手段４が、入出力手段２を
介してネットワーク１からタグ付きの入力文書３を入力
する（ステップ５０１）。入力文書３にはタグ情報が含
まれているので、タグ識別手段４は、翻訳対象タグ情報
格納手段５から翻訳対象タグ情報２００を得て、入力文
書３の翻訳対象を識別する（ステップ１２０１）。First, the tag identifying means 4 inputs the input document 3 with a tag from the network 1 via the input / output means 2 (step 501). Since the input document 3 includes the tag information, the tag identifying unit 4 obtains the translation target tag information 200 from the translation target tag information storage unit 5 and identifies the translation target of the input document 3 (step 1201). .

【０１１１】ここで、識別されたタグが非言語情報（図
表やイメージ情報）を参照するイメージタグであれば
（ステップ１２０２）、リンクオブジェクト獲得手段９
が、入出力手段２を介して、ネットワーク１から当該リ
ンクタグに対応するリンクオブジェクトを得る（ステッ
プ１２０３）。しかる後に、符号化手段１０が、当該リ
ンクオブジェクトから言語情報、すなわち、文字列を認
識する（ステップ１２０４）。タグ識別手段４は、符号
化手段１０が認識した文字列をリンクオブジェクト獲得
手段９を介して得て、リンクタグの文字列と認識文字列
を翻訳対象として翻訳手段６へ渡す（ステップ１２０
５）。If the identified tag is an image tag that refers to non-language information (figure or image information) (step 1202), the link object acquisition means 9
Acquires the link object corresponding to the link tag from the network 1 via the input / output unit 2 (step 1203). Thereafter, the encoding means 10 recognizes the language information, that is, the character string, from the link object (step 1204). The tag identification means 4 obtains the character string recognized by the encoding means 10 via the link object acquisition means 9, and passes the character string of the link tag and the recognized character string to the translation means 6 as a translation target (step 120).
5).

【０１１２】一方、上述したステップ１２０２の判断で
否定結果を得たならば、リンクタグか否かを判断する
（ステップ１９０１）。タグ識別手段４が識別したタグ
情報がリンクタグであるならば（ステップ１９０１）、
リンクオブジェクト獲得手段９が、入出力手段２を介し
て、ネットワーク１から当該リンクタグに対応するリン
クオブジェクトを得る（ステップ１９０２）。しかる後
に、リンクオブジェクト獲得手段９が、獲得したリンク
オブジェクトをオブジェクト格納手段１１に転送し、オ
ブジェクト格納手段１１は、格納したオブジェクトを新
たな入力文書としてタグ識別手段４に渡す（ステップ１
９０３）。タグ識別手段４は、オブジェクト格納手段１
１からのリンクオブジェクトの転送を受けると、それま
での動作を保留して、転送されたオブジェクトを新たな
入力文書とした動作を開始する（ステップ１９０４）。
タグ識別手段４は、新たな入力文書に対する動作を終え
ると、保留していた状態に復帰し、動作を続行する（ス
テップ５０３へ進む）。On the other hand, if a negative result is obtained in the above step 1202, it is judged whether or not it is a link tag (step 1901). If the tag information identified by the tag identifying means 4 is a link tag (step 1901),
The link object acquisition unit 9 obtains the link object corresponding to the link tag from the network 1 via the input / output unit 2 (step 1902). After that, the link object acquisition means 9 transfers the acquired link object to the object storage means 11, and the object storage means 11 passes the stored object to the tag identification means 4 as a new input document (step 1).
903). The tag identifying means 4 is the object storing means 1
When the transfer of the link object from No. 1 is received, the operation up to that point is suspended, and the operation of using the transferred object as a new input document is started (step 1904).
When the tag identifying means 4 finishes the operation for the new input document, it returns to the suspended state and continues the operation (proceeds to step 503).

【０１１３】一方、上述したステップ１９０１の判断
で、タグ識別手段４が識別したタグ情報がリンクタグで
ないならば、すなわち、タグ識別手段４が識別したタグ
が、イメージタグでもリンクタグでもないならば、タグ
識別手段４は、それを翻訳手段６に渡す（ステップ１２
０６）。On the other hand, if the tag information identified by the tag identifying means 4 is not a link tag in the determination in step 1901, that is, if the tag identified by the tag identifying means 4 is neither an image tag nor a link tag. , The tag identification means 4 passes it to the translation means 6 (step 12).
06).

【０１１４】翻訳手段６は、タグ識別手段４から得た翻
訳対象を翻訳し、その結果を出力文書生成手段７に渡す
（ステップ５０３）。出力文書生成手段７は、翻訳手段
６から得た翻訳結果を入力文書３を参照しながら整形し
て出力文書８に格納する（ステップ５０４）。The translation means 6 translates the translation object obtained from the tag identification means 4 and passes the result to the output document generation means 7 (step 503). The output document generation means 7 shapes the translation result obtained from the translation means 6 while referring to the input document 3 and stores it in the output document 8 (step 504).

【０１１５】以上のステップ５０１〜ステップ５０４の
動作は、タグ識別手段４が入力文書３の全てを処理し、
出力文書生成手段７が翻訳手段６から全ての翻訳結果を
受け取り、出力文書８の整形が完了するまで繰り返され
る（ステップ５０５）。入力文書３が全て処理され、出
力文書８の整形が完了したならば、出力文書生成手段７
は、出力文書８を入出力手段２を介して表示し（ステッ
プ５０６）、この機械翻訳システムは動作を終了する
（ステップ１９０５）。In the operations of the above steps 501 to 504, the tag identifying means 4 processes all of the input document 3,
The output document generation means 7 receives all the translation results from the translation means 6 and repeats until the shaping of the output document 8 is completed (step 505). When the input document 3 is all processed and the shaping of the output document 8 is completed, the output document generating means 7
Displays the output document 8 via the input / output unit 2 (step 506), and the machine translation system ends the operation (step 1905).

【０１１６】次に、タグ識別手段４の動作を図面を参照
しながら説明する。ここで、図２０が、第３の実施形態
のタグ識別手段４の動作を示すフローチャートである。Next, the operation of the tag identifying means 4 will be described with reference to the drawings. Here, FIG. 20 is a flowchart showing the operation of the tag identifying means 4 of the third exemplary embodiment.

【０１１７】タグ識別手段４は、まず、オブジェクト格
納手段１１からのオブジェクトの転送があるか否かを判
断する（ステップ２００１）。オブジェクトの転送があ
れば、それまでの動作を保留して、オブジェクト格納手
段１１から文書を入力し、それを新たな入力文書として
新規に動作を開始するためにステップ６０１へ進む（ス
テップ２００２）。なお、タグ識別手段４は、新たな入
力文書に対する動作を終えると、保留していた状態に復
帰し、動作を続行する。The tag identifying means 4 first determines whether or not an object is transferred from the object storage means 11 (step 2001). If there is an object transfer, the operation up to that point is suspended, a document is input from the object storage means 11, and the operation proceeds to step 601 in order to newly start the operation as a new input document (step 2002). It should be noted that when the tag identifying means 4 finishes the operation for the new input document, it returns to the suspended state and continues the operation.

【０１１８】また、オブジェクト格納手段１１からのオ
ブジェクトの転送がない間は（ステップ２００１）、タ
グ識別手段４は、まず、入出力手段２を介して、ネット
ワーク１から入力文書３を得る（ステップ５０１）。While the object is not transferred from the object storage means 11 (step 2001), the tag identification means 4 first obtains the input document 3 from the network 1 via the input / output means 2 (step 501). ).

【０１１９】次に、得られた入力文書３（オブジェクト
格納手段１１からの文書のこともあり得る）のタグ情報
に関わる表現を抽出するために、翻訳対象タグ情報格納
手段５によって格納されている翻訳対象タグ情報２００
のタグパターンを得る（ステップ６０１）。次に得られ
たタグパターンに適合する文字列を入力文書３から検索
し（ステップ６０２）、適合する文字列の存在の有無を
判断する（ステップ６０３）。Next, in order to extract the expression relating to the tag information of the obtained input document 3 (which may be a document from the object storage means 11), it is stored by the translation target tag information storage means 5. Translation target tag information 200
(Step 601). Next, the input document 3 is searched for a character string that matches the obtained tag pattern (step 602), and it is determined whether or not a matching character string exists (step 603).

【０１２０】ここで、当該タグパターンに適合する文字
列が入力文書３に存在するならば（ステップ６０３で肯
定結果）、適合した文字列に含まれるタグ情報がイメー
ジタグ又はリンクタグであるか否かを検査し（ステップ
２００３）、当該タグ情報がイメージタグ又はリンクタ
グであるならば、当該文字列をリンクオブジェクト獲得
手段９に転送する（ステップ１３０１）。If a character string matching the tag pattern is present in the input document 3 (Yes in step 603), whether the tag information included in the matching character string is an image tag or a link tag. It is inspected (step 2003), and if the tag information is an image tag or a link tag, the character string is transferred to the link object acquisition means 9 (step 1301).

【０１２１】次に、リンクオブジェクト獲得手段９及び
符号化手段１０の動作を経て認識された文字列をリンク
オブジェクト獲得手段９から受け取る（ステップ１３０
２）。そして、イメージタグの文字列と受け取った認識
文字列を翻訳対象として翻訳手段６へ転送する（ステッ
プ１２０５）。なお、タグ情報がイメージタグの場合に
これらステップ１３０２及びステップ１２０５が有効に
機能し、タグ情報がリンクタグの場合には、リンクオブ
ジェクト獲得手段９の動作により、上述したステップ２
００２が実行され、これらステップ１３０２及びステッ
プ１２０５は機能しない。Next, the character string recognized through the operations of the link object acquisition means 9 and the encoding means 10 is received from the link object acquisition means 9 (step 130).
2). Then, the character string of the image tag and the received recognized character string are transferred to the translation means 6 as translation targets (step 1205). When the tag information is an image tag, these steps 1302 and 1205 function effectively, and when the tag information is a link tag, the operation of the link object acquisition means 9 causes the above step 2 to be performed.
002 is executed, and these steps 1302 and 1205 do not work.

【０１２２】一方、上述したステップ２００３の判断
で、タグ識別手段４が識別したタグがイメージタグでも
リンクタグでもないならば、タグ識別手段４は、それを
翻訳手段６に渡す（ステップ１２０６）。On the other hand, if the tag identified by the tag identifying means 4 is neither an image tag nor a link tag in the judgment of step 2003, the tag identifying means 4 passes it to the translating means 6 (step 1206).

【０１２３】また、上述したステップ６０３で、適合す
る文字列が入力文書３に存在しない場合には、ステップ
１２０６及びステップ１３０１〜１２０５の動作は省略
される。If no matching character string exists in the input document 3 in step 603, the operations of step 1206 and steps 1301 to 1205 are omitted.

【０１２４】しかる後に、翻訳対象タグ情報格納手段５
によって格納されている翻訳対象タグ情報２００の全て
のタグパターンで、入力文書３を検索したか否かをチェ
ックし（ステップ６０５）、未だ検索していないタグパ
ターンが存在するならば、ステップ６０１〜ステップ１
２０５、又は、ステップ６０１〜ステップ１２０６を繰
り返す。After that, the translation target tag information storage means 5
It is checked whether or not the input document 3 is searched for in all the tag patterns of the translation target tag information 200 stored by (step 605), and if there is a tag pattern that has not been searched yet, steps 601 to 601 are executed. Step 1
205, or steps 601-1206 are repeated.

【０１２５】そして、全てのタグパターンによる検索が
完了したならば、タグ識別手段４は動作を終了する（ス
テップ１３０３）。When the search using all the tag patterns is completed, the tag identifying means 4 ends the operation (step 1303).

【０１２６】次に、第３の実施形態のリンクオブジェク
ト獲得手段９の動作を図面を参照しながら説明する。こ
こで、図２１が、第３の実施形態のリンクオブジェクト
獲得手段９の動作を示すフローチャートである。Next, the operation of the link object acquisition means 9 of the third embodiment will be described with reference to the drawings. Here, FIG. 21 is a flow chart showing the operation of the link object acquisition means 9 of the third exemplary embodiment.

【０１２７】リンクオブジェクト獲得手段９は、まず、
タグ識別手段４が、ステップ１３０１で転送した当該オ
ブジェクトの参照先を示す文字列（リンク情報）を受け
取り（ステップ１４０１）。イメージタグかリンクタグ
かを識別する（ステップ２１０１）。The link object acquisition means 9 first
The tag identifying means 4 receives the character string (link information) indicating the reference destination of the object transferred in step 1301 (step 1401). An image tag or a link tag is identified (step 2101).

【０１２８】ここで、転送されてきたリンク情報がイメ
ージタグであるならば、リンクオブジェクト獲得手段９
は、入出力手段２を介して、ネットワーク１から当該リ
ンク情報に対応するリンクオブジェクトを得る（ステッ
プ１２０３）。この場合は、リンクオブジェクトとして
イメージ情報が得られる。そして、得られたリンクオブ
ジェクトを符号化手段１０に転送し（ステップ１４０
２）、符号化手段１０がリンクオブジェクトから認識し
た認識文字列を符号化手段１０から得る（ステップ１４
０３）。最後に、リンクオブジェクト獲得手段９は、当
該認識文字列を、タグ識別手段４に返送し（ステップ１
４０４）、動作を終了する（ステップ２１０３）。If the transferred link information is an image tag, the link object acquisition means 9
Acquires a link object corresponding to the link information from the network 1 via the input / output unit 2 (step 1203). In this case, image information is obtained as a link object. Then, the obtained link object is transferred to the encoding means 10 (step 140).
2) The recognition character string recognized by the encoding means 10 from the link object is obtained from the encoding means 10 (step 14).
03). Finally, the link object acquisition means 9 returns the recognized character string to the tag identification means 4 (step 1
404), the operation ends (step 2103).

【０１２９】一方、転送されてきたリンク情報がイメー
ジタグでなく別の文書を参照するリンクタグであるなら
ば（ステップ２１０１）、リンクオブジェクト獲得手段
９は、入出力手段２を介して、ネットワーク１から当該
リンク情報に対応するリンクオブジェクトを得る（ステ
ップ１２０３）。この場合は、リンクオブジェクトとし
て参照先のタグ付き文書が得られる。次に、リンクオブ
ジェクト獲得手段９は、得られたリンクオブジェクトを
オブジェクト格納手段１１に転送し（ステップ２１０
２）、動作を終了する（ステップ２１０３）。On the other hand, if the transferred link information is not an image tag but a link tag for referencing another document (step 2101), the link object acquisition means 9 causes the network 1 to operate via the input / output means 2. A link object corresponding to the link information is obtained from (step 1203). In this case, the referenced document with a tag is obtained as the link object. Next, the link object acquisition means 9 transfers the obtained link object to the object storage means 11 (step 210).
2) The operation is ended (step 2103).

【０１３０】なお、タグ識別手段４とリンクオブジェク
ト獲得手段９とは、ステップ１３０１とステップ１４０
１、及び、ステップ１４０４とステップ１３０２で互い
に同期して動作する。また、リンクオブジェクト獲得手
段９とオブジェクト格納手段１１とは、ステップ２１０
２で同期し、オブジェクト格納手段１１は、オブジェク
トが格納された時点で、タグ識別手段４とステップ２０
０２で同期して動作する。Note that the tag identifying means 4 and the link object obtaining means 9 perform steps 1301 and 140.
1, and the steps 1404 and 1302 operate in synchronization with each other. Further, the link object acquisition means 9 and the object storage means 11 perform step 210.
2, the object storing means 11 and the tag identifying means 4 and the step 20 at the time when the object is stored.
02 works synchronously.

【０１３１】この第３の実施形態における翻訳手段６
は、図７に示すフローチャートに従って、第１の実施形
態における翻訳手段６と同様に動作する。また、第３の
実施形態における出力文書生成手段７も、図８に示すフ
ローチャートに従って、第１の実施形態における出力文
書生成手段７と同様に動作する。さらに、第３の実施形
態における符号化手段１０は、図１５に示すフローチャ
ートに従って、第２の実施形態における符号化手段１０
と同様に動作する。The translation means 6 in this third embodiment
Operates in the same manner as the translation means 6 in the first embodiment according to the flowchart shown in FIG. The output document generation means 7 in the third embodiment also operates in the same manner as the output document generation means 7 in the first embodiment according to the flowchart shown in FIG. Furthermore, the encoding means 10 in the third embodiment follows the flowchart shown in FIG. 15 and the encoding means 10 in the second embodiment.
Works the same as.

【０１３２】以下では、図３に示したタグ付き文書３０
１を入力文書３の例として、また、図２に示した翻訳対
象タグ情報２００を翻訳対象タグ情報格納手段５が格納
しているとして、第３の実施形態の機械翻訳システムの
動作を具体的に説明する。In the following, the tagged document 30 shown in FIG.
1 as an example of the input document 3 and the translation target tag information storage means 5 stores the translation target tag information 200 shown in FIG. 2, the operation of the machine translation system of the third embodiment will be described in detail. Explained.

【０１３３】まず、タグ識別手段４は入力文書３０１を
得る（ステップ５０１）。次に、タグ識別手段４は、翻
訳対象タグ情報格納手段５が格納する翻訳対象タグ情報
２００の第１のタグパターン２０１を得る（ステップ６
０１）。得られたタグパターンは「＜Ｉ＞．＋＜／Ｉ
＞」であるので、これに適合する文字列を入力文書３０
１から検索する（ステップ６０２）。その結果、「＜Ｉ
＞ｐａｒｓｉｎｇ＜／Ｉ＞」、「＜Ｉ＞ｔｒａｎｓｆｅ
ｒｒｉｎｇ＜／Ｉ＞」、「＜Ｉ＞ｇｅｎｅｒａｔｉｎｇ
＜／Ｉ＞」及び「＜Ｉ＞ｔｒａｎｓｆｅｒｍｅｔｈｏ
ｄ＜／Ｉ＞」がタグパターン２０１に適合する文字列と
して抽出され（ステップ６０３）、さらに、それらは、
イメージタグでもリンクタグはないので、タグ識別手段
４は、それらの文字列を翻訳対象として翻訳手段６に転
送する（ステップ１２０６）。First, the tag identifying means 4 obtains the input document 301 (step 501). Next, the tag identifying means 4 obtains the first tag pattern 201 of the translation target tag information 200 stored in the translation target tag information storage means 5 (step 6).
01). The obtained tag pattern is ". + ”, A character string matching this is input document 30
Search from 1 (step 602). As a result, " Parsing , “ transfer
"ring ", "generation"
 ”and“ transfer method ”
"d " is extracted as a character string that matches the tag pattern 201 (step 603).
Since there is no image tag or link tag, the tag identifying means 4 transfers these character strings to the translation means 6 as translation targets (step 1206).

【０１３４】翻訳手段６は、これらの４つの翻訳対象を
得て（ステップ７０１）、それらを翻訳し、それぞれ
「＜Ｉ＞解析＜／Ｉ＞」、「＜Ｉ＞変換＜／Ｉ＞」、
「＜Ｉ＞生成＜／Ｉ＞」、「＜Ｉ＞トランスファ方式＜
／Ｉ＞」なる翻訳結果を得る（ステップ７０２）。翻訳
手段６は、これら４つの翻訳結果を出力文書生成手段７
に転送する（ステップ７０３）。The translation means 6 obtains these four translation objects (step 701), translates them, and then respectively translates them into " analysis ", " transformation ",
“ Generation ”, “ Transfer method ”is obtained (step 702). The translation means 6 outputs these four translation results to the output document generation means 7
(Step 703).

【０１３５】出力文書生成手段７は、翻訳手段６から得
た４つの翻訳結果を、入力文書３０１の書式を参照しな
がら出力文書２２００の所定の位置（図２２参照）に格
納する（ステップ８０２）。その後、タグ識別手段４に
おいて、翻訳対象タグ情報２００の次のタグパターンが
残されているので（ステップ６０５で肯定結果）、以下
同様にして、入力文書３０１のタグパターン２０２及び
２０３に適合する文字列が順次翻訳され、出力文書２２
００に格納される。The output document generation means 7 stores the four translation results obtained from the translation means 6 in a predetermined position (see FIG. 22) of the output document 2200 while referring to the format of the input document 301 (step 802). . After that, in the tag identifying means 4, since the tag pattern next to the translation target tag information 200 remains (Yes in step 605), the characters matching the tag patterns 202 and 203 of the input document 301 are similarly processed. Output document 22
00 is stored.

【０１３６】次に、タグ識別手段４は、翻訳対象タグ情
報格納手段５が格納する翻訳対象タグ情報２００の第４
のタグパターン２０４を得る（ステップ６０１）。その
結果、「＜ＡＨＲＥＦ＝”ａｍｂｉｇｕｉｔｙ．ｈｔ
ｍｌ”＞ａｍｂｉｇｕｉｔｙｐｒｏｇｒａｍ＜／Ａ＞」
がタグパターン２０４に適合する文字列として抽出され
る（ステップ６０３）。ここで、当該文字列に含まれる
タグはリンクタグ（＜ＡＨＲＥＦ．．．＞．．．＜／
Ａ＞）であるので（ステップ２００３）、当該文字列
「＜ＡＨＲＥＦ＝”ａｍｂｉｇｕｉｔｙ．ｈｔｍｌ”
＞ａｍｂｉｇｕｉｔｙｐｒｏｂｌｅｍ＜／Ａ＞」が、
リンクオブジェクト獲得手段９に転送される（ステップ
１３０１）。Next, the tag identifying means 4 determines the fourth of the translation target tag information 200 stored in the translation target tag information storage means 5.
The tag pattern 204 is obtained (step 601). As a result, "<A HREF =" ambiguity. ht
ml "> ambiguityprogram </A>"
Is extracted as a character string that matches the tag pattern 204 (step 603). Here, the tags included in the character string are link tags (<A HREF...></
A>) (step 2003), the character string “<A HREF =” ambiguity. html ”
> Ambiguity problem </a> ”
It is transferred to the link object acquisition means 9 (step 1301).

【０１３７】リンクオブジェクト獲得手段９は、「＜Ａ
ＨＲＥＦ＝”ａｍｂｉｇｕｉｔｙ．ｈｔｍｌ”＞ａｍ
ｂｉｇｕｉｔｙｐｒｏｂｌｅｍ＜／Ａ＞」に適合する
リンクオブジェクト３０３をネットワーク１から入出力
手段２を介して得て（ステップ１９０２）、得られたリ
ンクオブジェクト３０３をオブジェクト格納手段１１に
転送する（ステップ２１０２）。The link object acquisition means 9 indicates "<A
HREF = "ambiguity.html"> am
The link object 303 that conforms to the "biquity problem </A>" is obtained from the network 1 via the input / output unit 2 (step 1902), and the obtained link object 303 is transferred to the object storage unit 11 (step 2102).

【０１３８】しかる後に、オブジェクト格納手段１１
は、タグ識別手段４にオブジェクトを転送するので（ス
テップ２００１）、タグ識別手段４はそれまでの動作
（すなわち、タグパターン２０４までを検査した状態で
動作）を保留して、オブジェクト格納手段１１からオブ
ジェクト３０３を入力し、それを新たな入力文書として
新規に動作を開始するためにステップ６０１へ進む（ス
テップ２００２）。After that, the object storing means 11
Transfers the object to the tag identifying means 4 (step 2001), the tag identifying means 4 suspends the operation up to that point (that is, operates in the state in which the tag patterns 204 are inspected), and the object storing means 11 stores the object. The object 303 is input, and the process proceeds to step 601 in order to start a new operation using it as a new input document (step 2002).

【０１３９】以下、タグ識別手段４は、翻訳対象タグ情
報格納手段５が格納する翻訳対象タグ情報２００の第１
のタグパターン２０１から第９のタグパターン２０９の
それぞれに適合する翻訳対象を得て、翻訳手段６によ
り、新規入力文書３０３のタグに関わる表現の翻訳が実
施され（ステップ５０３）、さらに、出力文書生成手段
７により、翻訳結果が出力文書２２００の所定の位置
（図２２の符号２２０１参照）に格納される（ステップ
５０４）。Hereinafter, the tag identification means 4 is the first of the translation target tag information 200 stored in the translation target tag information storage means 5.
From the tag pattern 201 of No. 1 to the ninth tag pattern 209, the translating means 6 obtains the translation object, and the translating means 6 translates the expression related to the tag of the new input document 303 (step 503), and further outputs the output document. The generation unit 7 stores the translation result at a predetermined position (see reference numeral 2201 in FIG. 22) of the output document 2200 (step 504).

【０１４０】この時点で、タグ識別手段４のオブジェク
ト３０３に対する動作が完了したので、タグ識別手段４
は、上記で保留していたタグパターン２０４までを検査
した状態から動作を再開する。At this point, since the operation of the tag identifying means 4 for the object 303 is completed, the tag identifying means 4
Restarts the operation from the state in which the tag patterns 204 that have been suspended above are inspected.

【０１４１】以下同様にして、入力文書３０１のタグパ
ターン２０５〜２０７に適合する文字列が順次翻訳さ
れ、出力文書２２００に格納され、次に、タグ識別手段
４は、翻訳対象タグ情報格納手段５が格納する翻訳対象
タグ情報２００の第８のタグパターン２０８を得る（ス
テップ６０１）。その結果、「＜ｃｅｎｔｅｒ＞＜ＩＭ
ＧＳＲＣ＝”ＴＲＡＮＳＦＥＲ．ｇｉｆ”＞＜／ｃｅ
ｎｔｅｒ＞」がタグパターン２０８に適合する文字列と
して抽出される（ステップ６０３）。ここで、当該文字
列に含まれるタグはイメージタグ（＜ＩＭＧ．．．＞）
であるので（ステップ２００３）、当該文字列「＜ｃｅ
ｎｔｅｒ＞＜ＩＭＧＳＲＣ＝”ＴＲＡＮＳＦＥＲ．ｇ
ｉｆ”＞＜／ｃｅｎｔｅｒ＞」が、リンクオブジェクト
獲得手段９に転送される（ステップ１３０１）。In the same manner, character strings conforming to the tag patterns 205 to 207 of the input document 301 are sequentially translated and stored in the output document 2200. Next, the tag identifying means 4 and the translation target tag information storing means 5 Obtains the eighth tag pattern 208 of the translation target tag information 200 stored by (step 601). As a result, "<center><IM
G SRC = "TRANSFER.gif"></ ce
"nter>" is extracted as a character string that matches the tag pattern 208 (step 603). Here, the tag included in the character string is an image tag (<IMG ...>).
(Step 2003), the character string “<ce
inter><IMG SRC = “TRANSFER.g
if "></center>" is transferred to the link object acquisition means 9 (step 1301).

【０１４２】リンクオブジェクト獲得手段９は、「＜ｃ
ｅｎｔｅｒ＞＜ＩＭＧＳＲＣ＝”ＴＲＡＮＳＦＥＲ．
ｇｉｆ”＞＜／ｃｅｎｔｅｒ＞」に適合するリンクオブ
ジェクト３０２をネットワーク１から入出力手段２を介
して得る（ステップ１２０３）。この得られたリンクオ
ブジェクト（図）に対する、タグ識別手段４、翻訳手段
６、出力文書生成手段７、リンクオブジェクト獲得手段
９及び符号化手段１０による具体的動作は、第２の実施
形態と同様であり、出力文書生成手段７によって、翻訳
手段６から得られた翻訳結果が、入力文書３０１の書式
を参照しながら出力文書２２００の所定の位置（図２２
の符号１６０１参照）に格納される。The link object acquisition means 9 displays "<c
enter><IMG SRC = “TRANSFER.
A link object 302 conforming to gif ">></center>" is obtained from the network 1 via the input / output unit 2 (step 1203). Specific operations performed by the tag identification means 4, the translation means 6, the output document generation means 7, the link object acquisition means 9, and the encoding means 10 for the obtained link object (FIG.) Are the same as those in the second embodiment. Yes, the translation result obtained from the translation unit 6 by the output document generation unit 7 refers to the format of the input document 301 at a predetermined position of the output document 2200 (see FIG. 22).
No. 1601).

【０１４３】以下同様にして、入力文書３０１のタグパ
ターン２０９に適合する文字列が順次翻訳され、出力文
書２２００に格納され、最終的に図２３に示す出力文書
の表示画面２３００が入出力手段２を介して表示される
（ステップ５０６）。Similarly, the character strings matching the tag pattern 209 of the input document 301 are sequentially translated, stored in the output document 2200, and finally the output document display screen 2300 shown in FIG. Is displayed via (step 506).

【０１４４】（Ｃ−３）第３の実施形態の効果この第３の実施形態の機械翻訳システムによっても、第
２の実施形態の技術的思想をそのまま有するので、第２
の実施形態が有していた効果を奏することができる。(C-3) Effects of the Third Embodiment The machine translation system of the third embodiment also has the technical idea of the second embodiment as it is.
It is possible to achieve the effect that the embodiment described above has.

【０１４５】これに加えて、第３の実施形態によれば、
入力文書３０１からリンクタグによって参照されるリン
ク先文書３０３のタグ情報に関わる表現のみに限定され
た部分が抽出され翻訳されており、従来、利用者が能動
的にリンクタグを辿ってみなければ獲得できなっかた情
報を得ることが可能となり、入力文書３０１の作者の意
図がより明確になる。このことは、図２２に示す出力文
書２２００の表示画面２３００を参照することで明らか
である。In addition to this, according to the third embodiment,
A part limited to only the expressions related to the tag information of the link destination document 303 referred to by the link tag from the input document 301 is extracted and translated. Conventionally, unless the user actively follows the link tag. It becomes possible to obtain information that cannot be acquired, and the intention of the author of the input document 301 becomes clearer. This is clear by referring to the display screen 2300 of the output document 2200 shown in FIG.

【０１４６】なお、図２３には、図３の文書３０３に対
応する表示として、リンクタグ情報２２００の「＜Ａ＞
ＨＲＥＦ＝”ａｍｂｉｇｕｉｔｙ．ｈｔｍｌ”＞曖昧
性の問題＜／Ａ＞」に対応した下線が施された「曖昧性
の問題」の部分２３０２と、リンク先の文書３０３を要
約して翻訳した内容２２０１も表示されている。Note that, in FIG. 23, "<A>" in the link tag information 2200 is displayed as a display corresponding to the document 303 in FIG.
The underlined "ambiguity problem" portion 2302 corresponding to "HREF =" ambiguity.html "> ambiguity problem </A>" and the translated content 2201 of the linked document 303 are also included. It is displayed.

【０１４７】（Ｄ）第４の実施形態次に、本発明による文書処理システムを、タグ付き文書
の機械翻訳システムに適用した第４の実施形態を図面を
参照しながら詳述する。(D) Fourth Embodiment Next, a fourth embodiment in which the document processing system according to the present invention is applied to a machine translation system for tagged documents will be described in detail with reference to the drawings.

【０１４８】この第４の実施形態の機械翻訳システム
は、第３の実施形態と同様な機能に加えて、入力文書の
翻訳結果における文字スタイルの変換機能を有するもの
である。ここで、文字スタイルとは、文章を構成する字
体に対する下線、太字、斜体、強調等の修飾をいう。The machine translation system of the fourth embodiment has a function of converting the character style in the translation result of the input document, in addition to the same function as that of the third embodiment. Here, the character style refers to a modification such as underline, bold, italic, or emphasized with respect to a character that forms a sentence.

【０１４９】タグ付き文書では、その文章を構成する字
体が下線、太字、斜体、強調等により修飾されることが
多く、このような文書を要約して翻訳した結果も、文字
スタイルの情報を有する。しかし、字体に対する文字ス
タイルは、要約されていない文書の表示、印刷等を意識
しており、要約した翻訳結果に対しては適していないこ
とも生じる。そのため、この第４の実施形態において
は、文字スタイルの変換機能を持たせている。In a tagged document, the fonts constituting the sentence are often modified by underlining, bold letters, italics, emphasis, etc., and the result of summarizing and translating such a document also has character style information. . However, the character style for the font is conscious of display and printing of unsummarized documents, and may not be suitable for the summarized translation result. Therefore, in the fourth embodiment, a character style conversion function is provided.

【０１５０】（Ｄ−１）第４の実施形態の構成この第４の実施形態の機械翻訳システムは、第３の実施
形態における出力文書生成手段７及び出力文書（バッフ
ァ）８間に、図２４に示す詳細構成を有する文字スタイ
ル変換手段１２を設けたものである。(D-1) Configuration of the Fourth Embodiment The machine translation system of the fourth embodiment has the configuration shown in FIG. 24 between the output document generating means 7 and the output document (buffer) 8 in the third embodiment. The character style conversion means 12 having the detailed configuration shown in FIG.

【０１５１】図２４において、文字スタイル変換手段１
２は、文字スタイル処理制御部２０、文字スタイル登録
・編集テーブル２１、文字スタイル変換参照テーブル２
２、文字スタイル変換照合テーブル２３及び文字スタイ
ル変換判定処理部２４から構成されている。In FIG. 24, the character style conversion means 1
2 is a character style processing control unit 20, a character style registration / edit table 21, and a character style conversion reference table 2
2. The character style conversion collation table 23 and the character style conversion determination processing unit 24 are included.

【０１５２】文字スタイル変換判定処理部２４は、要約
翻訳結果を入力し、文字スタイル変換照合テーブル２３
から抽出したデータを用いて、利用者が登録した文字ス
タイルの変更指定した内容を判定し、該当する文字スタ
イルが入力された翻訳結果中に存在している場合には、
利用者の指定する文字スタイルへと変更し、変更後の要
約翻訳結果を出力文書８とするものである。The character style conversion judgment processing section 24 inputs the summary translation result, and uses the character style conversion collation table 23.
Using the data extracted from, determine the specified contents of the character style change registered by the user, and if the corresponding character style exists in the input translation result,
The character style specified by the user is changed, and the changed summary translation result is used as the output document 8.

【０１５３】文字スタイル処理制御部２０は、入出力手
段２を介して利用者から起動され、利用者が文字スタイ
ル登録・編集テーブル２１を用いて翻訳結果における文
字スタイルの変更指定する内容を登録したり編集したり
するための処理や、登録された変更指定内容を用いて文
字スタイル変換参照テーブル２２を参照し、文字スタイ
ル変換照合テーブル２３にデータをセットしたりする処
理を制御するものである。The character style processing control unit 20 is activated by the user via the input / output means 2 and registers the contents of the character style change / designation in the translation result by the user using the character style registration / edit table 21. It controls the processing for editing and editing, the processing for referring to the character style conversion reference table 22 using the registered change designation content, and setting the data in the character style conversion collation table 23.

【０１５４】文字スタイル登録・編集テーブル２１は、
利用者が翻訳結果中の文字スタイルを変更指定する内容
を登録したり編集したりするために用いるテーブルであ
る。このテーブルへの登録又は編集では、利用者はタグ
付き文書で扱われる特殊な形態を意識せずに、通常の文
字スタイルの名称で登録することができるようになされ
ている。例えば、文書ではボールド（太字体）はタグ情
報文字列「＜Ｂ＞」、「＜／Ｂ＞」で表されるが、この
登録・編集時には、利用者が文字スタイルの種別名「ボ
ールド」と入力すれば良いようになされている。The character style registration / edit table 21 is
This is a table used by the user to register and edit the content for changing and designating the character style in the translation result. When registering or editing in this table, the user can register with a normal character style name without being aware of the special form handled in the tagged document. For example, in the document, bold (bold typeface) is represented by the tag information character strings "" and "", but at the time of registration / editing, the user identifies the type name of the character style as "bold". It is designed so that you can enter it.

【０１５５】文字スタイル変換参照テーブル２２は、利
用者が文字スタイル登録・編集テーブル２１に登録・編
集した翻訳結果中の文字スタイルの変更指定内容を、こ
れに対応するタグ情報（文字列）に変換するための情報
を格納しているものである。この文字スタイル変換参照
テーブル２２の格納内容は、文字スタイル処理制御部２
０が、利用者が文字スタイル登録・編集テーブル２１の
格納内容に応じたデータを、文字スタイル変換照合テー
ブル２３にセットする際に参照される。The character style conversion reference table 22 converts the character style change specification contents in the translation result registered / edited by the user in the character style registration / edit table 21 into corresponding tag information (character string). It stores information for doing so. The content stored in the character style conversion reference table 22 is the character style processing control unit 2
0 is referred to when the user sets data according to the stored contents of the character style registration / edit table 21 in the character style conversion / comparison table 23.

【０１５６】文字スタイル変換照合テーブル２３は、利
用者が登録・編集した文字スタイルの変更指定内容に対
応する、タグ付き文書で特殊な形態として扱われるタグ
情報がセットされるものである。セットされたデータ
は、文字スタイル変換判定処理部２４で利用される。The character style conversion and collation table 23 is set with tag information which is treated as a special form in a tagged document, which corresponds to the content of change specification of the character style registered / edited by the user. The set data is used by the character style conversion determination processing unit 24.

【０１５７】図２５は、上述した文字スタイル変換参照
テーブル２２の構成例を示す説明図である。FIG. 25 is an explanatory diagram showing a configuration example of the character style conversion reference table 22 described above.

【０１５８】文字スタイル変換参照テーブル２２は、文
字スタイル参照見出し項目２２Ａと文字スタイル変換参
照見出し項目２２Ｂとから構成されている。The character style conversion reference table 22 is composed of a character style reference heading item 22A and a character style conversion reference heading item 22B.

【０１５９】文字スタイル参照見出し項目２２Ａには、
通常の文字スタイルの名称（例えば、「ボールド」、
「イタリック」、「アンダーライン」等）が格納されて
おり、文字スタイル変換参照見出し項目２２Ｂには、文
字スタイル参照見出し項目２２Ａのデータに対応するタ
グ付文書の中で特殊な形態で用いられるタグ情報の見出
し（例えば、「＜Ｂ＞」、「＜／Ｂ＞」等）が格納され
ている。The character style reference heading item 22A includes
The name of a normal character style (for example, "bold",
"Italic", "underline", etc. are stored, and the character style conversion reference heading item 22B is a tag used in a special form in the tagged document corresponding to the data of the character style reference heading item 22A. Information headings (for example, “”, “”, etc.) are stored.

【０１６０】図２５の例では、文字スタイル参照見出し
項目２２Ａのデータとして、「デフォルト文字」、「ボ
ールド」、「イタリック」、「アンダーライン」、「強
調」、「強い強調」が格納され、これに対応する文字ス
タイル変換参照見出し項目２２Ｂのデータとして、「デ
フォルト文字」ではデータがなく、「ボールド」では
「＜Ｂ＞」と「＜／Ｂ＞」、「イタリック」では「＜Ｉ
＞」と「＜／Ｉ＞」、「アンダーライン」では「＜Ｕ
＞」と「＜／Ｕ＞」、「強調」では「＜ＥＭ＞」と「＜
／ＥＭ＞」、「強い強調」では「＜ＳＴＲＯＮＧ＞」と
「＜／ＳＴＲＯＮＧ＞」が格納されている。また、この
中で、「デフォルト文字」とは文字スタイル（字体修
飾）の指定がないものである。なお、「＜Ｂ＞」はその
直後の文字からボールド（太文字）とすることを表すタ
グ情報であり、「＜／Ｂ＞」はその直前の文字までボー
ルドとすることを表すタグ情報であり、他の記号（タグ
情報）も同様である。In the example of FIG. 25, "default character", "bold", "italic", "underline", "emphasis", and "strong emphasis" are stored as the data of the character style reference headline item 22A. As the data of the character style conversion reference heading item 22B corresponding to, there is no data in "default character", "" and "" in "bold", and " ”And“ ”, and“ ”And“ ”, and“ ”and“ and “strong emphasis”, “” and “” are stored. In addition, the "default character" has no designation of a character style (character style modification). Note that “” is tag information indicating that the character immediately after it is bolded (bold), and “” is tag information indicating that the character immediately before it is bolded. , And other symbols (tag information).

【０１６１】図２６は、利用者が翻訳結果中の文字スタ
イルを変更指定する内容を登録したり編集したりするた
めに用いる文字スタイル登録・編集テーブル２１の構成
及びその登録例を示す説明図である。FIG. 26 is an explanatory diagram showing the configuration of a character style registration / edit table 21 used by the user to register and edit the contents for changing and designating the character style in the translation result, and an example of the registration. is there.

【０１６２】文字スタイル登録・編集テーブル２１は、
文字スタイル見出し項目２１Ａと文字スタイル変換見出
し項目２１Ｂとから構成されている。文字スタイル見出
し項目２１Ａには、利用者が入力した翻訳結果中の変更
したい文字スタイルの名称が格納され、文字スタイル変
換見出し項目２１Ｂには、利用者が入力した文字スタイ
ル見出し項目２１Ａに登録した翻訳結果中の変更したい
文字スタイルの名称に対応した変更後の文字スタイルの
名称を格納している。The character style registration / edit table 21 is
It is composed of a character style heading item 21A and a character style conversion heading item 21B. The character style heading item 21A stores the name of the character style to be changed in the translation result input by the user, and the character style conversion heading item 21B stores the translation registered in the character style heading item 21A input by the user. The name of the changed character style corresponding to the name of the character style to be changed in the result is stored.

【０１６３】図２６の例では、文字スタイル見出しデー
タとして、「イタリック」及び「ボールド」が登録さ
れ、これらのそれぞれに対応する文字スタイル変換見出
しデータとして、「ボールド」及び「強調」が登録され
ている。In the example of FIG. 26, “italic” and “bold” are registered as the character style heading data, and “bold” and “emphasis” are registered as the character style conversion heading data corresponding to each of them. There is.

【０１６４】図２７は、文字スタイル変換照合テーブル
２３の構成例及び格納内容例を示す説明図である。な
お、図２７は、文字スタイル変換参照テーブル２２の格
納内容が図２に示す内容であり、文字スタイル登録・編
集テーブル２１の格納内容が図２６に示す内容である場
合の文字スタイル変換照合テーブル２３の格納内容を示
している。FIG. 27 is an explanatory diagram showing an example of the structure and stored contents of the character style conversion collation table 23. Note that, in FIG. 27, the content stored in the character style conversion reference table 22 is the content shown in FIG. 2, and the content stored in the character style registration / edit table 21 is the content shown in FIG. 26. Shows the stored contents of.

【０１６５】文字スタイル変換照合テーブル２３には、
図２６に示した文字スタイル登録編集テーブル２１への
利用者の文字スタイルを変更指定作業が終了したとき
に、文字スタイル処理制御部２０が図２５に示した文字
スタイル変換参照テーブル２２を参照し、利用者が文字
スタイル登録・編集テーブル２１に登録・編集した翻訳
結果に対する文字スタイルの変更指定内容を、これに対
応するタグ付き文書で特殊な形態として扱われるタグ情
報に変換したデータがセットされる。In the character style conversion collation table 23,
When the work of changing and designating the character style of the user to the character style registration edit table 21 shown in FIG. 26 is completed, the character style processing control unit 20 refers to the character style conversion reference table 22 shown in FIG. Data obtained by converting the character style change designation content corresponding to the translation result registered / edited in the character style registration / edit table 21 by the user into tag information handled as a special form in the corresponding tagged document is set. .

【０１６６】文字スタイル変換照合テーブル２３は、文
字スタイル照合見出し項目２３Ａ及び文字スタイル変換
照合見出し項目２３Ｂとから構成されている。The character style conversion collation table 23 is composed of character style conversion collation heading items 23A and character style conversion collation heading items 23B.

【０１６７】文字スタイル照合見出し項目２３Ａには、
文字スタイル登録・編集テーブル２１の文字スタイル見
出し項目２１Ａのデータを、文字スタイル変換参照テー
ブル２２を参照し、タグ付文書の中で特殊な形態で用い
られる記号に変換したデータがセットされる。一方、文
字スタイル変換照合見出し項目２３Ｂには、文字スタイ
ル登録・編集テーブル２１の文字スタイル変換見出し項
目２１Ｂのデータを、文字スタイル変換参照テーブル２
２を参照しタグ付き文書の中で特殊な形態で用いられる
記号に変換したデータがセットされる。In the character style collation heading item 23A,
The data of the character style heading item 21A of the character style registration / edit table 21 is converted into a symbol used in a special form in the tagged document by referring to the character style conversion reference table 22 and set. On the other hand, in the character style conversion collation heading item 23B, the data of the character style conversion heading item 21B of the character style registration / edit table 21 is stored in the character style conversion reference table 2
2 is set, and data converted into a symbol used in a special form in the tagged document is set.

【０１６８】図２７の例は、文字スタイル照合見出し項
目２３Ａには、図２６の文字スタイル登録・編集テーブ
ル２１の文字スタイル見出し項目２１Ａのデータである
「イタリック」に対しては、図２５の文字スタイル変換
参照テーブル２２が参照されて「＜Ｉ＞」及び「＜／Ｉ
＞」がセットされ、「ボールド」に対しては「＜Ｂ＞」
及び「＜／Ｂ＞」がセットされている。In the example of FIG. 27, in the character style collation heading item 23A, for the character string of FIG. 25, the italic character which is the data of the character style heading item 21A of the character style registration / edit table 21 of FIG. The style conversion reference table 22 is referred to and "" and " ”Is set, and for“ bold ”,“ ”
And "" are set.

【０１６９】また、文字スタイル変換照合見出し項目２
３Ｂには、図２６の文字スタイル登録・編集テーブル２
１の文字スタイル変換見出し項目２１Ｂのデータである
「ボールド」に対しては、図２５の文字スタイル変換参
照テーブル２２が参照されて「＜Ｂ＞」及び「＜／Ｂ
＞」がセットされ、「強調」に対しては「＜ＥＭ＞」及
び「＜／ＥＭ＞」がセットされている。Character style conversion collation heading item 2
3B includes a character style registration / edit table 2 shown in FIG.
For "bold" which is the data of the character style conversion heading item 21B of No. 1, the character style conversion reference table 22 of FIG. 25 is referred to, and "" and " ”Is set, and for“ emphasis ”,“ ”and“ ”are set.

【０１７０】（Ｄ−４）第４の実施形態の動作以下、この第４の実施形態の特徴的な処理を実行する文
字スタイル処理制御部２０及び文字スタイル変換判定処
理部２４の動作を順に説明する。(D-4) Operation of the Fourth Embodiment Hereinafter, the operation of the character style processing control unit 20 and the character style conversion determination processing unit 24 for executing the characteristic processing of the fourth embodiment will be described in order. To do.

【０１７１】ここで、図２８及び図２９は、文字スタイ
ル処理制御部２０の処理フローチャートである。Here, FIG. 28 and FIG. 29 are processing flow charts of the character style processing control unit 20.

【０１７２】利用者が入出力手段２を介して文字スタイ
ルの登録・編集処理を起動すると、文字スタイル処理制
御部２０は図２８及び図２９のフローチャートに示す一
連の処理を開始する。When the user activates the character style registration / editing process via the input / output means 2, the character style process control unit 20 starts a series of processes shown in the flowcharts of FIGS. 28 and 29.

【０１７３】まず、文字スタイル登録・編集テーブル２
１から、存在する文字スタイル登録・編集テーブルデー
タを抽出し、利用者が新たに文字スタイルデータを登録
又は編集できるように表示する（ステップ２８０１）。First, the character style registration / edit table 2
The existing character style registration / edit table data is extracted from No. 1 and displayed so that the user can newly register or edit the character style data (step 2801).

【０１７４】その後、利用者とのインタラクティブなデ
ータ授受を通じて、表示された文字スタイル登録・編集
テーブルデータに対し、利用者に文字スタイルを変更指
定する情報の登録・編集作業を実行させ、その登録・編
集作業を完了まで導き、利用者からの完了命令を受信す
ると、次のステップ２８０４へと制御を移す（ステップ
２８０２、２８０３）。なお、上述したように、このテ
ーブルへの登録・編集作業においては、利用者はタグ付
文書で扱われる特殊な記号形態を意識せずに、通常の文
字スタイルの名称で作業を行なうことができる。After that, through interactive data exchange with the user, the user is allowed to perform the registration / editing work of the information for designating the character style change to the displayed character style registration / edit table data, and registering / editing the information. When the editing work is guided to completion and the completion command from the user is received, the control is moved to the next step 2804 (steps 2802 and 2803). As described above, when registering / editing in this table, the user can carry out the work with the name of a normal character style without being aware of the special symbol form handled in the tagged document. .

【０１７５】次のステップ２８０４では、登録・編集作
業を終了した文字スタイル登録・編集テーブルデータを
文字スタイル登録・編集テーブル２１に保存する。At the next step 2804, the character style registration / edit table data for which the registration / edit work has been completed is stored in the character style registration / edit table 21.

【０１７６】その後、保存した文字スタイル登録・編集
テーブル２１のデータと文字スタイル変換参照テーブル
２２の文字スタイル参照見出し項目２２Ａのデータとを
照合し（ステップ２８０５）、その照合結果を確認し、
文字スタイル登録・編集テーブル２１のデータが文字ス
タイル参照見出し項目２２Ａのデータに該当しているか
判定する（ステップ２８０６）。Thereafter, the stored data of the character style registration / edit table 21 is collated with the data of the character style reference heading item 22A of the character style conversion reference table 22 (step 2805), and the collation result is confirmed.
It is determined whether the data of the character style registration / edit table 21 corresponds to the data of the character style reference headline item 22A (step 2806).

【０１７７】判定した結果、該当するデータが存在しな
い文字スタイル登録・編集テーブル２１のデータが１個
でもあると（そのデータにマッチする文字スタイル参照
見出し項目２２Ａのデータがなければ）、対象とする文
字スタイル登録・編集テーブル２１のデータを削除し
（ステップ２８０７）、ステップ２８０１に制御を移
し、文字スタイル参照見出し項目２２Ａに存在しないデ
ータが空の表項目となった文字スタイル登録・編集テー
ブル２１から文字スタイル登録・編集テーブルデータを
抽出して表示させ、利用者に正しいデータの登録を促
し、以下、ステップ２８０２〜２８０６へと進み、同様
の判定が行なわれ、文字スタイル登録・編集テーブル２
１のデータが全てマッチするまでかかる処理ループを繰
返す。As a result of the judgment, if there is even one data in the character style registration / edit table 21 for which the corresponding data does not exist (if there is no data of the character style reference headline item 22A matching the data), it is regarded as the target. Data in the character style registration / edit table 21 is deleted (step 2807), control is transferred to step 2801, and the data that does not exist in the character style reference heading item 22A becomes an empty table item. The character style registration / edit table data is extracted and displayed, the user is prompted to register the correct data, the process proceeds to steps 2802 to 2806, and the same determination is performed, and the character style registration / edit table 2 is displayed.
This processing loop is repeated until all the data of 1 match.

【０１７８】文字スタイル登録・編集テーブル２１の全
てのデータ（スタイル名）が文字スタイル参照見出し項
目２２Ａのデータに該当している場合には、ステップ２
８０４で保存した文字スタイル登録・編集テーブル２１
のデータから１個の未抽出データの抽出動作を行ない
（ステップ２８０８）、未抽出データが存在するか否か
を判定する（ステップ２８０９）。If all the data (style names) in the character style registration / edit table 21 correspond to the data of the character style reference headline item 22A, step 2
Character style registration / edit table 21 saved in 804
One piece of unextracted data is extracted from this data (step 2808), and it is determined whether or not there is unextracted data (step 2809).

【０１７９】ステップ２８０８の未抽出データの抽出動
作は、ステップ２８０８〜ステップ２８１２である処理
ループが繰返される毎に１回ずつ行なわれるが、その各
回で抽出する未抽出データの抽出順番は、例えば、以下
の通りである。The unextracted data extraction operation of step 2808 is performed once each time the processing loop of steps 2808 to 2812 is repeated. The extraction order of the unextracted data extracted at each time is, for example, It is as follows.

【０１８０】まず、文字スタイル登録・編集テーブル２
１の文字スタイル見出し項目２１Ａのデータから先に抽
出し、次に、このステップ２８０８に制御が移ったとき
には抽出した文字スタイル見出し項目２１Ａのデータに
対応する文字スタイル変換見出し項目２１Ｂのデータを
抽出し、その次は文字スタイル見出し項目２１Ａの次の
データというように変化する。First, the character style registration / edit table 2
The data of the character style conversion heading item 21B is extracted first from the data of the character style heading item 21A of 1, and then the control is transferred to this step 2808, the data of the character style conversion heading item 21B corresponding to the data of the extracted character style heading item 21A is extracted. , And then the data next to the character style heading item 21A.

【０１８１】文字スタイル登録・編集テーブル２１から
未抽出データが抽出できなければ（全てのデータを抽出
して処理済であって未抽出データが存在しなければ）、
文字スタイル処理制御部２０は一連の処理を終了する。If unextracted data cannot be extracted from the character style registration / edit table 21 (if all data has been extracted and processed and there is no unextracted data),
The character style process control unit 20 ends the series of processes.

【０１８２】これに対して、文字スタイル登録・編集テ
ーブル２１から未抽出データが抽出できた場合には、抽
出したデータ（文字スタイル名）と、文字スタイル変換
参照テーブル２２の文字スタイル参照見出し項目２２Ａ
のデータ（文字スタイル名）との照合を行なう（ステッ
プ２８１０）。その後、マッチした文字スタイル参照見
出し項目２２Ａのデータ（文字スタイル名）に対応する
文字スタイル変換参照見出し項目２２Ｂのデータ（タ
グ）を抽出し、対象データ（文字スタイル名）を抽出し
たデータ（タグ情報）に置換する（ステップ２８１
１）。On the other hand, when unextracted data can be extracted from the character style registration / edit table 21, the extracted data (character style name) and the character style reference heading item 22A of the character style conversion reference table 22 are extracted.
Is compared with the data (character style name) (step 2810). Then, the data (tag) of the character style conversion reference headline item 22B corresponding to the data (character style name) of the matched character style reference headline item 22A is extracted, and the target data (character style name) is extracted (tag information). ) (Step 281)
1).

【０１８３】そして、置換された対象とするデータ（タ
グ情報）を文字スタイル変換照合テーブル２３中の文字
スタイル照合見出し項目２３Ａ又は文字スタイル変換照
合見出し項目２３Ｂにセットし（ステップ２８１２）、
上述したステップ２８０８に戻る。Then, the replaced target data (tag information) is set in the character style conversion collation heading item 23A or the character style conversion collation heading item 23B in the character style conversion collation table 23 (step 2812).
Returning to step 2808 described above.

【０１８４】ここで、文字スタイル変換照合テーブル２
３中にセットする順番は、文字スタイル登録・編集テー
ブル２１からデータを抽出する順番と対応しており、例
えば、以下の通りである。Here, the character style conversion collation table 2
The order of setting in 3 corresponds to the order of extracting data from the character style registration / edit table 21, and is as follows, for example.

【０１８５】最初に文字スタイル照合見出し項目２３Ａ
にセットし、次にステップ１０１２にきたときには、デ
ータを文字スタイル変換照合見出し項目２３Ｂにセット
し、その次にステップ１０１２にきたときには、データ
を文字スタイル照合見出し項目２３Ａにセットし、以
下、項目２３Ｂ及び２３Ａを交互に切り替えるというよ
うな順番である。First, the character style collation heading item 23A
, The data is set to the character style conversion collation heading item 23B at the next step 1012, and the data is set to the character style collation heading item 23A at the next step 1012. And 23A are alternately switched.

【０１８６】これにより、文字スタイル登録・編集テー
ブル２１の文字スタイル見出し項目２１Ａから抽出した
データの置換データは、文字スタイル変換照合テーブル
２３の文字スタイル照合見出し項目２３Ａにセットさ
れ、文字スタイル登録・編集テーブル２１の文字スタイ
ル変換見出し項目２１Ｂから抽出したデータの置換デー
タは、文字スタイル変換照合テーブル２３の文字スタイ
ル照合見出し項目２３Ａに対応して文字スタイル変換照
合見出し項目２３Ｂにセットされる。As a result, the replacement data of the data extracted from the character style heading item 21A of the character style registration / edit table 21 is set in the character style matching heading item 23A of the character style conversion / matching table 23, and the character style registration / editing is performed. The replacement data of the data extracted from the character style conversion heading item 21B in the table 21 is set in the character style conversion heading item 23B corresponding to the character style conversion heading item 23A in the character style conversion heading table 23.

【０１８７】上述したステップ２８０８〜ステップ２８
１２でなる処理ループは、文字スタイル登録・編集テー
ブル２１に存在するデータ数だけ繰り返され、文字スタ
イル登録・編集テーブル２１から抽出される未抽出デー
タが存在しなくなれば、文字スタイル処理制御部２０は
一連の処理を終了する。Steps 2808 to 28 described above
The processing loop consisting of 12 is repeated by the number of data existing in the character style registration / edit table 21, and if there is no unextracted data extracted from the character style registration / edit table 21, the character style processing control unit 20 A series of processing ends.

【０１８８】次に、以上のようにしてセットされた文字
スタイル変換照合テーブル２３の格納内容を適宜利用す
る文字スタイル変換判定処理部２４の動作を、図面を参
照しながら説明する。Next, the operation of the character style conversion determination processing section 24 that appropriately uses the stored contents of the character style conversion collation table 23 set as described above will be described with reference to the drawings.

【０１８９】ここで、図３０及び図３１が、文字スタイ
ル変換判定処理部２４の処理フローチャートである。図
３０及び図３１の処理は、出力文書生成手段７から出力
されたタグ付きの訳文データのある１文に対する処理を
示している。30 and 31 are processing flow charts of the character style conversion determination processing section 24. The processing of FIGS. 30 and 31 shows the processing for one sentence having the translated text data with the tag output from the output document generation means 7.

【０１９０】文字スタイル変換判定処理部２４は、大き
く言えば、出力文書生成手段７から出力された出力文書
（タグ付き訳文データ）を入力し、文字スタイル変換照
合テーブル２３から抽出した文字スタイル変換用のデー
タを用いて、利用者が登録した文字スタイルの変更指定
した内容を判定し、訳文データに該当する文字スタイル
（タグ情報）が存在している場合には、利用者の指定す
る文字スタイルへと変更し、その変更後のデータ（変更
が不要な場合には出力文書生成手段７から出力されたデ
ータ）を出力文書８とするように動作する。The character style conversion determination processing section 24 is, to put it broadly, for inputting the output document (translated data with tag) output from the output document generating means 7, and for character style conversion extracted from the character style conversion collation table 23. Change the registered character style specified by the user using the data of the specified data, and if the corresponding character style (tag information) exists in the translated data, change to the character style specified by the user. And the changed data (when the change is unnecessary, the data output from the output document generation means 7) is used as the output document 8.

【０１９１】文字スタイル変換判定処理部２４は、図３
０及び図３１に示す処理を開始すると、まず、出力文書
生成手段７から出力されたタグ付の訳文データを読み込
む（ステップ３００１）。The character style conversion determination processing section 24 is shown in FIG.
0 and the process shown in FIG. 31 are started, first, the translated text data with a tag output from the output document generation means 7 is read (step 3001).

【０１９２】その後、文字スタイル変換照合テーブル２
３の文字スタイル照合見出し項目２３Ａから１データ
（文字スタイルに係るタグ情報）を抽出しようとする
（ステップ３００２）。なお、次にこのステップ３００
２に制御が移った場合には、文字スタイル変換照合テー
ブル２３の文字スタイル照合見出し項目２３Ａから予め
定まっている順番に従って次のデータを抽出しようとす
る。Thereafter, the character style conversion collation table 2
One data (tag information related to the character style) is to be extracted from the character style collation heading item 23A of 3 (step 3002). Next, in this step 300
When the control is shifted to 2, the next data is to be extracted from the character style matching index entry 23A of the character style conversion matching table 23 according to a predetermined order.

【０１９３】このような抽出動作を実行すると、この抽
出動作によって未抽出のデータが抽出できたか否か、言
い換えると、文字スタイル変換照合テーブル２３の文字
スタイル照合見出し項目２３Ａに、抽出されていないデ
ータが存在するか否かを判定する（ステップ３００
３）。When such an extracting operation is executed, it is determined whether or not the unextracted data has been extracted by this extracting operation. In other words, the character style matching heading item 23A of the character style conversion matching table 23 does not include the unextracted data. Is present (step 300).
3).

【０１９４】ここで、抽出動作によっても、未抽出デー
タが取出せなければ（文字スタイル照合見出し項目２３
Ａに、抽出されていないデータが存在しなければ）、後
述するステップ３００９に進む。Here, if the unextracted data cannot be extracted even by the extraction operation (the character style collation heading item 23
If there is no unextracted data in A), the process proceeds to step 3009 described later.

【０１９５】これに対して、文字スタイル変換照合テー
ブル２３の文字スタイル照合見出し項目２３Ａからデー
タ（文字スタイルに係るタグ情報）が抽出できると、出
力文書生成手段７から読み込んだタグ付訳文データにお
ける、今回の抽出データ（文字スタイルに係るタグ情
報）に係る文字列を他のデータ（他の文字スタイルに係
るタグ情報）に置き換えるためのステップ３００４〜ス
テップ３００８でなる処理ルーチンに移行する。On the other hand, when the data (tag information relating to the character style) can be extracted from the character style matching heading item 23A of the character style conversion and matching table 23, in the tagged translation data read from the output document generating means 7, The process proceeds to the processing routine of steps 3004 to 3008 for replacing the character string related to the extracted data (tag information related to the character style) this time with other data (tag information related to the other character style).

【０１９６】かかる処理ルーチンではまず、出力文書生
成手段７から読み込んだタグ付訳文データの先頭側から
文字データを抽出し（ステップ３００４）、その文字デ
ータの最後の文字データ（文末データ）でないことを確
認しながら（ステップ３００５）、訳文データから抽出
したその文字データと、上述したステップ３００２で抽
出した文字スタイル変換照合テーブル２３の文字スタイ
ル照合見出し項目２３Ａのデータ（文字スタイルに係る
タグ情報）とを照合し（ステップ３００６）、マッチ
し、しかも実行記号が付与されていないものかをを判定
し（ステップ３００７）、マッチしなければ、又は、マ
ッチするが実行記号が付されているならば、上述したス
テップ３００４に戻って、タグ付訳文データから次の文
字データを抽出する。In this processing routine, first, character data is extracted from the head side of the tagged translation text data read from the output document generating means 7 (step 3004), and it is determined that the character data is not the last character data (end-of-sentence data). While checking (step 3005), the character data extracted from the translated text data and the data of the character style matching heading item 23A of the character style conversion matching table 23 extracted in step 3002 (tag information related to the character style) are checked. It is collated (step 3006), and it is judged whether or not they match and the execution symbol is not added (step 3007). If they do not match or if they match but the execution symbol is attached, the above-mentioned is executed. Returning to step 3004, the next character data is extracted from the tagged translation data.

【０１９７】ここで、実行記号とは、後続のステップ３
００８の処理において処理の対象となった文字データに
処理実行済みであることを明示するために付与された記
号であり、これにより対象データの処理の競合を排除す
ることが可能となる。すなわち、ステップ３００８で置
き換えられた文字データ（例えば「＜Ｉ＞」から「＜Ｂ
＞」）が、その後に入ったステップ３００８で置き換え
られる（例えば「＜Ｂ＞」から「＜ＥＭ＞」）ことを防
止するために、最初の置換え時に付与される記号であ
る。Here, the execution symbol means the following step 3
This is a symbol added to clearly indicate that the processing has been performed on the character data that has been processed in the processing of 008. This makes it possible to eliminate competition in processing of the target data. That is, the character data replaced in step 3008 (for example, "" to ")) Is a symbol given at the time of the first replacement in order to prevent the replacement (eg, “” to “”) in the subsequent step 3008.

【０１９８】上述したステップ３００４〜ステップ３０
０７の処理ループを繰返すことにより、ステップ３００
２で抽出した文字スタイル変換照合テーブル２３の文字
スタイル照合見出し項目２３Ａのあるデータ（文字スタ
イルに係るタグ情報）にマッチする、しかも、実行記号
が付されていない訳文データにおける文字データが見付
かると（ステップ３００７で肯定結果）、マッチした文
字スタイル変換照合テーブル２３の文字スタイル照合見
出し項目２３Ａのデータに対応する文字スタイル変換照
合テーブル２３の文字スタイル変換照合見出し項目２３
Ｂのデータ（文字スタイルに係るタグ情報）を抽出し、
訳文データのマッチした文字データの部分と置換すると
共に、処理の実行済みであることを明示するために実行
記号を付与する（ステップ３００８）。Steps 3004 to 30 described above
Step 300 is repeated by repeating the processing loop of 07.
When character data in the translated text data that matches the data (tag information related to the character style) with the character style matching heading item 23A of the character style conversion matching table 23 extracted in 2 and is not attached with the execution symbol is found ( Affirmative result in step 3007), the character style conversion collation heading item 23 of the character style conversion collation table 23 corresponding to the data of the matched character style conversion heading item 23A of the character style conversion collation table 23
B data (tag information related to character style) is extracted,
It replaces the matched character data portion of the translated text data, and at the same time, adds an execution symbol to clearly indicate that the processing has been executed (step 3008).

【０１９９】この処理が終了したときにも、文末側にま
だマッチする文字データがある可能性があるので、ステ
ップ３００４に戻って、次の文字データの抽出を行な
う。Even when this process is completed, there is a possibility that there is still matching character data at the end of the sentence, so the process returns to step 3004 and the next character data is extracted.

【０２００】以上のようにして、ステップ３００４〜ス
テップ３００７の処理ループ、又は、ステップ３００４
〜ステップ３００８の処理ループを繰返して、今回のス
テップ３００２で文字スタイル変換照合テーブル２３の
文字スタイル照合見出し項目２３Ａから抽出したデータ
にマッチする、文頭から文末間の全ての文字データの置
換えを終了したときには、ステップ３００５で肯定結果
が得られ、ステップ３００２に戻って、文字スタイル変
換照合テーブル２３から次の未抽出データを抽出する。As described above, the processing loop of steps 3004 to 3007, or step 3004
By repeating the processing loop of step 3008, the replacement of all the character data from the beginning of the sentence to the end of the sentence that matches the data extracted from the character style matching heading item 23A of the character style conversion matching table 23 in step 3002 this time is completed. Sometimes, a positive result is obtained in step 3005, and the process returns to step 3002 to extract the next unextracted data from the character style conversion collation table 23.

【０２０１】ここで、文字スタイル変換照合テーブル２
３の文字スタイル照合見出し項目２３Ａに格納されてい
る全てのデータについて上述した処理を繰返し、ステッ
プ３００２でデータが抽出できなくなると、入力された
データからステップ３００９で付与された実行記号を除
去、出力し（ステップ３００９）、一連の処理を終了す
る。Here, the character style conversion collation table 2
The above process is repeated for all the data stored in the character style collation heading item 23A of No. 3, and when the data cannot be extracted in step 3002, the execution symbol given in step 3009 is removed from the input data and output. Then (step 3009), a series of processing is ended.

【０２０２】具体例での説明は省略するが、出力文書生
成手段７及び出力文書（バッファ）８間に設けられた文
字スタイル変換手段１２の動作により、要約した翻訳結
果に対しては適していない字体に対する文字スタイル
を、要約した翻訳結果に適した文字スタイルに変換す
る。Although not described in a concrete example, it is not suitable for the summarized translation result due to the operation of the character style conversion means 12 provided between the output document generation means 7 and the output document (buffer) 8. Convert the character style for the font to a character style suitable for the summarized translation result.

【０２０３】（Ｄ−３）第４の実施形態の効果この第４の実施形態の機械翻訳システムによても、第３
の実施形態の技術的思想をそのまま有するので、第３の
実施形態が有していた効果を奏することができる。(D-3) Effects of the Fourth Embodiment The machine translation system of the fourth embodiment also provides the third embodiment.
Since it has the technical idea of the embodiment as it is, it is possible to obtain the effect that the third embodiment has.

【０２０４】これに加えて、第４の実施形態によれば、
出力文書生成手段７及び出力文書（バッファ）８間に文
字スタイル変換手段１２を設けたので、出力文書生成手
段７から出力された要約翻訳結果において要約書として
は適さない文字スタイルを要約書に適した文字スタイル
に変換できるという効果も奏する。In addition to this, according to the fourth embodiment,
Since the character style conversion means 12 is provided between the output document generation means 7 and the output document (buffer) 8, a character style that is not suitable as a summary in the summary translation result output from the output document generation means 7 is suitable for the summary. It also has the effect of being converted into a different character style.

【０２０５】（Ｅ）他の実施形態上記各実施形態においては、要約翻訳処理を常に実行す
るものを示したが、入出力手段２を介して利用者が、要
約翻訳処理又は通常翻訳処理を指定できるようにしても
良い。この場合において、第２の実施形態については、
要約翻訳処理が選択されたときにも、イメージ中の文字
列の翻訳を実行するか否かをも指定できるようにしても
良い。また、第３の実施形態については、要約翻訳処理
が選択されたときにも、イメージ中の文字列の翻訳を実
行するか否かをも指定できるようにしても良く、また、
イメージの翻訳を実行するか及び又はリンク先文書の翻
訳を実行するかをも指定できるようにしても良い。さら
に、第３の実施形態について、リンク先文書を翻訳する
場合においても、どの深さまでのリンク先文書をも翻訳
するかをも指定できるようにしても良い。さらにまた、
第３の実施形態については、当初の入力文書は要約しな
いで全文の翻訳を行ない、リンク先文書は要約した翻訳
を行なうようにしても良い。(E) Other Embodiments In each of the above embodiments, the summary translation process is always executed. However, the user specifies the summary translation process or the normal translation process via the input / output unit 2. You may allow it. In this case, regarding the second embodiment,
Even when the abstract translation process is selected, whether or not to translate the character string in the image may be designated. Further, in the third embodiment, whether or not to translate the character string in the image may be designated even when the abstract translation process is selected.
It may also be possible to specify whether to translate the image and / or the linked document. Further, regarding the third embodiment, even when the link destination document is translated, it may be possible to specify the depth to which the link destination document is translated. Furthermore,
In the third embodiment, the original input document may be translated in full without being summarized, and the linked document may be translated in summary.

【０２０６】上記のような処理の切り替えは、利用者が
選択した処理方法を取込み、上述したフローチャートに
示す一連の処理において、取込んだ処理方法に応じて処
理を分岐させる分岐処理を適宜設けることにより達成す
ることができる。To switch the processing as described above, a branching process for taking in the processing method selected by the user and branching the processing according to the fetched processing method is appropriately provided in the series of processing shown in the above-mentioned flowchart. Can be achieved by

【０２０７】また、上記各実施形態においては、リンク
オブジェクトをネットワークを介して取込むものを示し
たが、独立した情報処理システム上に機械翻訳システム
を構築した場合には、リンク文書格納手段を設け、この
リンク文書格納手段からリンクオブジェクトを取込むよ
うにしても良い。また、ネットワークからのリンクオブ
ジェクトの取込み、及び、当該機械翻訳システムに設け
られたリンク文書格納手段からリンクオブジェクトを取
込みの双方の取込みを可能にしても良い。In each of the above embodiments, the link object is fetched via the network. However, when the machine translation system is constructed on an independent information processing system, a link document storage means is provided. The link object may be fetched from this link document storage means. Further, it is possible to allow both the fetching of the link object from the network and the fetching of the link object from the link document storage means provided in the machine translation system.

【０２０８】さらに、上記第２又は第３の実施形態にお
いては、リンク先文書である非言語文書を出力文書に含
める場合に、常に、その非言語文書中の文字列の訳語を
含めむものを示したが、単に、非言語文書をそのまま含
むものであっても良い。Further, in the second or third embodiment, when a non-language document which is a link destination document is included in an output document, it always includes a translation of a character string in the non-language document. However, the non-language document may be included as it is.

【０２０９】さらにまた、上記実施形態においては、翻
訳対象タグ情報格納手段５としてユーザが登録編集でき
るものを示したが、システムが固定的に備えるものであ
っても良く、また、ユーザ編集が可能なものとシステム
固定のものとを備えるようにしても良い。Furthermore, in the above embodiment, the translation target tag information storage means 5 has been described as one which can be registered and edited by the user, but it may be fixedly provided in the system, and user editing is possible. It is also possible to provide a fixed one and a fixed one.

【０２１０】また、上記実施形態においては、タグ付き
文書を対象とした機械翻訳システムを示したが、文章部
分と同様なテキストデータ列でなるタグ情報とは異なっ
ていても、タグ付き文書におけるタグ情報と同様な機能
を果たす出力形式やリンク先文書を規定する情報を伴う
文書を機械翻訳するものであれば、本発明の技術思想を
適用することができる。In the above embodiment, the machine translation system for a document with a tag is shown. However, even if it is different from the tag information consisting of a text data string similar to the text portion, the tag in the document with a tag is different. The technical idea of the present invention can be applied to any machine translation of a document accompanied by an output format that performs the same function as information and information that defines a linked document.

【０２１１】さらに、上記実施形態においては、本発明
を機械翻訳システムに適用したものを示したが、他の文
書処理システムに本発明を適用することができる。文書
処理システム（自然言語処理システム）の中には、目的
言語への変換は実行しなくても、機械翻訳システムと同
様な解析処理を実行するものは多く、このようなの文書
処理システムに本発明を適用でき、文書の解析対象をタ
グ情報を用いて限定することができる。また、上記実施
形態の機械翻訳システムにおける、タグ情報を用いた文
書の所定部分を抽出する構成だけを備えた文書処理シス
テム（要約作成システム）を構成しても良い。この場合
でも、リンク先文書に対しても同様な処理を実行して、
所定部分を抽出するようにしても良い。Furthermore, in the above embodiment, the present invention is applied to the machine translation system, but the present invention can be applied to other document processing systems. Many document processing systems (natural language processing systems) execute analysis processing similar to that of a machine translation system without executing conversion to a target language, and the present invention is applied to such a document processing system. Can be applied, and the analysis target of the document can be limited using the tag information. Further, the machine translation system of the above embodiment may be configured with a document processing system (abstract creating system) having only a configuration for extracting a predetermined portion of a document using tag information. Even in this case, perform the same process for the linked document,
You may make it extract a predetermined part.

【０２１２】上記第４の実施形態においては、タグ情報
に基づいて、要約文書の文字スタイルを変換できるもの
を示したが、文字より大きい単位のスタイルをも変換で
きるようにしても良い。In the fourth embodiment described above, the character style of the summary document can be converted based on the tag information, but a style of a unit larger than the character may be converted.

【０２１３】[0213]

【発明の効果】以上のように、本発明によれば、表示、
印刷出力時の形式を指定する形式指定情報を伴なう文書
を処理する文書処理システムにおいて、(1) 入力文書に
おける所定種類の文字列を抽出するための、形式指定情
報の特定パターンを格納している抽出対象特定情報記憶
手段と、(2) 入力文書において、抽出対象特定情報記憶
手段に格納されている形式指定情報の特定パターンに合
致している部分を抽出する形式指定情報識別抽出手段
と、(3) 抽出された部分を整備して、又は、その後処理
された抽出部分を整備して出力文書を生成する出力文書
生成手段とを有するので、抽出対象特定情報記憶手段の
記憶内容によっては、入力文書の所定部分だけでなく、
非言語文書やリンク先文書も取出すことができる構成に
対応でき、それら特殊な文書情報をも反映させた出力文
書を形成させることができる。As described above, according to the present invention, display,
In a document processing system that processes documents with format specification information that specifies the format for print output, (1) Store a specific pattern of format specification information for extracting a character string of a specified type in the input document. Extraction target specific information storage means, and (2) a format specification information identification extraction means for extracting a portion of the input document that matches a specific pattern of the format specification information stored in the extraction target specific information storage means (3) Since it has an output document generating means for generating an output document by arranging the extracted portion or arranging the extracted portion processed thereafter, depending on the storage content of the extraction target specific information storage means, , Not only a predetermined part of the input document,
A non-language document or a linked document can be taken out, and an output document can be formed in which the special document information is also reflected.

[Brief description of drawings]

【図１】第１の実施形態の全体構成を示すブロック図で
ある。FIG. 1 is a block diagram illustrating an overall configuration of a first embodiment.

【図２】第１の実施形態の翻訳対象タグ情報格納手段の
構成例を示す説明図である。FIG. 2 is an explanatory diagram showing a configuration example of a translation target tag information storage unit of the first exemplary embodiment.

【図３】第１の実施形態の具体的動作例の説明に用いる
入力文書を示す図面である。FIG. 3 is a diagram showing an input document used to describe a specific operation example of the first exemplary embodiment.

【図４】図３の入力文書の表示画面を示す図面である。FIG. 4 is a view showing a display screen of the input document of FIG.

【図５】第１の実施形態の全体動作を示すフローチャー
トである。FIG. 5 is a flowchart showing the overall operation of the first embodiment.

【図６】第１の実施形態のタグ識別手段の動作を示すフ
ローチャートである。FIG. 6 is a flowchart showing the operation of the tag identifying means of the first embodiment.

【図７】第１の実施形態の翻訳手段の動作を示すフロー
チャートである。FIG. 7 is a flowchart showing an operation of the translation means according to the first exemplary embodiment.

【図８】第１の実施形態の出力文書生成手段の動作を示
すフローチャートである。FIG. 8 is a flowchart showing the operation of the output document generation means of the first exemplary embodiment.

【図９】第１の実施形態の動作による出力文書例（図３
の入力文書に対応）を示す図面である。FIG. 9 is an example of an output document according to the operation of the first embodiment (FIG.
Corresponding to the input document of FIG.

【図１０】図９の出力文書の表示画面を示す図面であ
る。10 is a diagram showing a display screen of the output document of FIG.

【図１１】第２の実施形態の全体構成を示すブロック図
である。FIG. 11 is a block diagram showing an overall configuration of a second embodiment.

【図１２】第２の実施形態の全体動作を示すフローチャ
ートである。FIG. 12 is a flowchart showing the overall operation of the second embodiment.

【図１３】第２の実施形態のタグ識別手段の動作を示す
フローチャートである。FIG. 13 is a flowchart showing the operation of the tag identifying means according to the second embodiment.

【図１４】第２の実施形態のリンクオブジェクト獲得手
段の動作を示すフローチャートである。FIG. 14 is a flowchart showing an operation of the link object acquisition means of the second exemplary embodiment.

【図１５】第２の実施形態の符号化手段の動作を示すフ
ローチャートである。FIG. 15 is a flowchart showing an operation of the encoding means of the second exemplary embodiment.

【図１６】第２の実施形態の動作による出力文書例（図
３の入力文書に対応）を示す図面である。FIG. 16 is a diagram showing an output document example (corresponding to the input document of FIG. 3) according to the operation of the second embodiment.

【図１７】図１６の出力文書の表示画面を示す図面であ
る。17 is a diagram showing a display screen of the output document of FIG.

【図１８】第３の実施形態の全体構成を示すブロック図
である。FIG. 18 is a block diagram showing an overall configuration of a third embodiment.

【図１９】第３の実施形態の全体動作を示すフローチャ
ートである。FIG. 19 is a flowchart showing the overall operation of the third embodiment.

【図２０】第３の実施形態のタグ識別手段の動作を示す
フローチャートである。FIG. 20 is a flowchart showing the operation of the tag identifying means of the third exemplary embodiment.

【図２１】第３の実施形態のリンクオブジェクト獲得手
段の動作を示すフローチャートである。FIG. 21 is a flow chart showing the operation of the link object acquisition means of the third exemplary embodiment.

【図２２】第３の実施形態の動作による出力文書例（図
３の入力文書に対応）を示す図面である。FIG. 22 is a diagram showing an output document example (corresponding to the input document in FIG. 3) according to the operation of the third embodiment.

【図２３】図２２の出力文書の表示画面を示す図面であ
る。FIG. 23 is a diagram showing a display screen of the output document of FIG. 22.

【図２４】第４の実施形態の特徴部分の詳細構成を示す
ブロック図である。FIG. 24 is a block diagram showing a detailed configuration of a characteristic part of the fourth embodiment.

【図２５】第４の実施形態の文字スタイル変換参照テー
ブルの構成例の説明図である。FIG. 25 is an explanatory diagram of a configuration example of a character style conversion reference table according to the fourth embodiment.

【図２６】第４の実施形態の文字スタイル登録・編集テ
ーブルの登録例の説明図である。FIG. 26 is an explanatory diagram of a registration example of a character style registration / edit table according to the fourth embodiment.

【図２７】第４の実施形態の文字スタイル変換照合テー
ブルの構成、登録例の説明図である。FIG. 27 is an explanatory diagram of a configuration and registration example of a character style conversion matching table according to the fourth embodiment.

【図２８】第４の実施形態の文字スタイル処理制御部の
処理フローチャート（１）である。FIG. 28 is a processing flowchart (1) of a character style processing control unit according to the fourth embodiment.

【図２９】第４の実施形態の文字スタイル処理制御部の
処理フローチャート（２）である。FIG. 29 is a processing flowchart (2) of the character style processing control unit according to the fourth embodiment.

【図３０】第４の実施形態の文字スタイル変換判定処理
部の処理フローチャート（１）である。FIG. 30 is a processing flowchart (1) of a character style conversion determination processing unit according to the fourth embodiment.

【図３１】第４の実施形態の文字スタイル変換判定処理
部の処理フローチャート（２）である。FIG. 31 is a processing flowchart (2) of the character style conversion determination processing unit of the fourth embodiment.

[Explanation of symbols]

１…ネットワーク、２…入出力手段、３…入力文書（バ
ッファ）、４…タグ識別手段、５…翻訳対象タグ情報格
納手段、６…翻訳手段、７…出力文書生成手段、８…出
力文書（バッファ）、９…リンクオブジェクト獲得手
段、１０…符号化手段、１１…オブジェクト格納手段、
１２…文字スタイル変換手段。1 ... Network, 2 ... Input / output means, 3 ... Input document (buffer), 4 ... Tag identification means, 5 ... Translation target tag information storage means, 6 ... Translation means, 7 ... Output document generation means, 8 ... Output document ( Buffer), 9 ... link object acquisition means, 10 ... encoding means, 11 ... object storage means,
12 ... Character style conversion means.

Claims

[Claims]

1. A document processing system for processing a document with format designation information for designating a format at the time of display and print output, for extracting a character string of a predetermined type in an input document,
An extraction target specific information storage unit that stores a specific pattern of format specification information and a format that extracts a part of the input document that matches the specific pattern of the format specification information stored in the extraction target specific information storage unit A document processing system comprising: designated information identifying and extracting means; and output document generating means that prepares an extracted document by preparing an extracted portion or by preparing an extracted portion processed thereafter.

2. The format specification information for style specification, which is included in the output document and defines the modified state at the time of display and print output, is recognized, and when the format specification information for the predetermined style is specified, other 2. The document processing system according to claim 1, further comprising a style conversion unit that converts the style specification information for style specification or deletes the predetermined style specification format specification information.

3. A link destination document acquisition means for acquiring a link destination document by extracting and activating a link destination document in a certain document, and a link destination of a document including the format designation information as one type of the format designation information. Storing a specific pattern of the format designation information capable of extracting whether or not the input document includes the format designation information for designating such a linked document, in the extraction target specific information storage means. By the way, in the input document, the format designation information identification and extraction means
If there is a portion that matches the specific pattern of the format designation information for designating the linked document stored in the extraction target specific information storage means, the acquisition of the linked document by the linked document acquisition means is activated. The document processing system according to claim 1, wherein the document processing system is a document processing system.

4. When the link destination document acquired by the link destination document acquisition means is a language document, the format specification information identifying and extracting means also performs an extraction operation for the language document, and the output document generation. Output means prepares a portion extracted from the input document and the acquired link destination document, or prepares an extracted portion from the input document and the acquired link destination document subjected to a predetermined process after extraction The document processing system according to claim 1, wherein the document processing system generates a document.

5. The output document generation means, when the link destination document acquired by the link destination document acquisition means is a non-language document such as a figure or a table, includes the non-language document as it is in the output document. The document processing system according to any one of claims 1 to 4, wherein:

6. The method according to claim 1, further comprising processing means for executing a predetermined process on the portion of the input document extracted by the format designation information identifying and extracting means. Document processing system described.

7. The document processing system according to claim 1, wherein the processing means is a translation means for translating a source language into a target language.

8. The encoding means for recognizing a character string pattern, which may be linguistic information, from the image data of the non-language document acquired by the linked document acquisition means, and encoding the character string pattern, and the translating means. 9. The document processing system according to claim 7, wherein the output document generation means includes the translated word of the encoded character string in the output document.

9. The document with the format designation information for designating the format at the time of display and print output is a tagged document in which the format designation information is represented by a character code string similar to the text body of the document. The document processing system according to any one of claims 1 to 8.