JPH11272666A

JPH11272666A - System and method for editing document and record medium

Info

Publication number: JPH11272666A
Application number: JP7272798A
Authority: JP
Inventors: Yasuyuki Fujikawa; 泰之藤川
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1998-03-20
Filing date: 1998-03-20
Publication date: 1999-10-08
Anticipated expiration: 2018-03-20
Also published as: JP3737629B2

Abstract

PROBLEM TO BE SOLVED: To obtain the document editing system which reconstitute areas extracted from plural documents with specific pattern and displays in the state that a user can edit them. SOLUTION: A set of generally usable start and end patterns and document structure information are used to extract areas between places corresponding to the set of the patterns from plural documents, and the results are put together and reconstituted according to the document structure information and layout information 122 and displayed in the state that the user can edit them. The areas edited by the user are written back to the documents of the extraction sources. Further, the document structure information and layout information 122 are registered and held together; and the document structure information includes the information on the hierarchical structure (inclusion relation) of areas in the documents, appearance order, the presence of the repetition part, etc., and the layout information includes the information on the display positions, fonts styles, and font sizes of the respective areas, the arrangement and character color of character strings, etc.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文書を編集するシ
ステムに関し、より詳しくは、複数の文書から、同様の
パターンに属する領域を抽出し、その抽出された各領域
を再構成して表示装置上に編集可能に表示し、更にその
編集後の各領域を抽出元の文書に反映させる文書編集シ
ステムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a system for editing a document, and more particularly, to a display device which extracts regions belonging to the same pattern from a plurality of documents and reconstructs the extracted regions. The present invention relates to a document editing system that displays an editable area on the top and further reflects each area after the editing on a document of an extraction source.

【０００２】[0002]

【従来の技術】従来より、文書を一定の規則で効率よく
表示または印刷するための技術としてＳＧＭＬ（Ｓｔａ
ｎｄａｒｄＧｅｎｅｒａｌｉｚｅｄＭａｒｋｕｐ
Ｌａｎｇｕａｇｅ）が広く知られている。ユーザはＳＧ
ＭＬによって、文書の構造を定義した文書型定義と表示
レイアウトまたは印刷レイアウトに使用される文書レイ
アウト定義を事前に設定することができ、ユーザはこれ
らの定義を用いて、効率的に一定の規則に従って文書を
再構成し、その表示または印刷を行うことができる。前
記文書型定義はＤＴＤ（ＤｏｃｕｍｅｎｔＴｙｐｅ
Ｄｅｆｉｎｉｔｉｏｎ）と呼ばれ、例えば、文書が複数
の領域から成る場合、その各領域の階層構造（包含関
係）、出現順及び繰り返し等に関する情報を有してい
る。また前記各領域はそれぞれ任意の一対のタグ（識別
子）に関連付けられている。2. Description of the Related Art Conventionally, as a technique for efficiently displaying or printing a document according to a certain rule, SGML (Sta) is used.
nd Generalized Markup
Language) is widely known. User is SG
The ML allows a document type definition that defines the structure of a document and a document layout definition to be used for a display layout or a print layout to be set in advance, and a user can efficiently use the definitions according to a certain rule. The document can be reconstructed and displayed or printed. The document type definition is a DTD (Document Type)
For example, when a document is composed of a plurality of areas, the document has information on the hierarchical structure (inclusion relation), appearance order, repetition, and the like of each area. Each of the areas is associated with an arbitrary pair of tags (identifiers).

【０００３】ユーザが文書を作成する場合には、ＤＴＤ
で定義した領域に対応させる部分を、前記領域に関連付
けられた一対のタグで挟むように作成する。例えば、文
章のタイトル領域は一対のタグ「ｔｉｔｌｅ」と「／ｔ
ｉｔｌｅ」に関連付けられており、ユーザは、タイトル
として表示させたい文字列を、この一対のタグで挟むよ
うにする。タグは、場合によっては、対で指定しなくて
もよい。When a user creates a document, the DTD
The part corresponding to the area defined in the above is created so as to be sandwiched between a pair of tags associated with the area. For example, the title area of a sentence is a pair of tags “title” and “/ t
It is associated with “title”, and the user places a character string to be displayed as a title between the pair of tags. Tags may not need to be specified in pairs in some cases.

【０００４】前記文書レイアウト定義は、ＤＴＤで定義
された一対のタグに対応して設定され、例えば、その一
対のタグで挟まれた文書内の領域をどういったフォント
・スタイルやフォント・サイズで出力するか等を定義す
る。この文書レイアウト定義は、通常フォーマッタと呼
ばれる出力ソフトウエア等によって保持され、フォーマ
ッタが文書を処理する際にフォーマッタによって、その
文書内に現れる一対のタグに挟まれた各領域が、文書レ
イアウト定義においてそのタグに関して定義された内容
に従って表示または印刷される。例えば、文書内で一対
のタグ「ｔｉｔｌｅ」と「／ｔｉｔｌｅ」で挟まれた領
域は１２ポイントのゴシック体で表示される。[0004] The document layout definition is set corresponding to a pair of tags defined by the DTD. For example, an area in a document sandwiched between the pair of tags is determined by what font style and font size. Define whether to output, etc. This document layout definition is usually held by output software or the like called a formatter, and when the formatter processes a document, each area sandwiched between a pair of tags appearing in the document by the formatter is defined in the document layout definition. Displayed or printed according to the content defined for the tag. For example, an area between a pair of tags “title” and “/ title” in a document is displayed in a 12-point Gothic style.

【０００５】また、特開平７−９８７０８号公報「文書
処理システム及びその方法」には、ユーザが意図した通
りに文書またはその文書の一部を選択して、その内容を
再構成して表示する文書処理システムが開示されてい
る。この文書処理システムは、ユーザが、例えば「認識
技術に役立つ文書を読みたい。」という要求を文章で入
力すると、その意図を解析し、その意図に合致した少な
くとも１つの文書等（素材）が選択され、更にその文書
の全てまたは一部が再構成されて、ユーザの読みやすい
状態に加工された後、表示される。ユーザの意図に合致
した文書または文書の一部を選択するために、ユーザが
入力した前記文章内の複数のキーワードまたはそのキー
ワードの類義語が用いられる。In Japanese Patent Application Laid-Open No. 7-98708, "Document Processing System and Method", a document or a part of the document is selected as intended by the user, and the contents are reconstructed and displayed. A document processing system is disclosed. In this document processing system, when a user inputs, for example, a request to “read a document useful for recognition technology” in a sentence, the intention is analyzed, and at least one document or the like (material) matching the intention is selected. Then, all or part of the document is reconstructed and processed into a state that is easy for the user to read, and then displayed. In order to select a document or a part of the document that matches the user's intention, a plurality of keywords or synonyms of the keywords in the text input by the user are used.

【０００６】また更に、特開平８−２０２７１１号公報
「文書編集操作電子装置」には、ユーザが最初に指定し
た編集対象の範囲から、特定の文字パターンを検出し、
同一文書内に同様のパターンを有する部分があれば、そ
の部分も編集対象として自動的に追加登録する装置が開
示されている。前記装置は、１つの文書内に散在する見
出し、タイトル部分、または注釈文等を一括して編集対
象として選択することを目的としている。前記装置は、
最初にユーザが編集対象の範囲を１つ選択すると、その
範囲内で、見出しなどで多用される「・」や「§」とい
った特定の記号が存在するかどうかを判定し、それらの
記号のうちいずれかが存在する場合は、その記号と記号
の表示カラム位置等をパターンとして登録する。次に前
記装置は、その文書内で、前記登録されたパターンと同
じパターンを有する部分、即ち「・」や「§」といった
記号を同じカラム位置に有する部分を検出し、それらを
全て編集対象として自動的に選択する。Further, Japanese Patent Application Laid-Open No. Hei 8-202711 discloses a "document editing operation electronic device" which detects a specific character pattern from a range of an editing target specified first by a user.
An apparatus is disclosed in which if there is a portion having a similar pattern in the same document, that portion is also automatically added and registered as an edit target. The apparatus aims at selecting headlines, title portions, commentary sentences, and the like scattered in one document at a time as an object to be edited. The device comprises:
First, when the user selects one range to be edited, it is determined whether or not a specific symbol such as “•” or “§” frequently used in a heading or the like exists within the range. If any one exists, the symbol and the display column position of the symbol are registered as a pattern. Next, the apparatus detects a portion having the same pattern as the registered pattern in the document, that is, a portion having a symbol such as “•” or “§” at the same column position, and detects them all as editing targets. Select automatically.

【０００７】[0007]

【発明が解決しようとする課題】前述した第１の従来技
術ＳＧＭＬにおいては、文書の一部を抽出し、再構成す
るために、意識的に一対のタグを文書に埋め込むことが
必要であり、このため文書作成の際の労力は多大なもの
である。この結果、文書内には文書の内容とは直接関係
のないタグが散りばめられ、全体的に見映えが悪くな
り、場合によってはタグを文書内容と誤解するといった
弊害も生じる。また、タグは固定的な文字列であり、こ
れらをパターン化して表示や印刷の際に使用する文書レ
イアウト定義に対応付けることはできない。In the first prior art SGML, it is necessary to intentionally embed a pair of tags in a document in order to extract and reconstruct a part of the document. For this reason, a great deal of labor is required when creating a document. As a result, tags that are not directly related to the contents of the document are scattered in the document, resulting in a poor appearance as a whole, and in some cases, the tags are misunderstood as the contents of the document. Further, tags are fixed character strings, and cannot be patterned and associated with a document layout definition used for display or printing.

【０００８】更に、ＳＧＭＬでは、１文書を対象とする
部分的な抽出と再構成が可能であるが、複数の文書から
同様のパターンの文字列に挟まれる領域をそれぞれ抽出
して、それらを一覧的に集約して再構成することはでき
ない。また、ＳＧＭＬは主に、表示または印刷において
一定の規則に基づいてレイアウトを行うものであり、再
構成された文書の内容を編集可能とし、その編集された
内容を元の文書に書き戻すような機能はない。[0008] Furthermore, in SGML, it is possible to partially extract and reconstruct one document. However, regions between character strings of the same pattern are extracted from a plurality of documents, and they are listed. They cannot be aggregated and reconfigured. SGML mainly performs layout based on a certain rule in display or printing, and enables editing of the content of a reconstructed document and writing the edited content back to the original document. No function.

【０００９】また、ＳＧＭＬでは、文書の構造を定義す
るＤＴＤと、フォーマッタ内に保持されたレイアウト定
義をそれぞれ設定しなければならない。これはユーザに
とって非常に繁雑な作業である。更に、タグ名称などは
これらの定義の間で一致させておかなくてはならず、別
々に作成することによってタグ名や構造上の矛盾が生じ
やすくなる。In SGML, a DTD that defines the structure of a document and a layout definition held in a formatter must be set. This is a very complicated task for the user. Furthermore, tag names and the like must be matched between these definitions, and tag names and structural inconsistencies are more likely to occur if created separately.

【００１０】前述した第２の従来技術、特開平７−９８
７０８号公報「文書処理システム及びその方法」におい
ては、文書の一部を抽出し再構成するために、ユーザの
要求として入力された文章から解析されたキーワードま
たはそのキーワードの類義語による文字列検索が行われ
る。こうした文字列検索では、抽出元の文書を少しでも
変更すると、その部分は抽出されなくなってしまい、検
索結果が大きく変わってしまう。また、キーワードの類
義語も検索しているため、場合によっては検索結果が膨
大になり、実用に耐えないケースも生じる。更に、ユー
ザが要求を文章等で入力するため、所定のパターン化し
た文字列等を使用して抽出を行うことができない。The above-mentioned second prior art, Japanese Patent Laid-Open No. 7-98
In Japanese Patent Publication No. 708, “Document processing system and method”, in order to extract and reconstruct a part of a document, a character string search based on a keyword analyzed from a sentence input as a user's request or a synonym of the keyword is performed. Done. In such a character string search, even if the document of the extraction source is changed even a little, the part is not extracted, and the search result is greatly changed. In addition, since synonyms of keywords are also searched, the search results may be enormous in some cases, and in some cases, they may not be practical. Further, since the user inputs the request in a sentence or the like, the extraction cannot be performed using a character string or the like that has been patterned in a predetermined manner.

【００１１】更に、この技術では、複数の文書を対象と
して抽出を行うことができるが、複数の文書の選択自体
がユーザの入力から解析されたキーワード等を用いてい
るので、実際に選択される文書を事前に確定させること
ができない。また、再構成された文書の内容を編集し
て、その編集結果を元の文書に書き戻すような機能はな
い。Further, in this technique, a plurality of documents can be extracted, but since the selection of the plurality of documents themselves uses a keyword or the like analyzed from a user's input, the documents are actually selected. Documents cannot be confirmed in advance. Further, there is no function of editing the content of a reconstructed document and writing the edited result back to the original document.

【００１２】前述した第３の従来技術、特開平８−２０
２７１１号公報「文書編集操作電子装置」においては、
文書の一部を抽出し再構成するために、ユーザが最初に
選択した範囲と同じパターンを有する部分が検索される
が、前記パターンは、「・」や「§」といった見出し部
分に多用される特定の文字列とその表示カラムであっ
て、汎用的なパターンを用いて検索を行うことはできな
い。The above-mentioned third prior art, Japanese Patent Laid-Open No.
No. 2711, "Document Editing Operation Electronic Device"
In order to extract and reconstruct a part of the document, a portion having the same pattern as the range initially selected by the user is searched, but the pattern is frequently used for a heading portion such as “•” or “§”. A specific character string and its display column cannot be searched using a general-purpose pattern.

【００１３】更に、この技術では、１文書内における複
数の部分を対象にしており、前記パターンを有する文の
みが編集対象として選択される。また、編集対象として
選択された文に対してレイアウト情報などを用いて再構
成するといった機能はない。Further, in this technique, a plurality of portions in one document are targeted, and only a sentence having the pattern is selected as an editing target. Further, there is no function of reconstructing a sentence selected as an editing target using layout information or the like.

【００１４】本発明の課題は、汎用的な開始パターン及
び終了パターンからなる一対のパターンと、文書の構造
を定義する文書構造情報を用いて、複数の文書から前記
開始パターンに該当する個所と前記終了パターンに該当
する個所との間の領域をそれぞれ抽出し、その抽出結果
を前記文書構造情報とレイアウト情報に従って集約して
再構成し、ユーザが編集可能な状態で表示する文書編集
システムを提供することにある。[0014] An object of the present invention is to use a pair of general-purpose start patterns and end patterns and document structure information defining the structure of a document to determine the location corresponding to the start pattern from a plurality of documents. Provided is a document editing system that extracts regions between portions corresponding to an end pattern, collects and extracts the extracted results according to the document structure information and the layout information, and displays the extracted results in a state where the user can edit them. It is in.

【００１５】更に、本発明の課題は、編集可能な状態で
再構成され表示された領域をユーザが編集した後に、そ
の編集された領域を抽出元の文書に書き戻し、ユーザの
編集結果を抽出元の文書に反映させる文書編集システム
を提供することにある。Further, an object of the present invention is to provide an editing apparatus in which, after a user edits an area reconstructed and displayed in an editable state, the edited area is written back to an original document to extract the user's editing result. An object of the present invention is to provide a document editing system that reflects the original document.

【００１６】また更に、本発明の課題は、前記文書構造
情報と前記レイアウト情報を一体として登録、保持で
き、前記文書構造情報が前記文書内の各領域の領域名、
階層構造（包含関係）、出現順序、繰り返しの有無及び
省略の可否などの文書構造に関する情報を定義し、前記
レイアウト情報が、前記各領域の表示位置、フォント・
スタイル、フォント・サイズ、文字列の配置、及び文字
色等のレイアウト情報を定義する文書編集システムを提
供することにある。Still another object of the present invention is to register and hold the document structure information and the layout information integrally, wherein the document structure information includes an area name of each area in the document,
Defines information related to the document structure such as a hierarchical structure (inclusion relationship), an order of appearance, the presence or absence of repetition, and whether or not it can be omitted.
An object of the present invention is to provide a document editing system that defines layout information such as a style, a font size, a character string arrangement, and a character color.

【００１７】[0017]

【課題を解決するための手段】上記本発明の課題を解決
するための請求項１に記載の文書編集システムは、少な
くとも１つの文書から任意の領域を抽出して、該抽出さ
れた領域を編集可能な状態で表示する文書編集システム
であって、各文書から所望の領域を抽出するために使用
される領域定義情報をパターンとして登録する領域定義
情報登録手段と、各文書内の領域の構造に関する文書構
造情報、及び該各領域を再構成するために使用されるレ
イアウト情報を登録する文書構造・レイアウト情報登録
手段と、該記領域定義登録手段によって登録された領域
定義情報と、該文書構造・レイアウト情報登録手段によ
って登録された文書構造情報を使用して第１の箇所と第
２の箇所を検索し、該２つの箇所の間にある領域を抽出
する文書抽出手段と、該文書構造・レイアウト情報登録
手段によって登録された該文書構造情報及び該レイアウ
ト情報を使用して、該抽出された少なくとも１つの領域
を再構成して表示する文書再構成手段を有するよう構成
される。これによって、抽出パターンを使用した安定的
な領域の抽出が行われ、少なくとも１つの文書から抽出
された各領域の内容を集約して見やすい形式で表示する
ことができる。According to a first aspect of the present invention, there is provided a document editing system for extracting an arbitrary area from at least one document and editing the extracted area. A document editing system for displaying in a possible state, an area definition information registering means for registering area definition information used for extracting a desired area from each document as a pattern, and a structure of an area in each document. Document structure / layout information registration means for registering document structure information and layout information used to reconstruct each area; area definition information registered by the storage area definition registration means; Document extraction means for searching for a first location and a second location using document structure information registered by the layout information registration means and extracting an area between the two locations Using the document structure information and the layout information registered by the document structure / layout information registering means to reconstruct and display the at least one extracted area. You. As a result, a stable area is extracted using the extraction pattern, and the contents of each area extracted from at least one document can be aggregated and displayed in an easily viewable format.

【００１８】請求項２に記載の文書編集システムは、請
求項１に記載の前記システムにおいて、前記再構成して
表示された少なくとも１つの領域を編集できるように制
御する文書編集制御手段を有するように構成される。こ
れによって、少なくとも１つの文書から抽出された各領
域の内容が集約され見やすく表示されると共に、その状
態で編集を行うことができる。According to a second aspect of the present invention, in the document editing system according to the first aspect, the document editing system further includes a document editing control unit for controlling at least one of the reconstructed and displayed areas to be edited. It is composed of As a result, the contents of each area extracted from at least one document are displayed collectively and easily viewed, and editing can be performed in that state.

【００１９】請求項３に記載の文書編集システムは、請
求項１に記載の前記システムにおいて、前記文書編集制
御手段が更に、前記再構成して表示された少なくとも１
つの領域に関する編集機能を、前記文書から抽出された
領域の内容のみに制限するよう制御するように構成され
る。これによって、抽出された各領域の内容を編集しよ
うとする際に、各領域の内容以外の表示上の修飾部分な
どを誤って編集することがなく、編集の操作性が向上す
る。According to a third aspect of the present invention, in the document editing system according to the first aspect, the document editing control means further includes at least one of the reconstructed and displayed images.
The editing function for one area is controlled to be limited to only the contents of the area extracted from the document. Thus, when trying to edit the contents of each extracted region, a modified portion on the display other than the contents of each region is not erroneously edited, and the editing operability is improved.

【００２０】請求項４に記載の文書編集システムは、請
求項１に記載の前記システムにおいて、前記抽出された
少なくとも１つの領域が編集された場合、その編集結果
を該領域の抽出元の文書に反映させる文書編集結果反映
手段を有するように構成される。これによって、複数の
文書の同様の部分を個別に検索してそれぞれ編集作業を
行うことが不要となり、集約的に複数文書の関連部分を
編集できる。According to a fourth aspect of the present invention, in the document editing system according to the first aspect, when at least one of the extracted areas is edited, a result of the editing is added to a document from which the area is extracted. It is configured to have a document editing result reflecting means to be reflected. As a result, it is not necessary to individually search for similar parts of a plurality of documents and perform an editing operation on each of them, and related parts of a plurality of documents can be edited collectively.

【００２１】請求項９に記載の文書編集システムは、請
求項１に記載の前記システムにおいて、前記各文書内の
領域の構造に関する前記文書構造情報、及び該各領域を
再構成するために使用される前記レイアウト情報が、前
記文書構造・レイアウト情報登録手段によって一体的に
登録されるように構成される。これによって、前記文書
構造情報と前記レイアウト情報が一体として登録、保持
され、両者を矛盾なく一体化して容易に作成できるよう
になった。According to a ninth aspect of the present invention, in the system of the first aspect, the document editing system is used for reconstructing the document structure information on the structure of an area in each of the documents and the respective areas. The document structure / layout information registration means registers the layout information integrally. As a result, the document structure information and the layout information are registered and held as one, so that they can be easily created without contradiction.

【００２２】[0022]

【発明の実施の形態】以下に、本発明の実施の形態の例
について、図面を参照して説明する。尚、各図におい
て、同一または類似のものには同一の参照番号または、
記号を付与して説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described with reference to the drawings. In each figure, the same or similar items have the same reference numerals or
A description will be given with symbols.

【００２３】図１は、本発明の文書編集システム１００
の一実施の形態のシステム構成を示すブロック図であ
る。文書編集システム１００は、処理装置１１０、記憶
装置１２０、入力装置１３０、及び表示装置１４０を含
む。処理装置１１０は、記憶装置１２０に記憶されてい
る必要なデータを読み込み、そのデータから得られる情
報を元に処理を行い、その処理結果を表示装置１４０に
表示した後、必要に応じてその表示内容を記憶装置１２
０内に記憶された関連するデータに書き込む。FIG. 1 shows a document editing system 100 according to the present invention.
1 is a block diagram illustrating a system configuration according to an embodiment. The document editing system 100 includes a processing device 110, a storage device 120, an input device 130, and a display device 140. The processing device 110 reads necessary data stored in the storage device 120, performs processing based on information obtained from the data, displays the processing result on the display device 140, and displays the processing result as necessary. Content storage device 12
Write to the relevant data stored in 0.

【００２４】記憶装置１２０は、通常ハードディスクや
フロッピーディスクなどの２次記憶装置であり、文書編
集システム１００に必要な、後で詳述する領域定義情報
１２１、文書構造・レイアウト情報１２２、領域−文書
構造対応情報１２３、及び複数の抽出元となる文書１２
４を格納する。しかし、これらのデータは、ネットワー
ク経由で他のコンピュータの記憶装置等から読み込ま
れ、また逆の経路で書き込まれることも可能であり、こ
の場合に前記データが記憶装置１２０内に格納されてい
る必要はない。The storage device 120 is usually a secondary storage device such as a hard disk or a floppy disk, and includes area definition information 121, document structure / layout information 122, and area-document necessary for the document editing system 100, which will be described in detail later. Structure correspondence information 123 and a plurality of extraction source documents 12
4 is stored. However, these data can be read from a storage device or the like of another computer via a network, or can be written by a reverse route. In this case, the data needs to be stored in the storage device 120. There is no.

【００２５】入力装置１３０は、必要に応じて記憶装置
１２０に格納されたデータの入力・変更を行うための、
マウスやキーボードといった入力装置である。また、入
力装置１３０は、表示装置１４０上に表示される再構成
結果を編集するためにユーザによって使用される。ここ
で編集とは、表示装置１４０上に表示された文字列等の
対象を入力装置１３０を使用して追加、変更、または削
除することを指す。The input device 130 is used to input and change data stored in the storage device 120 as necessary.
An input device such as a mouse or a keyboard. The input device 130 is used by the user to edit the reconstruction result displayed on the display device 140. Here, editing refers to adding, changing, or deleting an object such as a character string displayed on the display device 140 using the input device 130.

【００２６】表示装置１４０は、記憶装置１２０に格納
されたデータや、処理装置１１０における処理結果の内
容を表示するための、ＣＲＴディスプレイなどの表示装
置である。処理装置１１０によって処理された再構成結
果は最初に、表示装置１４０に表示される。また、再構
成結果やその再構成結果を編集した結果などを印刷する
ために、印刷装置を設けることも可能である。The display device 140 is a display device such as a CRT display for displaying the data stored in the storage device 120 and the contents of the processing result in the processing device 110. The reconstruction result processed by the processing device 110 is first displayed on the display device 140. In addition, a printing device can be provided to print the reconstruction result, the result of editing the reconstruction result, and the like.

【００２７】これらの装置の構成は、パーソナル・コン
ピュータやＵＮＩＸワークステーション等において一般
的なものである。従って本発明の文書編集システム１０
０も、単一の一般的なパーソナル・コンピュータ等によ
って実施可能である。The configuration of these devices is common in personal computers and UNIX workstations. Therefore, the document editing system 10 of the present invention
0 can also be implemented by a single general personal computer or the like.

【００２８】処理装置１１０は更に、領域定義情報登録
部１１１、文書構造・レイアウト情報登録部１１２、文
書抽出部１１３、文書再構成部１１４、文書編集制御部
１１５、及び文書編集結果反映部１１６を含む。The processing device 110 further includes an area definition information registration unit 111, a document structure / layout information registration unit 112, a document extraction unit 113, a document reconstruction unit 114, a document edit control unit 115, and a document edit result reflection unit 116. Including.

【００２９】領域定義情報登録部１１１は、ユーザが処
理対象とする複数の文書のそれぞれから、再構成すべき
文書の領域を抽出するための開始パターンと終了パター
ンからなる１対のパターンを登録する機能を提供する。
登録結果は、記憶装置１２０内の領域定義情報１２１と
して記憶される。また、領域定義情報１２１はテキスト
情報でよいため、前記領域定義情報登録部１１１は、任
意のテキストエディタで実現されうるが、所定の入力エ
リアやチェック機能を備えた特定のアプリケーションに
よって達成されることが望ましい。The area definition information registration unit 111 registers a pair of patterns including a start pattern and an end pattern for extracting a document area to be reconstructed from each of a plurality of documents to be processed by the user. Provides functions.
The registration result is stored as the area definition information 121 in the storage device 120. Further, since the area definition information 121 may be text information, the area definition information registration unit 111 can be realized by an arbitrary text editor, but can be achieved by a specific application having a predetermined input area and a check function. Is desirable.

【００３０】文書構造・レイアウト情報登録部１１２
は、前記領域定義情報登録部１１１によって登録された
抽出パターンに対応する各領域に対し、領域名、階層構
造（包含関係）、出現順序、繰り返しの有無、及び省略
の可否などの文書構造情報と、再構成する際の各領域の
表示位置、フォント・スタイル、フォント・サイズ、文
字列の配置、及び文字色等のレイアウト情報を登録する
機能を提供する。登録結果は、記憶装置１２０内の文書
構造・レイアウト情報１２２として記憶される。この実
施の形態では、前記文書構造情報と前記レイアウト情報
を一体として登録しているため、登録機構も一体化して
示しているが、文書構造情報登録部とレイアウト情報登
録部という個別の機構として構成することもできる。ま
たその場合、記憶装置１２０内の文書構造・レイアウト
情報１２２も、文書構造情報及びレイアウトとして別々
に記憶されうる。Document structure / layout information registration unit 112
For each area corresponding to the extraction pattern registered by the area definition information registration unit 111, document structure information such as an area name, a hierarchical structure (inclusive relation), an order of appearance, the presence or absence of repetition, and whether or not it can be omitted is stored. And a function of registering layout information such as a display position of each area, a font style, a font size, a character string arrangement, and a character color when reconstructing. The registration result is stored as document structure / layout information 122 in the storage device 120. In this embodiment, since the document structure information and the layout information are registered integrally, the registration mechanism is also shown integrally. However, the document structure information and the layout information registration unit are configured as separate mechanisms. You can also. In that case, the document structure / layout information 122 in the storage device 120 can also be separately stored as the document structure information and the layout.

【００３１】文書抽出部１１３は、ユーザが指定した記
憶装置１２０内の少なくとも１つの文書１２４を読み込
み、領域定義情報登録部１１１によって登録された領域
定義情報１２１と文書構造・レイアウト情報登録部１１
２によって登録された文書構造・レイアウト情報１２２
を使用して、文書１２４の所望の領域を抽出する。文書
抽出部１１３は、読み込んだ前記文書１２４のそれぞれ
の中で、前記領域定義情報１２１に登録された開始パタ
ーンと終了パターンからなる１対の抽出パターンに合致
する箇所をそれぞれ判定し、更にそれらの箇所に挟まれ
た領域が、前記文書構造・レイアウト情報１２２内に定
義されている文書内の各領域に関する領域名、階層構造
（包含関係）、出現順序、繰り返しの有無、及び省略の
可否などの文書構造情報と矛盾していないことを判定す
ると、その領域を抽出する。また、この抽出の際、抽出
元の文書１２４のどこからどこまでの領域を抽出したか
を対応づける情報を、記憶装置１２０内の領域−文書構
造対応情報１２３に出力する。これは、抽出された各領
域が後述する文書編集制御部１１５によって編集された
結果、その領域の長さが変更される可能性があり、その
後、これらの領域が後述する文書編集結果反映部１１６
によって個別に抽出元の文書１２４に反映される際に、
文書１２４内の抽出されていない他の領域を更新してし
まわないようにするためである。The document extracting unit 113 reads at least one document 124 in the storage device 120 designated by the user, and stores the area definition information 121 registered by the area definition information registering unit 111 and the document structure / layout information registering unit 11.
2 document structure / layout information 122 registered
Is used to extract the desired region of the document 124. The document extracting unit 113 determines, in each of the read documents 124, locations that match a pair of extraction patterns including a start pattern and an end pattern registered in the area definition information 121, and further determines those locations. The area sandwiched between the parts is the area name, hierarchical structure (inclusive relation), appearance order, repetition or non-repetition, and whether or not to omit each area in the document defined in the document structure / layout information 122. If it is determined that there is no conflict with the document structure information, the area is extracted. In addition, at the time of this extraction, information that associates a region from which to extract the region of the extraction source document 124 to the region-document structure correspondence information 123 in the storage device 120 is output. This is because there is a possibility that the lengths of the extracted areas are changed by the document editing control unit 115 described later, and then these areas are changed to the document editing result reflecting unit 116 described later.
Is individually reflected in the extraction source document 124 by
This is to prevent the other unextracted area in the document 124 from being updated.

【００３２】また、文書抽出部１１３では、ユーザが文
書１２４を指定した際に、その各文書１２４の内容をそ
れぞれ１つのウインドウに表示させ、その後、抽出を指
示するよう設定する。しかし、各文書１２４の内容を表
示させることなく、抽出の指定を行うよう構成しても構
わない。In the document extracting section 113, when the user specifies a document 124, the contents of each document 124 are displayed in one window, and thereafter, settings are made so as to instruct extraction. However, the extraction may be designated without displaying the contents of each document 124.

【００３３】文書再構成部１１４は、文書抽出部１１３
によって抽出された文書１２４内の各領域を、領域定義
情報登録部１１１によって登録された領域定義情報１２
１と文書構造・レイアウト情報登録部１１２によって登
録された文書構造・レイアウト情報１２２を使用して、
表示装置１４０上に再構成する。前記抽出された領域に
は、それが抽出された抽出パターンに対応する名前（領
域名）がつけられており、その領域名に対応する前記文
書構造・レイアウト情報１２２内の内容がそれぞれの領
域の再構成に用いられる。また、この再構成に際して
は、レイアウト情報によって、抽出された領域に含まれ
ていない見出しなどの固定の文字列や、囲み線、下線等
の修飾が付加され、ユーザが見やすいような再構成がな
される。The document reconstructing unit 114 includes a document extracting unit 113
Each area in the document 124 extracted by the area definition information registration unit 111
1 and the document structure / layout information 122 registered by the document structure / layout information registration unit 112,
Reconfigure on display device 140. The extracted area is given a name (area name) corresponding to the extraction pattern from which the extracted area is extracted, and the content in the document structure / layout information 122 corresponding to the area name is assigned to each area. Used for reconstruction. In this reconfiguration, a fixed character string such as a heading not included in the extracted area, a modification such as an enclosing line or an underline is added according to the layout information, and the reconfiguration is performed so that the user can easily see it. You.

【００３４】また、文書再構成部１１４は、前記領域−
文書構造対応情報１２３の出力を前記文書抽出部１１３
に代わって行うことも可能である。その場合には、文書
抽出部１１３は、抽出対象となった領域の内容に加え
て、それらの領域が抽出元の文書１２４のどこからどこ
までの内容に対応するかを示す中間情報を、文書再構成
部１１４に対して出力しておく必要がある。Further, the document reconstructing unit 114 stores the area-
The output of the document structure correspondence information 123 is output to the document extraction unit 113.
It is also possible to do this instead. In this case, the document extracting unit 113 generates, in addition to the contents of the extraction target area, intermediate information indicating where and how the contents correspond to the contents of the extraction source document 124. It is necessary to output to the unit 114.

【００３５】再構成結果は、複数の文書に関して抽出さ
れた少なくとも１つの領域が前記レイアウト情報に従っ
て１つのウインドウとして表示される。この実施の形態
では、例えばマイクロソフト社のＷｉｎｄｏｗｓ９５の
ようなマルチウインドウシステム上でワープロＷｏｒｄ
を稼働させて、その中の１ウインドウに再構成結果を表
示させている。また、再構成結果は、１つの文書に関し
て抽出された少なくとも１つの領域を前記レイアウト情
報に従って１つのウインドウとして表示し、それぞれの
文書に関して１つのウインドウで表示されるように構成
しても良い。As a result of the reconstruction, at least one area extracted for a plurality of documents is displayed as one window according to the layout information. In this embodiment, a word processor Word is used on a multi-window system such as Windows 95 of Microsoft Corporation.
Is operated, and the reconstruction result is displayed in one of the windows. The reconstruction result may be configured so that at least one area extracted for one document is displayed as one window according to the layout information, and each document is displayed in one window.

【００３６】文書編集制御部１１５は、前記文書再構成
部１１４によって表示装置１４０に表示されたウインド
ウにおいて、そのウインドウ内の編集を制御する。前述
の通り、前記ウインドウ内には、複数の文書に関して再
構成された結果が表示されているが、抽出された領域
と、レイアウト情報によって新たに付加された見出しな
どの情報が混在している。そこで、前記文書編集制御部
１１５は、このウインドウ内において、前記抽出された
領域のみを編集可能とし、レイアウト情報によって付加
された見出しなどの修飾領域の編集を禁止する。また、
ここでの編集は、結果的に抽出された領域の長さを変え
てしまうような編集であっても構わない。The document editing control section 115 controls editing in the window displayed on the display device 140 by the document reconstructing section 114. As described above, the results of reconstructing a plurality of documents are displayed in the window, but the extracted area and information such as a headline newly added by layout information are mixed. Therefore, the document editing control unit 115 allows only the extracted area to be edited in this window, and prohibits editing of a modification area such as a heading added by layout information. Also,
The editing here may be an editing that changes the length of the extracted area as a result.

【００３７】文書編集結果反映部１１６は、ユーザが抽
出元の文書１２４にその編集結果を反映させるよう指示
すると、記憶装置１２０内の領域−文書構造対応情報１
２３に記憶された対応情報に従って、その抽出元の文書
１２４の内容を編集された内容に更新する。もちろん、
反映されるのは、複数の文書から抽出され編集された領
域であって、レイアウト情報によって新たに付加された
見出しなどの情報は対象外である。When the user instructs the document 124 to be extracted to reflect the editing result, the document editing result reflecting section 116 stores the area-document structure correspondence information 1 in the storage device 120.
In accordance with the correspondence information stored in 23, the contents of the extraction source document 124 are updated to the edited contents. of course,
What is reflected is an area extracted and edited from a plurality of documents, and information such as a headline newly added by layout information is out of scope.

【００３８】以上、図１に関して説明を行ってきたが、
図示した構成は本発明の一実施の形態の構成を示したに
過ぎず、こうした構成に限られるものではない。例え
ば、処理装置１１０内の各部分は、複数のコンピュータ
に分散させて行うことができる。更に、記憶装置１２
０、入力装置１３０、及び表示装置１４０は、ネットワ
ークを介して単数または複数の処理装置１１０と接続さ
せることができ、ユーザがリモートで本発明の文書編集
システム１００を操作することが可能である。As described above with reference to FIG.
The illustrated configuration merely shows the configuration of one embodiment of the present invention, and is not limited to such a configuration. For example, each part in the processing device 110 can be performed by being distributed to a plurality of computers. Further, the storage device 12
The input device 130, the input device 130, and the display device 140 can be connected to one or more processing devices 110 via a network, and a user can remotely operate the document editing system 100 of the present invention.

【００３９】次に、図４から図１２までを参照しながら
図２の処理フローを説明する。図２は、前記文書抽出部
１１３及び前記文書再構成部１１４による文書抽出及び
再構成処理の流れを示す図である。図２は、左１／３の
欄が文書抽出及び再構成処理により作成された情報を示
し、中央１／３の欄が文書抽出及び再構成処理の流れを
示し、右１／３の欄がユーザが予め定義した情報を示し
ている。Next, the processing flow of FIG. 2 will be described with reference to FIGS. FIG. 2 is a diagram showing the flow of document extraction and reconstruction processing by the document extraction unit 113 and the document reconstruction unit 114. In FIG. 2, the left 1/3 column shows the information created by the document extraction and reconstruction process, the central 1/3 column shows the flow of the document extraction and reconstruction process, and the right 1/3 column shows the flow of the document extraction and reconstruction process. It shows information defined by the user in advance.

【００４０】図２の最初のステップＳ１０において、文
書抽出部１１３は領域定義情報１２１を読み込む。領域
定義情報１２１は、予めユーザが領域定義情報登録部１
１１を用いて登録したものである。In the first step S10 of FIG. 2, the document extracting section 113 reads the area definition information 121. The area definition information 121 is stored in advance by the user in the area definition information registration unit 1.
11 is registered.

【００４１】図４はユーザによって登録された領域定義
情報１２１の一例を示すものである。なお、図面中に記
載された黒く塗りつぶされた括弧は本文中では“〔〕”
の括弧に置き換えて表記する。この例では、ある一対の
文字列パターン（開始パターンと終了パターン）に該当
する文字列に挟まれる文書１２４内の領域を抽出するよ
うに定義されている。例えば、抽出する領域の直前を示
す文字列パターン（開始パターン）は〔＠＠〕であり、
直後を示す文字列パターン（終了パターン）は〔＠＠−
終了〕であり、〔＠＠〕に該当する文字列と〔＠＠−終
了〕に該当する文字列の間の領域が抽出されるように定
義される。外１（以下、本文中におFIG. 4 shows an example of the area definition information 121 registered by the user. Note that the parentheses painted black in the drawings are "[]" in the text.
Notation is replaced by parentheses. In this example, it is defined to extract an area in the document 124 sandwiched between character strings corresponding to a pair of character string patterns (a start pattern and an end pattern). For example, the character string pattern (start pattern) immediately before the area to be extracted is [＠＠],
The character string pattern (end pattern) indicating immediately after is [＠＠-
End], and a region between the character string corresponding to [＠＠] and the character string corresponding to [＠＠-end] is defined to be extracted. Outside 1 (hereinafter referred to as

【００４２】[0042]

【外１】 [Outside 1]

【００４３】いて、この黒く塗りつぶされた四角は
“（黒四角）”と置き換えて表記する。）は任意の領域
の終了を示すパターンとして定義されている。但し、こ
こで、〔＠＠−終了〕は必須でなく、〔＠＠〕に該当す
る文字列の次に更に〔＠＠〕に該当する別の文字列があ
れば、２つ目の〔＠＠〕の直前までが１つ目の〔＠＠〕
に関する領域の抽出範囲となる。また、＠＠とは、任意
の文字数の文字列を示し、一対の文字列パターンにおけ
る〔＠＠〕と〔＠＠−終了〕内の＠＠は同一の文字列で
ある。なお、＠＠に該当する文字列は、後述する図５の
文書構造情報の各領域名（構成要素の名称）と一致する
べきものである。The squares painted black are replaced with “(black squares)”. ) Is defined as a pattern indicating the end of an arbitrary area. However, here, [＠＠-end] is not essential. If there is another character string corresponding to [＠＠] after the character string corresponding to [＠＠], the second [＠ Until just before [＠], the first [＠＠]
The extraction range of the region for ＠＠ indicates a character string having an arbitrary number of characters, and 、 in [＠＠] and [＠＠-end] in a pair of character string patterns is the same character string. Note that the character string corresponding to ＠＠ should match each area name (name of a component) of the document structure information in FIG. 5 described later.

【００４４】図２の次のステップＳ１１において、文書
抽出部１１３は文書構造・レイアウト情報１２２を読み
込む。文書構造・レイアウト情報１２２は、予めユーザ
が文書構造・レイアウト情報登録部１１２を用いて登録
したものである。ステップＳ１１で使用されるのは、こ
の文書構造・レイアウト情報１２２のうち、文書１２４
の中の各領域の領域名、階層構造（包含関係）、出現順
序、繰り返しの有無及び省略の可否などを定義した文書
構造情報のみである。この実施の形態では、文書構造・
レイアウト情報１２２は一体として定義されており、後
で図１０に関して詳述する。ここでは、説明を簡略化す
るため、前記文書構造情報の例を概念的に示す図５を用
いて説明する。In the next step S11 of FIG. 2, the document extracting unit 113 reads the document structure / layout information 122. The document structure / layout information 122 is registered in advance by the user using the document structure / layout information registration unit 112. In step S11, the document 124 of the document structure / layout information 122 is used.
Only the document structure information that defines the area name, the hierarchical structure (inclusive relation), the order of appearance, the presence / absence of repetition, the omission of omission, etc. In this embodiment, the document structure
The layout information 122 is defined as one, and will be described later in detail with reference to FIG. Here, in order to simplify the description, the description will be given with reference to FIG. 5 which conceptually shows an example of the document structure information.

【００４５】図５に示された文書構造情報は、プログラ
ムの仕様書に関する文書構造を規定した例であり、主と
して文書２００の階層構造（包含関係）を示している。
最下位の階層のブロック（即ち、見出し、関数名、処理
概要、呼出形式、インタフェース、及び注意）が文書内
の領域の実体を示している。文書２００は、見出し２１
０と処理仕様２２０から成っている。見出し２１０は１
つ存在する。処理仕様２２０は繰り返しありと記述され
ており、複数存在する可能性がある。処理仕様２２０
は、関数名２２１、処理概要２２２、呼出形式２２３、
インタフェース２２４、及び注意２２５を含んでいると
いう包含関係を規定する。処理仕様２２０は前記各要素
の集合を規定するのみで、文書２００の実体領域に対応
しない。インタフェース２２４は繰り返しありと記述さ
れていることから、１つの処理仕様の中に複数存在する
可能性があり、注意２２５は省略可であり、存在しない
場合がある。The document structure information shown in FIG. 5 is an example in which a document structure relating to a program specification is defined, and mainly shows a hierarchical structure (inclusion relation) of the document 200.
Blocks at the lowest level (that is, headings, function names, processing outlines, call types, interfaces, and notes) indicate the substance of the area in the document. Document 200 has heading 21
0 and the processing specification 220. Heading 210 is 1
Exist. The processing specification 220 is described as having repetition, and there is a possibility that a plurality of processing specifications exist. Processing specification 220
Is a function name 221, a process outline 222, a call format 223,
An inclusion relationship is defined that includes an interface 224 and a note 225. The processing specification 220 only defines a set of each of the above elements, and does not correspond to the substantial area of the document 200. Since the interface 224 is described as having repetition, there may be a plurality of interfaces in one processing specification, and the attention 225 can be omitted and may not be present.

【００４６】逆に、見出し２１０、関数名２２１、処理
概要２２２、呼出形式２２３、及び少なくとも１つのイ
ンタフェース２２４は、文書２００内に必須のものであ
り、これがない場合は、後述の図２のステップＳ１２に
おいて、該当文書に関して所定のタイミングでエラーメ
ッセージを表示したりする対応が考えられる。Conversely, the heading 210, the function name 221, the processing outline 222, the calling form 223, and at least one interface 224 are essential in the document 200. In S12, it is possible to display an error message at a predetermined timing with respect to the document.

【００４７】次に図２のステップＳ１２では、文書抽出
部１１３が、ユーザの指定した少なくとも１つの文書１
２４を読み込み、図４の領域定義情報１２１と図５の文
書構造情報に基づいて、各文書１２４から少なくとも１
つの領域を抽出する。前述のように、図５に示された文
書構造情報の各領域名（構成要素の名称）が、図４の領
域定義情報１２１のパターンに記載された＠＠に該当す
る文字列に対応している。即ち文書抽出部１１３は、既
に読み取った図４の領域定義情報１２１と図５の文書構
造情報から、ユーザより指定された前記各文書１２４
が、少なくとも〔見出し〕〜〔見出し−終了〕の領域、
〔関数名〕〜〔関数名−終了〕の領域、〔処理概要〕〜
〔処理概要−終了〕の領域、〔呼出形式〕〜〔呼出形式
−終了〕の領域、〔インタフェース〕〜〔インタフェー
ス−終了〕の領域、及び〔注意〕〜〔注意−終了〕の領
域をこの順序で有し、更に〔関数名〕〜〔関数名−終
了〕の領域から〔注意〕〜〔注意−終了〕の領域まで
は、同じ順序で繰り返される可能性があるものとして解
釈する。Next, in step S12 of FIG. 2, the document extracting unit 113 sets at least one document 1 specified by the user.
24 based on the area definition information 121 of FIG. 4 and the document structure information of FIG.
Extract two regions. As described above, each area name (name of a component) of the document structure information shown in FIG. 5 corresponds to the character string corresponding to ＠＠ described in the pattern of the area definition information 121 in FIG. I have. That is, the document extracting unit 113 extracts each of the documents 124 specified by the user from the area definition information 121 of FIG. 4 already read and the document structure information of FIG.
Is, at least the area of [heading] to [heading-end],
[Function name]-[Function name-End] area, [Processing overview]-
The area of [Processing Overview-End], the area of [Call Format]-[Call Format-End], the area of [Interface]-[Interface-End], and the area of [Caution]-[Caution-End] Further, from the area of [function name] to [function name-end] to the area of [caution] to [caution-end], it is interpreted that there is a possibility of being repeated in the same order.

【００４８】但し、前述のように、〔見出し−終了〕、
〔関数名−終了〕、〔処理概要−終了〕、〔呼出形式−
終了〕、〔インタフェース−終了〕、及び〔注意−終
了〕といった領域の終了を判定する文字列は文書１２４
中で省略される場合があり、次の領域の開始を判定する
文字列または（黒四角）などの所定の文字列の出現によ
り領域の終了が判定される。However, as described above, [heading-end],
[Function Name-End], [Process Overview-End], [Call Format-
Character strings for determining the end of the area such as [end], [interface-end], and [caution-end] are document 124.
The end of the area is determined by the appearance of a character string for determining the start of the next area or a predetermined character string such as (black square).

【００４９】こうした抽出の例を、図６に示される文書
１２４の例で考えてみる。前述の説明から、文書１２４
では、見出し２１０に関する領域は必須となっている
が、ここでは省略する。文書抽出部１１３が図６の文書
１２４を読み込むと、最初に「〔関数名〕」という文字
列を検出し、その後「〔関数名−終了〕」という文字列
を検出する。これは、図４の領域定義情報１２１のパタ
ーンに合致し、図５の文書構造情報における関数名２２
１の出現順序に合致するため（前述の通り、ここでは見
出し２１０はないものと考え、関数名２２１が最初に現
れる有効な領域である）、「〔関数名〕」〜「〔関数名
−終了〕」の間の文字列「ＺａｉｋｏＨｉｋｉａｔｅ」
が「関数名」という領域名と関連付けて抽出される。Consider an example of such extraction with the example of the document 124 shown in FIG. From the above description, document 124
In this case, the area relating to the heading 210 is indispensable, but is omitted here. When the document extracting unit 113 reads the document 124 shown in FIG. 6, the character string "[function name]" is first detected, and then the character string "[function name-end]" is detected. This matches the pattern of the area definition information 121 in FIG. 4, and matches the function name 22 in the document structure information in FIG.
1 (it is assumed that the heading 210 does not exist here, and the function name 221 is the first valid area, as described above), so that “[function name]” to “[function name−end” ]] ”, The character string“ ZaikoHikiate ”
Is extracted in association with the area name “function name”.

【００５０】次に、文書抽出部１１３が読み込みを続け
ると、「★この行は抽出されません★」という文字列が
抽出されるが、この文字列は、図４の領域定義情報１２
１で定義されたパターンで挟まれておらず、またそのよ
うなパターンで始まっていないため、抽出されない。Next, when the document extracting unit 113 continues reading, a character string “★ This line is not extracted ★” is extracted. This character string is used in the area definition information 12 shown in FIG.
It is not extracted because it is not sandwiched by the patterns defined in 1 and does not begin with such a pattern.

【００５１】次に、文書抽出部１１３は、「〔処理概
要〕」という文字列を検出し、その後「〔呼出形式〕」
という文字列を検出する。ここで、「〔処理概要〕」と
いう文字列は図４の領域定義情報１２１のパターンに合
致し、図５の文書構造情報において、関数名２２１の次
に出現する領域名が処理概要２２２であることから出現
順序の条件も満たし、「〔処理概要〕」〜「〔呼出形
式〕」の間の文字列「（改行）与えられた商品コードと
受注数量を基に、在庫が引き当て可能かどうか（改行）
を判断する。」が、「処理概要」という領域名と関連付
けて抽出される。この場合、「〔処理概要−終了〕」と
いう、領域の終了を示す文字列はないが、次の妥当な領
域「呼出形式」の開始を示す「〔呼出形式〕」という文
字列が出現したため前記文字列「〔処理概要−終了〕」
が省略されているものと判断される。また、ここで抽出
される文字列の内容には、抽出元の内容を忠実に再現す
べく、文字情報の他に改行や改ページに関する制御情報
も抽出される。前記処理概要の例では、改行に関する制
御情報が、抽出文字列内に「（改行）」として抽出され
ている。Next, the document extracting unit 113 detects the character string "[Processing outline]", and thereafter, "[Calling form]"
Is detected. Here, the character string “[Process Overview]” matches the pattern of the area definition information 121 in FIG. 4, and the area name appearing next to the function name 221 in the document structure information in FIG. Therefore, the condition of the appearance order is also satisfied, and whether the stock can be allocated based on the given product code and the ordered quantity of the character string “(line feed)” between “[processing outline]” and “[call form]” ( new line)
Judge. "Is extracted in association with the area name" process outline ". In this case, although there is no character string indicating the end of the area, "[processing outline-end]", the character string "[call form]" indicating the start of the next valid area "call form" appears. Character string "[Processing Overview-End]"
Is determined to be omitted. In addition, in order to faithfully reproduce the contents of the extraction source, control information relating to line feed and page break is also extracted from the contents of the character string extracted here. In the example of the processing outline, the control information regarding the line feed is extracted as “(line feed)” in the extracted character string.

【００５２】以降、同様の処理を繰り返すと、その後、
以下のような文字列が各領域名に関連付けて抽出され
る。領域名：呼出形式＝「ＺａｉｋｏＨｉｋｉａｔｅ（ＳｈｏｈｉｎＣＤ，ＺａｉｋｏＳｕ，Ｋｅｋｋａ）」、領域名：インタフェース（１）＝「ＩｎｔＳｈｏｈｉｎＣＤ，／／引き当てすべき商品コードのキーとなる値」、領域名：インタフェース（２）＝「ＩｎｔＺａｉｋｏＳｕ，／／引き当てすべき数量」、領域名：インタフェース（３）＝「ＩｎｔＫｅｋｋａ／／引き当てが可能かどうかがセットされる。（改行）／／引き当て可能なら１。不可なら０が設定される。」、領域名：注意＝「在庫数（ＺａｉｋｏＳｕ）にマイナスの数値が入っている場合、本処理の動作結果（改行）は不定になる。」。Thereafter, when the same processing is repeated,
The following character strings are extracted in association with each area name. Area name: Call format = "ZaikoHikiate (ShohinCD, ZaikoSu, Kekka)" Area name: Interface (1) = "Int ShohinCD, // Key value of product code to be assigned" Area name: Interface (2 ) = “Int ZaikoSu, // amount to be allocated”, area name: interface (3) = “Int Kekka /// whether or not allocation is possible. (Line feed) // 1 if allocation is possible, 0 if not possible Is set. ”, Area name: Attention =“ If the stock quantity (ZaikoSu) contains a negative numerical value, the operation result (line feed) of this processing is undefined ”.

【００５３】ここで、処理仕様２２０の繰り返しはない
がインタフェース２２４は３回繰り返されており、この
ような場合、領域名には、前述のようにインタフェース
の後に（１）〜（３）が付加される。これは、繰り返し
の要素を固有に識別可能にするためであって、他の様々
な方法を用いることもできる。Here, although the processing specification 220 is not repeated, the interface 224 is repeated three times. In such a case, (1) to (3) are added to the area name after the interface as described above. Is done. This is to make the repeated elements uniquely identifiable, and various other methods can be used.

【００５４】図７には、領域定義情報１２１の第２の例
が示されている。この例では、ある程度規則的な文字列
のパターンを有する文書１２４内の文字列から開始され
る領域が抽出されるように定義されている。例えば、抽
出する領域の直前の文字列は＾［０−９］＋￥．＠＠＄
であり、領域の直後の文字列を示すパターンはない。＾
［０−９］＋￥．＠＠＄は、行頭から少なくとも１つの
数字の並び、１つのピリオド、領域名に対応する任意の
数の文字列、及び改行からなる文字列を示している。任
意の領域の終了を示すパターンは、外２である。こ
こで、＾＄は改FIG. 7 shows a second example of the area definition information 121. In this example, it is defined that an area starting from a character string in the document 124 having a character string pattern that is somewhat regular is extracted. For example, the character string immediately before the region to be extracted is {[0-9] + {. ＠＠＄
And there is no pattern indicating the character string immediately after the area. ＾
[0-9] + {. ＠＠＄ indicates a character string composed of a sequence of at least one number from the beginning of the line, one period, an arbitrary number of character strings corresponding to the area name, and a line feed. The pattern indicating the end of any area is Where ＾＄

【００５５】[0055]

【外２】 [Outside 2]

【００５６】行のみの行を示している。従って、改行の
みの行または文字列（黒四角）が現れた場合に任意の領
域が終了する。前述のように＠＠は、文書構造情報の各
領域名（構成要素の名称）と同じものである。A row including only rows is shown. Therefore, an arbitrary region ends when a line or a character string (black square) including only a line feed appears. As described above, ＠＠ is the same as each area name (name of a component) of the document structure information.

【００５７】図７の領域定義情報１２１に基づいた抽出
の例を、図８に示す第２の文書１２４の例で考えてみ
る。また、ここでは、図５に示したような文書構造情報
が省略されているが、階層構造、及び繰り返しを持たな
い「発明の名称」、「特許請求の範囲」、及び「発明の
詳細な説明」の各領域が、この順で並んでいる単純な文
書構造を仮定する。An example of extraction based on the area definition information 121 shown in FIG. 7 will be considered with an example of the second document 124 shown in FIG. Also, here, the document structure information as shown in FIG. 5 is omitted, but “hierarchy name”, “claims”, and “detailed description of the invention” which do not have a hierarchical structure and repetition Is assumed to be a simple document structure in which each area is arranged in this order.

【００５８】文書抽出部１１３が図８の文書１２４を読
み込むと、最初に「１．発明の名称」という文字列を検
出し、その後改行のみの行を検出する。ここで、図７の
領域定義情報１２１から、＠＠に対応するのは「発明の
名称」であり、前記仮定した文書構造情報において文書
内で最初にあるべき領域「発明の名称」に合致するた
め、「１．発明の名称」〜改行のみの行の間にある文字
列「（改行）電子ファイル編集装置（改行）」を「発明
の名称」という領域名に関連付けて抽出する。以降同様
に抽出を繰り返すと、その後、以下のような文字列が各
領域名に関連付けて抽出される。When the document extracting unit 113 reads the document 124 shown in FIG. 8, it first detects a character string "1. title of invention", and then detects a line consisting only of a line feed. Here, from the area definition information 121 of FIG. 7, "invention name" corresponds to the triangle, and coincides with the first area "invention name" in the document in the assumed document structure information. Therefore, the character string “(line feed) electronic file editing device (line feed)” between “1. title of the invention” and a line including only a line feed is extracted in association with the area name “name of the invention”. Thereafter, when the extraction is repeated in the same manner, the following character strings are extracted in association with each area name.

【００５９】領域名：特許請求の範囲＝「（改行）電子
ファイルの領域を指定するためのルールと、その領域を
レイアウトする情報を定義することにより、電子ファイ
ルを再構成し、編集する装置（改行）」、領域名：発明の詳細な説明＝「（改行）本発明は、複
数．．．（改行）は不定になる。」。Area name: Claim = “(line feed) An apparatus for reconstructing and editing an electronic file by defining rules for designating an area of an electronic file and information for laying out the area ( Line break) ", Area name: Detailed description of the invention =" (Line break) In the present invention, plural (line break) is undefined. "

【００６０】図９には、領域定義情報１２１の第３の例
が示されている。この例では、ある程度固定的な文字列
のパターンを有する文書１２４内の文字列に挟まれた領
域が抽出されるように定義されている。例えば、「１．
発明の名称」という文字列と、＾＄（改行のみの行）、
または（黒四角）との間の文字列は、発明名称という領
域名で抽出され、「２．特許請求の範囲」という文字列
と、＾＄または（黒四角）との間の文字列は、請求範囲
という領域名で抽出され、「３．発明の詳細な説明」と
いう文字列と、＾＄または（黒四角）との間の文字列
は、詳細説明という領域名で抽出される。ここで、＾＄
は改行のみの行を示している。FIG. 9 shows a third example of the area definition information 121. In this example, it is defined that an area sandwiched between character strings in the document 124 having a character string pattern that is fixed to some extent is extracted. For example, "1.
"Title of Invention", ＾＄ (line with only line breaks),
Or a character string between (black squares) is extracted by an area name called an invention name. A character string between “2. Claims” and a character string between ＾＄ or (black square) is A character string between “3. Detailed description of the invention” and a character string between “＾＄” and (black square) is extracted as an area name “detailed description”. Where ＾＄
Indicates a line containing only a line feed.

【００６１】図４または図７に示した領域定義情報の例
では、文書中のパターンの中の一部分（＠＠に対応する
部分）が領域名に対応していたが、この第３の例では、
パターン毎に固定的に設定されている。また、この例で
も、文書構造情報が省略されているが、階層構造、及び
繰り返しを持たない「発明名称」、「請求範囲」、及び
「詳細説明」の各領域が、この順で並んでいる単純な文
書構造を仮定する。ここでは、各領域の名称が前記領域
定義情報１２１内の各抽出パターンに対応して設定され
ており、図８の説明で仮定した文書構造情報における領
域名とは異なっていることに注意すべきである。In the example of the area definition information shown in FIG. 4 or FIG. 7, a part (a part corresponding to ＠＠) in the pattern in the document corresponds to the area name, but in the third example, ,
It is fixedly set for each pattern. Also in this example, although the document structure information is omitted, the areas of the "invention name", "claims", and "detailed description" which do not have a hierarchical structure and repetition are arranged in this order. Assume a simple document structure. Here, it should be noted that the name of each area is set corresponding to each extraction pattern in the area definition information 121, and is different from the area name in the document structure information assumed in the description of FIG. It is.

【００６２】図９の領域定義情報１２１を用いて、再び
図８の文書１２４からの抽出を行うと、文書抽出部１１
３は最初に「１．発明の名称」という文字列を検出し、
その後改行のみの行を検出する。ここで、図９の領域定
義情報１２１から、対応する領域名は「発明名称」であ
り、前記仮定した文書構造情報において文書内で最初に
あるべき領域「発明名称」に合致するため、「１．発明
の名称」〜改行のみの行までの文字列「（改行）電子フ
ァイル編集装置（改行）」を「発明名称」という領域名
に関連付けて抽出する。以降同様に抽出を繰り返すと、
その後、以下のような文字列が各領域名に関連付けて抽
出される。When the extraction from the document 124 in FIG. 8 is performed again using the area definition information 121 in FIG.
3 first detects the character string “1.
Then, a line containing only a line feed is detected. Here, from the area definition information 121 of FIG. 9, the corresponding area name is “invention name”, which matches the area “invention name” that should be first in the document in the assumed document structure information. The character string “(line feed) electronic file editing device (line feed)” from the “name of invention” to the line with only line feed is extracted in association with the area name “invention name”. When the extraction is repeated in the same way,
Then, the following character strings are extracted in association with each area name.

【００６３】領域名：請求範囲＝「（改行）電子ファイ
ルの領域を指定するためのルールと、その領域をレイア
ウトする情報を定義することにより、電子ファイルを再
構成し、編集する装置（改行）」、領域名：詳細説明＝「（改行）本発明は、複数．．．
（改行）は不定になる。」。Area name: Claim = “(line feed) Device for reconfiguring and editing an electronic file by defining rules for specifying the area of the electronic file and information for laying out the area (line feed) , Area name: Detailed description = "(Line feed)
(Line feed) is undefined. ".

【００６４】図２のステップＳ１２で、文書抽出部１１
３が文書１２４から抽出対象の領域を抽出する方法を以
上に示したが、文書抽出部１１３は、この抽出と共に、
抽出元の文書１２４のどこからどこまでの領域を抽出し
たかを対応づける情報を、記憶装置１２０内の領域−文
書構造対応情報１２３に出力する。図１０は、前記領域
−文書構造対応情報１２３の内容の一例を示す図であ
る。In step S12 of FIG. 2, the document extracting unit 11
3 shows a method for extracting an area to be extracted from the document 124 as described above.
Information associating the region from which to extract the region of the extraction source document 124 to the region-document structure correspondence information 123 in the storage device 120 is output. FIG. 10 is a diagram showing an example of the contents of the area-document structure correspondence information 123.

【００６５】領域−文書構造対応情報１２３は、前記文
書構造情報における領域名毎に、その領域が抽出元の文
書１２４のどの位置に対応するかを示す抽出元文書情報
を有している。ここで、図５に示す文書構造情報を有す
る文書１２４を仮定すると、「見出し」に対応する抽出
元文書情報は、「”Ｃ：￥文書￥ｓｏｕｒｃｅ−１．
ｃ”，Ｆｒｏｍ（２，１０），Ｔｏ（２，２０）」
であり、これは、抽出元の文書が、記憶装置１２０内の
Ｃドライブ内の「文書」というディレクトリ内の「ｓｏ
ｕｒｃｅ−１．ｃ」というファイルであり、抽出された
「見出し」の領域をその文書の２行目の１０カラム〜２
行目の２０カラムの位置から抽出したということを示し
ている。他の領域も同様であるが、図１０では、処理仕
様及びインタフェースに繰り返しがあり、領域名におい
ては、それぞれ繰り返し要素が括弧内に表されている。
また、その繰り返し要素と階層関係（包含関係）を表す
ために、前記領域名は、例えば「処理仕様（２）．イン
タフェース（１）」といった連結された表記となってい
る。The area-document structure correspondence information 123 has, for each area name in the document structure information, extraction source document information indicating which position of the extraction source document 124 the area corresponds to. Here, assuming a document 124 having the document structure information shown in FIG. 5, the extraction source document information corresponding to “heading” is “C: \ document \ source-1.
c ", From (2,10), To (2,20)"
This means that the source document is “so” in a directory “document” in the C drive in the storage device 120.
urce-1. c), and the extracted “heading” area is stored in the second row of the document at 10 columns to 2
This indicates that the extraction was performed from the position of the 20th column in the row. The same applies to other areas, but in FIG. 10, there are repetitions in the processing specifications and interfaces, and in the area names, repetition elements are shown in parentheses.
Further, in order to represent the repetitive element and the hierarchical relationship (inclusive relationship), the area name is in a linked notation such as "processing specification (2). Interface (1)".

【００６６】この例では、抽出された領域の抽出元の文
書１２４における位置を、各領域の開始行とカラム、及
び終了行とカラムを保持することによって記憶するよう
にしているが、各領域名、即ち検索パターン毎の抽出順
（この場合は、書き戻す際に、文書１２４に対して前記
文書抽出部１１３による抽出処理と同様のパターン検索
処理が必要）や、抽出元の文書１２４における開始アド
レス位置と抽出された領域のサイズを保持することによ
って、前記抽出された領域の抽出元の文書１２４におけ
る位置を記憶してもよい。In this example, the position of the extracted area in the document 124 of the extraction source is stored by holding the start row and column and the end row and column of each area. That is, the extraction order for each search pattern (in this case, when writing back, the document 124 needs to be subjected to the same pattern search processing as the extraction processing by the document extraction unit 113), and the start address in the extraction source document 124 By storing the position and the size of the extracted region, the position of the extracted region in the document 124 from which the extraction was performed may be stored.

【００６７】こうして、抽出された各領域の抽出元の文
書１２４における位置が記憶されることによって、即
ち、領域−文書構造対応情報１２３を保持することによ
って、前記文書抽出部１１３によって抽出された領域
が、抽出元の文書１２４と独立して編集された場合で
も、文書１２４内の抽出されていない他の領域との整合
性を保ちつつ、前記抽出され編集された領域を文書１２
４に反映することが可能となる。In this way, by storing the position of each extracted region in the original document 124, that is, by storing the region-document structure correspondence information 123, the region extracted by the document extracting unit 113 is stored. Can be edited independently of the extraction source document 124, while maintaining consistency with other unextracted areas in the document 124,
4 can be reflected.

【００６８】再び図２のステップＳ１３に戻ると、ここ
では前記文書再構成部１１４が、前記文書抽出部１１３
が文書１２４から抽出した領域の情報を読み込み、文書
構造・レイアウト情報１２２に基づいて、表示装置１４
０上に、それらの領域の情報を再構成して表示する。Returning again to step S 13 in FIG. 2, here, the document reconstructing unit 114
Reads the information of the area extracted from the document 124 and displays the information on the display device 14 based on the document structure / layout information 122.
0, the information of those areas is reconstructed and displayed.

【００６９】図１１に文書構造・レイアウト情報１２２
の例を示す。この文書構造・レイアウト情報１２２は、
図５に示した概念的な文書構造情報と、前記再構成のた
めに使用されるレイアウト情報を同時に定義するための
方法を示したものである。前述したように、文書構造情
報とレイアウト情報は、個別に定義されていても良い。FIG. 11 shows the document structure / layout information 122.
Here is an example. This document structure / layout information 122
FIG. 6 illustrates a method for simultaneously defining the conceptual document structure information illustrated in FIG. 5 and the layout information used for the reconstruction. As described above, the document structure information and the layout information may be individually defined.

【００７０】この例では、マイクロソフト社のワープロ
Ｗｏｒｄを使用して、各領域に関して、その領域の表示
位置、フォント・スタイル、フォント・サイズ、文字列
の配置、及び文字色等のレイアウト情報と、領域名、包
含関係、繰り返しの有無、及び省略の可否を示す文書構
造情報を定義し、その他囲み線や区切り線などのワープ
ロによる修飾を付加している。これらの情報のうち、フ
ォント・スタイル、フォント・サイズ、文字列の配置、
及び文字色等のレイアウト情報と、領域名、繰り返しの
有無、及び省略の可否を示す文書構造情報は、各領域毎
にＷｏｒｄのブックマークの機能を利用して設定してい
る。In this example, the layout information such as the display position of the area, the font style, the font size, the arrangement of the character string, the character color, and the like are used for each area using the Microsoft word processor Word. It defines the document structure information indicating the name, inclusion relation, the presence or absence of repetition, and whether or not it can be omitted, and also adds word processing modifications such as encircling lines and dividing lines. This information includes font style, font size, text placement,
Layout information such as text color and the like, and document structure information indicating the region name, the presence / absence of repetition, and the omission of omission are set for each region by using the Word bookmark function.

【００７１】しかし、このように、各領域に対して、レ
イアウト情報及び文書構造情報を定義可能であり、その
情報をコンピュータによって制御（入出力を含む）でき
れば、他のどのような手段を用いても構わない。また、
マイクロソフト社のＷｏｒｄを用いた本発明の例は、単
に例示のためのものに過ぎず、こうした具体的な手段の
みに制限されるものではない。However, as described above, layout information and document structure information can be defined for each area, and if the information can be controlled (including input / output) by a computer, any other means can be used. No problem. Also,
The example of the present invention using Microsoft's Word is for illustrative purposes only, and is not limited to these specific means.

【００７２】図１１に示す文書構造・レイアウト情報１
２２は、まず可視部分と不可視部分に大別される。可視
部分は更に、ユーザによる編集が禁止される固定部分
と、抽出された領域が表示される、ユーザによる編集が
可能な編集部分に分類される。区切り線３２６、固定タ
イトル３２７、３２８、３２９、及び３２Ａ、及び各領
域を囲む囲み線は固定部分である。領域３１０、３２
１、３２２、３２３、３２４、及び３２５は、それぞれ
見出し、関数名、処理概要、呼出形式、インタフェー
ス、及び注意の各領域名称に対応する編集部分である。Document structure / layout information 1 shown in FIG.
22 is roughly divided into a visible part and an invisible part. The visible portion is further classified into a fixed portion where editing by the user is prohibited and an editing portion where the extracted area is displayed and which can be edited by the user. The dividing line 326, the fixed titles 327, 328, 329, and 32A, and the surrounding lines surrounding each area are fixed parts. Regions 310, 32
Reference numerals 1, 322, 323, 324, and 325 denote editing portions corresponding to the heading, function name, processing outline, call format, interface, and attention area name, respectively.

【００７３】前述のように、文書抽出部１１３は、前記
文書１２４から抽出した領域をそれぞれ見出しや関数名
といった領域名と関連付けて、抽出された領域の情報と
して出力する。前記文書再構成部１１４は、文書抽出部
１１３から受け取った抽出された領域の情報を、対応す
る領域名を有する図１１に示された領域の位置に出力す
る。この領域は、ユーザによる編集が可能で、前記編集
制御部１１５によって、表示された内容の削除、変更、
追加を行うことができる。前記編集制御部１１５は、そ
の他の領域のユーザによる編集を禁止する。所定の領域
のみ編集を有効にすることは、従来のワープロの機能を
使用して容易に達成することができるが、特別なアプリ
ケーションを用意して、ワープロ上に組み込むこともで
きるし、全く独立した別のアプリケーションとして作成
することもできる。As described above, the document extracting unit 113 associates the area extracted from the document 124 with the area name such as a heading or a function name and outputs the extracted area information. The document reconstruction unit 114 outputs the information of the extracted area received from the document extraction unit 113 to the position of the area shown in FIG. 11 having the corresponding area name. This area can be edited by the user, and the displayed contents can be deleted, changed,
Additions can be made. The editing control unit 115 prohibits a user from editing other areas. Enabling editing only for a given area can be easily achieved using the functions of a conventional word processor. However, a special application can be prepared and incorporated into the word processor, or a completely independent application can be used. It can be created as another application.

【００７４】不可視部分には、上位階層の領域である処
理仕様に対応する領域３２０と、各領域に関してレイア
ウト情報、及び文書構造情報を定義する定義部分３３
０、３３１、３３２、３３３、３３４、３３５、及び３
３６がある（この定義内容は、Ｗｏｒｄのブックマーク
機能によって設定されている）。領域３２０は、図５の
文書構造情報に示す処理仕様２２０に対応するものであ
り、前記編集領域３２１、３２２、３２３、３２４、及
び３２５の集合を含むように示されている。このことに
よって、前記各領域３２０及び３２１〜３２５の階層構
造（包含関係）が分かり、これも１つの文書構造情報で
ある。この領域３２０の定義をしているのが定義部分３
３６である。ここでは、この領域３２０が処理仕様とい
う領域名称を有し、繰り返し有りであることが定義され
ている。領域３２０が、各領域３２１から３２５を包含
しているということは、各領域３２１から３２５を全て
含むように領域３２０が設定されていることで示されて
いる。The invisible part includes an area 320 corresponding to the processing specification, which is an area of a higher hierarchy, and a definition part 33 for defining layout information and document structure information for each area.
0, 331, 332, 333, 334, 335, and 3
36 (the contents of the definition are set by the bookmark function of Word). The area 320 corresponds to the processing specification 220 shown in the document structure information of FIG. 5, and is shown to include a set of the editing areas 321, 322, 323, 324, and 325. Thus, the hierarchical structure (inclusion relationship) of each of the regions 320 and 321 to 325 is known, and this is also one piece of document structure information. The definition part 3 defines this area 320.
36. Here, it is defined that this area 320 has an area name of processing specification and has repetition. The fact that the region 320 includes the regions 321 to 325 is indicated by the region 320 being set to include all the regions 321 to 325.

【００７５】その他の定義部分３３０、３３１、３３
２、３３３、３３４、及び３３５は、それぞれ編集領域
３１０、３２１、３２２、３２３、３２４、及び３２５
に関するものであり、表示される編集対象の文字列に関
するレイアウト情報及び文章構造情報を含んでいる。Other definition parts 330, 331, 33
2, 333, 334, and 335 are edit areas 310, 321, 322, 323, 324, and 325, respectively.
And layout information and sentence structure information relating to the displayed character string to be edited.

【００７６】例えば、定義部分３３０領域３１０に対応
し、表示される文字列のフォント・スタイルは明朝体、
フォント・サイズは２２ポイント、文字修飾はＢＯＬＤ
（太字）とし、文字列の配置は中央揃え（センタリン
グ）、色は黒で、対応する領域３１０の領域名称が「見
出し」であることを示している。For example, the font style of the displayed character string corresponding to the definition part 330 area 310 is Mincho type,
Font size is 22 points, character modification is BOLD
(Bold), the arrangement of the character strings is centered (centering), the color is black, and the area name of the corresponding area 310 is “heading”.

【００７７】また、領域３２４に対応する定義部分３３
４は、領域３２４の領域名称がインタフェースであり、
繰り返し有りであることを示している。領域３２５に対
応する定義部分３３５は、領域３２５の領域名称が注意
であり、省略可であることを示している。The definition part 33 corresponding to the area 324
4 is an area name of the area 324 is an interface;
This indicates that there is repetition. The definition part 335 corresponding to the area 325 indicates that the area name of the area 325 is cautionary and can be omitted.

【００７８】文書再構成部１１４が、このようにして文
書抽出部１１３から抽出された領域の情報を再構成して
表示画面１４０上に表示した表示例４００を図１２に示
す。図１２の表示例４００は、図６に示した文書１２４
を、図４に示す領域定義情報１２１、及び図１１に示す
文書情報・レイアウト情報１２２に基づいて抽出し、再
構成し、表示したものである（但し、図６の文書１２４
では、見出し「Ａプロジェクト−関数定義仕様書」は省
略してある）。FIG. 12 shows a display example 400 in which the document reconstructing section 114 reconstructs the information of the area extracted from the document extracting section 113 in this way and displays it on the display screen 140. The display example 400 shown in FIG. 12 is the document 124 shown in FIG.
Is extracted, reconstructed, and displayed based on the area definition information 121 shown in FIG. 4 and the document information / layout information 122 shown in FIG. 11 (however, the document 124 shown in FIG. 6).
The heading "A project-function definition specification" is omitted.

【００７９】この表示例４００では、１つの文書から複
数の領域が抽出された場合を示している。複数の文書か
ら領域が抽出された場合は、この表示の下に追加される
か、または別のウインドウ内の表示として提供される。
このことによって、ユーザは、例えば、複数のプログラ
ムの仕様書の処理概要や呼出形式等の内容を同じ様式で
一覧表示させる事ができる。This display example 400 shows a case where a plurality of areas are extracted from one document. If regions are extracted from multiple documents, they are added below this display or provided as a display in another window.
As a result, the user can, for example, display a list of processing outlines, call formats, and the like of the specifications of a plurality of programs in the same format.

【００８０】図３には、文書編集結果反映部１１６によ
る文書編集結果反映処理の流れが示されている。図３
は、左１／３の欄が再構成処理及びユーザの編集により
作成された情報を示し、中央１／３の欄が文書編集結果
反映処理の流れを示し、右１／３の欄がユーザが予め定
義した情報を示している。FIG. 3 shows the flow of the document editing result reflecting process by the document editing result reflecting unit 116. FIG.
In the left 1/3 column, the information created by the reconstruction process and the editing by the user is shown, the center 1/3 column shows the flow of the document editing result reflection process, and the right 1/3 column shows the flow of the user editing process. This shows information defined in advance.

【００８１】ユーザが編集結果を抽出元の文書１２４に
反映させるよう指示すると、図３の最初のステップＳ２
０において、文書編集結果反映部１１６が文書構造情報
・レイアウト情報１２２を読み込む。これは、各編集領
域の領域名等を取得するためであり、この実施の形態で
は、前述のように文書構造情報・レイアウト情報１２２
の保存及び編集にＷｏｒｄの機能が使用されているた
め、Ｗｏｒｄの関係ファイルまたはメモリから読み込ま
れる。When the user instructs to reflect the editing result on the document 124 of the extraction source, the first step S2 in FIG.
At 0, the document editing result reflecting unit 116 reads the document structure information / layout information 122. This is for acquiring the area name and the like of each editing area. In this embodiment, as described above, the document structure information / layout information 122
Since the Word function is used for saving and editing the file, it is read from a Word-related file or memory.

【００８２】次に、ステップＳ２１において、文書編集
結果反映部１１６は、文書抽出部１１３によって出力さ
れた領域−文書構造対応情報１２３を読み込み、ユーザ
によって編集された表示装置１４０上の各領域の内容
を、抽出元の文書１２４のどの位置に反映させるかを判
定する。Next, in step S21, the document editing result reflecting unit 116 reads the area-document structure correspondence information 123 output by the document extracting unit 113, and the contents of each area on the display device 140 edited by the user. Is reflected in which position of the extraction source document 124.

【００８３】次に、ステップＳ２２において、文書編集
結果反映部１１６は、ユーザによって編集された表示装
置１４０上の各領域の内容を、抽出元の文書１２４の対
応する位置に反映させる。ユーザが表示装置１４０上の
ある領域に対して任意の文字数だけ削除を行った場合
は、その領域を抽出元の文書１２４に反映させた時にそ
の領域の最終位置となるアドレスＡが、領域−文書構造
対応情報１２３に保持されているその領域の抽出時にお
ける最終位置のアドレスＢより小さいので、その領域を
抽出元の文書１２４に反映させた後、両者のアドレスの
差だけ、抽出元の文書１２４のアドレスＢ以降の内容を
アドレスＡの後に移動させる必要がある。Next, in step S22, the document editing result reflecting unit 116 reflects the contents of each area on the display device 140 edited by the user at the corresponding position of the document 124 as the extraction source. When the user deletes a certain area on the display device 140 by an arbitrary number of characters, the address A, which is the final position of the area when the area is reflected in the extraction source document 124, is the area-document. Since it is smaller than the address B of the last position at the time of extraction of the area held in the structure correspondence information 123, the area is reflected on the document 124 of the extraction source, and then the document 124 of the extraction source is added by the difference between the two addresses. Must be moved after address A.

【００８４】ユーザが表示装置１４０上のある領域に対
して任意の文字数だけ追加を行った場合は、その領域を
抽出元の文書１２４に反映させた時にその領域の最終位
置となるアドレスＡが、領域−文書構造対応情報１２３
に保持されているその領域の抽出時における最終位置の
アドレスＢより大きいので、その領域を抽出元の文書１
２４に反映させる前に、両者のアドレスの差だけ、抽出
元の文書１２４のアドレスＢ以降の内容をアドレスＡの
後に移動させておく必要がある。When the user has added an arbitrary number of characters to a certain area on the display device 140, the address A, which is the final position of the area when the area is reflected in the original document 124, is Area-document structure correspondence information 123
Is larger than the address B of the last position at the time of extraction of the area held in
Before being reflected in the address 24, it is necessary to move the contents after the address B of the extraction source document 124 after the address A by the difference between the two addresses.

【００８５】ユーザが表示装置１４０上のある領域に対
して編集を行わなかったかまたは、編集を行ったものの
結果的に文字数に変動がなかった場合は、その領域を抽
出元の文書１２４に反映させた時にその領域の最終位置
となるアドレスＡと、領域−文書構造対応情報１２３に
保持されているその領域の抽出時における最終位置のア
ドレスＢは等しいので、その領域を抽出元の文書１２４
の元の位置に反映させるだけでよい。If the user has not edited a certain area on the display device 140, or if the number of characters has not fluctuated as a result of editing, the area is reflected in the original document 124. Since the address A, which is the final position of the region when the region is extracted, is equal to the address B of the final position at the time of extraction of the region held in the region-document structure correspondence information 123, the region is extracted from the document 124 of the extraction source.
It only needs to be reflected in the original position of.

【００８６】前記アドレスＡやアドレスＢは、図１０に
例示した領域−文書構造対応情報１２３に示すように、
文書１２４における行とカラムで把握することができる
が、文書１２４の先頭からの論理アドレスとして把握す
る事も可能である。As shown in the area-document structure correspondence information 123 illustrated in FIG.
Although it can be grasped by the line and the column in the document 124, it can also be grasped as a logical address from the head of the document 124.

【００８７】次に、図１３から図２０を参照して、本発
明の文書編集システムの一連の操作をより具体的に説明
する。また、領域定義情報１２１は図４に示したもの、
文書構成・レイアウト情報１２２は図１１に示したもの
が使用される。Next, a series of operations of the document editing system of the present invention will be described more specifically with reference to FIGS. The area definition information 121 is the one shown in FIG.
The document configuration / layout information 122 shown in FIG. 11 is used.

【００８８】図１３には、抽出元の文書１２４の読み込
みを指定する画面４１０が示されている。このように、
本発明の文書編集システムは、例えばマイクロソフト社
のワープロＷｏｒｄなどに本発明の独自の機能を組み込
むことによって達成することができるが、新たに作成し
てもよく、また任意の方法で既存のアプリケーションと
組み合わせて実現することも可能である。FIG. 13 shows a screen 410 for designating reading of the extraction source document 124. in this way,
The document editing system of the present invention can be achieved by incorporating the unique functions of the present invention into, for example, a Microsoft word processor Word, etc., but may be newly created, or may be combined with an existing application by an arbitrary method. It is also possible to realize them in combination.

【００８９】画面の表示例４１０のメニューで「表示
（Ｖ）」をマウスのクリックやキーボド等によって選択
すると、図示するようなサブメニュー４１１が表示され
る。そこで、このサブメニュー４１１から「抽出ファイ
ル指定（Ｓ）」を選択すると、図１４に示すような抽出
元の文書１２４を選択する画面４２１が表示される。When "Display (V)" is selected by a mouse click, a keyboard, or the like in the menu of the display example 410 of the screen, a submenu 411 as shown in the figure is displayed. Then, when "extract file designation (S)" is selected from the submenu 411, a screen 421 for selecting the source document 124 as shown in FIG. 14 is displayed.

【００９０】図１４は、抽出元の文書１２４を選択する
画面４２１を示している。画面４２１は画面４２０から
のポップアップ・ウインドウとして表示される。この例
では、Ｔｉｔｌｅ．ｔｘｔ（４２２）、ＺａｉｋｏＨｉ
ｋｉａｔｅ．ｃ（４２３）、及びＺａｉｋｏＫｏｕｓｉ
ｎ．ｃ（４２４）の３ファイルが選択されている。これ
らのファイルはいずれもテキスト・ファイルである。本
発明が抽出元の文書１２４に関して文字列のパターン検
索を行い、抽出結果の編集を可能にするシステムである
ことから、ここでテキスト情報を有しないバイナリ・フ
ァイルなどを指定することに意味はない。この画面４２
１で抽出の対象となる抽出元の文書１２４を選択した後
「開く（Ｏ）」ボタンを選択すると、図１５に示すよう
に、選択された抽出元の文書１２４の内容を表示する画
面４３０が表示される。FIG. 14 shows a screen 421 for selecting the document 124 to be extracted. The screen 421 is displayed as a pop-up window from the screen 420. In this example, Title. txt (422), ZaikoHi
kitate. c (423) and Zaiko Kousi
n. Three files c (424) are selected. These files are all text files. Since the present invention is a system that performs a character string pattern search on the extraction source document 124 and enables editing of the extraction result, there is no point in specifying a binary file or the like having no text information here. . This screen 42
After selecting the extraction source document 124 to be extracted in step 1 and selecting the "Open (O)" button, a screen 430 displaying the contents of the selected extraction source document 124 is displayed as shown in FIG. Is displayed.

【００９１】図１５は、選択された抽出元の文書１２４
の内容を表示する画面４３０を示している。指定された
文書１２４の内容がそれぞれ１つのウインドウ内に表示
されている。ウインドウ４３１には、ＺａｉｋｏＫｏｕ
ｓｉｎ．ｃの内容が、ウインドウ４３２には、Ｚａｉｋ
ｏＨｉｋｉａｔｅ．ｃの内容が、ウインドウ４３３には
Ｔｉｔｌｅ．ｔｘｔの内容がそれぞれ表示されている。FIG. 15 shows the selected document 124 as the extraction source.
Is displayed on the screen 430 that displays the contents of. The contents of the designated document 124 are displayed in one window. Window 431 contains a ZaikoKou
sin. The contents of c are displayed in the window 432 as Zaik
oHikiate. c is displayed in the window 433 as Title. txt are displayed.

【００９２】図１６は、図１５の状態から再構成処理を
指示する画面４４０を示している。画面４４０のメニュ
ーで「表示（Ｖ）」を選択すると、図示するようなサブ
メニュー４４４が表示される。そこで、このサブメニュ
ー４４４から「ソース→仕様（Ｓ）」を選択すると、選
択された全ての文書１２４に関して再構成処理が開始さ
れる。設計によっては、図１５の内容表示画面を省略す
ることも可能である。FIG. 16 shows a screen 440 for instructing a reconstruction process from the state of FIG. When "Display (V)" is selected in the menu of the screen 440, a submenu 444 as shown in the figure is displayed. Therefore, when “source → specification (S)” is selected from the submenu 444, the reconstruction process is started for all the selected documents 124. Depending on the design, the content display screen in FIG. 15 can be omitted.

【００９３】図１７は、図１６で再構成の指定がされた
場合に、再構成結果を表示する画面４５０を示してい
る。画面４５０の再構成結果は１ウインドウとして表示
され、抽出元の文書１２４の区切りはページ区切りで表
される。例えば、ＺａｉｋｏＨｉｋｉａｔｅ．ｃから抽
出され再構成された内容は、領域４５１に示され、Ｚａ
ｉｋｏＫｏｕｓｉｎ．ｃから抽出され再構成された内容
は、領域４５２に示されている。このように、抽出元の
文書１２４の区切りをページ区切りによって表すのに代
えて、抽出元の文書１２４毎に別のウインドウで表示す
るような態様も可能である。この画面４５０において
は、図１１に示す領域３１０、３２１、３２２、３２
３、３２４、及び３２５に対応する部分に関して、ユー
ザはワープロと同様の方法で自由に編集作業を行うこと
ができ、それ以外の領域にはカーソルが移動せず、ユー
ザによる編集ができないように制御されている。FIG. 17 shows a screen 450 for displaying the reconstruction result when the reconstruction is designated in FIG. The reconstructed result of the screen 450 is displayed as one window, and the separation of the extraction source document 124 is represented by a page separation. For example, ZaikoHikiate. c and the reconstructed content is shown in area 451 and Za
ikoKousin. The content extracted and reconstructed from c is shown in area 452. As described above, instead of expressing the break of the document 124 of the extraction source by the page break, a mode in which a separate window is displayed for each document 124 of the extraction source is also possible. In this screen 450, regions 310, 321, 322, and 32 shown in FIG.
For the parts corresponding to 3, 324, and 325, the user can freely perform editing work in the same manner as in a word processor. Have been.

【００９４】図１８は、図１７で示された再構成結果の
表示画面４５０において、ＺａｉｋｏＨｉｋｉａｔｅ．
ｃから抽出され再構成された内容を編集している画面４
６０を示している。ここでの編集は、処理概要に関する
記述内容を付加するものである。ユーザの編集によって
加えられた部分４６１は、それが再構成された後に編集
されたものであることを明示するために、この例のよう
に反転表示されているのが望ましい。FIG. 18 shows a display screen 450 of the reconstructed result shown in FIG. 17 on which ZaikoHikiate.
Screen 4 for editing the content extracted and reconstructed from c.
60 is shown. The editing here adds the description content regarding the processing outline. Preferably, the portion 461 added by the user's editing is highlighted, as in this example, to indicate that it has been edited after it has been reconstructed.

【００９５】図１９は、編集結果を抽出元の文書１２４
に反映させる指示を行う画面４７０を示している。画面
４７０のメニューで「表示（Ｖ）」を選択すると、図示
するようなサブメニュー４７２が表示される。そこで、
このサブメニュー４７２から「仕様→ソース（Ｄ）」を
選択すると、元の文書１２４に対してその編集結果が反
映される。この例では、抽出元の文書ＺａｉｋｏＨｉｋ
ｉａｔｅ．ｃに対して、実質的に内容が追加された部分
（反転領域４７１）が反映される。FIG. 19 shows a document 124 from which an editing result is extracted.
9 shows a screen 470 for giving an instruction to reflect the information to the user. When "Display (V)" is selected in the menu on the screen 470, a submenu 472 as shown in the figure is displayed. Therefore,
When “specification → source (D)” is selected from the submenu 472, the editing result is reflected on the original document 124. In this example, the source document ZaikoHik
iate. For c, a portion (inversion area 471) where the contents are substantially added is reflected.

【００９６】図２０は、編集結果が反映された抽出元の
文書１２４の内容を示す画面４８０を示している。図１
９に示す画面４７０の操作で編集結果が反映された文書
ＺａｉｋｏＨｉｋｉａｔｅ．ｃを再び表示させてみる
と、ウインドウ４８１に示すように、処理概要の内容
に、「その際．．．チェックする。」という文が追加さ
れている。また、図３のステップＳ２２において、文書
編集結果反映部１１６が、ユーザによる編集結果を抽出
元の文書１２４に反映させた時に、その編集された部分
が最終的に抽出元の文書１２４のどの位置に反映された
かを記憶しておけば、ウインドウ４８１に示すように、
ユーザによって追加等された領域を反転表示させること
も可能である。FIG. 20 shows a screen 480 showing the contents of the extraction source document 124 on which the editing result is reflected. FIG.
9 on which the editing result is reflected by the operation of the screen 470 shown in FIG. When c is displayed again, as shown in a window 481, a sentence "Check at that time ..." is added to the contents of the processing summary. Also, in step S22 in FIG. 3, when the document editing result reflecting unit 116 reflects the user's editing result on the extraction source document 124, the edited portion is finally determined in which position of the extraction source document 124. Is stored in the window, as shown in the window 481,
It is also possible to reverse display the area added by the user.

【００９７】図２１は、上述した文書編集システム１０
０を構築するために使用されるコンピュータ６００のハ
ードウエア構成の一例を示している。該コンピュータ６
００は、それぞれバス６８０に接続されたＣＰＵ６１
０、記憶部６２０、メモリ部６３０、表示部６４０、入
力部６５０、印刷部６６０、及びネットワーク・インタ
フェース部６７０からなる。FIG. 21 shows the document editing system 10 described above.
0 shows an example of a hardware configuration of a computer 600 used for constructing 0. The computer 6
00 is the CPU 61 connected to the bus 680, respectively.
0, a storage unit 620, a memory unit 630, a display unit 640, an input unit 650, a printing unit 660, and a network interface unit 670.

【００９８】ＣＰＵ６１０は、図１の文書編集システム
１００の処理装置１１０に対応し、領域定義情報登録部
１１１、文書構造・レイアウト情報登録部１１２、文書
抽出部１１３、文書再構成部１１４、文書編集制御部１
１５、及び文書編集結果反映部１１６の各部分を実行す
る。The CPU 610 corresponds to the processing device 110 of the document editing system 100 shown in FIG. 1, and includes an area definition information registration unit 111, a document structure / layout information registration unit 112, a document extraction unit 113, a document reconstruction unit 114, a document editing unit Control unit 1
15 and each part of the document editing result reflecting unit 116 are executed.

【００９９】記憶部６２０は、図１の文書編集システム
１００の記憶装置１２０に対応し、ＣＰＵ６１０によっ
て実行される前記各機能を実現するプログラム、及び領
域定義情報１２１、文書構造・レイアウト情報１２２、
領域−文書構造対応情報１２３、及び複数の抽出元とな
る文書１２４を格納するメモリ部６３０には、ＣＰＵ６
１０によって実行される前記各部分を実現するプログラ
ムがロードされ、また必要に応じてユーザの編集内容な
どを含む表示部６４０に表示されている内容や、記憶部
６２０内の各種情報の内容がロードされる。The storage unit 620 corresponds to the storage device 120 of the document editing system 100 shown in FIG. 1, and executes programs executed by the CPU 610 to realize the above functions, area definition information 121, document structure / layout information 122,
The memory unit 630 storing the area-document structure correspondence information 123 and a plurality of documents 124 to be extracted is stored in the CPU 6.
10 is loaded with a program for realizing each of the above-mentioned parts, and the contents displayed on the display unit 640 including the contents edited by the user and the contents of various information in the storage unit 620 are loaded as necessary. Is done.

【０１００】表示部６４０は、図１の文書編集システム
１００の表示装置１４０に対応し、再構成結果を表示
し、それをユーザが編集する際に必要とされ、通常はＣ
ＲＴやＬＣＤ等のディスプレイ装置である。The display unit 640 corresponds to the display device 140 of the document editing system 100 shown in FIG. 1 and displays a reconstructed result, which is required when the user edits the reconstructed result.
A display device such as an RT or an LCD.

【０１０１】入力部６５０は、図１の文書編集システム
１００の入力装置１３０に対応し、前記表示部６４０に
表示された画面に沿って入力や指示を行うために使用さ
れる装置であり、通常キーボード、マウス等から成る入
力装置やタッチパネル、音声入力装置等で構成される。The input unit 650 corresponds to the input device 130 of the document editing system 100 shown in FIG. 1, and is used for inputting and giving instructions along the screen displayed on the display unit 640. The input device includes a keyboard, a mouse, and the like, a touch panel, a voice input device, and the like.

【０１０２】印刷部６６０は、ユーザ等の指示に従っ
て、前記記憶部６２０またはメモリ部６３０に格納され
ているデータ等を印刷する、レーザプリンタ等の印刷装
置である。前記印刷部６６０は、前記文書編集システム
１００の実施に関して必須の構成要件ではない。The printing unit 660 is a printing device such as a laser printer that prints data or the like stored in the storage unit 620 or the memory unit 630 according to an instruction from a user or the like. The printing unit 660 is not an essential component for implementing the document editing system 100.

【０１０３】ネットワーク・インタフェース部６７０
は、主に、リモートにある記憶部６２０との接続、また
は他のＣＰＵとの接続を実現するよう機能する。前記文
書編集システム１００を単一のコンピュータで実施する
場合には必要とされない。Network interface section 670
Functions mainly to realize a connection with the storage unit 620 at a remote location or a connection with another CPU. This is not required when the document editing system 100 is implemented by a single computer.

【０１０４】バス６８０は、前記各構成要素６１０〜６
７０間でデータ、指令等の送受信を行うための共通伝送
経路である。A bus 680 is provided for each of the components 610-6.
This is a common transmission path for transmitting and receiving data, commands, and the like between the 70s.

【０１０５】[0105]

【発明の効果】本発明の文書編集システムによれば、汎
用的な開始パターン及び終了パターンからなる一対のパ
ターンと、文書の構造を定義する文書構造情報を用い
て、複数の文書から前記開始パターンに該当する個所と
終了パターンに該当する個所との間の領域がそれぞれ抽
出され、その抽出結果が、前記文書構造情報とレイアウ
ト情報に従って集約して再構成され、ユーザが編集可能
な状態で表示される。この結果、複数の文書の情報を、
ユーザが最も見やすいレイアウトで表示させたまま編集
作業を行うことができ、複数の文書から必要な領域を検
索して直接編集作業を行うことに比べて格段に編集作業
が効率化された。According to the document editing system of the present invention, the start pattern is obtained from a plurality of documents by using a pair of general-purpose start patterns and end patterns and document structure information that defines the structure of the document. Are extracted from the area corresponding to the area corresponding to the end pattern and the area corresponding to the area corresponding to the end pattern is extracted. You. As a result, information of multiple documents is
The editing operation can be performed while displaying the layout in the most easy-to-view format for the user, and the editing operation is much more efficient than performing the editing operation directly by searching a required area from a plurality of documents.

【０１０６】また、抽出対象を文字列の一致などではな
く、抽出領域の開始、終了を示す文字列パターンを用い
て指定することにより、複数の文を含む広い領域を指定
することができる。文字列の一致などの方法では、抽出
元の文書を少し変えただけでも抽出結果が大きく変化す
るのに対し、本発明のシステムでは開始や終了パターン
に該当する部分が変更されない限り、抽出される領域自
体に変化はない。この結果、常にユーザの意図した抽出
結果を得ることができ、安定した再構成結果を表示させ
ることができる。By specifying the extraction target by using a character string pattern indicating the start and end of the extraction area instead of matching the character strings, a wide area including a plurality of sentences can be specified. In a method such as character string matching, even if the extraction source document is slightly changed, the extraction result greatly changes. On the other hand, in the system of the present invention, the extraction is performed unless the part corresponding to the start or end pattern is changed. There is no change in the area itself. As a result, an extraction result intended by the user can be always obtained, and a stable reconstruction result can be displayed.

【０１０７】更に本発明によって、編集可能な状態で再
構成され表示された領域がユーザに編集された後、その
編集された領域が抽出元の文書に、その文書の他の領域
と矛盾することなく書き戻される。Further, according to the present invention, after an area reconstructed and displayed in an editable state is edited by a user, the edited area is inconsistent with another area of the original document. It is written back without.

【０１０８】また更に本発明によって、前記文書構造情
報と前記レイアウト情報が一体として登録、保持され、
前記文書構造情報には前記文書内の各領域の領域名、階
層構造（包含関係）、出現順序、繰り返しの有無、及び
省略の可否などの文書構造に関する情報が含まれ、前記
レイアウト情報には、前記各領域の表示位置、フォント
・スタイル、フォント・サイズ、文字列の配置、及び文
字色等のレイアウト情報が含まれる。従来は両方の前記
情報を突き合わせながら、両者が矛盾しないように注意
深く作成していたが、本発明により、両者を矛盾なく一
体化して容易に作成できるようになった。Further, according to the present invention, the document structure information and the layout information are registered and held as one body,
The document structure information includes information on a document structure such as an area name of each area in the document, a hierarchical structure (inclusive relation), an order of appearance, the presence or absence of repetition, and whether or not omission is possible, and the layout information includes Layout information such as the display position, font style, font size, character string arrangement, and character color of each area is included. Conventionally, both pieces of information have been carefully matched so that they do not conflict with each other. However, the present invention makes it possible to easily create both pieces without contradiction.

[Brief description of the drawings]

【図１】本発明の文書編集システムの一実施の形態のシ
ステム構成を示すブロック図である。FIG. 1 is a block diagram showing a system configuration of an embodiment of a document editing system according to the present invention.

【図２】文書抽出及び再構成処理の流れを示す図であ
る。FIG. 2 is a diagram showing a flow of document extraction and reconstruction processing.

【図３】文書編集結果反映処理の流れを示す図である。FIG. 3 is a diagram showing a flow of a document editing result reflecting process.

【図４】領域定義情報の例を示す図である。FIG. 4 is a diagram showing an example of area definition information.

【図５】文書構造情報の例を概念的に示す図である。FIG. 5 is a diagram conceptually illustrating an example of document structure information.

【図６】抽出元の文書の例を示す図である。FIG. 6 is a diagram illustrating an example of an extraction source document.

【図７】領域定義情報の第２の例を示す図である。FIG. 7 is a diagram illustrating a second example of the area definition information.

【図８】抽出元の文書の第２の例を示す図である。FIG. 8 is a diagram illustrating a second example of an extraction source document.

【図９】領域定義情報の第３の例を示す図である。FIG. 9 is a diagram illustrating a third example of the area definition information.

【図１０】領域−文書構造対応情報の例を示す図であ
る。FIG. 10 is a diagram showing an example of area-document structure correspondence information.

【図１１】文書構造・レイアウト情報の例を示す図であ
る。FIG. 11 is a diagram illustrating an example of document structure / layout information.

【図１２】再構成結果の表示例を示す図である。FIG. 12 is a diagram illustrating a display example of a reconstruction result.

【図１３】抽出元の文書の読み込みを指定する画面の例
を示す図である。FIG. 13 is a diagram illustrating an example of a screen for designating reading of an extraction source document.

【図１４】抽出元の文書を選択する画面の例を示す図で
ある。FIG. 14 is a diagram illustrating an example of a screen for selecting a document to be extracted.

【図１５】選択された抽出元の文書の内容を表示する画
面の例を示す図である。FIG. 15 is a diagram showing an example of a screen for displaying the contents of a selected extraction source document.

【図１６】再構成処理を指示する画面の例を示す図であ
る。FIG. 16 is a diagram illustrating an example of a screen for instructing a reconstruction process.

【図１７】再構成結果を表示する画面の例を示す図であ
る。FIG. 17 is a diagram illustrating an example of a screen displaying a reconstruction result.

【図１８】表示された再構成結果に対して編集を行った
画面の例を示す図である。FIG. 18 is a diagram illustrating an example of a screen on which a displayed reconstructed result is edited.

【図１９】編集結果を抽出元の文書に出力するよう指示
する画面の例を示す図である。FIG. 19 is a diagram illustrating an example of a screen instructing to output an editing result to a document of an extraction source.

【図２０】編集結果が反映された抽出元の文書の内容を
示す画面の例を示す図である。FIG. 20 is a diagram illustrating an example of a screen showing the contents of an extraction source document in which an editing result is reflected;

【図２１】文書編集システムを実行するコンピュータの
ハードウエア構成を示す図である。FIG. 21 is a diagram illustrating a hardware configuration of a computer that executes the document editing system.

[Explanation of symbols]

１００文書編集システム１１０処理装置１１１領域定義情報登録部１１２文書構造・レイアウト情報登録部１１３文書抽出部１１４文書再構成部１１５文書編集制御部１１６文書編集結果反映部１２０記憶装置１２１領域定義情報１２２文書構造・レイアウト情報１２３領域−文書構造対応情報１２４文書１３０入力装置１４０表示装置 REFERENCE SIGNS LIST 100 Document editing system 110 Processing device 111 Area definition information registration unit 112 Document structure / layout information registration unit 113 Document extraction unit 114 Document reconstruction unit 115 Document editing control unit 116 Document editing result reflection unit 120 Storage device 121 Area definition information 122 Document Structure / layout information 123 Area-document structure correspondence information 124 Document 130 Input device 140 Display device

Claims

[Claims]

1. A document editing system for extracting an arbitrary region from at least one document and displaying the extracted region in an editable state, wherein the system is used to extract a desired region from each document. Area definition information registering means for registering area definition information to be formed as a pattern, document structure information relating to the structure of the area in each document, and document structure information for registering layout information used to reconstruct each area. Layout information registration means, area definition information registered by the storage area definition registration means, document extraction means for extracting an area using the document structure information registered by the document structure / layout information registration means, Using the document structure information and the layout information registered by the document structure / layout information registration unit, the extracted at least one area Document editing system comprising: the document reconstruction means for displaying reconfigure, the.

2. At least one of said reconstructed and displayed
2. The document editing system according to claim 1, further comprising document editing control means for controlling one of the areas to be edited.

3. The document editing control means further controls to limit an editing function for at least one of the reconstructed and displayed areas to only contents of an area extracted from the document. 3. The document editing system according to claim 2, wherein:

4. The apparatus according to claim 2, further comprising: a document editing result reflecting unit that, when the at least one extracted area is edited, reflects the editing result on a document from which the area is extracted. Document editing system.

5. The area definition information registered by the area definition information registering means includes a first pattern for specifying a location immediately before the desired area, and is registered by the document structure / layout information registering means. The document structure information includes information on an area name, a hierarchical structure, an order of appearance, the presence or absence of repetition, and whether or not to omit each area in the document. If the obtained first pattern corresponds and the area name corresponding to the location satisfies the appearance condition of the same area name obtained from the document structure information, the location is determined as the start location of the area, If there is no location corresponding to an arbitrary pattern between the start location and the location that is a location corresponding to the same or different pattern as the first pattern, the location is defined as an area. 2. The document editing apparatus according to claim 1, wherein the end point is determined, and an area between the start point and the end point is extracted in association with the area name obtained from the document structure information. system.

6. The area definition information registered by the area definition information registration means includes a pair of patterns consisting of a first pattern and a second pattern, wherein the first pattern is located immediately before the desired area. The second pattern specifies a location immediately after the desired area, and the document structure information registered by the document structure / layout information registration means includes an area name of each area in the document, a hierarchical structure , The order of appearance, the presence or absence of repetition, and the omission / non-reduction of the document. If the region name to be executed satisfies the appearance condition of the same region name obtained from the document structure information,
Is determined, and after the first location, the location corresponding to the second pattern, and there is no location corresponding to an arbitrary pattern between the first location and the location. Determining the location as the second location, and extracting an area between the first location and the second location in association with the area name obtained from the document structure information; The document editing system according to claim 1, wherein:

7. The document structure / layout information, wherein the area definition information registered by the area definition information registration means includes one or more pairs of a pattern of a character string at the start of the area and a pattern of a character string at the end of the area. The document structure information registered by the registration unit includes information on an area name, a hierarchical structure, an appearance order, the presence or absence of repetition, and whether or not to omit each area in the document. Corresponds to the pattern of the starting character string obtained from the area definition information, and if the area name corresponding to the location satisfies the appearance condition of the same area name obtained from the document structure information, Is determined as a start position, and if there is no position corresponding to an arbitrary pattern between the start position and the position in the position corresponding to the pattern of the end character string, the And determining a location as an end location of the area, and extracting an area between the start location and the end location in association with the area name obtained from the document structure information. Document editing system.

8. The document editing system according to claim 5, wherein the pattern is defined by a combination of a fixed character string and an arbitrary number of arbitrary characters.

9. The document structure / layout information registering means integrally registers the document structure information relating to the structure of an area in each document and the layout information used for reconstructing each area. The document editing system according to claim 1, wherein:

10. A document editing method for extracting an arbitrary area from at least one document and displaying the extracted area in an editable state, wherein the method is used to extract a desired area from each document. Area definition information registration step of registering area definition information as a pattern; document structure information relating to the structure of the area in each document; and document structure / layout information for registering layout information used to reconstruct each area A registration step, searching for a first location and a second location using the area definition information registered by the area definition registration means and the document structure information registered by the document structure / layout information registration means; Extracting a region between the two places, and the document structure information registered by the document structure / layout information registration means. Using the finely the layout information, document editing method characterized by having a document reconstruction step of displaying reconfigure the extracted at least one region, a.

11. A recording medium recording a program for implementing a document editing method for extracting an arbitrary area from at least one document and displaying the extracted area in an editable state, wherein the program includes: An area definition information registration step of registering area definition information used to extract a desired area as a pattern; document structure information relating to the structure of an area in each document; A document structure / layout information registration step of registering layout information to be registered; a region definition information registered by the region definition registration unit; and a document structure information registered by the document structure / layout information registration unit. A document extraction step of retrieving a first location and a second location and extracting an area between the two locations; and registering the document structure / layout information. Means for executing a document reconstructing step of reconstructing and displaying the extracted at least one area using the document structure information and the layout information registered by the means. A readable recording medium.

12. A program stored in the computer-readable recording medium for implementing the document editing method, wherein when the at least one extracted area is edited, the editing result is extracted from the area. The recording medium according to claim 11, further comprising a document editing result reflecting step of reflecting the result in an original document.