JPH08329108A

JPH08329108A - Method for converting text into hypertext

Info

Publication number: JPH08329108A
Application number: JP7134915A
Authority: JP
Inventors: Minoru Ashizawa; 実芦沢
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1995-06-01
Filing date: 1995-06-01
Publication date: 1996-12-13

Abstract

PURPOSE: To prepare a link when a part of described contents of a node is related to another node by preparing a link based upon the coincidence of stems as the meaning contents of a phrase. CONSTITUTION: After inputting a text by a node table preparing step 101, nodes, anchors and links to be the constitutional elements of a hypertext are prepared through a sentence dividing step 102, a word decomposing step 103, an anchor extracting step 104, an important position extracting step 105, an anchor marking step 106, and a link preparing step 107. A hypertext display step 108 detemines a node to be displayed in the hypertext and displays a text corresponding to the node. At the time of the display, an anchor is displayed by emphasis such as inversion. When a user selectes the anchor part by a mouse or the like, a node indicating relation with the anchor by a link is displayed instead of the node being displayed at present.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、テキストの内容に基づ
いてハイパーテキスト構造を作成する方法に係り、テキ
ストデータベースの作成、検索および表示を行う方法に
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for creating a hypertext structure based on the contents of text, and more particularly to a method for creating, searching and displaying a text database.

【０００２】[0002]

【従来の技術】ハイパーテキストとは、章、節などのテ
キストのまとまりをノードとして、内容的な関連を持つ
ノードをリンクによって結び付けた電子化テキストであ
る。リンクの端点がノード内の一部分であるとき、その
部分をアンカーと呼ぶ。2. Description of the Related Art Hypertext is electronic text in which a group of texts such as chapters and sections is used as a node and nodes having a content relationship are linked by a link. When the end point of a link is a part within a node, that part is called an anchor.

【０００３】テキストの内容的な関連を示すリンクを自
動的に作成する技術として、Salton, G : "Automatic T
ext Processing," ADDISON-WESLEY（1989）に示され
た、単語ベクトル空間モデルに基づく方法が知られてい
る。ノード内の単語の出現分布をベクトルで表現し、ベ
クトルの内積が大きいノードの組合せに対して、内容的
な関連を示すリンクを作成するものである。この他の技
術として、白石、他 : "オンラインマニュアル作成支援
ツール," 情報処理学会第４５回全国大会５Ｃ−７, pp.
3-273--3-274（1992）に示された方法がある。これは、
文書中に２回以上出現する名詞部分の間にリンクを作成
し、また、名詞部分に続く助詞によって名詞部分の重要
度を決定してリンクの方向を決定するものである。Salton, G: "Automatic T
ext Processing, "A method based on the word vector space model shown in ADDISON-WESLEY (1989) is known. The appearance distribution of words in a node is represented by a vector, and a combination of nodes with a large inner product of the vector is obtained. In contrast, Shiraishi, et al .: "Online Manual Creation Support Tool," IPSJ 45th National Convention 5C-7, pp.
3-273--3-274 (1992). this is,
A link is created between noun parts appearing more than once in a document, and the importance of the noun part is determined by the particle following the noun part to determine the direction of the link.

【０００４】[0004]

【発明が解決しようとする課題】上記従来の技術のSalt
onの単語ベクトル空間モデルに基づく方法では、ノード
の記述内容の全体が関連を持つ場合には有効であると期
待できるが、マニュアルのようにノードの記述内容の一
部が他のノードと関連を持つ場合には効果が期待できな
いという問題点がある。白石、他の方法は、リンクの作
成の精度が期待できないという問題点がある。つまり、
ノードの主たる内容ではなく従たる内容を示す名詞部分
同士をリンクで結んだり、ノードの内容としては無関係
であるが同一の名詞部分を持つノードの間にリンクを作
成するという問題点がある。また、動詞などで表される
操作方法などについてリンクを作ることができない。[Problems to be Solved by the Invention] Salt of the above conventional technique
The method based on the word vector space model of on can be expected to be effective when the entire description contents of a node are related, but as in the manual, a part of the description contents of a node may be related to other nodes. There is a problem that the effect cannot be expected when holding it. Shiraishi and other methods have a problem that the accuracy of link creation cannot be expected. That is,
There is a problem in that noun parts that show subordinate contents instead of the main contents of a node are connected by a link, or a link is created between nodes that have the same noun part but are unrelated to the contents of the node. In addition, it is not possible to create a link about the operation method expressed by a verb.

【０００５】本発明の目的は、ハイパーテキストの読者
がハイパーテキストを参照中に更に詳細を知りたいと思
う箇所からその説明として適切な箇所へリンクをたどる
ことができるように、ある内容を参照した箇所と、その
内容を主たる説明内容とするノードとを結ぶリンクを自
動的に作成する方法を提供することにある。It is an object of the present invention to refer to certain content so that a reader of hypertext can follow a hypertext reference from a place where he / she wants to know more details to the appropriate place for its description. It is to provide a method of automatically creating a link connecting a location and a node whose content is the main description content.

【０００６】[0006]

【課題を解決するための手段】上記目的は、ノードテー
ブル作成ステップ、文分割ステップ、単語分割ステッ
プ、アンカー抽出ステップ、重要箇所抽出ステップ、ア
ンカーマーキングステップ、リンク作成ステップ、およ
び、ハイパーテキスト表示ステップからなるテキストの
ハイパーテキスト化方法によって達成される。[Means for Solving the Problems] The above-mentioned object is selected from a node table creating step, a sentence dividing step, a word dividing step, an anchor extracting step, an important part extracting step, an anchor marking step, a link creating step, and a hypertext displaying step. It is achieved by the method of making the text hypertext.

【０００７】ここで、アンカー抽出ステップはキーワー
ド抽出サブステップとキーフレーズ抽出サブステップと
からなリ、重要箇所抽出ステップは重要箇所仮認識サブ
ステップと除外箇所抽出サブステップと抽出箇所統合サ
ブステップからなり、リンク作成ステップはキーフレー
ズタイプリンク作成サブステップとキーワードタイプリ
ンク作成サブステップとからなり、ハイパーテキスト表
示ステップはレイアウトサブステップとアンカー表示位
置再計算サブステップとノードテキスト表示サブステッ
プとからなる。Here, the anchor extraction step is composed of a keyword extraction substep and a key phrase extraction substep, and the important part extraction step is composed of an important part temporary recognition substep, an excluded part extraction substep and an extracted part integration substep. The link creation step includes a key phrase type link creation substep and a keyword type link creation substep, and the hypertext display step includes a layout substep, an anchor display position recalculation substep, and a node text display substep.

【０００８】[0008]

【作用】ノードテーブル作成ステップは、テキストの内
容的なまとまりであるノードを作成し、ノード間の接続
関係およびノードの説明の目的のタイプをノードテーブ
ルに記録する。テキストの内容的なまとまりは入力テキ
ストに挿入された章、節、項などの文書の構成要素の境
界を示すデータによって認識する。ノード間の接続関係
とは、テキストにおけるノードの順序や章、節、項など
における上下関係である。ノードの説明の目的タイプと
は、そのノードが目的とする説明のタイプであり、概要
説明、詳細説明などがあり、説明の目的のタイプを示す
データがノードごとにテキストに挿入されているものと
する。説明の目的のタイプを示すデータが存在しない文
書構成要素については、その上位にある文書構成要素と
同一であるものとする。The node table creating step creates a node that is a unit of textual content, and records the connection relationship between nodes and the intended type of node description in the node table. The content grouping of the text is recognized by the data indicating the boundaries of the document constituent elements such as chapters, sections and sections inserted in the input text. The connection relationship between nodes is the order of the nodes in the text and the vertical relationship in chapters, sections, terms, etc. The purpose type of a node's description is the type of description that the node is intended for, such as a brief description or a detailed description, with data indicating the type of description purpose inserted in the text for each node. To do. Document components that do not have data indicating the intended type of description are the same as the document components above them.

【０００９】文分割ステップは、ノード内のテキストか
ら空白、句点などを手掛かりにタイトル、文を切り出
す。この文分割の技術は機械翻訳システムに関連して公
知の技術である。In the sentence dividing step, a title and a sentence are cut out from the text in the node by using blanks, punctuation marks and the like as clues. This sentence division technique is a well-known technique related to a machine translation system.

【００１０】単語分割ステップは、文分割ステップで切
り出したタイトル、文を単語に分割し単語の品詞および
語幹やその単語の意味概念を認識する。この単語分割の
技術も機械翻訳システムに関連して公知の技術である。In the word dividing step, the title and sentence cut out in the sentence dividing step are divided into words, and the part of speech and stem of the word and the meaning concept of the word are recognized. This word division technique is also a well-known technique related to a machine translation system.

【００１１】アンカー抽出ステップは、キーワード抽出
サブステップにおいて、キーワード抽出用の品詞および
文字の並びのパターンと単語分割の結果との比較によっ
て普通名詞および複合名詞の部分をキーワードとして抽
出する。続いてキーフレーズ抽出サブステップにおい
て、キーフレーズ抽出用の品詞および文字の並びのパタ
ーンと単語分割の結果との比較によって、動詞またはサ
変名詞と、その動詞またはサ変名詞に意味的に接続する
格要素の内容を表す普通名詞または複合名詞との対をキ
ーフレーズとして抽出する。In the anchor extraction step, in the keyword extraction sub-step, the common noun and compound noun portions are extracted as keywords by comparing the part of speech and character arrangement pattern for keyword extraction with the result of word division. Then, in the key phrase extraction sub-step, by comparing the part of speech and character arrangement pattern for key phrase extraction and the result of word division, the verb or sahen noun and the case element that is semantically connected to the verb or sahenun A pair with an ordinary noun or a compound noun representing the content of is extracted as a key phrase.

【００１２】重要箇所抽出ステップは、重要箇所仮抽出
サブステップにおいて、重要箇所仮抽出用のパターンと
単語分割の結果との比較によって仮の重要箇所を抽出す
る。除外箇所抽出サブステップにおいて、除外箇所抽出
用のパターンと単語分割の結果との比較によって除外箇
所を抽出する。抽出箇所統合サブステップにおいて、除
外箇所と重複する仮の抽出重要箇所を除外し、除外した
残りを重要箇所とする。In the important point extracting step, the temporary important point extracting step extracts a temporary important point by comparing the pattern for temporary important point extraction with the result of word division. In the excluded part extraction sub-step, the excluded part is extracted by comparing the pattern for extracting the excluded part with the result of the word division. In the extraction point integration sub-step, temporary extraction important points that overlap with the exclusion points are excluded, and the remaining excluded points are regarded as important points.

【００１３】アンカーマーキングステップは、重要箇所
と重複するキーワード、キーフレーズに対して、重要箇
所である旨のフラグを付ける。In the anchor marking step, a flag indicating that it is an important part is attached to a keyword and a key phrase which overlap with the important part.

【００１４】リンク作成ステップは、キーフレーズタイ
プリンク作成サブステップにおいて、２個のキーフレー
ズについて、キーフレーズを構成する動詞またはサ変名
詞および名詞または複合名詞のそれぞれの表記や語幹や
意味概念が一致し、一方のキーフレーズには重要箇所の
フラグがあり他方にはそのフラグが無く、それぞれのキ
ーフレーズが存在するノードの目的タイプの組合せが許
可されており、両方のノードは異なっておりかつ直系の
上下関係に無い場合に、重要箇所のフラグが無いキーフ
レーズを始点のアンカーとし、重要箇所のフラグがある
キーフレーズが存在するノードを終点とするリンクを作
成する。リンクのデータは、始点のアンカーの表示範囲
情報と、始点のアンカーを構成する語句の情報と、終点
のノードの位置情報とからなる。In the link generation step, in the key phrase type link generation sub-step, the notations, stems and semantic concepts of the verbs or sahen nouns and the nouns or compound nouns constituting the key phrases are the same for the two key phrases. , One keyphrase has a flag of an important point and the other has no flag, and the combination of the target type of the node where each keyphrase exists is allowed, and both nodes are different and direct When there is no hierarchical relationship, a link is created in which a key phrase with no flag of an important point is an anchor of the starting point and a node with a key phrase with a flag of the important point is an end point. The link data includes display range information of the starting point anchor, information of words and phrases forming the starting point anchor, and position information of the end point node.

【００１５】キーワードタイプリンク作成サブステップ
において、キーワードとして抽出した普通名詞および複
合名詞と、キーフレーズを構成するものとして抽出した
普通名詞および複合名詞とをキーワードとみなして、２
個のキーワードについて、キーフレーズタイプリンクと
同等の条件および手順にしたがってリンクを作成する。
つまり、キーフレーズタイプリンクの作成サブステップ
においては動詞またはサ変名詞および名詞または複合名
詞の一致によってリンクを作成していたところを、名詞
または複合名詞の一致によってリンクを作成する。In the keyword type link creation sub-step, the common nouns and compound nouns extracted as keywords and the common nouns and compound nouns extracted as those constituting the key phrase are regarded as keywords, and 2
For each keyword, create a link according to the same conditions and procedure as the key phrase type link.
That is, in the sub-step of creating a key phrase type link, a link is created by matching a noun or a compound noun, where a link was created by matching a verb or sahen noun and a noun or compound noun.

【００１６】キーフレーズタイプリンク作成サブステッ
プ、キーワードタイプリンク作成サブステップの両方に
おいて、始点とするアンカーのキーフレーズまたはキー
ワードを構成する単語を包含する最小の連続する範囲を
アンカーの表示範囲とする。ただし、アンカーの表示範
囲が重複するリンクが他にあって、そのリンクの終点が
異なる場合は、それぞれのキーフレーズを構成する単語
から重複する単語を除いた残りの単語を包含する最小の
連続する範囲を、それぞれのリンクの始点のアンカーの
表示範囲とする。また、アンカーの表示範囲が重複する
リンクが他にあって、そのリンクの終点が同じ場合は、
リンクを作成せず、他にあったリンクの始点とするアン
カーの表示範囲をそれぞれのアンカーの表示範囲の和と
する。In both the key phrase type link creation substep and the keyword type link creation substep, the minimum continuous range that includes the key phrase of the anchor that is the starting point or the words that compose the keyword is the anchor display range. However, if there are other links with overlapping anchor display ranges and the end points of the links are different, the minimum consecutive words including the remaining words excluding the overlapping words from the words that make up each key phrase are included. The range is the display range of the anchor at the start point of each link. Also, if there are other links with the same anchor display range and the end points of the links are the same,
Without creating a link, the display range of the anchor that is the starting point of another link is the sum of the display ranges of the anchors.

【００１７】ハイパーテキスト表示ステップは、レイア
ウトサブステップにおいて、表示すべきノードを決定し
た後に、そのノードのテキストを表示用にレイアウト
し、文書に含まれる文書構成要素情報やレイアウト情報
を削除すると共に、レイアウト前後の文字の位置の対応
表を作成する。そして、アンカー表示位置計算サブステ
ップにおいて、レイアウト前のテキストに基づいて表現
されたアンカーの表示位置情報を、レイアウト前後の文
字の位置の対応表を参照して、レイアウト後のテキスト
に基づいた表現に変換する。その後に、ノードテキスト
表示サブステップにおいてアンカー部分を反転などの強
調によって表示する。In the hypertext display step, after the node to be displayed is determined in the layout sub-step, the text of the node is laid out for display, and the document constituent element information and layout information included in the document are deleted. Create a correspondence table of character positions before and after layout. Then, in the anchor display position calculation sub-step, the display position information of the anchor expressed based on the text before the layout is converted into the expression based on the text after the layout by referring to the correspondence table of the character positions before and after the layout. Convert. After that, in the node text display sub-step, the anchor part is displayed by highlighting such as inversion.

【００１８】上記のリンク作成ステップの作用の一部と
して２個以上のリンクの始点のアンカーの表示範囲が重
複する場合にはその重複を解消することを示したが、そ
の解消処理を行わずにノードテキスト表示サブステップ
において、リンクのフォローのためにアンカーの表示範
囲が重複する箇所をマウス等で選択された場合に、それ
らのアンカーに対する反転などの強調表示を順次切り替
えて表示することもできる。As a part of the operation of the link creating step, it has been shown that when the display ranges of the anchors at the starting points of two or more links overlap, the overlap is eliminated, but the elimination processing is not performed. In the node text display sub-step, when a portion where the display range of the anchors overlaps is selected with a mouse or the like to follow the link, highlighting such as inversion of those anchors can be sequentially switched and displayed.

【００１９】[0019]

【実施例】本発明の、テキストのハイパーテキスト化方
法の一実施例を説明する。この方法の実施例を示すフロ
ー図を図１に示す。また、データフロー図を図２に示
す。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of a text hypertext conversion method of the present invention will be described. A flow chart showing an embodiment of this method is shown in FIG. A data flow diagram is shown in FIG.

【００２０】まず、概要を説明する。First, an outline will be described.

【００２１】本実施例が入力とするテキストの一例を図
１２に示す。ノードテーブル作成ステップ１０１におい
てテキストを入力してから、文分割ステップ１０２、単
語分割ステップ１０３、アンカー抽出ステップ１０４、
重要箇所抽出ステップ１０５、アンカーマーキングステ
ップ１０６、リンク作成ステップ１０７を経て、ハイパ
ーテキストの構成要素であるノード、アンカー、リンク
を作成する。その結果のイメージ図を図１３に示す。細
線枠1３０１などの細線枠はノードを表し、反転表示部
分１３０２などの反転表示はアンカーを表し、矢印１３
０３などの矢印はリンクを表す。FIG. 12 shows an example of text input by this embodiment. After inputting text in the node table creating step 101, a sentence dividing step 102, a word dividing step 103, an anchor extracting step 104,
A node, an anchor, and a link, which are constituent elements of the hypertext, are created through the important point extraction step 105, the anchor marking step 106, and the link creation step 107. An image diagram of the result is shown in FIG. A thin line frame such as a thin line frame 1301 represents a node, a reverse display such as a reverse display portion 1302 represents an anchor, and an arrow 13
Arrows such as 03 represent links.

【００２２】このハイパーテキストをハイパーテキスト
表示ステップ１０８において、表示すべきノードを決定
して、そのノードのテキストを表示する。その際にアン
カーを反転するなどの強調して表示する。利用者がマウ
スなどでアンカーの部分を選択すると、現在表示中のノ
ードに代えて、リンクによってそのアンカーとの関連が
示されたノードを表示する。In the hypertext display step 108, the hypertext display step 108 determines the node to be displayed and displays the text of the node. At that time, the anchor is highlighted and displayed, for example. When the user selects the anchor part with a mouse or the like, the node whose link is indicated by a link is displayed instead of the currently displayed node.

【００２３】次に、本実施例の動作の詳細を説明する。Next, details of the operation of this embodiment will be described.

【００２４】ノードテーブル作成ステップ１０１におい
て、テキストを入力してノードテーブルを作成する。ノ
ードはテキストの内容的なまとまりである。本実施例で
は、章、節、項などの文書構成要素に対応してノードを
作成する。また、各ノードは「概要説明」、「詳細説
明」など、説明の目的を持つ。ノードテーブルは、ノー
ドを構成するテキストの範囲、ノードの説明の目的タイ
プ、子ノード番号リストを記録するためのテーブルであ
る。In the node table creation step 101, text is input to create a node table. A node is a piece of textual content. In this embodiment, nodes are created corresponding to document constituent elements such as chapters, sections, and items. In addition, each node has a purpose of explanation such as “summary explanation” and “detailed explanation”. The node table is a table for recording a range of texts forming a node, a purpose type of a node description, and a child node number list.

【００２５】図１２のテキストに対応して作成するノー
ドテーブルを図１４に示す。図中、「…」１４０８の
「…」は、文字や項目を省略してあることを示す。他の
図表においても同様である。FIG. 14 shows a node table created corresponding to the text of FIG. In the figure, "..." of "..." 1408 indicates that characters and items are omitted. The same applies to other charts.

【００２６】「＃」１４０１は、ノードの通し番号を表
す。「開始」１４０４、「終了」１４０５は、ノードを
構成する範囲を入力テキストの先頭の文字からの文字数
によって表現した値である。テキストに挿入されてい
る、「＜ＤＯＣ＞」１２０１、「＜ＣＨＡＰ＞」１２０
２、「＜ＳＥＣＴ＞」１２０３、「＜ＳＵＢＳＥＣＴ
＞」１２０４などの、文書構成要素の境界を示すデータ
によって、ノードの範囲を得る。"#" 1401 represents the serial number of the node. “Start” 1404 and “end” 1405 are values that represent the range forming the node by the number of characters from the first character of the input text. "<DOC>" 1201 and "<CHAP>" 120 inserted in the text
2, "<SECT>" 1203, "<SUBSECT
> ”, Such as 1204, to obtain a range of nodes, with data indicating the boundaries of the document components.

【００２７】「タグ」１４０２は、文書構成要素の境界
を示すデータの名称であり、文書構成要素の種別を表
す。「章節番号」１４０３は、文書構成要素の番号であ
る。The "tag" 1402 is the name of the data indicating the boundary of the document constituent element and represents the type of the document constituent element. “Chapter section number” 1403 is the number of the document constituent element.

【００２８】「目的タイプ」１４０６は、ノードの説明
の目的タイプである。ノードの説明の目的タイプを示す
データは、「<NTYPE>」１２０５、「</NTYPE>」１２０
６のようにNTYPEタグによってテキストに挿入されてい
る。したがってこのタグの間の文字列を、ノードの説明
の目的タイプとして得る。ノードの説明の目的タイプが
示されていないノードについては、その上位にあるノー
ドと同一であるものとする。The "purpose type" 1406 is the purpose type of the node description. The data indicating the purpose type of the node description is “<NTYPE>” 1205, “</ NTYPE>” 120
It is inserted in the text by the NTYPE tag like 6. So we get the string between this tag as the target type for the node description. A node for which the purpose type of the node description is not shown is assumed to be the same as the node above it.

【００２９】「下位ノード＃リスト」１４０７は、ノー
ドの下位にあるノードのリストである。章に対応するノ
ードに対して、その章を構成する節に対応するノード
は、下位のノードとなる。節のノードから見れば章のノ
ードは上位のノードとなる。The "lower node #list" 1407 is a list of nodes below the node. With respect to the node corresponding to the chapter, the nodes corresponding to the sections constituting the chapter are lower nodes. Seen from the node of the section, the node of the chapter becomes the upper node.

【００３０】図１４の「下位ノード＃リスト」１４０７
が示すノードの上下関係を図１５に図式的に示す。「＃
１」１５０１などの「＃」の左の番号は、ノードの番号
であり「＃」１４０１の値と対応する。"Lower node #list" 1407 in FIG.
FIG. 15 schematically shows the vertical relationship of the nodes indicated by. "#
The number to the left of “#” such as “1” 1501 is the node number and corresponds to the value of “#” 1401.

【００３１】文分割ステップ１０２において、ノード内
のテキストからタイトル、文を切り出して、文テーブル
を作成する。文テーブルには、各タイトル、文の開始位
置と文字数を記録する。文テーブルの例を図１６に示
す。「＃」１６０１は文の通し番号を表し、「開始」１
６０２は文の開始位置をテキストの先頭からの文字数で
表現した値である。「文字数」１６０３は切り出したタ
イトルや文の文字数であリ、「文」１６０４は切り出し
たタイトルや文の文字列である。べた書きされたテキス
トからタイトルや文を切り出す技術は機械翻訳に関連し
て公知の技術である。In the sentence dividing step 102, a title and a sentence are cut out from the text in the node to create a sentence table. In the sentence table, each title, the start position of the sentence, and the number of characters are recorded. An example of the sentence table is shown in FIG. "#" 1601 represents the serial number of the sentence, and "start" 1
Reference numeral 602 is a value in which the start position of the sentence is expressed by the number of characters from the beginning of the text. “Number of characters” 1603 is the number of characters in the cut-out title or sentence, and “sentence” 1604 is a character string of the cut-out title or sentence. A technique for cutting out a title or a sentence from a solid text is a well-known technique related to machine translation.

【００３２】単語分割ステップ１０３において、各文、
タイトルを単語に分割し、単語の語幹と品詞を認識して
単語テーブルを作成する。文１６０５の単語分割の結果
を図１７に示す。「＃」１７０１は文内の単語の通し番
号を表し、「開始」１７０２は単語の開始位置をテキス
トの先頭からの文字数で表現した値である。「文字数」
１７０３は単語の文字数であリ、「表記」１７０４は単
語の文字列である。「語幹」１７０５は単語の語幹であ
り辞書の見出しを記録する。辞書に登録されていない単
語についてはその語幹を、１７０７「−」に示すよう
に、「−」とする。「品詞」１７０６は品詞を表す。単
語分割の技術も機械翻訳システムに関連して公知の技術
である。In the word division step 103, each sentence,
The title is divided into words and the word stem and part of speech are recognized to create a word table. The result of word division of the sentence 1605 is shown in FIG. “#” 1701 represents a serial number of a word in a sentence, and “start” 1702 is a value in which the start position of the word is expressed by the number of characters from the beginning of the text. "word count"
1703 is the number of characters of the word, and “notation” 1704 is the character string of the word. A “word stem” 1705 is a word stem and records a dictionary heading. As for a word that is not registered in the dictionary, its stem is set to "-" as indicated by 1707 "-". “Part of speech” 1706 represents a part of speech. The word division technique is also a well-known technique related to a machine translation system.

【００３３】アンカー抽出ステップ１０４において、単
語分割結果と予め記述されたアンカー抽出用パターンと
の対応付けによってアンカーを抽出し、アンカーテーブ
ルを作成する。In the anchor extraction step 104, an anchor is extracted by associating the result of word division with a previously described anchor extraction pattern to create an anchor table.

【００３４】アンカー抽出用パターンの例を図１８に示
す。「＃」１８０１はアンカー抽出用パターンの通し番
号を表し、「名称」１８０２はパターンの名称を表す。
「内容」１８０３は、バターンの内容を表す。アンカー
抽出用パターンは、正規表現として知られる文字列のパ
ターンの記述方法を拡張して、品詞と文字の並びとして
記述する。FIG. 18 shows an example of the anchor extraction pattern. “#” 1801 represents the serial number of the anchor extraction pattern, and “name” 1802 represents the name of the pattern.
“Content” 1803 represents the content of the pattern. The anchor extraction pattern is described as a sequence of parts of speech and characters by expanding the description method of a character string pattern known as a regular expression.

【００３５】アンカーテーブルを図１９に示す。「＃」
１９０１はアンカーの通し番号を表す。「＃」１９０２
はアンカーが存在したノードの番号を表し、ノードテー
ブルにおける「＃」１４０１に対応する。「目的タイ
プ」１９０３は、そのノードの説明の目的タイプを表
し、ノードテーブルにける「目的タイプ」１４０６に対
応する。「開始」１９０４、「終了」１９０５は、抽出
したキーワードおよびキーフレーズの名詞部の開始位
置、終了位置をテキストの先頭からの文字数によって表
した値である。「語幹」１９０６は、キーワードおよび
キーフレーズの名詞または複合名詞の部分の単語の語幹
あるいは表記を結合したものである。今後、キーフレー
ズの名詞または複合名詞の部分を名詞部と呼ぶ。辞書に
登録されていた単語は語幹を、語幹が「−」の単語につ
いてはその表記を結合対象とする。「開始」１９０７、
「終了」１９０８、「語幹」１９０９は、キーフレーズ
の動詞またはサ変名詞の部分の単語について、キーワー
ドと同様の内容を記録する。今後、キーフレーズの動詞
またはサ変名詞の部分を述語部と呼ぶ。「パターン」１
９１０は、そのアンカーを抽出したアンカー抽出用パタ
ーンの名称であり、「名称」１８０２に対応する。The anchor table is shown in FIG. "#"
1901 represents the serial number of the anchor. "#" 1902
Represents the number of the node in which the anchor existed, and corresponds to “#” 1401 in the node table. The “purpose type” 1903 represents the purpose type of the description of the node, and corresponds to the “purpose type” 1406 in the node table. “Start” 1904 and “end” 1905 are values that represent the start position and end position of the noun part of the extracted keyword and key phrase by the number of characters from the beginning of the text. The “word stem” 1906 is a combination of the word stems or notations of the nouns or compound nouns of keywords and key phrases. Hereinafter, the noun or compound noun part of the key phrase will be called the noun part. A word registered in the dictionary is a stem, and a word having a stem of "-" is a combination of the notations. "Start" 1907,
“End” 1908 and “word stem” 1909 record the same content as the keyword for the word in the verb or sahen noun part of the key phrase. Hereafter, the verb or sahen noun part of the key phrase is called the predicate part. "Pattern" 1
Reference numeral 910 is the name of the anchor extraction pattern from which the anchor has been extracted, and corresponds to “name” 1802.

【００３６】アンカー抽出ステップ１０４は、図３に示
すように、キーワード抽出サブステップ３０１とキーフ
レーズ抽出サブステップ３０２から構成される。The anchor extraction step 104 is composed of a keyword extraction sub-step 301 and a key phrase extraction sub-step 302, as shown in FIG.

【００３７】キーワード抽出サブステップ３０１におい
て、キーワード抽出用パターンによって、普通名詞およ
び複合名詞を抽出する。「Ｗ１」１８０４に示すよう
に、名称が「Ｗ」で始まるパターンをキーワード抽出用
パターンとする。それらのパターンを順次適用してキー
ワードを抽出する。In the keyword extraction substep 301, common nouns and compound nouns are extracted according to the keyword extraction pattern. As shown in “W1” 1804, a pattern whose name starts with “W” is a keyword extraction pattern. Keywords are extracted by sequentially applying those patterns.

【００３８】ここで、パターンの記述方法を簡単に説明
する。この記述方法は、文字列処理の分野では正規表現
として知られる、文字の並びのパターンの記述方法を拡
張し、文字の並びの他に品詞の記述を加えたものであ
る。品詞を「｛」と「｝」で囲み、文字列を「”」で囲
むことによって、品詞と文字列を区別する。文字列は従
来の正規表現のよって記述する。品詞と文字列はそれぞ
れ一つのサブパターンであり、サブパターンは、連接に
よって結合したり、ＡＮＤ「＆」やＯＲ「｜」で結合す
ることができる。「（」と「）」は、サブパターンのま
とまりの範囲を表す。「＊」はその左側に隣接するサブ
パターンの０回以上の繰り返しを表し、「＋」はその左
側に隣接するサブパターンの１回以上の繰り返しを表
す。「＠」によって、パターン内の抽出範囲を指定す
る。「＠」に続く１文字によって抽出範囲の名称を指定
する。「＠＠」は、その範囲の終りを示す。Here, the description method of the pattern will be briefly described. This description method is an extension of the method for describing a pattern of character sequences, which is known as a regular expression in the field of character string processing, and adds a part of speech description in addition to the character sequence. The part of speech and the character string are distinguished by enclosing the part of speech with "{" and "}" and enclosing the character string with "". Character strings are described by conventional regular expressions. The part-of-speech and the character string are each one sub-pattern, and the sub-patterns can be connected by concatenation or can be connected by AND "&" or OR "|". "(" And ")" represent the range of a group of sub-patterns. "*" Represents 0 or more repetitions of the sub-pattern adjacent to its left side, and "+" represents one or more repetitions of the sub-pattern adjacent to its left side. The extraction range in the pattern is designated by "@". The name of the extraction range is specified by one character following "@". "@@" indicates the end of the range.

【００３９】キーワード抽出用パターンの例としてパタ
ーン１８０５を説明する。パターン１８０５は、マニュ
アルの文書名を抽出するためのパターンである。パター
ン１８０５は、始めの単語の品詞は普通名詞または英字
であり、次に品詞は普通名詞または数字または英字であ
る単語が１回以上繰り返し、その次に空白文字が１文字
あって、その次に「文法／操作書」という文字列があれ
ば、その範囲全体を「Ｎ」という名称で抽出する、とい
う内容である。キーワード抽出サブステップ３０１にお
いては、抽出範囲の名称「Ｎ」をキーワードに対応する
部分として抽出する。The pattern 1805 will be described as an example of the keyword extraction pattern. The pattern 1805 is a pattern for extracting the document name of the manual. In the pattern 1805, the part of speech of the first word is a common noun or an alphabet, the word of which the part of speech is a common noun, a number, or an alphabet is repeated one or more times, followed by one space character, and then by a space character. If there is a character string “grammar / operation manual”, the entire range is extracted with the name “N”. In the keyword extraction sub-step 301, the name "N" of the extraction range is extracted as a portion corresponding to the keyword.

【００４０】パターン１８０５によって、アンカー１９
１１に示すように「ＤＢＳ２△文法／操作書」を抽出
し、名詞部に記録する。キーワード抽出用パターンには
述語部の指定は無いので、アンカーテーブルにおいてキ
ーワードとして抽出したアンカーの述語部の各項目には
「−」を埋める。The pattern 1805 allows the anchor 19
As shown in 11, the “DBS2Δ grammar / operation manual” is extracted and recorded in the noun part. Since the predicate part is not specified in the keyword extraction pattern, "-" is filled in each item of the predicate part of the anchor extracted as the keyword in the anchor table.

【００４１】「＊」、「＋」の繰り返し指定によって
「＠」で指定される抽出範囲の対象箇所が複数箇所考え
られる場合は、パターンと単語分割結果との対応付け処
理においてバックトラックを行い、すべての対象を抽出
する。ただし、「＊」、「＋」が「＠」で指定する抽出
範囲の内側にあって複数の重複した範囲が抽出対象とし
て考えられる場合は、それら重複する抽出対象について
は最大の範囲を抽出する。これは、例えば、複合名詞は
普通名詞の１回以上の繰り返しの範囲であるが、３語の
単語からなる複合名詞としては、その内の１語あるいは
２語の部分は抽出せず、複合名詞の全体である３語を抽
出することである。When there are a plurality of target locations in the extraction range designated by "@" by repeatedly designating "*" and "+", backtracking is performed in the process of associating the pattern with the word division result. Extract all objects. However, if "*" and "+" are inside the extraction range specified by "@" and multiple overlapping ranges are considered as extraction targets, the maximum range is extracted for these overlapping extraction targets. . This means, for example, that a compound noun is a range of one or more repetitions of an ordinary noun, but as a compound noun consisting of three words, one or two parts of it are not extracted and the compound noun is not extracted. Is to extract the entire 3 words.

【００４２】キーフレーズ抽出サブステップ３０２にお
いて、キーフレーズ抽出用パターンによって、動詞また
はサ変名詞と、その動詞またはサ変名詞に意味的に接続
する格要素の内容を表す普通名詞および複合名詞とを抽
出する。「Ｐ１」１８０６に示すように、名称が「Ｐ」
で始まるパターンをキーフレーズ抽出用パターンとす
る。それらのパターンを順次適用してキーワードを抽出
する。パターンの記述中、抽出範囲の名称「Ｎ」が名詞
部に対応し、「Ｐ」が述語部に対応する。In the key phrase extraction sub-step 302, a verb or sahen noun and common nouns and compound nouns representing the contents of case elements semantically connected to the verb or sahen noun are extracted by the keyphrase extracting pattern. . As shown in “P1” 1806, the name is “P”
The pattern starting with is the key phrase extraction pattern. Keywords are extracted by sequentially applying those patterns. In the description of the pattern, the name "N" of the extraction range corresponds to the noun part, and "P" corresponds to the predicate part.

【００４３】キーワード抽出用パターンによって抽出さ
れたキーワードが、キーフレーズ抽出用パターンの名詞
部に対応した場合、そのキーワードはアンカーテーブル
から削除する。When the keyword extracted by the keyword extraction pattern corresponds to the noun part of the key phrase extraction pattern, the keyword is deleted from the anchor table.

【００４４】アンカー抽出ステップ１０４における、上
記の処理の結果、図１９に示すアンカーテーブルを作成
する。As a result of the above processing in the anchor extraction step 104, the anchor table shown in FIG. 19 is created.

【００４５】重要箇所抽出ステップ１０５において、単
語分割結果と予め記述された重要箇所抽出用パターンと
の対応付けによって重要箇所を抽出し、重要箇所テーブ
ルを作成する。In the important point extraction step 105, the important point is extracted by associating the word division result with the previously described important point extraction pattern, and an important point table is created.

【００４６】重要箇所抽出用パターンの例を図２０に示
す。「＃」２００１は重要箇所抽出用パターンの通し番
号を表し、「名称」２００２はパターンの名称を表す。
「内容」２００３は、パターンの内容を表す。重要箇所
抽出用パターンは、アンカー抽出用パターンと同様、品
詞と文字の並びとして記述する。FIG. 20 shows an example of an important portion extraction pattern. “#” 2001 represents the serial number of the important part extraction pattern, and “name” 2002 represents the name of the pattern.
"Content" 2003 represents the content of the pattern. Like the anchor extraction pattern, the important point extraction pattern is described as a sequence of parts of speech and characters.

【００４７】アンカー抽出用パターンの説明時に無かっ
た記号について説明する。パターンの始めの「＾」は、
文頭を表す。「”」で示された文字列内の「［」
と「］」は文字クラス、つまり、「［」と「］」で囲ま
れた文字のいづれかの文字とマッチすることを表す。
「［」の次に「＾」がある場合は、否定文字クラス、つ
まり、「＾」から「］」ではさまれた文字以外の文字と
マッチすることを表す。「｛．｝」は、任意の品詞を表
す。Symbols that were not present when describing the anchor extraction pattern will be described. The "^" at the beginning of the pattern is
Indicates the beginning of a sentence. "[" In the character string indicated by """
And "]" represent a character class, that is, match any one of the characters enclosed in "[" and "]".
When "^" is followed by "^", it indicates that it matches a character other than the negative character class, that is, the characters sandwiched between "^" and "]". "{.}" Represents an arbitrary part of speech.

【００４８】重要箇所テーブルを図２１に示す。「＃」
２１０１は重要箇所の通し番号を表す。「開始」２１０
２、「終了」２１０３は、抽出した重要箇所の開始位
置、終了位置をテキストの先頭からの文字数によって表
した値である。「重要箇所」２１０４は、重要箇所とし
て抽出した部分の文字列である。FIG. 21 shows the important part table. "#"
2101 represents a serial number of an important part. "Start" 210
2, "End" 2103 is a value that represents the start position and end position of the extracted important part by the number of characters from the beginning of the text. The “important part” 2104 is a character string of the part extracted as the important part.

【００４９】重要箇所抽出ステップ１０５は、図４に示
すように、重要箇所仮抽出サブステップ４０１と除外箇
所抽出サブステップ４０２と抽出箇所統合サブステップ
４０３とから構成される。As shown in FIG. 4, the important part extraction step 105 is composed of an important part temporary extraction substep 401, an excluded part extraction substep 402, and an extracted part integration substep 403.

【００５０】重要箇所仮抽出サブステップ４０１におい
て、重要箇所仮抽出用パターンによって、文字列を抽出
する。「Ｃ１」２００４に示すように、名称が「Ｃ」で
始まるパターンを重要箇所仮抽出用パターンとする。そ
れらのパターンを順次適用して文字列を仮重要箇所とし
て抽出する。In the important point temporary extraction sub-step 401, a character string is extracted by the important point temporary extraction pattern. As shown in “C1” 2004, a pattern whose name starts with “C” is set as an important portion temporary extraction pattern. The character strings are extracted as temporary important points by sequentially applying those patterns.

【００５１】重要箇所仮抽出用パターンの例として、パ
ターン２００５の内容について簡単に説明する。このパ
ターンは、文頭から「０」から「９」の数字または小数
点「．」が１文字以上繰り返し、その次に空白文字
「△」がある場合に、その次の単語から文末まで抽出範
囲名称「Ｃ」で取り出す。つまり、章、節、項のタイト
ルの行の、番号に続く章、節、項の名称を抽出する。As an example of the important portion temporary extraction pattern, the contents of the pattern 2005 will be briefly described. In this pattern, if the number from "0" to "9" or the decimal point "." Is repeated more than one character from the beginning of the sentence and then there is a space character "△", the extraction range name from the next word to the end of the sentence " Take out with "C". That is, the chapter, section, and section names following the number in the chapter, section, and section title lines are extracted.

【００５２】仮重要箇所として抽出した文字列を仮重要
箇所および除外箇所テーブルに記録する。仮重要箇所お
よび除外箇所テーブルを図２２に示す。「＃」２２０１
は仮重要箇所および除外箇所の通し番号を表す。「開
始」２２０２、「終了」２２０３は、抽出した仮重要箇
所および除外箇所の開始位置、終了位置をテキストの先
頭からの文字数によって表した値である。「仮重要箇所
Ｃおよび除外箇所Ｅ」２２０４は、仮重要箇所および除
外箇所として抽出した部分の文字列である。「パター
ン」２２０５は、その仮重要箇所および除外箇所を抽出
した重要箇所抽出用パターンの名称であり、「名称」２
００２に対応する。The character string extracted as the temporary important point is recorded in the temporary important point and excluded point table. FIG. 22 shows a temporary important point and excluded point table. "#" 2201
Indicates the serial numbers of temporary important points and excluded points. “Start” 2202 and “end” 2203 are values that represent the start position and end position of the extracted temporary important part and excluded part by the number of characters from the beginning of the text. “Temporary important part C and excluded part E” 2204 is a character string of the parts extracted as the temporary important part and the excluded part. The “pattern” 2205 is the name of the important part extraction pattern in which the temporary important part and the excluded part are extracted, and the “name” 2
It corresponds to 002.

【００５３】図２２において、仮重要箇所２２０６から
仮重要箇所２２０７までが、重要箇所仮抽出サブステッ
プによって抽出した仮重要箇所を表す。In FIG. 22, temporary important points 2206 to 2207 represent temporary important points extracted by the important point temporary extraction sub-step.

【００５４】除外箇所抽出サブステップ４０２におい
て、除外箇所抽出用パターンによって、文字列を抽出す
る。「Ｅ１」２００６に示すように、名称が「Ｅ」で始
まるパターンを除外箇所抽出用パターンとする。それら
のパターンを順次適用して文字列を除外要箇所として抽
出し、仮重要箇所および除外箇所テーブルに記録する。
図２２において、除外箇所２２０８が、除外箇所抽出サ
ブステップによって抽出した除外箇所を表す。In the excluded part extraction sub-step 402, a character string is extracted by the excluded part extraction pattern. As shown in “E1” 2006, a pattern whose name starts with “E” is set as an exclusion point extraction pattern. These patterns are sequentially applied to extract a character string as an exclusion point, and record it in a temporary important point and exclusion point table.
In FIG. 22, an excluded part 2208 represents an excluded part extracted by the excluded part extraction sub-step.

【００５５】抽出箇所統合サブステップ４０３におい
て、除外箇所と「開始」２２０１、「終了」２２０２で
示される範囲が重複する仮重要箇所を削除する。更に除
外箇所を削除する。残った仮重要箇所を重要箇所テーブ
ルに記録する。重要箇所については既に説明した通りで
ある。この抽出箇所統合サブステップ４０３において、
仮重要箇所２２０７と除外箇所２２０８が重複するので
この仮重要箇所２２０７を削除し、また、除外箇所２２
０８を削除する。In the extraction location integration sub-step 403, the temporary important location where the exclusion location and the range indicated by “start” 2201 and “end” 2202 overlap is deleted. Furthermore, the excluded part is deleted. The remaining temporary important points are recorded in the important point table. The important points are as described above. In this extraction point integration sub-step 403,
Since the temporary important point 2207 and the excluded point 2208 overlap, the temporary important point 2207 is deleted, and the excluded point 22
08 is deleted.

【００５６】アンカーマーキングステップ１０６におい
て、重要箇所テーブルに記録された重要箇所と範囲が重
複するアンカーに重要フラグを付け、重要箇所認識済み
アンカーテーブルを作成する。範囲の重複の有無は、重
要箇所テーブルの「開始」２１０２、「終了」２１０
３、アンカーテーブルの名詞部の「開始」１９０４、
「終了」１９０５、述語部の「開始」１９０７、「終
了」１９０８の値を調べることによって判定する。In the anchor marking step 106, an important flag is attached to an anchor whose range overlaps with the important point recorded in the important point table, and an important point recognized anchor table is created. Whether or not the ranges overlap is determined by the “start” 2102 and “end” 210 of the important point table.
3, "start" 1904 of the noun part of the anchor table,
The determination is made by checking the values of “end” 1905, “start” 1907, and “end” 1908 of the predicate.

【００５７】重要箇所認識済みアンカーテーブルを図２
３に示す。これは、アンカーテーブル図１９の「パター
ン」１９１０の項目を削り、「重要フラグ」２３１０の
項目を追加したものである。重要フラグが付いているア
ンカーは「重要フラグ」２３１０の項目の値が１であ
り、フラグが無いアンカーはその値は０である。FIG. 2 shows the anchor table in which the important points are recognized.
3 shows. This is one in which the item of "pattern" 1910 in FIG. 19 of the anchor table is deleted and the item of "important flag" 2310 is added. An anchor having an important flag has a value of 1 in the item of “important flag” 2310, and an anchor having no flag has a value of 0.

【００５８】リンク作成ステップ１０７において、アン
カーテーブルにおける名詞部の語幹または述語部の語幹
が一致するアンカーを結ぶリンクを作成し、リンクテー
ブルを作成する。その結果、図１３に示すハイパーテキ
ストを作成する。In the link creating step 107, a link connecting the anchors in which the stems of the noun part or the predicate part in the anchor table match is created and a link table is created. As a result, the hypertext shown in FIG. 13 is created.

【００５９】リンクテーブルを図２４に示す。「＃」２
４０１はリンクの通し番号を表す。リンクデータは、始
点アンカー情報、終点ノード情報、語幹情報からなる。The link table is shown in FIG. "#" 2
401 represents the serial number of the link. The link data includes start point anchor information, end point node information, and stem information.

【００６０】始点アンカー情報の構成は、次のとおりで
ある。「始点アンカー」の下位項目である「＃」２４０
２は、アンカーの通し番号であり、重要箇所認識済みア
ンカーテーブルの通し番号「＃」２３０１に対応する。
「始点アンカー」の下位項目の「ノード」の下位項目で
ある「＃」２４０３は、アンカーが存在するノードの通
し番号であり、重要箇所認識済みアンカーテーブルの
「ノード」の下位項目の通し番号「＃」２３０２に対応
する。「ノード」の下位項目「開始」２４０４は、ノー
ドの開始位置をテキストの先頭からの文字数で表現した
値であり、ノードの通し番号「＃」２４０３が表すノー
ドの、ノードテーブルにおける「開始」１４０４の値を
コピーしたものである。「相対位置」の下位項目である
「開始」２４０５、「終了」２４０６は、アンカーの表
示範囲をノードの先頭からの文字数で表現した値であ
り、重要箇所認識済みアンカーテーブルの「名詞部Ｎ」
の下位項目「開始」２３０４、「終了」２３０５、「述
語部Ｐ」の下位項目「開始」２３０７、「終了」２３０
８の値から「ノード」の下位項目の「開始」２４０４の
値を減じた値である。The structure of the starting point anchor information is as follows. “#” 240, which is a subordinate item of “starting point anchor”
2 is the serial number of the anchor, and corresponds to the serial number “#” 2301 of the anchor table in which the important point has been recognized.
“#” 2403, which is a sub-item of “node”, which is a sub-item of “start point anchor”, is a serial number of a node in which an anchor exists, and a serial number “#” of a sub-item of “node” in the important point recognized anchor table. Corresponding to 2302. The “start” 2404 subordinate item of “node” is a value in which the start position of the node is expressed by the number of characters from the beginning of the text, and the “start” 1404 in the node table of the node represented by the node serial number “#” 2403. It is a copy of the value. "Start" 2405 and "End" 2406, which are subordinate items of "Relative Position", are values in which the display range of the anchor is expressed by the number of characters from the beginning of the node, and "Noun part N" of the important point recognized anchor table.
Sub-items “start” 2304, “end” 2305, and “predicate part P” sub-items “start” 2307 and “end” 230
It is a value obtained by subtracting the value of “start” 2404 of the lower item of “node” from the value of 8.

【００６１】終点ノード情報の構成は次のとおりであ
る。「終点ノード」の下位項目である「＃」２４０７、
「開始」２４０８、「終了」２４０９は、リンクの終点
ノードの通し番号、開始位置、終了位置を表す。これら
はノードテーブルの「＃」１４０１、「開始」１４０
４、「終了」１４０５に対応する。「開始」２４０８、
「終了」２４０９の値は、テキストの先頭からの文字数
によって表現する。The structure of the end point node information is as follows. “#” 2407, which is a subordinate item of “end point node”,
“Start” 2408 and “end” 2409 represent the serial number, start position, and end position of the end node of the link. These are “#” 1401 and “start” 140 in the node table.
4, corresponding to “end” 1405. "Start" 2408,
The value of “end” 2409 is expressed by the number of characters from the beginning of the text.

【００６２】語幹情報の構成は次のとおりである。「名
詞部語幹Ｎ」２４１０は、重要箇所認識済みアンカーテ
ーブルの「名詞部」の下位項目の「語幹」２３０６に対
応し、「述語部語幹Ｐ」２４１１は、重要箇所認識済み
アンカーテーブルの「述語部」の下位項目の「語幹」２
３０９に対応する。The structure of the word stem information is as follows. The “noun part word stem N” 2410 corresponds to a “word stem” 2306 which is a subordinate item of the “noun part” in the important part recognized anchor table, and the “predicate part stem P” 2411 is “predicate” in the important part recognized anchor table. "Word stem", which is a subordinate item of "Bubu"
Corresponding to 309.

【００６３】リンク作成ステップ１０７は、図５に示す
ようにキーフレーズタイプリンク作成サブステップ５０
１とキーワードタイプリンク作成サブステップ５０２と
からなる。The link creation step 107, as shown in FIG. 5, is a key phrase type link creation substep 50.
1 and a keyword type link creation sub-step 502.

【００６４】キーフレーズタイプリンク作成サブステッ
プ５０１において、キーフレーズタイプのアンカー、す
なわち、重要箇所認識済みアンカーテーブルの「述語
部」の下位項目の「語幹」２３０９が「−」でないアン
カーについてリンクを作成する。キーフレーズタイプの
アンカーの各々を始点アンカーとして仮定し、名詞部、
述語部のそれぞれの語幹が一致するアンカーをサーチ
し、そのアンカーが存在するノードを終点とするリンク
を作成し、リンクテーブルに記録する。その詳細を図６
に示す。In the key phrase type link creation sub-step 501, a link is created for an anchor of the key phrase type, that is, an anchor in which the "word stem" 2309 of the subordinate item of the "predicate part" of the important point recognized anchor table is not "-". To do. Assuming each of the key phrase type anchors as the starting point anchor, the noun part,
An anchor in which the stems of the predicates match each other is searched, a link with the node at which the anchor is present as an end point is created, and is recorded in the link table. The details are shown in Fig. 6.
Shown in

【００６５】始点アンカー仮定ステップ６０１におい
て、始点のアンカーとして重要箇所認識済みアンカーテ
ーブルの行の一つを、ＳＡとする。次の重要フラグ判定
ステップ６０２において、ＳＡが表すアンカーの重要フ
ラグを調べる。重要フラグが１である場合は、そのアン
カーを始点とするリンクは作成しない。At the starting point anchor assumption step 601, one of the rows in the important point recognized anchor table as the starting point anchor is set to SA. In the next important flag determination step 602, the important flag of the anchor represented by SA is checked. If the important flag is 1, no link starting from that anchor is created.

【００６６】重要フラグが０である場合は、参照先アン
カーのサーチステップ６０３において、ＳＡの参照先と
するアンカーを重要箇所認識済みアンカーテーブルから
サーチする。ここで、参照先アンカーの候補をＴＡとす
ると、次に示す条件（１）〜（４）に適合するＴＡを参
照先アンカーとする。If the important flag is 0, in step 603 of searching for a reference destination anchor, the anchor referred to by the SA is searched from the important point recognized anchor table. Here, if the candidate of the reference destination anchor is TA, the TA that meets the following conditions (1) to (4) is set as the reference destination anchor.

【００６７】（１）ＳＡとＴＡで名詞部語幹および述
語部語幹が一致する。(1) SA and TA have the same noun part predicate and predicate part stem.

【００６８】（２）ＴＡの重要フラグ＝１である。(2) TA important flag = 1.

【００６９】（３）ＳＡがあるノードとＴＡがあるノ
ードのノード目的タイプの組合せが許可されている。(3) A combination of node purpose types of a node with SA and a node with TA is permitted.

【００７０】（４）ＳＡがあるノードとＴＡがあるノ
ードは異なり、かつ、直系の上下関係にはない。(4) The node having SA and the node having TA are different from each other and are not in the direct relation of the upper and lower sides.

【００７１】条件（１）において「名詞部語幹」とは、
重要箇所認識済みアンカーテーブルにおける「名詞部
Ｎ」の「語幹」２３０６を表し、「述語部語幹」は「述
語部Ｐ」の「語幹」２３０９を表す。同様に条件（２）
における「重要フラグ」とは「重要フラグ」２３１０を
表す。条件（３）における「ノード目的タイプ」とは
「目的タイプ」２３０３を表す。このノード目的タイプ
の組合せの一例を、図２５に示す。「１」２５０１は、
リンク作成を許可することを表し、「０」２５０２は、
リンク作成を許可しないことを表す。ノード目的タイプ
の組合せを限定することで、不要なリンクの作成を抑制
する。条件（４）において、ノードが直系の上下関係の
有無を調べることは、図１５に示したノードの上下関係
を調べることであり、「＃」２３０２の値をキーにし
て、ノードテーブル図１４の「下位ノード＃リスト」１
４０７を参照することによって判定する。In the condition (1), "noun part stem" means
The "word stem" 2306 of the "noun part N" in the important point recognized anchor table is represented, and the "predicate part stem" represents the "word stem" 2309 of the "predicate part P". Similarly, condition (2)
The “important flag” in “” represents the “important flag” 2310. The “node purpose type” in condition (3) represents the “purpose type” 2303. An example of this combination of node purpose types is shown in FIG. "1" 2501 is
Indicates that link creation is permitted, and "0" 2502 is
Indicates that link creation is not allowed. Suppressing unnecessary link creation by limiting the combinations of node purpose types. In condition (4), checking whether or not a node has a direct hierarchical relationship is to check the hierarchical relationship of nodes shown in FIG. 15, and using the value of “#” 2302 as a key, the node table of FIG. "Lower node # list" 1
The determination is made by referring to 407.

【００７２】サーチが失敗した場合には、サーチ結果判
定ステップ６０４を経て現在のＳＡを始点とするリンク
作成処理を終了する。If the search is unsuccessful, the link creation process starting from the current SA as the starting point is terminated through the search result determination step 604.

【００７３】サーチが成功した場合は、始点アンカーの
表示範囲設定ステップ６０５において、始点アンカーの
表示範囲ＳＡＤをＳＡの名詞部から述語部までの連続す
る範囲とする。すなわち、重要箇所認識済みアンカーテ
ーブルの名詞部Ｎの「開始」２３０４、「終了」２３０
５、述語部Ｐの「開始」２３０７、「終了」２３０８の
内の最小の値から最大の値の間の範囲をＳＡＤとする。If the search is successful, the display range SAD of the starting point anchor is set to the continuous range from the noun part to the predicate part of SA in the display range setting step 605 of the starting point anchor. That is, “start” 2304 and “end” 230 of the noun part N of the anchor table whose important parts have been recognized.
5, the range between the minimum value and the maximum value of the “start” 2307 and “end” 2308 of the predicate part P is SAD.

【００７４】終点ノード設定ステップ６０６において、
終点ノードＴＮを参照先アンカーＴＡが存在するノード
とする。At the end point node setting step 606,
The end point node TN is a node where the reference destination anchor TA exists.

【００７５】そして、キーフレーズタイプのリンク登録
ステップ６０７において、始点アンカーの表示範囲ＳＡ
Ｄを調整してから、ＳＡＤを始点としＴＮを終点とする
リンクをリンクテーブルに登録する。キーフレーズタイ
プのリンク登録ステップ６０７の詳細を図７に示す。Then, in the key phrase type link registration step 607, the display range SA of the starting point anchor is displayed.
After adjusting D, a link having SAD as the starting point and TN as the ending point is registered in the link table. The details of the key phrase type link registration step 607 are shown in FIG.

【００７６】リンクテーブルサーチステップ７０１にお
いて、始点の表示範囲がＳＡＤと重複するリンクをリン
クテーブルからサーチする。In the link table search step 701, a link whose starting point display range overlaps with SAD is searched from the link table.

【００７７】サーチが失敗した場合は、始点アンカーの
表示範囲の調整は不要であり、サーチ結果判定ステップ
７０２を経てリンク登録ステップ７０３において、ＳＡ
ＤとＴＮのリンクを表す情報として、始点アンカー情
報、終点ノード情報、および語幹をリンクテーブルに登
録する。If the search is unsuccessful, it is not necessary to adjust the display range of the starting point anchor, and the SA is selected in the link registration step 703 through the search result determination step 702.
As the information indicating the link between D and TN, the start point anchor information, the end point node information, and the stem are registered in the link table.

【００７８】リンクテーブルサーチステップ７０１にお
けるサーチが成功した場合は、始点アンカーの表示範囲
の調整を行う。When the search in the link table search step 701 is successful, the display range of the starting point anchor is adjusted.

【００７９】サーチ結果のアンカー設定ステップ７０４
において、ＳＡ１をリンクテーブルサーチステップ７０
１におけるサーチ結果のリンクの始点アンカーとし、サ
ーチ結果のノード設定ステップ７０５において、ＴＮ１
をサーチ結果のリンクの終点ノードとし、ノード判定ス
テップ７０６において、ＴＮとＴＮ１とを比較する。Search result anchor setting step 704
In step SA1, link table search step 70
1 as the starting point anchor of the link of the search result in step 1, and TN1
Is set as the end node of the link of the search result, and TN and TN1 are compared in the node determination step 706.

【００８０】ＴＮとＴＮ１とが一致しない場合は、始点
アンカーの名詞部判定ステップ７０７においてＳＡの名
詞部とＳＡ１の名詞部を比較する。名詞部とはリンクテ
ーブル２おける「名詞部語幹Ｎ」２４１０である。When TN and TN1 do not match, the noun part of SA and the noun part of SA1 are compared in the noun part determination step 707 of the starting point anchor. The noun part is the “noun part stem N” 2410 in the link table 2.

【００８１】名詞部が異なる場合は、アンカー表示範囲
設定ステップ７０８において、ＳＡの表示範囲ＳＡＤを
ＳＡの名詞部の範囲に設定しなおす。続くアンカー表示
範囲変更ステップ７０９において、ＳＡ１のアンカーの
表示範囲を名詞部の範囲に変更する。ここの変更にあた
っては、リンクテーブルの始点アンカーの番号２４０２
をキーにして重要箇所認識済みアンカーテーブルを参照
し、名詞部の範囲を得る。If the noun parts are different, the SA display range SAD is reset to the SA noun part range in the anchor display range setting step 708. In the subsequent anchor display range changing step 709, the SA1 anchor display range is changed to the noun part range. In changing this, the starting point anchor number of the link table 2402
The key is used as a key to refer to the anchor table in which the important part is recognized, and the range of the noun part is obtained.

【００８２】この変更は、例えば、「データ、テーブル
を変更する」というテキストから（データ、変更）、
（テーブル、変更）という２個のキーフレーズを抽出
し、それぞれのキーフレーズが始点アンカーとなる場合
に、動詞部が共通するためにキーフレーズの範囲が重複
するので、アンカーの表示範囲としては名詞部、つま
り、「データ」、「テーブル」の部分とするものであ
る。This change is made, for example, from the text "change data, table" (data, change),
When two key phrases (table, change) are extracted, and each key phrase serves as a starting point anchor, the range of key phrases overlaps because the verb part is common, so the anchor display range is a noun. It is a part, that is, a "data" or "table" part.

【００８３】そして、リンク登録ステップ７０３におい
て、ＳＡＤとＴＮのリンクを表す情報として、始点アン
カー情報、終点ノード情報、および語幹をリンクテーブ
ルに登録する。Then, in the link registration step 703, the start point anchor information, the end point node information, and the stem are registered in the link table as the information representing the link between the SAD and the TN.

【００８４】始点アンカーの名詞部判定ステップ７０７
においてＳＡの名詞部とＳＡ１の名詞部が等しい場合
は、アンカー表示範囲設定ステップ７１０において、Ｓ
Ａの表示範囲ＳＡＤをＳＡの動詞部の範囲に設定しなお
す。続くアンカー表示範囲変更ステップ７１１におい
て、ＳＡ１のアンカーの表示範囲を動詞部の範囲に変更
する。そして、リンク登録ステップ７０３において、Ｓ
ＡＤとＴＮのリンクを表す情報として、始点アンカー情
報、終点ノード情報、および語幹をリンクテーブルに登
録する。Step 707 of determining the noun part of the starting point anchor
If the noun part of SA is equal to the noun part of SA1 in S, in anchor display range setting step 710, S
The display range SAD of A is reset to the range of the verb part of SA. In the subsequent anchor display range changing step 711, the display range of the anchor of SA1 is changed to the range of the verb part. Then, in the link registration step 703, S
As the information indicating the link between AD and TN, the start point anchor information, the end point node information, and the stem are registered in the link table.

【００８５】ここのステップ７１０、ステップ７１１に
おける表示範囲の変更処理は、ステップ７０８、ステッ
プ７０９においてアンカーの表示範囲を「名詞部」とし
たことに対して「動詞部」とするものである。これは、
例えば、「データの登録、削除」というテキストから
（データ、登録）、（データ、削除）という２個のキー
フレーズを抽出しそれぞれのキーフレーズが始点アンカ
ーとなる場合に、名詞部が共通するためにキーフレーズ
の範囲が重複するので、アンカーの表示範囲としては述
語部、つまり、「登録」、「削除」の部分とするもので
ある。The processing of changing the display range in steps 710 and 711 is to set the display range of the anchor to "noun part" in step 708 and 709, instead of "verb part". this is,
For example, when two key phrases (data, registration) and (data, deletion) are extracted from the text "data registration / deletion" and each key phrase serves as a starting point anchor, the noun part is common. Since the key phrase ranges overlap with each other, the display range of the anchor is the predicate part, that is, the "register" and "delete" parts.

【００８６】ノード判定ステップ７０６において、ＴＮ
とＴＮ１とが一致する場合は、リンクを新たに登録せ
ず、サーチ結果のリンクの始点アンカーＳＡ１の表示範
囲をＳＡ１の表示範囲とＳＡＤの表示範囲の和とする。In the node determination step 706, the TN
And TN1 match, the link is not newly registered, and the display range of the start point anchor SA1 of the link of the search result is the sum of the display range of SA1 and the display range of SAD.

【００８７】以上、図７の処理を重要箇所認識済みアン
カーテーブルの各キーフレーズタイプのアンカーを始点
アンカーＳＡに仮定しながら繰り返す。この結果、リン
ク２４１２、リンク２４１３、リンク２４１４を作成す
る。ただし、図２４において、リンク２４１２、リンク
２４１３の始点アンカーの開始位置２４０５、終了位置
２４０６の値は、上記の説明した処理の結果の値ではな
く、この後のキーワードタイプリンク作成サブステップ
５０２によって変更された結果を示している。以上でキ
ーフレーズタイプリンク作成サブステップ５０１の説明
を終わる。As described above, the processing of FIG. 7 is repeated while assuming that the anchor of each key phrase type of the anchor table in which the important point has been recognized is the starting point anchor SA. As a result, a link 2412, a link 2413, and a link 2414 are created. However, in FIG. 24, the values of the start position 2405 and the end position 2406 of the start anchor of the link 2412 and the link 2413 are not the values obtained as a result of the above-described processing, but are changed by the subsequent keyword type link creation substep 502. The results are shown. This is the end of the description of the key phrase type link creation sub-step 501.

【００８８】次に、キーワードタイプリンク作成サブス
テップ５０２において、重要箇所認識済みアンカーテー
ブルの各アンカーについてリンクを作成する。「述語
部」の下位項目の「語幹」２３０９が「−」であるか否
かは問わず、アンカーの各々を始点アンカーとして仮定
し、名詞部の語幹が一致するアンカーをサーチし、その
アンカーが存在するノードを終点とするリンクを作成す
る。その詳細を図８に示す。Next, in the keyword type link creation sub-step 502, a link is created for each anchor in the important point recognized anchor table. Regardless of whether "word stem" 2309, which is a subordinate item of "predicate part", is "-" or not, each of the anchors is assumed to be a starting point anchor, and an anchor having the same stem in the noun part is searched for. Create a link that ends at an existing node. The details are shown in FIG.

【００８９】図８のフローは、図６のフローとほぼ同様
である。図６と図８の相違は、図６はキーフレーズとキ
ーワードの相違による部分だけであり、処理の考え方は
同じである。つまり、図８のフローにおいて、終点ノー
ド設定ステップ８０６までの処理は、次のとおりであ
る。始点アンカーＳＡと名詞部が一致して、かつ、その
他の条件を満たす参照先のアンカーをサーチして、その
参照先アンカーのあるノードをＴＮとする。The flow of FIG. 8 is almost the same as the flow of FIG. The difference between FIG. 6 and FIG. 8 is only the part in FIG. 6 due to the difference between the key phrase and the keyword, and the processing concept is the same. That is, in the flow of FIG. 8, the processing up to the end point node setting step 806 is as follows. A reference destination anchor that matches the starting point anchor SA and the noun part and satisfies other conditions is searched for, and a node having the reference destination anchor is set as TN.

【００９０】図６と図８の相違点は、次のとおりであ
る。図６の参照先アンカーのサーチステップ６０３では
条件（１）で名詞部と述語部の一致を調べることに対し
て、図８の参照先アンカーのサーチステップ８０３にお
いては名詞部の一致を調べる。また、図６の始点アンカ
ーの表示範囲設定ステップ６０５において始点アンカー
の表示範囲ＳＡＤをＳＡの名詞部から述語部までの連続
する範囲とすることに対して、図８の始点アンカーの表
示範囲設定ステップ８０５において始点アンカーの表示
範囲ＳＡＤをＳＡの名詞部の範囲とする。また、図６で
はリンクの登録をキーフレーズタイプのリンク登録ステ
ップ６０７で行うことに対して、図８では、キーワード
タイプのリンク登録ステップ８０７で行う。The differences between FIG. 6 and FIG. 8 are as follows. In the reference anchor search step 603 of FIG. 6, the match of the noun part and the predicate part is checked under the condition (1), whereas in the reference anchor search step 803 of FIG. 8, the match of the noun part is checked. Further, in the display range setting step 605 of the starting point anchor shown in FIG. 6, the display range SAD of the starting point anchor is set to be a continuous range from the noun part to the predicate part of SA. At 805, the display range SAD of the starting point anchor is set as the range of the noun part of SA. Further, in FIG. 6, the link is registered in the key phrase type link registration step 607, whereas in FIG. 8, the link is registered in the keyword type link registration step 807.

【００９１】キーワードタイプのリンク登録ステップ８
０７の詳細を図９に示す。Keyword type link registration step 8
Details of 07 are shown in FIG.

【００９２】リンクテーブルサーチステップ９０１、サ
ーチ結果ステップ９０２、リンク登録ステップ９０３の
処理は、図７のキーフレーズタイプのリンク登録処理に
おけるリンクテーブルサーチステップ７０１、サーチ結
果ステップ７０２、リンク登録ステップ７０３と同様で
ある。つまり、始点アンカーの表示範囲が重複するリン
クが他に無い場合は、そのまま、ＳＡＤとＴＮが表すリ
ンクをリンクテーブルに登録する。ただし、ＳＡがキー
フレーズタイプのアンカーであっても、リンクテーブル
の述語部は、値が無いことを表す「−」とする。The processing of the link table search step 901, the search result step 902, and the link registration step 903 is the same as the link table search step 701, the search result step 702, and the link registration step 703 in the key phrase type link registration processing of FIG. Is. That is, when there is no other link in which the display range of the starting point anchor overlaps, the link represented by SAD and TN is registered in the link table as it is. However, even if SA is a key phrase type anchor, the predicate part of the link table is "-" indicating that there is no value.

【００９３】サーチが成功した場合は、サーチ結果のア
ンカー設定ステップ９０４を経て、サーチ結果のノード
設定ステップ９０５に進む。ノード判定ステップ９０
６、ノード上下関係判定ステップ９０７、アンカー表示
範囲変更ステップ９０８、リンク登録ステップ９０３と
進む流れは、キーワードタイプリンクを作成すると共
に、ＳＡと始点アンカーの表示範囲が重複するリンクの
表示範囲を変更する処理である。ＳＡと始点アンカーの
表示範囲が重複するリンクとはキーフレーズタイプのリ
ンクであり、そのキーフレーズタイプのリンクの始点ア
ンカーの表示範囲を述語部の範囲に変更する。例えば、
「データを変更する」というテキストから（データ、変
更）というキーフレーズを抽出するが、「データを変更
する」を始点アンカーの表示範囲とするキーフレーズタ
イプリンクと「データ」を始点アンカーの表示範囲とす
るキーワードタイプのリンクを作成するとき、キーフレ
ーズタイプリンクの始点アンカーの表示範囲を述語部
「変更する」の範囲に変更する。When the search is successful, the process proceeds to a search result anchor setting step 904 and then to a search result node setting step 905. Node determination step 90
6, the node hierarchical relationship determination step 907, the anchor display range change step 908, and the link registration step 903, the flow proceeds to create a keyword type link and change the display range of the link in which the SA and the start point anchor display range overlap. Processing. The link in which SA and the display range of the starting point anchor overlap is a key phrase type link, and the display range of the starting point anchor of the link of the key phrase type is changed to the range of the predicate part. For example,
The key phrase (data, change) is extracted from the text "Modify data", but the key phrase type link with "Modify data" as the starting point anchor display range and the "Data" starting point anchor display range When creating a keyword type link, the display range of the starting point anchor of the key phrase type link is changed to the range of the predicate “change”.

【００９４】この処理によって、リンク２４１５、リン
ク２４１６を作成し、リンク２４１２、リンク２４１３
の始点アンカーの表示範囲を変更する。図２４には、変
更した結果の値を示す。By this processing, the link 2415 and the link 2416 are created, and the link 2412 and the link 2413 are created.
Change the display range of the starting point anchor. FIG. 24 shows the changed values.

【００９５】ノード上下関係判定ステップ９０７で上下
関係が無いとした場合は、ＳＡＤとＴＮは内容的な関係
が無いものとして、リンクを作成しない。If it is determined that there is no hierarchical relationship in the node hierarchical relationship determination step 907, SAD and TN have no physical relationship and no link is created.

【００９６】ノード判定ステップ９０６でＴＮとサーチ
結果のノードとが一致する場合は、リンクを新たに登録
せず、サーチ結果のリンクの始点アンカーＳＡ１の表示
範囲をＳＡ１の表示範囲とＳＡＤの表示範囲の和とす
る。If the TN and the node of the search result match in the node determination step 906, the link is not newly registered, and the display range of the start point anchor SA1 of the search result is set to the display range of SA1 and the display range of the SAD. The sum of

【００９７】以上で、リンク作成ステップ１０７の説明
を終わる。This is the end of the description of the link creation step 107.

【００９８】上記の処理によって図２４に示すリンクを
作成し、したがって図１３に示すハイパーテキストを作
成する。図２４のリンク２４１５は図１３のリンク１３
０３に対応し、リンク２４１２はリンク１３０４に対応
し、リンク２４１４はリンク１３０５に対応し、リンク
２４１６はリンク１３０６に対応し、リンク２４１３は
リンク１３０７に対応する。By the above processing, the link shown in FIG. 24 is created, and therefore the hypertext shown in FIG. 13 is created. The link 2415 in FIG. 24 is the link 13 in FIG.
03, the link 2412 corresponds to the link 1304, the link 2414 corresponds to the link 1305, the link 2416 corresponds to the link 1306, and the link 2413 corresponds to the link 1307.

【００９９】次に、ハイパーテキスト表示ステップ１０
８において、ハイパーテキストのノードの一つを表示す
る。その詳細を図１０に示す。Next, hypertext display step 10
At 8, one of the hypertext nodes is displayed. The details are shown in FIG.

【０１００】レイアウトステップ１００１においてハイ
パーテキストの表示対象のノードのテキストをレイアウ
トする。このとき、文書に含まれる文書構成要素情報や
レイアウト情報を削除すると同時に、レイアウト前後の
文字の対応表を作成する。対応表を図２６に示す。「レ
イアウト前」２６０１、「レイアウト後」２６０２は、
レイアウト前の文字の位置、レイアウト後の文字の位置
をそれぞれノードの先頭からの文字数で表した値であ
る。In layout step 1001, the text of the node for which the hypertext is displayed is laid out. At this time, the document constituent element information and the layout information included in the document are deleted, and at the same time, the correspondence table of the characters before and after the layout is created. The correspondence table is shown in FIG. “Before layout” 2601 and “after layout” 2602 are
It is a value that represents the position of the character before layout and the position of the character after layout by the number of characters from the beginning of the node.

【０１０１】アンカー表示位置再計算ステップ１００２
において、図２６に示す対応表を参照してアンカーの表
示位置を再計算する。このとき、リンクテーブルにおけ
る始点アンカーの表示範囲のデータは、ノードの先頭か
らの文字数によって表しているので、リンクテーブルに
おける表示範囲の値、つまり「開始」２４０５および
「終了」２４０６の値とをそれぞれ対応表の「レイアウ
ト前」２６０１の値と対応付け、それと対応した「レイ
アウト後」２６０２の値を取り出すことでアンカー表示
位置の再計算を行うことができる。Anchor display position recalculation step 1002
At, the display position of the anchor is recalculated with reference to the correspondence table shown in FIG. At this time, since the data of the display range of the starting point anchor in the link table is represented by the number of characters from the head of the node, the value of the display range in the link table, that is, the values of “start” 2405 and “end” 2406, respectively. The anchor display position can be recalculated by associating with the value of “before layout” 2601 in the correspondence table and extracting the corresponding value of “after layout” 2602.

【０１０２】次に、ノードテキスト表示ステップ１００
３においてレイアウト結果を表示しアンカーの範囲を反
転などで強調して表示する。Next, the node text display step 100
In 3, the layout result is displayed and the anchor range is highlighted by inversion or the like.

【０１０３】以上で本実施例の説明を終わる。This is the end of the description of the present embodiment.

【０１０４】上記の説明では、図７および図９のリンク
の登録の際に、始点アンカーの表示範囲が重複する場合
はその重複を解消するとした。ここで、その変形例とし
て、リンクの登録の際の始点アンカーの表示範囲の重複
を解消せずにそのまま登録するものとし、ハイパーテキ
スト表示ステップにおいて対処する方法を説明する。In the above description, when the links in FIGS. 7 and 9 are registered, if the display ranges of the start point anchors overlap, the overlap is eliminated. Here, as a modification thereof, a method of coping with the hypertext display step will be described, in which it is assumed that the display range of the starting point anchor at the time of registering the link is registered as it is without being resolved.

【０１０５】ノードテキスト表示ステップ１００３にお
いて、アンカーの範囲を反転などで強調して表示する際
に、範囲が重複するアンカーについてはその内の１個だ
けを表示する。そして、表示後に、マウスなどを介した
指示にしたがってアンカーの表所切り替える。In the node text display step 1003, when the range of anchors is highlighted and displayed by inversion or the like, only one of the anchors whose ranges overlap is displayed. Then, after the display, the surface of the anchor is switched according to an instruction via a mouse or the like.

【０１０６】図１１は、アンカーの表示部分をマウスで
選択してマウスボタンが押された場合に、アンカーの表
所切り替える処理を示す。図１１において、アンカー表
示箇所のマウス入力ステップ１１０１、マウス入力判定
ステップ１１０２、アンカー表示取消ステップ１１０
３、リンクテーブルサーチステップ１１０４、サーチ結
果判定ステップ１１０５、重複アンカー表示取消ステッ
プ１１０６、アンカー表示ステップ１１０７と進む流れ
は、マウス入力の箇所を始点アンカーの範囲として含む
リンクについて、アンカーの表示を切り替える処理であ
る。この処理によって、例えば、図２７に示す表示にお
けるアンカー２７０１の表示と、図２８に示すアンカー
２７０３をマウス入力にしたがって切り替える。矢印２
７０２はマウスポインタを表す。FIG. 11 shows the process of switching the anchor's surface when the anchor display portion is selected with the mouse and the mouse button is pressed. In FIG. 11, a mouse input step 1101 of an anchor display portion, a mouse input determination step 1102, an anchor display cancellation step 110
3, link table search step 1104, search result determination step 1105, duplicate anchor display cancellation step 1106, and anchor display step 1107 are the processing for switching the display of anchors for links that include the mouse input location as the starting point anchor range. Is. By this processing, for example, the display of the anchor 2701 in the display shown in FIG. 27 and the anchor 2703 shown in FIG. 28 are switched according to the mouse input. Arrow 2
Reference numeral 702 represents a mouse pointer.

【０１０７】アンカー再表示ステップ１１０８は、マウ
ス入力の箇所を始点アンカーの範囲として含むリンクが
他に無かった場合の処理である。マウス入力に応じた処
理ステップ１１０９は、アンカーの表示の切り替えでは
なく、例えばリンクのフォローなどの処理を行う処理で
ある。The anchor re-display step 1108 is a processing when there is no other link including the location of the mouse input as the range of the starting point anchor. The processing step 1109 in response to mouse input is processing for performing processing such as link follow-up, instead of switching anchor display.

【０１０８】以上で変形例の説明を終わる。This is the end of the description of the modified example.

【０１０９】[0109]

【発明の効果】語句の意味的な内容として語幹の一致に
基づいてリンクを作成することにより、ノードの記述内
容の一部が他のノードと関連を持つ場合にリンクを作成
することができる。一致を調べる語句として、動詞又は
サ変名詞と、その動詞またはサ変名詞に意味的に接続す
る格要素の内容を表す普通名詞又は複合名詞との対をキ
ーフレーズとして使用することで、無関係の箇所に誤っ
てリンクを作成することを減らすことができる。リンク
の始点が存在するノードと終点のノードの文書構成要素
の説明の目的タイプの組合せを限定することによって、
不適切なリンクを減らすことができる。By creating a link based on the matching of the stems as the semantic content of a phrase, a link can be created when a part of the description content of a node is related to another node. By using as a key phrase a pair of a verb or sahen noun and a common noun or compound noun that expresses the content of a case element that is semantically connected to the verb or sahen noun, as a key phrase for matching, You can reduce accidental linking. By limiting the combination of the purpose type of the description of the document component of the node where the link start point exists and the end point,
You can reduce inappropriate links.

【０１１０】リンクの始点のアンカーの表示範囲を、ア
ンカーを構成する語句を包含する連続した最小の範囲と
することや、アンカーの範囲の重複を解消することや、
あるいは重複するアンカーを切り換えて表示することに
より、リンクの終点の内容を利用者が予想できるユーザ
インターフェースを提供することができる。The display range of the anchor at the start point of the link is set to the minimum continuous range that includes the words and phrases that form the anchor, and the overlapping of the ranges of anchors is eliminated.
Alternatively, by switching and displaying the overlapping anchors, it is possible to provide a user interface in which the user can predict the content of the end point of the link.

【０１１１】また、ノード内で重要な内容を持つ重要箇
所を各文の表現に基づいて抽出した結果に基づいてリン
クの終点のノードを決定することによって、終点のノー
ドとして始点のアンカーの内容を主たる説明内容とする
ノードを選択することができる。Further, by determining the node at the end point of the link based on the result of extracting the important point having the important content within the node based on the expression of each sentence, the content of the anchor at the start point is determined as the end point node. You can select the node that is the main description.

【０１１２】ハイパーテキストの表示の際に、レイアウ
ト前後の文字の対応に基づいてアンカーの表示位置を再
計算することにより、異なるレイアウト規則を適用する
場合でも、レイアウト前のリンクデータを共通に使用す
ることができる。When displaying the hypertext, by recalculating the display position of the anchor based on the correspondence between the characters before and after the layout, the link data before the layout is commonly used even when different layout rules are applied. be able to.

[Brief description of drawings]

【図１】本発明のテキストのハイパーテキスト化方法の
一実施例を表すフロー図である。FIG. 1 is a flow chart showing an embodiment of a text hypertext conversion method according to the present invention.

【図２】本発明のテキストのハイパーテキスト化方法の
一実施例を表すデータフロー図である。FIG. 2 is a data flow diagram showing an embodiment of a text hypertext conversion method of the present invention.

【図３】テキストのハイパーテキスト化方法におけるア
ンカー抽出サブステップの内容を表すフロー図である。FIG. 3 is a flowchart showing the contents of an anchor extraction sub-step in the text hypertext conversion method.

【図４】テキストのハイパーテキスト化方法における重
要箇所抽出サブステップの内容を表すフロー図である。FIG. 4 is a flowchart showing the contents of an important part extraction sub-step in the text hypertext conversion method.

【図５】テキストのハイパーテキスト化方法におけるリ
ンク作成サブステップの内容を表すフロー図である。FIG. 5 is a flowchart showing the contents of a link creation substep in the text hypertext conversion method.

【図６】リンク作成サブステップにおけるキーフレーズ
タイプリンク作成ステップの内容を表すフロー図であ
る。FIG. 6 is a flowchart showing the contents of a key phrase type link creation step in the link creation substep.

【図７】キーフレーズタイプリンク作成ステップにおけ
るキーフレーズタイプのリンク登録ステップの内容を表
すフロー図である。FIG. 7 is a flowchart showing the contents of a key phrase type link registration step in a key phrase type link creation step.

【図８】リンク作成サブステップにおけるキーワードタ
イプリンク作成ステップの内容を表すフロー図である。FIG. 8 is a flowchart showing the contents of a keyword type link creation step in the link creation substep.

【図９】キーワードタイプリンク作成ステップにおける
キーワードタイプのリンク登録ステップの内容を表すフ
ロー図である。FIG. 9 is a flowchart showing the contents of a keyword type link registration step in the keyword type link creation step.

【図１０】テキストのハイパーテキスト化方法における
ハイパーテキスト表示サブステップの内容を表すフロー
図である。FIG. 10 is a flowchart showing the contents of a hypertext display sub-step in the text hypertext conversion method.

【図１１】ハイパーテキスト表示サブステップにおける
アンカー表示切替処理を表すフロー図である。FIG. 11 is a flowchart showing anchor display switching processing in a hypertext display sub-step.

【図１２】本発明のテキストのハイパーテキスト化方法
に対して入力するテキストの一例を表す図である。FIG. 12 is a diagram showing an example of a text input to the hypertext conversion method for text of the present invention.

【図１３】本発明のテキストのハイパーテキスト化方法
によって作成されたハイパーテキストの一例を表す図で
ある。FIG. 13 is a diagram showing an example of hypertext created by the text hypertext conversion method of the present invention.

【図１４】テキストからノードを作成した結果を記録す
るノードテーブルの一例を表す図である。FIG. 14 is a diagram illustrating an example of a node table that records a result of creating a node from text.

【図１５】ノードテーブルにおけるノードの上下関係を
図式的に表した図である。FIG. 15 is a diagram schematically showing the vertical relationship of nodes in a node table.

【図１６】テキスト内の文を切り出した結果を記録する
文テーブルの一例を表す図である。FIG. 16 is a diagram illustrating an example of a sentence table that records a result of cutting out a sentence in a text.

【図１７】テキスト内の文を単語に分割した結果を記録
する単語テーブルの一例を表す図である。FIG. 17 is a diagram illustrating an example of a word table that records a result of dividing a sentence in a text into words.

【図１８】アンカーを抽出するためのパターンの一例を
表す図である。FIG. 18 is a diagram illustrating an example of a pattern for extracting an anchor.

【図１９】抽出したアンカーを記録するアンカーテーブ
ルの一例を表す図である。FIG. 19 is a diagram illustrating an example of an anchor table that records extracted anchors.

【図２０】重要箇所を抽出するためのパターンの一例を
表す図である。FIG. 20 is a diagram illustrating an example of a pattern for extracting an important part.

【図２１】抽出した重要箇所を記録する重要箇所テーブ
ルの一例を表す図である。FIG. 21 is a diagram showing an example of an important point table recording the extracted important points.

【図２２】重要箇所を抽出する仮定で作成する仮重要箇
所および除外箇所テーブルの一例を表す図である。FIG. 22 is a diagram showing an example of a temporary important point and excluded point table created on the assumption that important points are extracted.

【図２３】アンカーテーブルの各アンカーに対して重要
箇所テーブルの内容に基づいて重要箇所フラグを付けた
結果の重要箇所認識済みアンカーテーブルの一例を表す
図である。FIG. 23 is a diagram showing an example of an important-parts-recognized anchor table as a result of attaching an important-part flag to each anchor of the anchor table based on the contents of the important-part table.

【図２４】作成したリンクを記録するリンクテーブルの
一例を表す図である。FIG. 24 is a diagram illustrating an example of a link table that records created links.

【図２５】リンク作成時にリンクの始点と終点のノード
の説明の目的タイプの組合せを指定するノード目的タイ
プマトリックスを表す図である。FIG. 25 is a diagram showing a node purpose type matrix that specifies a combination of purpose types for explaining nodes at a start point and an end point of a link when a link is created.

【図２６】ハイパーテキスト表示ステップにおけるレイ
アウトサブステップにおいて作成する、レイアウト前後
の文字の位置の対応表の一例を表す図である。FIG. 26 is a diagram showing an example of a correspondence table of character positions before and after layout, which is created in the layout sub-step in the hypertext display step.

【図２７】アンカーの表示状態の一例を表す図である。FIG. 27 is a diagram illustrating an example of a display state of anchors.

[Explanation of symbols]

１０１…ノードテーブル作成ステップ、１０２…文分割
ステップ、１０３…単語分割ステップ、１０４…アンカ
ー抽出ステップ、１０５…重要箇所抽出ステップ、１０
６…アンカーマーキングステップ、１０７…リンク作成
ステップ、１０８…ハイパーテキスト表示ステップ。101 ... Node table creation step, 102 ... Sentence division step, 103 ... Word division step, 104 ... Anchor extraction step, 105 ... Important point extraction step, 10
6 ... Anchor marking step, 107 ... Link creation step, 108 ... Hypertext display step.

Claims

[Claims]

1. Inputting text in which data indicating the boundaries of document constituent elements such as chapters, sections, and terms is input, and creating a node that is a semantic unit of text based on the data, and from each sentence Extract the words and phrases, extract the important points that have important contents in the node, flag the words that overlap with the important points as important points, and use the words that do not have the important points flag as the anchor of the starting point In the hypertext conversion method of text that converts text to hypertext by creating a link that indicates the relationship between the two, with a node that has a phrase that matches all or part of the word having an important point flag as the end point. The words and phrases extracted from the sentence consist of the keywords consisting of common nouns and compound nouns from each sentence in the node, and words and phrases that have a semantic connection. And to extract and key phrases,
Extracting important points that have important contents in a node based on the expression of each sentence, and dividing words into words and checking each word's notation or A method for converting text into hypertext, characterized by comparing stems or semantic concepts.

2. A method for extracting a key phrase, which comprises dividing a sentence into words, recognizing a part of speech, and extracting a part where the part of speech of a word and the arrangement of characters match a predetermined pattern as a key phrase. The method for converting text into a hypertext according to claim 1.

3. A verb or sahen noun and a common noun or compound noun representing the content of a case element that is semantically connected to the verb or sahen noun are defined as words in a semantic connection that form a key phrase. The hypertext conversion method according to claim 1, wherein the hypertext conversion is performed.

4. In addition to data indicating boundaries of document components such as chapters, sections, and paragraphs, data indicating the intended type of description of document components is inserted as input text, and links indicating relationships are input. 2. When creating, the link is created only when the combination of the target type of the node having the link start point and the node end point of the link is a fixed combination. Hypertext conversion method

5. The text according to claim 1, wherein when the link is created, a link is created in which a minimum range of consecutive words including a word constituting the anchor of the starting point is set as the display range of the anchor. Hypertext conversion method.

6. When a part within the range of the anchor is selected with a mouse or the like to follow the link, there are two links including the selected part as the display range of the anchor of the starting point.
When there are more than one, a hypertext display method characterized by displaying the display range of those anchors sequentially by highlighting such as reversing each time a selection operation with a mouse or the like is performed.

7. When creating a link, a temporary minimum display range of anchors that includes the words and phrases forming the starting anchor is used as a temporary display range of the anchor, and a link starting from another anchor having the overlapping range is set. In some cases, the display range of each anchor is set to a minimum continuous range including words that do not overlap in terms of words forming the anchor, and the hypertext conversion method of the text according to claim 1.

8. When displaying the text by deleting the document component information and the layout information included in the text by laying out the text, the text before the layout and the characters of the text after the layout are associated with each other, and Hypertext characterized by displaying the anchor after converting the display range information of the anchor at the start point of the link expressed based on the position of the character to the display range information based on the position of the character of the text after layout. Display method.