JPH10232881A

JPH10232881A - Information visualization method

Info

Publication number: JPH10232881A
Application number: JP9270250A
Authority: JP
Inventors: Michihiro Nagaishi; 道博長石
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 1996-12-20
Filing date: 1997-10-02
Publication date: 1998-09-02

Abstract

PROBLEM TO BE SOLVED: To read and display document contents corresponding to a specified unit by extracting a logical structure from the document data as long as they are electronized document data even when it is not a special format for visualization, turning the logical structure to a unit, visualizing the mutual relation of the respective units and instructing one of the units. SOLUTION: The electronized document data are converted into a mark-up language system (step s1), the logical structure is extracted by analyzing the contents of the data converted into the mark-up language system (step s2) and the mutual relation of the logic structure is indicated by a tree structure and displayed on a screen (step s3). Then, when a user looks at the displayed contents and selects one of the displayed contents, details for the contents are displayed (step s4).

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、電子化された文書
データから、その文書データに含まれる目次、章、節な
どの論理構造を抽出し、それらの論理構造相互の関係を
視覚的なマップとして表示を行い、そのマップで表示さ
れた所定の単位を指定すると、その指定された内容を表
示するようにした情報視覚化方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention extracts a logical structure such as a table of contents, a chapter, a section, etc., contained in document data from digitized document data, and visualizes a relationship between the logical structures with a visual map. The present invention relates to an information visualization method in which, when a predetermined unit displayed on a map is specified, the specified content is displayed.

【０００２】[0002]

【従来の技術】情報量の多い文書データの中からユーザ
が必要とする内容を得るための情報検索を行う方法は、
従来より、幾つかの方法が提案されている。2. Description of the Related Art A method for performing an information search for obtaining contents required by a user from document data having a large amount of information is described below.
Conventionally, several methods have been proposed.

【０００３】たとえば、文書データの論理構造などを視
覚化して表示する方法として、アウトラインプロセッサ
における論理構造（章、段落など）の文字列を、２次元
のツリー構造として視覚的なマップとして表示を行い、
両者を切り替え表示可能とし、通常の文字列による論理
構造の把握と、マップ表示による視覚的な論理構造の把
握の双方を可能としたものがある。For example, as a method of visualizing and displaying the logical structure of document data, a character string of a logical structure (chapter, paragraph, etc.) in an outline processor is displayed as a two-dimensional tree structure as a visual map. ,
In some cases, both can be switched and displayed, and both the grasp of the logical structure by a normal character string and the visual grasp of the logical structure by map display can be performed.

【０００４】また、Ａpple社のＨotＳauceのように、デ
ィレクトリ構造になっているwwwのブックマークを３次
元的に表現し、そのいずれかをクリックすると、webに
行けるというものもある。[0004] Further, like HotSauce of Apple Inc., there is a system in which a bookmark of www having a directory structure is three-dimensionally expressed, and clicking any one of the bookmarks allows the user to go to the web.

【０００５】このように、文書データの論理的構造を視
覚化するという発想や、視覚結果に基づいて所望とする
文書にアクセスするという発想はそれぞれ存在する。As described above, there is an idea of visualizing the logical structure of document data and an idea of accessing a desired document based on a visual result.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、この種
の従来技術では、文書データの論理構造を視覚化する
際、視覚化を意識して特別なフォーマットで作成された
文書データ（たとえば、ＨotＳauce用のＭＦＣであると
か、アウトラインプロセッサの文字列など）が対象とな
っている。つまり、全く視覚化を意識していない文書
（たとえば、ＤＴＰソフトで作成された文書データや、
書物など一般の文書）の論理構造を自動的に抽出し、視
覚化することはできない。換言すれば、文書データの視
覚化は、最初から視覚化を前提にしたフォーマットで作
成された文書データでないと不可能であるということで
あり、その他の文書データは手動で逐一作成する必要が
ある。However, in the prior art of this type, when visualizing the logical structure of document data, document data created in a special format (for example, for HotSauce) MFC, outline processor character strings, etc.). In other words, documents that are not completely conscious of visualization (for example, document data created by DTP software,
It is not possible to automatically extract and visualize the logical structure of a general document such as a book. In other words, it is impossible to visualize document data unless the document data is created in a format that assumes visualization from the beginning, and other document data must be manually created one by one. .

【０００７】また、論理構造を視覚化した結果に基づい
て所望とする文書にアクセスするという発想において、
たとえば、前述したＡpple社のＨotＳauceは、個々のwe
bにアクセスできることから、視覚化は種々のwebに行く
ためのポインタを探しやすくするための手段と考えられ
る。たとえば、アウトラインプロセッサによる文書デー
タの論理構造を視覚化するということは、直列的な文字
列の配列である文書内容を目で見て把握しやすくするた
めであると考えられる。[0007] In the idea of accessing a desired document based on the result of visualizing the logical structure,
For example, Apple's HotSauce mentioned above
With access to b, visualization is considered a way to make it easier to find pointers to go to various webs. For example, the visualization of the logical structure of document data by the outline processor is considered to make it easy to visually recognize and understand the contents of a document, which is an array of serial character strings.

【０００８】このように、従来の技術は、特定のフォー
マットに基づかない文書データを視覚化して、かつ、実
際の文書内容にアクセスできるようにポインタを与える
ものではない。As described above, the conventional technique does not visualize document data that is not based on a specific format and does not provide a pointer so that the actual document content can be accessed.

【０００９】したがって、たとえば、ＤＴＰなどにより
作成された電子機器の使用説明書などの文書データの論
理構造を抽出して、章、節などの各項目を視覚化してそ
の視覚化された内容から、ある部分を選択してその詳細
内容を表示するというような一連の処理は、従来の技術
ではできなかった。Therefore, for example, the logical structure of document data such as an instruction manual of an electronic device created by DTP or the like is extracted, and each item such as a chapter or a section is visualized. A series of processes, such as selecting a certain part and displaying its detailed contents, could not be performed by the conventional technology.

【００１０】そこで、本発明は、視覚化のための特別な
フォーマットでなくても、電子化された文書データであ
ればその文書データから論理構造を抽出し、視覚化する
ことを可能とし、また、視覚化は、もとの文書データの
論理構造などを把握しやすくするだけでなく、実際の文
書内容にアクセスしやすいような具体的な表示を行い、
その表示内容を選択して指示することにより、内容に関
する表示が行われるようにした情報視覚化方法を実現す
ることを目的とする。Therefore, the present invention makes it possible to extract a logical structure from digitalized document data and visualize the document data, even if the document data is not a special format for visualization. , Visualization not only makes it easier to understand the logical structure of the original document data, but also provides specific displays that make it easier to access the actual document contents,
It is an object of the present invention to realize an information visualization method in which the display contents are displayed by selecting and indicating the display contents.

【００１１】[0011]

【課題を解決するための手段】本発明において、請求項
１の発明は電子化された情報からその情報の論理構造
を抽出し、抽出された論理構造を視覚化した表示を行
い、その視覚化された表示からある単位を指定すると、
その指定された単位に対応した内容を視覚化して表示す
ることを特徴としている。According to the present invention, a logical structure of the information is extracted from the digitized information, and the extracted logical structure is visualized and displayed. If you specify a unit from the displayed display,
It is characterized in that the contents corresponding to the specified unit are visualized and displayed.

【００１２】また、請求項２の発明は、電子化された情
報からその情報の論理構造を情報の単位として抽出し、
抽出された情報の単位間相互の関係を階層構造として視
覚化した表示を行い、その視覚化された階層構造からあ
る単位を指定すると、その指定された単位に対応した内
容を視覚化して表示することを特徴としている。[0012] According to a second aspect of the present invention, a logical structure of the information is extracted from the digitized information as a unit of information.
Performs a display by visualizing the mutual relationship between units of the extracted information as a hierarchical structure, and when a certain unit is specified from the visualized hierarchical structure, the contents corresponding to the specified unit are visualized and displayed. It is characterized by:

【００１３】また、請求項３の発明は、請求項１または
２において、前記電子化された情報を、マークアップ言
語系に変換したのち、その変換データから論理構造を情
報の単位として抽出するようにしている。According to a third aspect of the present invention, in the first or second aspect, after converting the digitized information into a markup language system, a logical structure is extracted from the converted data as a unit of information. I have to.

【００１４】また、請求項４の発明は、請求項１、２、
または３の発明において、前記抽出された情報の単位間
相互の関係を視覚化する際、対応する情報の内容を要約
する情報を付加するようにしている。[0014] The invention of claim 4 is the invention according to claims 1 and 2,
In the invention of the third aspect, when visualizing the mutual relationship between units of the extracted information, information summarizing the contents of the corresponding information is added.

【００１５】また、請求項５の発明は、請求項４の発明
において、前記情報の内容を要約する情報は、対応する
情報の内容を表す文章の要約あるいはキーワードである
ことを特徴としている。According to a fifth aspect of the present invention, in the fourth aspect, the information summarizing the content of the information is a text summary or a keyword representing the content of the corresponding information.

【００１６】さらに、請求項６の発明は、請求項４の発
明において、前記情報の内容を要約する情報は、対応す
る情報の内容を表すイメージ画像であることを特徴とし
ている。Further, the invention of claim 6 is characterized in that, in the invention of claim 4, the information summarizing the content of the information is an image image representing the content of the corresponding information.

【００１７】このように、本発明は、視覚化のための特
別なフォーマットでなくても電子化された文書データで
あればその文書データから論理構造を抽出し、論理構造
を単位としてそれぞれの単位相互の関係を視覚化し、そ
の単位のいずれかを指示することにより、指示された単
位に対応する文書内容を読み出して表示することが可能
となる。As described above, according to the present invention, a logical structure is extracted from electronic document data even if the format is not a special format for visualization, and the logical structure is used as a unit. By visualizing the mutual relationship and designating one of the units, it becomes possible to read out and display the document content corresponding to the designated unit.

【００１８】また、電子化された文書を、ＨＴＭＬなど
のマークアップ言語系に変換したのち処理を行うことに
より、論理構造の抽出処理を容易なものとすることがで
き、さらに、応用性や汎用性が高いものとなる。Further, by converting the digitized document into a markup language system such as HTML and then processing it, the logical structure can be easily extracted. It becomes high.

【００１９】さらに、抽出された情報の単位間相互の関
係を視覚化する際、対応する情報の内容を要約したもの
として、文章の要約、出現頻度の高いキーワードあるい
は、内容を的確に表すような絵や写真などの情報を付加
することにより、より具体的な視覚化が可能となる。Further, when visualizing the mutual relation between the units of the extracted information, the contents of the corresponding information are summarized to summarize the sentences, frequently appearing keywords, or accurately represent the contents. By adding information such as pictures and photographs, more specific visualization becomes possible.

【００２０】[0020]

【発明の実施の形態】以下、本発明の実施の形態につい
て図面を参照しながら説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００２１】図１は本発明の実施の形態の大まかな処理
手順を説明するフローチャートである。まず、電子化さ
れた文書データをマークアップ言語係への変換を行い
（ステップｓ１）、マークアップ言語係への変換が行わ
れたデータの内容を解析して論理構造を抽出し（ステッ
プｓ２）、その論理構造の相互の関係をツリー構造で表
し画面上に表示する（ステップｓ３）。そして、ユーザ
が表示された内容を見て、表示内容のいずれかを選択す
ると、内容についての詳細を表示する（ステップｓ
４）。FIG. 1 is a flowchart for explaining a rough processing procedure according to the embodiment of the present invention. First, the digitized document data is converted to a markup language section (step s1), and the contents of the data converted to the markup language section are analyzed to extract a logical structure (step s2). The mutual relationship between the logical structures is represented by a tree structure and displayed on the screen (step s3). Then, when the user looks at the displayed contents and selects any of the displayed contents, the details of the contents are displayed (step s).
4).

【００２２】以上の処理について図２を参照しながら詳
細に説明する。The above processing will be described in detail with reference to FIG.

【００２３】図２において、ここで扱う電子化された文
書データとしては特に限定されるものではない。たとえ
ば、マークアップ言語系の文書１（ＳＧＭＬ：Ｓtandar
d Ｇeneralized Ｍarkup Ｌanguage、ＨＴＭＬ:Ｈyper
Ｔexst Ｍarkup Ｌanguage、ＴｅＸなどの文章、画像の
論理構造が既に記述されているもの）、ワープロなどに
より作成された一般的な文書データ２、ＤＴＰソフトで
作成された文書データ３、単なるテキスト文書４などで
ある。In FIG. 2, the digitized document data handled here is not particularly limited. For example, a markup language document 1 (SGML: Standar
d Generalized Markup Language, HTML: Hyper
Texts such as Text Markup Language, TeX, and the logical structure of an image are already described), general document data 2 created by a word processor, document data 3 created by DTP software, simple text document 4, etc. It is.

【００２４】これらの文書のうち、マークアップ言語系
の文書１は論理構造が既に記述されている。また、ワー
プロ文書データ２やＤＴＰソフトで作成された文書３は
文章、画像が独自のレイアウト情報で格納されている
が、それぞれの機器におけるアプリケーションの違いが
あるものの、論理構造を取り出すことは可能である。Of these documents, the markup language document 1 has a logical structure already described. Further, the word processing document data 2 and the document 3 created by the DTP software store sentences and images with their own layout information. However, although there are differences in applications between the respective devices, the logical structure can be extracted. is there.

【００２５】また、テキスト文書４は通常は文書のみ
（画像が別に指定されている場合もある）であり、段落
などからによって文書データから論理構造を取り出すこ
とは可能である。The text document 4 is usually only a document (in some cases, an image is specified separately), and it is possible to extract a logical structure from document data based on paragraphs or the like.

【００２６】さらに、これら以外にも、通常の文書５
（たとえば印刷された文書など）をスキャナなどにより
読み取った文書画像を文字認識、レイアウト認識、文書
理解した結果も電子化された文書データ５ａとして扱う
ことができる。Further, in addition to these, ordinary documents 5
The result of character recognition, layout recognition, and document understanding of a document image obtained by reading a document (for example, a printed document) with a scanner or the like can also be handled as digitized document data 5a.

【００２７】以上のような電子化された文書データから
それぞれのデータに含まれる目次、章、節などの単位を
抽出する。From the digitized document data as described above, a unit such as a table of contents, a chapter, or a section included in each data is extracted.

【００２８】なお、このとき、前記ワープロ文書データ
２、ＤＴＰソフトで作成された文書データ３、単なるテ
キスト文書データ４、スキャナで読みとった文書データ
５ａなどは、それぞれのデータを直接、内容の解析を行
ってそれぞれの単位を抽出するようにしてもよいが、一
旦、マークアップ言語系のたとえばＨＴＭＬに変換し
て、その変換データを解析して論理構造をそれぞれの単
位として抽出するようにしてもよい。At this time, the contents of the word processor document data 2, the document data 3 created by the DTP software, the simple text document data 4, and the document data 5a read by the scanner are directly analyzed. Each unit may be extracted by performing the conversion. Alternatively, the unit may be once converted into a markup language system such as HTML, the converted data may be analyzed, and the logical structure may be extracted as each unit. .

【００２９】たとえば、ＤＴＰソフトで作成された文書
データ２は、レイアウト構造が規定されているため、ど
のようなレイアウトが章や節に該当するかを決めること
ができるので、ＨＴＭＬなどへの変換は可能である。そ
して、ＨＴＭＬなどに変換後のデータから論理構造を認
識して抽出するには、タグ、修飾語、形態素分析など
で、章、節、句などの単位を抽出する。For example, since the document data 2 created by the DTP software has a defined layout structure, it is possible to determine what layout corresponds to a chapter or a section. It is possible. Then, in order to recognize and extract a logical structure from data converted into HTML or the like, a unit such as a chapter, a section, or a phrase is extracted by a tag, a modifier, a morphological analysis, or the like.

【００３０】このようにして論理構造を単位とし、それ
ぞれの単位の抽出がなされると、これらの各単位と対応
する文書内容の特定、切り出しを行う。さらに、必要に
応じて、それぞれの単位に対応する文書内容の要約、出
現頻度の高い語句をキーワードとして抽出する。When the logical structure is set as a unit and each unit is extracted, the contents of the document corresponding to each unit are specified and cut out. Further, if necessary, a summary of the document contents corresponding to each unit and a phrase having a high appearance frequency are extracted as keywords.

【００３１】このように、電子化された文書を、ＨＴＭ
Ｌなどのマークアップ言語系に変換しておけば、論理構
造の抽出が容易となり、さらに、応用性や汎用性に優れ
たものとなる。特に、最近では、webなどの普及でマー
クアップ言語系の文書が事実上の標準になっているの
で、中間処理としてＨＴＭＬなどに変換すると色々な処
理を行う上で有利なことが多い。As described above, an electronic document is converted to an HTM
If converted to a markup language system such as L, the logical structure can be easily extracted, and the application and versatility can be improved. In particular, recently, markup language documents have become a de facto standard due to the spread of the web and the like, and conversion to HTML or the like as an intermediate process is often advantageous in performing various processes.

【００３２】ところで、前記した単位とは、ここでは、
ある文書において、たとえば、ページｐ１の第ｎ１行目
〜ｎ２行目までを「Ａ」という概念で表したとき、それ
を１つの単位としている。したがって、「Ａ」という概
念は、その文書のページｐ１の第ｎ１行目〜ｎ２行目ま
でに対応するということになる。あるいは、段落を単位
とすることもできる。By the way, the above-mentioned unit here means
In a certain document, for example, when the n1st line to the n2th line of the page p1 are represented by the concept of "A", the unit is one unit. Therefore, the concept of "A" corresponds to the n1st line to the n2th line of page p1 of the document. Alternatively, a paragraph can be used as a unit.

【００３３】このようにして、文書内容を解析して複数
の単位（これを単位Ｕ１、単位Ｕ２、・・・というよう
に表す）を抽出すると、それぞれの単位Ｕ１、単位Ｕ
２、・・・を、たとえば、文書の内容に応じて、上位概
念を表す単位Ｕ１に対してその単位Ｕ１に属する下位概
念を表す単位Ｕ２，Ｕ３、そして、その単位Ｕ２，Ｕ３
に属する下位の概念を表す単位Ｕ４，Ｕ５，Ｕ６，・・
・というように、それぞれの単位の相互の関係をツリー
構造で表す。As described above, when the contents of the document are analyzed and a plurality of units (which are expressed as units U1, U2,...) Are extracted, each unit U1, unit U
.., For example, according to the contents of the document, a unit U1 representing a higher concept, a unit U2, U3 representing a lower concept belonging to the unit U1, and the units U2, U3
U4, U5, U6,... Representing lower concepts belonging to
-The mutual relationship of each unit is expressed in a tree structure.

【００３４】これらそれぞれの単位Ｕ１、Ｕ２、・・・
は、前記したように、その文書から抽出されたそれぞれ
の単位における文書内容に対応している。たとえば、単
位Ｕ２は文書における内容ｐ２の部分に対応し、単位Ｕ
３は内容ｐ３の部分に対応し、単位Ｕ４は内容ｐ４の部
分に対応している。さらに、単位Ｕ５は文書における内
容ｐ５の部分に対応し、単位Ｕ６は内容ｐ６の部分に対
応し、単位Ｕ７は内容ｐ７の部分に対応している。これ
らそれぞれの文書内容は、文章であったり絵であった
り、その両方であったりする。Each of these units U1, U2,...
Corresponds to the document content in each unit extracted from the document as described above. For example, the unit U2 corresponds to the part of the content p2 in the document, and the unit U2
3 corresponds to the portion of the content p3, and the unit U4 corresponds to the portion of the content p4. Further, the unit U5 corresponds to the portion of the content p5 in the document, the unit U6 corresponds to the portion of the content p6, and the unit U7 corresponds to the portion of the content p7. The content of each of these documents is a text, a picture, or both.

【００３５】たとえば、内容ｐ２はオリジナル文書１０
におけるｐ２の部分（文章）を切り出した内容であり、
内容ｐ５はオリジナル文書１０におけるｐ５の部分
（絵）を切り出した内容であり、内容ｐ７はオリジナル
文書１０におけるｐ７の部分（文章と絵）を切り出した
内容である。For example, the content p2 is the original document 10
This is the content of the p2 part (sentence) cut out in
The content p5 is a content obtained by cutting out the portion (picture) of p5 in the original document 10, and the content p7 is a content obtained by cutting out the portion (text and picture) of p7 in the original document 10.

【００３６】ところで、図２における視覚化されたツリ
ー構造は、具体的には、図３のような内容のものが画面
２０上に表示される。By the way, the visualized tree structure shown in FIG. 2 is, specifically, shown in FIG.

【００３７】これは、プリンタの使用説明書の文書デー
タであり、大項目として最も最上位に「機種名」が表示
され、その機種に対する「性能」についての項目、「使
い方」についての項目などが表示される。This is the document data of the instruction manual of the printer, and the "model name" is displayed at the highest level as a large item, and the item of "performance" and the item of "usage" for the model are displayed. Is displayed.

【００３８】そして、「使い方」の項目には、「設
置」、「プリンタドライバ」、「カラー印刷」などの項
目が表示され、これら「設置」、「プリンタドライ
バ」、「カラー印刷」などの項目ごとにそれぞれに属す
る小項目が表示される。Items such as “installation”, “printer driver”, and “color printing” are displayed in the “usage” item, and these items such as “installation”, “printer driver”, and “color printing” are displayed. The sub-items belonging to each are displayed.

【００３９】たとえば、「設置」の項目を例にとって説
明すると、小項目として「設置場所などの注意」、「用
紙関係のセッティング」、「インクカートリッジの取り
付け」、「電源接続」、「動作確認」、「コンピュータ
との接続」などというようにそれぞれ短い文章で表され
ている。For example, the item of "installation" will be described as an example. As sub-items, "attention to the installation location", "setting related to paper", "attachment of ink cartridge", "power connection", "operation check" , "Connection to a computer", etc., in short sentences.

【００４０】そして、今、ユーザが、このように表示さ
れた画面２０上から「インクカートリッジの取り付け」
の部分をマウスなどでクリックしたとすると、画面２０
上には、図４のように、インクカートリッジの取り付け
についての内容を表す表示がなされる。Then, the user can now select “Install ink cartridge” from the screen 20 displayed in this manner.
Is clicked with a mouse or the like, the screen 20
At the top, as shown in FIG. 4, a display indicating the contents of the attachment of the ink cartridge is made.

【００４１】なお、図４のような表示がなされたとき、
たとえば、画面２０上の端部に表示された「設置に戻
る」というような表示部分をクリックすると、画面が切
り替わって、たとえば、図５に示すように、「設置」に
関する表示がなされる。また、この画面２０上の上端部
には「使い方にもどる」というような表示がなされ、こ
こをクリックすると、「使い方」に関する表示となる。When the display as shown in FIG. 4 is made,
For example, when a display portion such as "return to installation" displayed at the end on the screen 20 is clicked, the screen is switched, and, for example, a display related to "installation" is made as shown in FIG. In addition, a display such as "Return to Usage" is displayed at the upper end on this screen 20, and when this is clicked, a display relating to "Usage" is displayed.

【００４２】ところで、図３のような表示を行う際、表
示の仕方としては、２次元的な表示でもよいし、奥行き
を持たせた３次元的な表示でもよい。さらに、階層が多
い場合は、見える階層を制限したり、親子ディレクトリ
としてもよい。When the display as shown in FIG. 3 is performed, a two-dimensional display or a three-dimensional display with a depth may be used. Further, when there are many hierarchies, the visible hierarchies may be limited or the parent / child directories may be used.

【００４３】また、図３の表示において、項目を表す内
容以外に、文書内容の要約やイメージを付加するように
してもよい。たとえば、文書内容の要約の場合は、その
文書内容に対する一般的な要約、その文書内容のなかで
出現頻度の高いキーワード、さらには、冒頭の部分のみ
などを要約として付加するようにしてもよい。イメージ
のに場合は、その文書内容に関連した絵や写真などを付
加する。一例として、「インクカートリッジの取り付
け」に対してはインクカートリッジの絵や写真を付加し
たり、「コンピュータとの接続」という内容に対しては
電源コードの絵や写真を付加するというようなことを行
ってもよい。Further, in the display of FIG. 3, a summary or image of the document content may be added in addition to the content representing the item. For example, in the case of a summary of the document content, a general summary for the document content, a keyword having a high frequency of appearance in the document content, and only the beginning portion may be added as the summary. In the case of an image, a picture or a photograph related to the contents of the document is added. For example, a picture or picture of the ink cartridge is added to "Install the ink cartridge", and a picture or picture of the power cord is added to the content of "Connection to computer". May go.

【００４４】以上説明したように、この実施の形態によ
れば、ＤＴＰソフトなどで作成された文書データ（たと
えば、使用説明書など）を、まず、ＨＴＭＬなどのマー
クアップ言語系の文書データに変換し、その論理構造を
抽出し、たとえば、章、節などの項目をそれぞれの項目
相互の関係をツリー構造で視覚化して表示し、その表示
された中から所望とする内容に関する項目を指定する
と、その詳細な内容が表示される。このように、本発明
では、視覚化のための特別なフォーマットでなくても、
電子化された文書データであればその文書データから論
理構造を抽出し、視覚化することができ、また、その表
示内容を選択して指示することにより、内容に関する表
示を即座に得ることができる。As described above, according to this embodiment, first, document data (for example, instruction manual) created by DTP software or the like is converted into document data of a markup language system such as HTML. Then, the logical structure is extracted, and, for example, the items such as chapters and sections are visualized and displayed in a tree structure with respect to each other, and an item related to desired contents is designated from the displayed. The detailed contents are displayed. Thus, the present invention does not require a special format for visualization,
In the case of digitized document data, the logical structure can be extracted from the document data and visualized, and the display of the content can be immediately obtained by selecting and indicating the display content. .

【００４５】なお、以上説明した実施の形態は、本発明
の好適な実施の形態の例であるが、これに限定されるも
のではなく、本発明の要旨を逸脱しない範囲で、種々変
形実施可能である。The embodiment described above is an example of a preferred embodiment of the present invention. However, the present invention is not limited to this, and various modifications can be made without departing from the gist of the present invention. It is.

【００４６】なお、本発明の処理を行う処理プログラム
は、フロッピィディスク、光ディスク、ハードディスク
などの記憶媒体に記憶させておくことができ、本発明
は、それらの記憶媒体をも含むものであり、また、ネッ
トワークからデータを得る形式でもよい。The processing program for performing the processing of the present invention can be stored in a storage medium such as a floppy disk, an optical disk, or a hard disk, and the present invention includes those storage media. Alternatively, data may be obtained from a network.

【００４７】[0047]

【発明の効果】以上説明したように本発明によれば、視
覚化のための特別なフォーマットでなくても電子化され
た文書データであれば、その文書データから論理構造を
抽出し、論理構造相互の関係を視覚化することができ、
また、視覚化は、もとの文書データの論理構造などを把
握しやすくするだけでなく、実際の文書内容にアクセス
しやすいような具体的な表示とし、その表示内容を選択
して指示することにより、実際の文書内容を読み出して
表示することができるので、膨大なデータ量を有する文
章データの中に存在する所望とする情報を得る場合、情
報のアクセスが容易で即座に所望とする情報を取り出す
ことができる。As described above, according to the present invention, if the document data is digitized even if it is not a special format for visualization, a logical structure is extracted from the document data, and the logical structure is extracted. You can visualize their relationships,
In addition, visualization should not only make it easier to grasp the logical structure of the original document data, but also make it a specific display that makes it easy to access the actual document contents, and select and indicate the display contents. In this way, since the actual contents of a document can be read and displayed, when obtaining desired information existing in text data having a huge data amount, the information can be easily accessed and the desired information is immediately Can be taken out.

[Brief description of the drawings]

【図１】本発明の概略的な処理を説明するフローチャー
ト。FIG. 1 is a flowchart illustrating a schematic process of the present invention.

【図２】本発明の実施の形態の処理手順を図１のフロー
チャートに沿って説明する図。FIG. 2 is an exemplary view for explaining the processing procedure according to the embodiment of the present invention along the flowchart in FIG. 1;

【図３】本発明の実施の形態における視覚化された内容
の具体例を説明する図。FIG. 3 is a diagram illustrating a specific example of visualized contents according to the embodiment of the present invention.

【図４】本発明の実施の形態において視覚化された内容
から所定の内容を指示したときの文書内容の表示例を示
す図。FIG. 4 is a diagram showing a display example of document content when a predetermined content is designated from the visualized content in the embodiment of the present invention.

【図５】図４で示された文書内容表示において表示内容
を切り替えた場合の表示例を示す図。FIG. 5 is a view showing a display example when display contents are switched in the document content display shown in FIG. 4;

[Explanation of symbols]

１マークアップ言語系の文書データ２ワープロで作成された一般的な文書データ３ＤＴＰで作成された一般的な文書データ４単なるテキストデータ５一般の文書５ａ一般の文書を電子化した文書データＵ１，Ｕ２，・・・論理構造に対応する単位ｐ１，ｐ２，・・・文書内容 DESCRIPTION OF SYMBOLS 1 Document data of markup language system 2 General document data created by word processor 3 General document data created by DTP 4 Simple text data 5 General document 5a Document data U1 which digitized a general document U2, unit corresponding to logical structure p1, p2, document contents

Claims

[Claims]

When the logical structure of the information is extracted from the digitized information, and the extracted logical structure is visualized and a certain unit is designated from the visualized display,
An information visualization method characterized in that contents corresponding to the specified unit are visualized and displayed.

2. Extracting the logical structure of the information from the digitized information as a unit of information, performing a display in which the mutual relationship between the extracted information units is visualized as a hierarchical structure, and displaying the visualized information. An information visualization method characterized in that when a certain unit is designated from a hierarchical structure, the contents corresponding to the designated unit are visualized and displayed.

3. The information visualization according to claim 1, wherein after converting the digitized information into a markup language system, a logical structure is extracted as a unit of information from the converted data. Method.

4. The method according to claim 1, wherein information for summarizing the contents of the corresponding information is added at the time of the visualization.
2. The information visualization method according to 2 or 3.

5. The information visualization method according to claim 4, wherein the information summarizing the content of the information is a text summary or a keyword representing the content of the corresponding information.

6. The information visualization method according to claim 4, wherein the information summarizing the content of the information is an image representing the content of the corresponding information.