JP2006215681A

JP2006215681A - Document detail determination support system

Info

Publication number: JP2006215681A
Application number: JP2005025911A
Authority: JP
Inventors: Shu Saito; 周斉藤; Yuuya Sonoyama; 裕弥園山
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2005-02-02
Filing date: 2005-02-02
Publication date: 2006-08-17

Abstract

<P>PROBLEM TO BE SOLVED: To provide a document detail determination support system that allows a user to easily and quickly determine the content of a document. <P>SOLUTION: From data about a document, the system extracts a plurality of words contained in the document. The system searches a table that associates words with images for images that match the plurality of words extracted. Data about the images searched for are obtained from an image database that stores image data. Using the image data obtained, an illustration of the document is created and outputted. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、文書内容の把握を支援するための文書内容把握支援システムに関するものである。 The present invention relates to a document content grasping support system for supporting grasping of document contents.

文書内容の把握を支援する技術は、例えば特許文献１や特許文献２が開示している。特許文献１は、新聞や雑誌、論文、書籍などの文書の情報をディスプレイに表示する装置を開示する。この装置は、文書に含まれる文章や図形のデータをディスプレイの表示領域の中でレイアウトする。そのレイアウトのために、文章や図形のような文書のデータは複数のブロックに分配される。文章のデータは分割して複数のブロックに配分することができる。これらのブロックは、各ブロックに対応する情報の重要度に従って、与えられた矩形領域の列に配置される。重要度の高いブロックは優先して配置し、重要度の低いブロックであまったところを埋める。このようにして定めたレイアウトでディスプレイに文書の情報を表示することにより、この装置はユーザが文書情報を把握するのを容易にする。 For example, Patent Document 1 and Patent Document 2 disclose techniques for supporting grasp of document contents. Patent Document 1 discloses a device that displays information on documents such as newspapers, magazines, papers, and books on a display. This apparatus lays out text and graphic data included in a document within a display area of a display. Due to the layout, document data such as text and graphics is distributed to a plurality of blocks. Sentence data can be divided and distributed to a plurality of blocks. These blocks are arranged in a given row of rectangular areas according to the importance of information corresponding to each block. Blocks with high importance are placed preferentially, and the blocks with low importance are filled. By displaying the document information on the display in the layout determined in this way, this apparatus makes it easy for the user to grasp the document information.

特許文献２は情報検索処理方法を記載している。この方法では、公開特許公報のデータのように文字コードとイメージとを含むデータを対象として情報を検索する。検索した情報の文字コードとイメージとを同一画面上に表示し、文字コードに対応する画面上の文字をユーザが指定すると、その文字で特定されるイメージを同一画面上に切り換え表示する。これにより、検索した情報に含まれる文字と図とを対照する手間を省き、文章内容の把握を容易にしている。
特開平８−２５５２５５号公報特開平８−３３９３８０号公報 Patent Document 2 describes an information search processing method. In this method, information is searched for data including a character code and an image, such as data of a published patent publication. The character code and image of the retrieved information are displayed on the same screen, and when the user designates the character on the screen corresponding to the character code, the image specified by the character is switched and displayed on the same screen. This saves the trouble of comparing the characters included in the searched information with the figure, and makes it easy to grasp the text content.
JP-A-8-255255 JP-A-8-339380

このように従来の技術では、文章の重要な部分の文書全体における位置やイメージを手がかりとして、文章の内容を容易に推測することができる。しかしながら、文章の重要な部分が記述している事項や、図示されていない事項は、ユーザ自身が文章を読解して把握しなければならない。読解の難しい文章が含まれていたり、文章が長かったりすると、その読解は困難になり、文書の内容を把握するのに手間取ることになる。 As described above, according to the conventional technique, the contents of a sentence can be easily estimated based on the position and image of the important part of the sentence in the entire document. However, the user himself / herself has to understand the matter that is described in the important part of the sentence and the matter that is not illustrated by reading the sentence. If a sentence that is difficult to read is included or a sentence is long, it becomes difficult to read, and it takes time to grasp the contents of the document.

この問題は、内容を把握すべき文書の量が増えるとさらに重大になる。例えばコンピュータシステムで文書の情報を検索するときに、適当な検索条件を指定できなければ、検索結果に大量の文書が含まれてしまう。その中から必要な文書を選別するには、多数の文書の内容を把握しなければならない。一つの文書の内容を把握するのに手間取ると、選別作業を終えるのに膨大な時間が必要となる。選別作業を短時間で終えようとして文書の内容を十分に把握しなければ、重要な情報を見逃してしまう危険性が高くなる。 This problem becomes more serious as the amount of documents whose contents need to be understood increases. For example, when searching for document information in a computer system, if appropriate search conditions cannot be specified, a large amount of documents will be included in the search results. In order to select necessary documents from the list, it is necessary to grasp the contents of a large number of documents. If it takes time to grasp the contents of one document, a huge amount of time is required to finish the sorting operation. If the content of the document is not fully understood in order to finish the sorting operation in a short time, there is a high risk of missing important information.

本発明は、このような従来の技術における課題を鑑みてなされたものであり、文書の内容を簡単かつ迅速に把握することのできる文書内容把握支援システム、検索システム、ファイル管理システム、文書内容把握支援方法および文書内容把握支援プログラムを提供することを目的とするものである。 The present invention has been made in view of such problems in the conventional technology, and is a document content grasping support system, a search system, a file management system, and a document content grasping that can easily and quickly grasp the contents of a document. The object is to provide a support method and a document content grasp support program.

この目的を達成するために、本発明の文書内容把握支援システムでは、文書のデータから、その文書に含まれるワードを複数抽出する。画像データベースは、画像のデータを蓄積する。テーブルは、ワードと画像とを関連付ける。画像検索部は、抽出した複数のワードにそれぞれ対応する画像をそのテーブルから検索する。画像データ取得部は、検索した画像のデータを画像データベースから取得する。イラスト作成部は、取得した画像データを用いて、その文書のイラストを作成する。イラスト出力部は、作成したイラストを作成する。 In order to achieve this object, the document content grasping support system of the present invention extracts a plurality of words contained in the document from the document data. The image database stores image data. The table associates words with images. The image search unit searches the table for images corresponding to the extracted words. The image data acquisition unit acquires the searched image data from the image database. The illustration creation unit creates an illustration of the document using the acquired image data. The illustration output unit creates the created illustration.

この文書内容把握支援システムにおいて、文書中での出現頻度に従ってワードを抽出するようにしてもよい。 In this document content grasp support system, words may be extracted according to the appearance frequency in the document.

またこのシステムにレイアウト決定部をさらに備えるようにしてもよい。レイアウト決定部は、取得した画像データのレイアウトを決定する。イラスト作成部は、その決定に従って、取得した画像データのレイアウトをする。 The system may further include a layout determining unit. The layout determining unit determines the layout of the acquired image data. The illustration creation unit lays out the acquired image data according to the determination.

またテーブルが、ワードとそのワードの属するグループとを関連付けるようにしてもよい。イラスト作成部は、抽出したワードのグループ毎にイラストを作成する。 The table may associate a word with a group to which the word belongs. The illustration creation unit creates an illustration for each group of extracted words.

さらにイラスト作成部は、抽出したワードのグループ毎の出現頻度に従って、イラスト間のサイズの関係を決定するようにしてもよい。 Furthermore, the illustration creation unit may determine the size relationship between illustrations according to the appearance frequency of each extracted word group.

また文書内容把握支援システムにおいて、文書の一部または全部に対象部を設定するようにしてもよい。この場合、システムは、設定した対象部のデータから、その対象部に含まれるワードを複数抽出する。 In the document content grasping support system, the target part may be set for a part or all of the document. In this case, the system extracts a plurality of words included in the target part from the set data of the target part.

さらに文書の表示領域を特定し、特定した表示領域に対象部を設定するようにしてもよい。 Further, the document display area may be specified, and the target portion may be set in the specified display area.

さらに表示した文書のデータがページを示すデータを含む場合に、表示するページの変更を検出し、その結果に基づいて表示領域を特定するようにしてもよい。 Furthermore, when the data of the displayed document includes data indicating a page, a change in the page to be displayed may be detected, and the display area may be specified based on the result.

また他の観点によれば、本発明は検索システムを提供する。この検索システムにおいて、文書データベースは、文書のデータを蓄積する。文書検索部は、指定した条件に従って、蓄積された文書のデータを検索する。検索した文書のデータに基づいて、その文書に含まれるワードが複数抽出される。画像検索部は、抽出した複数のワードにそれぞれ対応する画像をテーブルから検索する。画像データ取得部は、検索した画像のデータを画像データベースから取得する。イラスト作成部は、取得した画像データを用いて、その文書のイラストを作成する。文書一覧出力部は、検索した文書毎に作成したイラストを用いて、検索した文書の一覧を出力する。 According to another aspect, the present invention provides a search system. In this search system, the document database stores document data. The document search unit searches the stored document data according to the specified conditions. A plurality of words included in the document are extracted based on the retrieved document data. The image search unit searches the table for images corresponding to the extracted words. The image data acquisition unit acquires the searched image data from the image database. The illustration creation unit creates an illustration of the document using the acquired image data. The document list output unit outputs a list of searched documents using illustrations created for each searched document.

この検索システムにおいて、文書の検索条件が文書のデータに含まれるワードを指定する場合に、その指定されたワードにも基づいて、抽出するワードを選択するようにしてもよい。 In this search system, when a search condition for a document specifies a word included in the document data, the word to be extracted may be selected based on the specified word.

さらに他の観点によれば、本発明はファイル管理システムを提供する。ファイル管理部は、ファイルの格納場所を階層的に管理する。ファイル特定部は、選択した場所に格納されたファイルから、文書のデータを含むファイルを特定する。特定したファイルの文書のデータに基づいて、その文書に含まれるワードが複数抽出される。画像検索部は、抽出した複数のワードにそれぞれ対応する画像をテーブルから検索する。画像データ取得部は、検索した画像のデータを画像データベースから取得する。イラスト作成部は、取得した画像データを用いて、文書のイラストを作成する。ファイル一覧出力部は、特定したファイル毎に作成したイラストを用いて、選択した場所に格納されたファイルの一覧を出力する。 According to yet another aspect, the present invention provides a file management system. The file management unit hierarchically manages file storage locations. The file specifying unit specifies a file including document data from the files stored in the selected location. Based on the document data of the specified file, a plurality of words included in the document are extracted. The image search unit searches the table for images corresponding to the extracted words. The image data acquisition unit acquires the searched image data from the image database. The illustration creation unit creates an illustration of the document using the acquired image data. The file list output unit outputs a list of files stored in the selected location using an illustration created for each identified file.

さらに他の観点によれば、本発明は、コンピュータを用いて、文書の内容の把握を支援する方法を提供する。この方法では、文書のデータから、その文書に含まれるワードをコンピュータが複数抽出する。コンピュータは、ワードと画像とを関連付けるテーブルから、抽出した複数のワードにそれぞれ対応する画像を検索し、画像のデータを蓄積する画像データベースから、検索した画像のデータを取得する。コンピュータは、取得した画像データを用いて、その文書のイラストを作成し、作成したイラストを出力する。 According to still another aspect, the present invention provides a method for supporting grasping of the contents of a document using a computer. In this method, the computer extracts a plurality of words included in the document from the document data. The computer retrieves images corresponding to the extracted plurality of words from a table associating words and images, and acquires retrieved image data from an image database that stores image data. The computer creates an illustration of the document using the acquired image data, and outputs the created illustration.

さらに他の観点によれば、本発明は、このような文書内容把握支援方法の手順をコンピュータに実行させるためのプログラムおよびそのプログラムを記録したコンピュータ読み取り可能な記録媒体を提供する。 According to still another aspect, the present invention provides a program for causing a computer to execute the procedure of the document content grasping support method, and a computer-readable recording medium recording the program.

上述の構成を採用することにより、本発明では、文書の内容を簡単かつ迅速に把握することができる。 By adopting the above-described configuration, the present invention can easily and quickly grasp the contents of a document.

以下、添付図面を参照して本発明の実施の形態について詳細に説明する。この実施の形態では、コンピュータを用いた電子書籍の閲覧システムとして本発明を具体化している。図１はこの閲覧システムに利用するコンピュータのハードウェア構成を示す。 Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In this embodiment, the present invention is embodied as an electronic book browsing system using a computer. FIG. 1 shows a hardware configuration of a computer used in this browsing system.

この閲覧システムには、汎用のコンピュータを利用することができる。コンピュータ１０１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１０２やバス１０３を備えている。ＣＰＵ１０２はバス１０３を通じてＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１０４やＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１０５と接続される。ＲＯＭ１０４に記憶されたプログラムの指令に従ってコンピュータ１０１が起動すると、ＣＰＵ１０２はＲＡＭ１０５上でＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）１０６の一部または全部を動作させる。 A general-purpose computer can be used for this browsing system. The computer 101 includes a CPU (Central Processing Unit) 102 and a bus 103. The CPU 102 is connected to a ROM (Read Only Memory) 104 and a RAM (Random Access Memory) 105 through a bus 103. When the computer 101 is started in accordance with a program instruction stored in the ROM 104, the CPU 102 operates a part or all of an OS (Operating System) 106 on the RAM 105.

バス１０３にはビデオインターフェイス１０７やＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）インターフェイス１０８も接続されている。ビデオインターフェイス１０７はＣＰＵ１０２の制御にしたがって、画像をディスプレイ１０９に表示する。ディスプレイ１０９としてＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）やＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）を用いることができる。ＵＳＢインターフェイス１０８は周辺機器をコンピュータ１０１本体に接続するのに用いることができる。ここでは、入力装置１１０をコンピュータ１０１に接続している。入力装置１１０としてはキーボードに加えて、マウスやトラックボールのようなポインティングデバイスを用いることができる。 A video interface 107 and a USB (Universal Serial Bus) interface 108 are also connected to the bus 103. The video interface 107 displays an image on the display 109 according to the control of the CPU 102. A CRT (Cathode Ray Tube) or LCD (Liquid Crystal Display) can be used as the display 109. The USB interface 108 can be used to connect a peripheral device to the computer 101 main body. Here, the input device 110 is connected to the computer 101. As the input device 110, in addition to a keyboard, a pointing device such as a mouse or a trackball can be used.

さらにバス１０３には、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）１１１も接続されている。このＨＤＤ１１１には、アプリケーションプログラム１１２のファイルやその他のファイルが格納される。アプリケーションプログラム１１２は、ここでは電子書籍の閲覧のためのプログラムである。ユーザが入力装置１１０でアプリケーションプログラム１１２の利用を指示すると、ＣＰＵ１０２はＯＳ１０６の指令に従ってＨＤＤ１１１からアプリケーションプログラム１１２のファイルを読み出し、ＲＡＭ１０５上でそのアプリケーションプログラム１１２を動作させる。 Further, an HDD (Hard Disk Drive) 111 is also connected to the bus 103. The HDD 111 stores a file of the application program 112 and other files. Here, the application program 112 is a program for browsing electronic books. When the user instructs to use the application program 112 with the input device 110, the CPU 102 reads the file of the application program 112 from the HDD 111 in accordance with a command from the OS 106 and operates the application program 112 on the RAM 105.

アプリケーションプログラム１１２の指令に従ってコンピュータ１０１が動作することにより、コンピュータ１０１は図２に示すようにユーザインターフェイス部２０１や文書データ取得部２０２、データ解析部２０３を備える。 When the computer 101 operates in accordance with an instruction of the application program 112, the computer 101 includes a user interface unit 201, a document data acquisition unit 202, and a data analysis unit 203 as shown in FIG.

ユーザインターフェイス部２０１は、ユーザが入力装置１１０を用いて指示を与えたりディスプレイ１０９で電子書籍を閲覧したりするためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）を提供する。 The user interface unit 201 provides a GUI (Graphical User Interface) for a user to give an instruction using the input device 110 or to browse an electronic book on the display 109.

文書データ取得部２０２は、ユーザインターフェイス部２０１を通じて受けたユーザからの指示に従って、例えばＨＤＤ１１１に記憶された電子書籍のデータ１１３のファイルを取得する。ここでは、電子書籍のデータ１１３は書籍のページを表すデータや、ページに記述されたテキストのデータを含む。 The document data acquisition unit 202 acquires a file of the electronic book data 113 stored in the HDD 111, for example, in accordance with an instruction from the user received through the user interface unit 201. Here, the electronic book data 113 includes data representing a book page and text data described on the page.

データ解析部２０３は、取得した電子書籍のデータ１１３を解析し、その結果に従って表示データを作成する。ユーザインターフェイス部２０１はその表示データを用いて、ユーザから指定された電子書籍の閲覧画面をディスプレイ１０９に表示する。図３はその閲覧画面の一例を示す。 The data analysis unit 203 analyzes the acquired electronic book data 113 and creates display data according to the result. The user interface unit 201 uses the display data to display an electronic book browsing screen designated by the user on the display 109. FIG. 3 shows an example of the browsing screen.

閲覧画面３０１では、開いた書物のように、電子書籍の連続する２ページ３０２Ｌおよび３０２Ｒのデータを左右に並べて表示する。次の２ページのデータを表示するとき、ユーザはボタン３０３を入力装置１１０で操作し、前の２ページのデータを表示するときはボタン３０４を操作する。ユーザはその操作をすることにより、書物のページをめくるときのように、電子書籍の文章を読み進んだり読み返したりすることができる。 On the browsing screen 301, the data of the two continuous pages 302L and 302R of the electronic book are displayed side by side as in an open book. The user operates the button 303 with the input device 110 when displaying the next two pages of data, and operates the button 304 when displaying the previous two pages of data. By performing the operation, the user can read or read back the text of the electronic book as when turning a page of a book.

ＨＤＤ１１１は、このようなアプリケーションプログラム１１２や電子書籍のデータ１１３のファイルのほか、文章内容把握支援プログラム１１４のファイルも格納している。 In addition to the application program 112 and the electronic book data 113 file, the HDD 111 also stores a text content grasp support program 114 file.

この実施の形態において文章内容把握支援プログラム１１４は、アプリケーションプログラム１１２と連携して、電子書籍の内容をユーザが把握するのを支援する。このプログラム１１４の指令に従って、コンピュータ１０１は図４に示すような手順４０１乃至４０５を実行する。またプログラム１１４の指令に従って動作することでコンピュータ１０１は、図２に示すように、キーワード抽出部２０４、画像データベース２０５、画像管理テーブル２０６、画像検索部２０７、画像データ取得部２０８、イラスト作成部２０９、およびイラスト出力部２１０をさらに備える。 In this embodiment, the text content grasp support program 114 cooperates with the application program 112 to assist the user in grasping the contents of the electronic book. In accordance with the instructions of the program 114, the computer 101 executes procedures 401 to 405 as shown in FIG. Further, as shown in FIG. 2, the computer 101 operates according to the instructions of the program 114 so that the keyword extraction unit 204, the image database 205, the image management table 206, the image search unit 207, the image data acquisition unit 208, and the illustration creation unit 209. And an illustration output unit 210.

キーワード抽出部２０４は、文書のデータから、その文書に含まれるワードを複数抽出する。ここでは文書のデータは、閲覧画面３０１で表示されている電子書籍の文章のデータ１１３である。プログラム１１４の指令に従ってＣＰＵ１０２は、その文書のキーワードを複数抽出する。例えば文章に含まれる名詞からキーワードを抽出する（手順４０１）。文章に含まれる名詞を抽出するため、既に知られた自然言語処理アルゴリズムを利用することができる。日本語の文章であれば形態素解析により文章を単語に分解し、係り受け構造などを構文解析により特定することが可能である。その結果を利用して、文章に含まれる名詞を抽出し、そのうちから、重要と推定される名詞をキーワードとして利用する。図５はキーワードを抽出する手順の一例を示す。 The keyword extraction unit 204 extracts a plurality of words included in the document from the document data. Here, the document data is the text data 113 of the electronic book displayed on the browsing screen 301. In accordance with an instruction from the program 114, the CPU 102 extracts a plurality of keywords for the document. For example, keywords are extracted from nouns included in the sentence (procedure 401). In order to extract nouns contained in the sentence, a known natural language processing algorithm can be used. In the case of a Japanese sentence, it is possible to decompose the sentence into words by morphological analysis and to identify the dependency structure and the like by syntactic analysis. Using the result, nouns included in the sentence are extracted, and from these, nouns estimated to be important are used as keywords. FIG. 5 shows an example of a procedure for extracting a keyword.

ＣＰＵ１０２は電子書籍の文章のデータからその文章に含まれる名詞のデータを抽出すると（手順５０１）、抽出した全部または一部の名詞毎に、その文章におけるその名詞の出現頻度を計算する（手順５０２）。そしてこの出現頻度に従ってキーワードを抽出する（手順５０３）。例えば出現頻度の高さが最上位から１０番目までの名詞をキーワードとして抽出する。抽出するキーワードの個数は一定であってもよいし、文章の長さに応じて変更するようにしてもよい。 When the CPU 102 extracts noun data contained in the sentence from the sentence data of the electronic book (procedure 501), the CPU 102 calculates the appearance frequency of the noun in the sentence for each extracted all or part of the nouns (procedure 502). ). Then, keywords are extracted according to the appearance frequency (procedure 503). For example, nouns having the highest appearance frequency from the top to the tenth are extracted as keywords. The number of keywords to be extracted may be constant or may be changed according to the length of the sentence.

図６は文章のデータの一例を示す。この例で文章６０１のデータは「車にはエンジンが付いています。車はエンジンで動きます。エンジンには点火装置が付いています。エンジンは点火装置によって動きます。」といったテキストのデータを含む。この文章６０１であれば、「エンジン」、「車」、「点火装置」のような単語がキーワードとして抽出される。 FIG. 6 shows an example of sentence data. In this example, the data of the sentence 601 includes text data such as “the car has an engine. The car runs on the engine. The engine has an ignition device. The engine runs on the ignition device.” . In the case of this sentence 601, words such as “engine”, “car”, and “ignition device” are extracted as keywords.

画像データベース２０５は画像のデータを蓄積する。ここでは画像データベース２０５がＨＤＤ１１１上に構築される。画像データベース２０５には、車、エンジン、点火装置、地球などの物の一般名称に対応した画像のデータが蓄積される。画像のデータは図７に示すように、画像自体のデータ７０１乃至７０４とその画像を識別するためのデータとを含む。画像を識別するためのデータには画像のデータ名を用いることができる。 The image database 205 stores image data. Here, the image database 205 is constructed on the HDD 111. The image database 205 stores image data corresponding to general names of objects such as cars, engines, ignition devices, and the earth. As shown in FIG. 7, the image data includes data 701 to 704 of the image itself and data for identifying the image. The data name of the image can be used as data for identifying the image.

画像管理テーブル２０６は、画像データベース２０５に蓄積された画像とワードとを関連付けるテーブルであり、このテーブル２０６もＨＤＤ１１１に格納しておくことができる。テーブル２０６は図８に示すように、この実施の形態ではキーワードと画像のデータ名とを関連付ける。この例では、「車」というキーワードと「Ｃａｒ．ｂｍｐ」というデータ名とを関連付け、「エンジン」というキーワードと「Ｅｎｇｉｎｅ．ｂｍｐ」というデータ名とを関連付ける。「点火装置」には「Ｆｉｒｅ．ｂｍｐ」を関連付け、「地球」には「Ｅａｒｔｈ．ｂｍｐ」を関連付けている。 The image management table 206 is a table that associates images stored in the image database 205 with words, and this table 206 can also be stored in the HDD 111. As shown in FIG. 8, the table 206 associates keywords with image data names in this embodiment. In this example, the keyword “car” is associated with the data name “Car.bmp”, and the keyword “engine” is associated with the data name “Engine.bmp”. “Fire.bmp” is associated with “Ignition device”, and “Earth.bmp” is associated with “Earth”.

画像検索部２０７は、抽出した複数のキーワードにそれぞれ対応する画像をそのテーブル２０６から検索する。ここでは、ＣＰＵ１０２が、プログラム１１４の指令に従ってＨＤＤ１１１にアクセスし、各キーワードに対応する画像データの名称をテーブル２０６から検索する（図４の手順４０２）。抽出したキーワードが「車」、「エンジン」、「点火装置」であれば、その検索で、「Ｃａｒ．ｂｍｐ」、「Ｅｎｇｉｎｅ．ｂｍｐ」、「Ｆｉｒｅ．ｂｍｐ」を名称のデータとしてそれぞれ取得する。 The image search unit 207 searches the table 206 for images corresponding to the extracted keywords. Here, the CPU 102 accesses the HDD 111 in accordance with a command from the program 114, and searches the table 206 for the name of the image data corresponding to each keyword (step 402 in FIG. 4). If the extracted keywords are “car”, “engine”, and “ignition device”, “Car.bmp”, “Engine.bmp”, and “Fire.bmp” are respectively acquired as name data in the search.

画像データ取得部２０８は、検索した画像のデータを画像データベース２０５から取得する。そのためにＣＰＵ１０２は、バス１０３を通じてＨＤＤ１１１にアクセスし、名称のデータを用いて、画像自体のデータを画像データベース２０５から取得する。例えば名称のデータとして「Ｃａｒ．ｂｍｐ」、「Ｅｎｇｉｎｅ．ｂｍｐ」および「Ｆｉｒｅ．ｂｍｐ」を取得していれば、図７に示した画像データ７０１乃至７０４のうち、画像データ７０１乃至７０３を取得する（図４の手順４０３）。 The image data acquisition unit 208 acquires the searched image data from the image database 205. For this purpose, the CPU 102 accesses the HDD 111 through the bus 103 and acquires data of the image itself from the image database 205 using the name data. For example, if “Car.bmp”, “Engine.bmp”, and “Fire.bmp” are acquired as the name data, the image data 701 to 703 are acquired from the image data 701 to 704 shown in FIG. (Procedure 403 in FIG. 4).

イラスト作成部２０９は、画像データベース２０５から取得した画像データを用いて、電子書籍の内容を説明する画像を作成する。ここではＣＰＵ１０２が、画像データベース２０５から取得した画像データを組み合わせてイラストを作成する（図４の手順４０４）。例えばＣＰＵ１０２は画像データ７０１乃至７０３を組み合わせ、図９に示すようなイラスト９０１を作成する。 The illustration creation unit 209 uses the image data acquired from the image database 205 to create an image that explains the contents of the electronic book. Here, the CPU 102 creates an illustration by combining the image data acquired from the image database 205 (step 404 in FIG. 4). For example, the CPU 102 combines the image data 701 to 703 to create an illustration 901 as shown in FIG.

イラスト出力部２１０は、作成したイラスト９０１を出力する。プログラム１１４の指令に従ってＣＰＵ１０２は、図１０に示すように、作成したイラストの出力画面１００１をディスプレイ１０９に表示する（図４の手順４０５）。閲覧画面３０１と対応して出力画面１００１を表示するため、ＣＰＵ１０２は閲覧画面３０１の表示位置に基づいて出力画面１００１の表示位置を計算する。図１０では、閲覧画面３０１のページ３０２Ｒに一部重なる位置で出力画面１００１が表示されている。またこの例のように、ＣＰＵ１０２は出力画面１００１の上部に、抽出したキーワードを配置するようにしてもよい。 The illustration output unit 210 outputs the created illustration 901. In accordance with the instruction of the program 114, the CPU 102 displays the created illustration output screen 1001 on the display 109 as shown in FIG. 10 (step 405 in FIG. 4). In order to display the output screen 1001 corresponding to the browsing screen 301, the CPU 102 calculates the display position of the output screen 1001 based on the display position of the browsing screen 301. In FIG. 10, the output screen 1001 is displayed at a position that partially overlaps the page 302R of the browsing screen 301. Further, as in this example, the CPU 102 may arrange the extracted keyword at the top of the output screen 1001.

このように電子書籍のイラストを出力することで、ユーザはその電子書籍の内容を簡単且つ迅速に把握することができる。また文章からイラストができるので、子供も文書を楽しみ易い。文字ばかりの文書を絵本のように楽しむことができる。 By outputting the illustration of the electronic book in this way, the user can easily and quickly grasp the contents of the electronic book. In addition, because illustrations can be made from text, it is easy for children to enjoy documents. You can enjoy text-only documents like a picture book.

上述のようなコンピュータシステムにおいて、イラストに用いる画像データを配置するため、コンピュータ１０１は図１１に示すように、レイアウト決定部２１１および配置管理テーブル２１２をさらに備えることができる。 In the computer system as described above, in order to arrange the image data used for the illustration, the computer 101 can further include a layout determination unit 211 and an arrangement management table 212 as shown in FIG.

レイアウト決定部２１１は、画像データ取得部２０８が取得した画像データのレイアウトを決定する。決定したレイアウトに従って、イラスト作成部２０９は、取得した画像データを組み合わせる。ここではＣＰＵ１０２が、配置管理テーブル２１２のデータを用いて、取得した画像データの配置を計算する。 The layout determination unit 211 determines the layout of the image data acquired by the image data acquisition unit 208. In accordance with the determined layout, the illustration creating unit 209 combines the acquired image data. Here, the CPU 102 calculates the arrangement of the acquired image data using the data of the arrangement management table 212.

配置管理テーブル２１２は、イラストにおける画像データのサイズや位置を管理するためのテーブルであり、ＨＤＤ１１１に格納しておくことができる。図１２は配置管理テーブルの構成の一例を示す。 The arrangement management table 212 is a table for managing the size and position of the image data in the illustration, and can be stored in the HDD 111. FIG. 12 shows an example of the configuration of the arrangement management table.

この配置管理テーブル２１２は、イラストにおける各画像のサイズや位置のデータを配置レベル毎に与える。サイズのデータは、イラストにおける各画像のサイズを具体的に定めるのに用いられる。イラストにおける各画像のサイズは、画像データベース２０５に蓄積された画像データのサイズが統一されている場合、その画像のサイズとその統一されたサイズとの比率で与えることができる。例えばサイズのデータが１００％の値を表していれば、画像データベース２０５に蓄積された画像データをそのままの大きさで用いる。サイズのデータが１００％未満の値を表していれば、蓄積された画像データを縮小して用いる。サイズのデータが１００％より大きな値を表してれば、蓄積された画像データを拡大して用いる。 The arrangement management table 212 gives data on the size and position of each image in the illustration for each arrangement level. The size data is used to specifically determine the size of each image in the illustration. When the size of the image data stored in the image database 205 is unified, the size of each image in the illustration can be given by the ratio between the size of the image and the unified size. For example, if the size data represents a value of 100%, the image data stored in the image database 205 is used as it is. If the size data represents a value less than 100%, the stored image data is reduced and used. If the size data represents a value larger than 100%, the stored image data is enlarged and used.

また位置のデータは、イラストに用いる画像間の相対的な位置関係を与える。相対的な位置関係は、基準位置に対する各画像の座標で表現することができる。画像のデータが矩形の画像を表現する場合、画像の位置は、その矩形の中心や角の位置で定めることができる。例えば位置のデータが（０，０）を表していれば、その画像の位置が基準位置となる。位置のデータが（＋２０，＋２０）を表していれば、基準位置からｘ方向に＋２０だけずらしｙ方向に＋２０だけずらしてその画像を配置することを示す。 The position data gives a relative positional relationship between images used for the illustration. The relative positional relationship can be expressed by the coordinates of each image with respect to the reference position. When the image data represents a rectangular image, the position of the image can be determined by the center or corner position of the rectangle. For example, if the position data represents (0, 0), the position of the image becomes the reference position. If the position data represents (+20, +20), it indicates that the image is arranged with a shift of +20 in the x direction and a shift of +20 in the y direction from the reference position.

配置レベルは、このようなサイズおよび位置のデータを画像データに関連付けるのに用いられる。図８に示すように、画像管理テーブル２０６の各レコードは配置レベルのデータを含む。ＣＰＵ１０２はキーワードを抽出すると、バス１０３を通じてＨＤＤ１１１にアクセスし、各キーワードに対応する画像データの名称および配置レベルのデータをテーブル２０６から取得する。各画像について配置レベルのデータを取得すると、その配置レベルに対応するサイズおよび位置のデータをＨＤＤ１１１上の配置管理テーブル２１２から取得する。そしてＣＰＵ１０２は、取得したサイズおよび位置のデータを用いて、イラストにおける各画像のサイズおよび位置を計算し、その結果に従ってイラストを作成する。 The placement level is used to associate such size and position data with image data. As shown in FIG. 8, each record of the image management table 206 includes arrangement level data. When the CPU 102 extracts the keyword, it accesses the HDD 111 through the bus 103 and acquires the image data name and arrangement level data corresponding to each keyword from the table 206. When the arrangement level data is acquired for each image, the size and position data corresponding to the arrangement level is acquired from the arrangement management table 212 on the HDD 111. Then, the CPU 102 calculates the size and position of each image in the illustration using the acquired size and position data, and creates an illustration according to the result.

このように画像データのレイアウトを決定することで、文書のイラストを構成する画像間の関係を適当に表すことができる。これによって、ユーザはその文章の内容をより的確に把握することが可能となる。 By determining the layout of the image data in this way, the relationship between the images constituting the document illustration can be appropriately represented. Thereby, the user can grasp the contents of the sentence more accurately.

また上述のようなコンピュータシステムにおいて、一つの文章から複数のイラストを出力するようにしてもよい。一つの文章が複数の主題を含む場合、その主題毎にイラストがあると、その文章のより詳細な内容をユーザが簡単に把握することができる。例えば図１３に示すような文章１３０１のデータは、「車」に関するテキストデータと、「地球」に関するテキストデータとを含む。このような文章１３０１では、「車」に関するイラストだけでなく、「地球」に関するイラストを出力することができる。 In the computer system as described above, a plurality of illustrations may be output from one sentence. When one sentence includes a plurality of subjects, if there is an illustration for each subject, the user can easily grasp the detailed contents of the sentence. For example, the data of the sentence 1301 as shown in FIG. 13 includes text data related to “car” and text data related to “Earth”. In such a sentence 1301, not only an illustration related to “car” but also an illustration related to “earth” can be output.

ここでは、イラスト作成部２０９が、抽出したキーワードのグループ毎に概説画像を作成する。そのために画像管理テーブル２０６に、ワードとそのワードの属するグループとを関連付けるデータを格納することができる。図８の例では、キーワード毎にそのキーワードの属するグループのデータが与えられている。「車」や「エンジン」、「点火装置」については、グループのデータとして「Ａ」という値が与えられている。「地球」については、グループのデータとして「Ｂ」という別の値が与えられている。 Here, the illustration creation unit 209 creates a summary image for each group of extracted keywords. Therefore, data that associates a word with a group to which the word belongs can be stored in the image management table 206. In the example of FIG. 8, data of a group to which the keyword belongs is given for each keyword. For “car”, “engine”, and “ignition device”, a value of “A” is given as group data. For “Earth”, another value of “B” is given as group data.

グループのデータは、画像検索部２０７が用いることができる。画像検索部２０７はグループのデータを利用して、抽出したキーワードをグループに分ける。画像検索部２０７は、抽出したキーワードをグループに分けると、グループ毎に画像を検索する。このような機能もＣＰＵ１０２が文書内容把握支援プログラム１１４の指令に従って実現する。そのときコンピュータ１０１が実行する手順の一例を図１４に示す。 The image search unit 207 can use the group data. The image search unit 207 uses the group data to divide the extracted keywords into groups. When the image search unit 207 divides the extracted keywords into groups, the image search unit 207 searches for images for each group. Such a function is also realized by the CPU 102 in accordance with a command from the document content grasp support program 114. An example of the procedure executed by the computer 101 at that time is shown in FIG.

ＣＰＵ１０２はキーワードを抽出すると（手順４０１）、バス１０３を通じてＨＤＤ１１１にアクセスし、各キーワードに対応するグループのデータをテーブル２０６から取得する（手順１４０１）。ＣＰＵ１０２はプログラム１１４の指令に従って、データを取得したグループのうち、いずれかのグループに属するキーワードを選択する（手順１４０２）。キーワードを選択すると、ＣＰＵ１０２は、選択したキーワードに対応する画像データの名称のデータをＨＤＤ１１１上のテーブル２０６から取得する（手順４０２）。文章１３０１であれば、キーワードとして「エンジン」、「車」、「点火装置」、および「地球」といった単語を抽出する。これらのキーワードが属するグループは「Ａ」か「Ｂ」である。２つのグループのデータを取得すると、ＣＰＵ１０２は、例えばグループ「Ａ」に属するキーワードを選択する。これにより、抽出した４つのキーワードのうち、「車」、「エンジン」、「点火装置」が選択される。これらのキーワードを選択すると、画像データの名称のデータとして「Ｃａｒ．ｂｍｐ」、「Ｅｎｇｉｎｅ．ｂｍｐ」、「Ｆｉｒｅ．ｂｍｐ」をそれぞれ取得する。 When the CPU 102 extracts keywords (procedure 401), it accesses the HDD 111 through the bus 103, and acquires group data corresponding to each keyword from the table 206 (procedure 1401). The CPU 102 selects a keyword belonging to one of the groups from which data has been acquired in accordance with a command from the program 114 (procedure 1402). When the keyword is selected, the CPU 102 acquires data of the name of the image data corresponding to the selected keyword from the table 206 on the HDD 111 (procedure 402). In the case of the sentence 1301, words such as “engine”, “car”, “ignition device”, and “earth” are extracted as keywords. The group to which these keywords belong is “A” or “B”. When the data of the two groups is acquired, the CPU 102 selects, for example, a keyword belonging to the group “A”. Thus, “car”, “engine”, and “ignition device” are selected from the four extracted keywords. When these keywords are selected, “Car.bmp”, “Engine.bmp”, and “Fire.bmp” are acquired as data of the name of the image data.

抽出したキーワードをグループ分けする場合、画像データ取得部２０８は、選択したキーワードに対応する画像データを画像データベース２０５から取得し、イラスト作成部２０９は、選択したキーワードの属するグループに対してイラストを作成する。そのためにＣＰＵ１０２は、選択したキーワードの画像データの名称のデータを取得すると、ＨＤＤ１１１にアクセスし、その名称のデータを用いて、画像自体のデータを画像データベース２０５から取得する(手順４０３)。取得した画像データを組合せ、選択したキーワードの属するグループに対してイラストを作成する（手順４０４）。例えば名称のデータとして「Ｃａｒ．ｂｍｐ」、「Ｅｎｇｉｎｅ．ｂｍｐ」、「Ｆｉｒｅ．ｂｍｐ」を取得していれば、ＣＰＵ１０２は画像データ７０１乃至７０３をＨＤＤ１１１上の画像データベース２０５から取得する。そして画像データ７０１乃至７０３を組合せ、イラスト９０１を作成する。 When grouping the extracted keywords, the image data acquisition unit 208 acquires image data corresponding to the selected keyword from the image database 205, and the illustration creation unit 209 creates an illustration for the group to which the selected keyword belongs. To do. Therefore, when the CPU 102 acquires the name data of the selected keyword image data, the CPU 102 accesses the HDD 111 and acquires the image data itself from the image database 205 using the name data (procedure 403). The acquired image data is combined, and an illustration is created for the group to which the selected keyword belongs (step 404). For example, if “Car.bmp”, “Engine.bmp”, and “Fire.bmp” are acquired as name data, the CPU 102 acquires the image data 701 to 703 from the image database 205 on the HDD 111. The image data 701 to 703 are combined to create an illustration 901.

このようにしていずれかのグループに対してイラストを作成すると、残りのグループについても同様にイラストを作成する(手順１４０３)。グループ「Ａ」および「Ｂ」のうち、グループ「Ａ」に対してイラストを作成していれば、ＣＰＵ１０２はプログラム１１４の指令に従い、グループ「Ｂ」に対してもイラストを作成する。この例では、グループ「Ｂ」に属するキーワードとして、「地球」を選択する。そのキーワードを選択すると、画像データの名称のデータとして「Ｅａｒｔｈ．ｂｍｐ」を取得する。ＣＰＵ１０２はＨＤＤ１１１にアクセスし、その名称のデータを用いて、画像データベース２０５から画像データ７０４を取得する。そしてその画像データ７０４を用いて、グループ「Ｂ」に対するイラストを作成する。 When illustrations are created for any of the groups in this way, illustrations are created for the remaining groups in the same manner (step 1403). If an illustration is created for the group “A” in the groups “A” and “B”, the CPU 102 creates an illustration for the group “B” in accordance with the instruction of the program 114. In this example, “Earth” is selected as a keyword belonging to the group “B”. When the keyword is selected, “Earth.bmp” is acquired as the data of the name of the image data. The CPU 102 accesses the HDD 111 and acquires image data 704 from the image database 205 using the data of the name. Then, an illustration for the group “B” is created using the image data 704.

全てのグループに対しそれぞれイラストを作成すると、イラスト出力部２１０は、グループ毎に作成したイラストを出力する。例えばＣＰＵ１０２が、図１５に示すように、グループ「Ａ」に対するイラスト９０１の出力画面１５０１とグループ「Ｂ」に対するイラストの出力画面１５０２とをそれぞれディスプレイ１０９に表示する(手順４０５)。この例では、グループ「Ｂ」に対するイラストとして画像データ７０４をそのまま用いている。 When illustrations are created for all groups, the illustration output unit 210 outputs the illustrations created for each group. For example, as shown in FIG. 15, the CPU 102 displays an output screen 1501 of an illustration 901 for the group “A” and an output screen 1502 of an illustration for the group “B” on the display 109 (step 405). In this example, the image data 704 is used as it is as an illustration for the group “B”.

またこの例では、出力画面１５０１よりも下側に出力画面１５０２を配置している。この配置は、文章中でキーワードが出現する順序に従って、イラスト出力部２１０が定めることができる。そのためにキーワード抽出部２０４が、文書の先頭から末尾に向かってワードを抽出するとともに、その抽出の順序をワード毎に記録するようにしてもよい。イラスト出力部２１０は、その記録に基づいてキーワードの出現順序を定め、その出願順序に従って複数の出力画面の表示位置を定める。 In this example, the output screen 1502 is arranged below the output screen 1501. This arrangement can be determined by the illustration output unit 210 according to the order in which the keywords appear in the text. Therefore, the keyword extraction unit 204 may extract words from the beginning to the end of the document and record the extraction order for each word. The illustration output unit 210 determines the appearance order of keywords based on the recording, and determines the display positions of a plurality of output screens according to the application order.

また上述の例のように各グループに対するイラストを別個の出力画面で表示する代わりに、それらのイラストを一つの出力画面で表示するようにしてもよい。この場合には、各グループのイラストを作成したときに、イラスト作成部２０９がそれらのイラストを連結して一つのイラストを作成することができる。イラスト出力部２１０は、そのイラストを一つの出力画面に表示する。 Further, instead of displaying the illustrations for each group on separate output screens as in the above example, those illustrations may be displayed on one output screen. In this case, when creating illustrations for each group, the illustration creation unit 209 can create a single illustration by connecting the illustrations. The illustration output unit 210 displays the illustration on one output screen.

このように各グループに対するイラストを作成することで、ユーザがイラストから得られる情報が増え、文章の内容をより詳細に把握することが可能となる。 Thus, by creating an illustration for each group, the user can obtain more information from the illustration, and the contents of the sentence can be grasped in more detail.

各グループに対するイラストを作成する例においても、各イラストを構成する画像データのレイアウトをレイアウト決定部２１１が決定するようにしてもよい。 Also in the example of creating an illustration for each group, the layout determining unit 211 may determine the layout of the image data constituting each illustration.

またイラスト作成部２０９が、抽出したワードの属するグループ毎の出現頻度に従って、イラスト間のサイズの関係を決定することができる。例えばグループ毎の出現頻度は、そのグループに属するキーワードそれぞれの出現頻度の総計として求める。イラスト作成部２０９は、グループ毎の総計を比較し、その比較結果に基づいて各グループに対するイラストのサイズを決定する。この機能を実現するためにＣＰＵ１０２が、プログラム１１４の指令に従ってグループ毎に総計を計算する。グループ毎に総計を計算すると、ＣＰＵ１０２は、最大の総計で各グループの総計の値を正規化する演算を行う。そして正規化した値で各グループのイラストのサイズを調整する。全てのイラストを矩形の画像で表現し、その矩形のサイズを揃えていれば、例えばその矩形の一辺の長さに、正規化した値を乗算して、各イラストのサイズを調整する。 The illustration creation unit 209 can determine the size relationship between the illustrations according to the appearance frequency of each group to which the extracted word belongs. For example, the appearance frequency for each group is obtained as the sum of the appearance frequencies of the keywords belonging to the group. The illustration creation unit 209 compares the totals for each group, and determines the illustration size for each group based on the comparison result. In order to realize this function, the CPU 102 calculates a total for each group in accordance with a command from the program 114. When the total is calculated for each group, the CPU 102 performs an operation for normalizing the total value of each group with the maximum total. Then, the illustration size of each group is adjusted by the normalized value. If all the illustrations are represented by rectangular images and the sizes of the rectangles are the same, for example, the length of one side of the rectangle is multiplied by a normalized value to adjust the size of each illustration.

このようにしてイラスト間のサイズの関係を決定することで、ユーザは、その関係から各イラストから得られる情報の重要度を推測することができ、必要な情報を簡単に選別できる。特に、文章が様々な情報や文脈を含んでいるときに有用である。 By determining the size relationship between illustrations in this way, the user can estimate the importance of information obtained from each illustration from the relationship, and can easily select necessary information. This is particularly useful when the text contains various information and contexts.

キーワードを抽出する範囲は、文書の全体でも文書の一部でもよい。上述のようなコンピュータシステムにおいて、キーワードを抽出する対象部を設定するため、コンピュータ１０１は図１６に示すように、対象設定部２１３をさらに備えることができる。 The range for extracting keywords may be the whole document or a part of the document. In the computer system as described above, the computer 101 can further include a target setting unit 213 as shown in FIG.

対象設定部２１３は、文書の一部または全部に、ワードを抽出する対象部を設定する。この実施の形態におけるコンピュータシステムがユーザインターフェイス部２０１を備えているように、文書を表示する表示部をシステムが備えている場合には、対象設定部２１３は、表示領域特定部２１４を備えるようにしてもよい。 The target setting unit 213 sets a target part for extracting a word in a part or all of a document. When the system includes a display unit for displaying a document, such as the computer system according to this embodiment includes the user interface unit 201, the target setting unit 213 includes a display area specifying unit 214. May be.

表示領域特定部２１４は文書の表示領域を特定する。対象設定部２１３は、特定した表示領域に対象部を設定する。表示領域の特定は、ユーザから指示を受けたときにすることができる。ここでは、表示領域を特定する指示をユーザから受けるため、図３に示すように、「イラスト」ボタン３０５を閲覧画面３０１上に配置している。 The display area specifying unit 214 specifies the display area of the document. The target setting unit 213 sets the target part in the specified display area. The display area can be specified when an instruction is received from the user. Here, in order to receive an instruction for specifying the display area from the user, an “illustration” button 305 is arranged on the browsing screen 301 as shown in FIG.

このような機能を実現するためＣＰＵ１０２はプログラム１１４の指令に従って、入力装置１１０を用いたアプリケーションプログラム１１２に対するユーザの操作を監視する。ボタン３０５を押す操作を検出すると、ＣＰＵ１０２はそのときに表示されている文書の領域を特定する。電子書籍のデータ１１３のように、表示する文書のデータがページを示すデータを含む場合には、文書の表示領域の特定にそのページを示すデータを利用することができる。表示中のページを示すデータをＲＡＭ１０５に一時的に記録していれば、ＣＰＵ１０２は、そのデータをＲＡＭ１０５から取得する。そしてそのデータを用いて表示しているページを特定する。図３の例ではページ番号表示部３０６および３０７に示すように、６ページ中の３ページ目と４ページ目とを表示している。この状態でボタン３０５を押す操作をユーザがすると、ＣＰＵ１０２は３ページ目と４ページ目を表示領域として特定し、ワードを抽出する対象部としてその表示領域を設定する。この場合、ＣＰＵ１０２は文書全体ではなく３ページ目および４ページ目のデータからキーワードを抽出する。そしてそのキーワードを使って、３ページ目および４ページ目に対するイラストを作成し、そのイラストをディスプレイ１０９に表示する。別のページについてイラストを既に表示している場合には、そのイラストの出力画面の表示を更新して、新たに作成したイラストを表示してもよいし、新たに作成したイラストを別の出力画面で表示するようにしてもよい。 In order to realize such a function, the CPU 102 monitors a user operation on the application program 112 using the input device 110 in accordance with an instruction of the program 114. When detecting an operation of pressing the button 305, the CPU 102 specifies the area of the document displayed at that time. When data of a document to be displayed includes data indicating a page like the electronic book data 113, data indicating the page can be used for specifying a display area of the document. If data indicating the currently displayed page is temporarily recorded in the RAM 105, the CPU 102 acquires the data from the RAM 105. And the page currently displayed is specified using the data. In the example of FIG. 3, as shown in page number display sections 306 and 307, the third and fourth pages of the six pages are displayed. When the user performs an operation of pressing the button 305 in this state, the CPU 102 specifies the third and fourth pages as display areas, and sets the display areas as target portions for extracting words. In this case, the CPU 102 extracts keywords from the data of the third and fourth pages, not the entire document. Then, using the keyword, illustrations for the third and fourth pages are created and displayed on the display 109. If an illustration has already been displayed for another page, the display on the output screen for that illustration may be updated to display the newly created illustration, or the newly created illustration may be displayed on another output screen. You may make it display with.

また表示領域の特定は、ユーザから明示的に指示を受けたときだけでなく、システムが定めたときにもすることができる。ここでは、表示領域特定部２１４が、表示した文書のデータがページを示すデータを含む場合に、表示するページの変更を検出し、その検出結果に基づいて自動的に表示領域を特定する。このような機能を実現するためにＣＰＵ１０２は文書内容把握支援プログラム１１４の指令に従って、入力装置１１０を用いたアプリケーションプログラム１１２に対するユーザの操作を監視する。例えばボタン３０３や３０４を押す操作のように、表示ページを変更する操作をユーザが行うと、ＣＰＵ１０２はその操作を検出する。表示中のページを表すデータをＲＡＭ１０５に一時的に記録している場合、その操作が行われると、ＣＰＵ１０２はバス１０３を通じてＲＡＭ１０５にアクセスし、そのデータをＲＡＭ１０５から取得する。それまで表示していたページを表すデータをＲＡＭ１０５から取得すると、ＣＰＵ１０２はそのデータに基づいて、次に表示するページを表す数値を計算する。ボタン３０３を押す操作をユーザがすると、ＣＰＵ１０２は表示ページを示す値から、一度に表示するページ数、ここでは「２」を加算することで、次に表示するページを表す数値を計算する。ボタン３０４を押す操作がされていれば、ＣＰＵ１０２は表示ページを示す値に「２」を引くことで、次に表示するページを表す数値を計算する。図３の例のように３ページ目と４ページ目とを表示した状態でボタン３０３を押す操作がされていれば、次に表示するページを表す数値は、５および６である。またボタン３０４を押す操作がされていれば、その数値は１および２である。このようにして表示するページを特定することで、ＣＰＵ１０２は表示領域を特定することができる。ＣＰＵ１０２は、ワードを抽出する対象部としてその表示領域を設定する。この場合、ＣＰＵ１０２は文書全体ではなく、１ページ目および２ページのデータからキーワードを抽出するか、５ページ目および６ページ目のデータからキーワードを抽出する。そしてそのキーワードを使って、１ページ目および２ページ目に対するイラストか、５ページ目および６ページ目に対するイラストを作成し、そのイラストをディスプレイ１０９に表示する。このため、表示するページに変更があれば、自動的にイラストが更新されるか、変更後のページに対するイラストが追加される。 The display area can be specified not only when an instruction is explicitly received from the user, but also when the system determines it. Here, when the displayed document data includes data indicating a page, the display area specifying unit 214 detects a change in the page to be displayed, and automatically specifies the display area based on the detection result. In order to realize such a function, the CPU 102 monitors a user operation on the application program 112 using the input device 110 in accordance with a command from the document content grasp support program 114. For example, when the user performs an operation of changing the display page, such as an operation of pressing the buttons 303 and 304, the CPU 102 detects the operation. When data representing the page being displayed is temporarily recorded in the RAM 105, when the operation is performed, the CPU 102 accesses the RAM 105 through the bus 103 and acquires the data from the RAM 105. When data representing the page that has been displayed is acquired from the RAM 105, the CPU 102 calculates a numerical value representing the page to be displayed next based on the data. When the user presses the button 303, the CPU 102 calculates a numerical value representing the next page to be displayed by adding the number of pages to be displayed at one time, here “2”, from the value indicating the display page. If the user presses the button 304, the CPU 102 calculates a numerical value representing the page to be displayed next by subtracting “2” from the value indicating the display page. If the operation of pressing the button 303 is performed in the state where the third and fourth pages are displayed as in the example of FIG. 3, the numerical values representing the pages to be displayed next are 5 and 6. If the button 304 is pressed, the numerical values are 1 and 2. By specifying the page to be displayed in this way, the CPU 102 can specify the display area. The CPU 102 sets the display area as a target part from which words are extracted. In this case, the CPU 102 extracts a keyword from the data of the first page and the second page, not from the entire document, or extracts the keyword from the data of the fifth page and the sixth page. Then, using the keyword, an illustration for the first page and the second page or an illustration for the fifth page and the sixth page is created, and the illustration is displayed on the display 109. For this reason, if there is a change in the page to be displayed, the illustration is automatically updated, or an illustration for the changed page is added.

上述の対象部は、文書のデータがページを示すデータを含まない場合でも設定することができる。例えばＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａｎｇｕａｇｅ）やＸＭＬ（ｅＸｔｅｎｓｉｂｌｅＭａｒｋｕｐＬａｎｇｕａｇｅ）で記述したデータは、ページの区切りを示すデータを含まない。図１７はＨＴＭＬで記述したウェブページの閲覧システムの概略構成を示す図である。 The target portion described above can be set even when the document data does not include data indicating a page. For example, data described in HTML (HyperText Markup Language) or XML (extensible Markup Language) does not include data indicating page breaks. FIG. 17 is a diagram showing a schematic configuration of a web page browsing system described in HTML.

図１７に示す例では、コンピュータ１０１が通信インターフェイス１１５をさらに備えている。通信インターフェイス１１５はネットワーク１１６を通じて、別のコンピュータ１１７に接続されている。コンピュータ１１７は、ウェブページを公開するウェブサーバコンピュータである。またコンピュータ１０１において、通信インターフェイス１１５はバス１０３に接続されている。コンピュータ１０１はこの通信インターフェイス１１５を用いることでコンピュータ１１７と通信することができる。コンピュータ１１７が公開するウェブページを閲覧する場合には、ウェブブラウザのようなウェブクライアントプログラムをアプリケーションプログラム１１２として利用する。その場合、ＣＰＵ１０２はアプリケーションプログラム１１２の指令に従ってコンピュータ１１７と通信し、コンピュータ１１７からウェブページのデータをダウンロードする。通信インターフェイス１１５がウェブページのデータを受信すると、ＣＰＵ１０２はバス１０３を通じてＨＤＤ１１１にアクセスし、ダウンロードしたウェブページのデータ１１８をＨＤＤ１１１に一時的に記憶する。ＣＰＵ１０２はアプリケーションプログラム１１２の指令に従ってデータ１１８を解釈し、そのウェブページをディスプレイ１０９に表示する。図１８はウェブブラウザの画面の一例を示す。 In the example illustrated in FIG. 17, the computer 101 further includes a communication interface 115. The communication interface 115 is connected to another computer 117 through the network 116. The computer 117 is a web server computer that publishes a web page. In the computer 101, the communication interface 115 is connected to the bus 103. The computer 101 can communicate with the computer 117 by using the communication interface 115. When browsing a web page published by the computer 117, a web client program such as a web browser is used as the application program 112. In that case, the CPU 102 communicates with the computer 117 in accordance with a command from the application program 112 and downloads web page data from the computer 117. When the communication interface 115 receives the web page data, the CPU 102 accesses the HDD 111 through the bus 103 and temporarily stores the downloaded web page data 118 in the HDD 111. The CPU 102 interprets the data 118 in accordance with a command from the application program 112 and displays the web page on the display 109. FIG. 18 shows an example of a web browser screen.

図１８に示すように、ウェブブラウザの画面１８０１には、ＵＲＬ（ＵｎｉｆｏｒｍＲｅｓｏｕｒｃｅＬｏｃａｔｏｒ）を指定するための欄１８０２が配置されている。この例では、「http:www.hhh.com/aaa.html」というＵＲＬを指定することで、上述した文章６０１を含むウェブページが画面１８０１のエリア１８０３に表示されている。このウェブページは、自動車や自動二輪車などの仕組みを説明するページである。そのウェブページでは、説明のありかを示す画像１８０４や１８０５が文章６０１やその他の文章に対応するテキストの間に配置されている。 As shown in FIG. 18, a field 1802 for designating a URL (Uniform Resource Locator) is arranged on the screen 1801 of the web browser. In this example, by specifying the URL “http: www.hhh.com/aaa.html”, a web page including the above-described sentence 601 is displayed in an area 1803 of the screen 1801. This web page is a page for explaining a mechanism of an automobile or a motorcycle. In the web page, images 1804 and 1805 indicating whether there is an explanation are arranged between the texts 601 and the texts corresponding to other texts.

また画面１８０１には、スクロールバー１８０６が用意されている。このスクロールバー１８０６は、ウェブページの表示サイズがエリア１８０３のサイズよりも大きいときに、そのエリア１８０３に表示する部分をユーザが指定するのに用いることができる。ユーザが入力装置１１０を用いてこのスクロールバー１８０６を移動する操作をすると、ＣＰＵ１０２は移動後のスクロールバー１８０６の位置に基づいて計算を行う。その計算結果に従って、ウェブページのユーザが表示した部分をエリア１８０３に表示する。 A scroll bar 1806 is prepared on the screen 1801. The scroll bar 1806 can be used by the user to specify a portion to be displayed in the area 1803 when the display size of the web page is larger than the size of the area 1803. When the user performs an operation of moving the scroll bar 1806 using the input device 110, the CPU 102 performs a calculation based on the position of the scroll bar 1806 after the movement. A portion of the web page displayed by the user is displayed in area 1803 according to the calculation result.

ＣＰＵ１０２は文章内容把握支援プログラム１１４の指令に従って、このような計算結果を利用して、ウェブページのデータのように文書のデータがページを示すデータを含まない場合でも、文書の表示領域を特定する。図１８に示す状態では、エリア１８０３に表示されているテキストは文章６０１のテキストのみである。この場合、ＣＰＵ１０２は文書内容把握支援プログラム１１４の指令に従って、文章６０１に対するイラストを作成する。ここでは、上述した手順に従い、文章６０１に対するイラストとしてイラスト９０１を作成し、そのイラスト９０１を出力画面１００１で表示する。 The CPU 102 uses the calculation result according to the command of the sentence content grasp support program 114 to specify the display area of the document even when the document data does not include the data indicating the page like the data of the web page. . In the state shown in FIG. 18, the text displayed in area 1803 is only the text of sentence 601. In this case, the CPU 102 creates an illustration for the sentence 601 in accordance with a command from the document content grasp support program 114. Here, according to the above-described procedure, an illustration 901 is created as an illustration for the sentence 601, and the illustration 901 is displayed on the output screen 1001.

イラストの作成と文書の表示は別々のコンピュータで行うことができる。図１８の例であれば、イラストの作成はサーバ側のコンピュータ１１７が行い、文書およびその文書に対するイラストの表示はクライアント側のコンピュータ１０１が行うことができる。この場合には、コンピュータ１０１からウェブページの転送要求があると、例えばコンピュータ１１７が、転送するウェブページのデータにスクリプトを書き込む。このスクリプトを含むウェブページをコンピュータ１０１上のウェブブラウザが解釈すると、そのウェブブラウザの指令に従ってコンピュータ１０１は、そのウェブページに対するイラストのデータの転送要求をコンピュータ１１７に送信する。この要求を受信すると、コンピュータ１１７は、上述したような手順に従ってそのイラストのデータを作成し、作成したデータをコンピュータ１０１へ出力する。イラストのデータを受信すると、コンピュータ１０１は、ポップアップ画面やその他の出力画面でそのイラストを表示する。 Illustration creation and document display can be done on separate computers. In the example of FIG. 18, illustration creation can be performed by the server computer 117, and the document and illustrations for the document can be displayed by the client computer 101. In this case, when there is a web page transfer request from the computer 101, for example, the computer 117 writes a script in the data of the web page to be transferred. When the web browser on the computer 101 interprets the web page including the script, the computer 101 transmits a request to transfer illustration data for the web page to the computer 117 in accordance with an instruction from the web browser. Upon receiving this request, the computer 117 creates the illustration data in accordance with the procedure described above, and outputs the created data to the computer 101. When the illustration data is received, the computer 101 displays the illustration on a pop-up screen or other output screen.

表示するイラストは、表示するデータがページを示すデータを含む場合と同様、文書の表示領域に変更があったときに自動的に変更するようにしてもよい。表示するデータがページを示すデータを含まない場合、文書の表示領域の変更は、スクロールバー１８０６の位置の変動に基づいて検出することができる。例えばスクロールバー１８０６が移動する方向のエリア１８０３の長さとスクロールバー１８０６の位置の変動量とをＣＰＵ１０２が比較する。その変動量がその長さを上回ったことをＣＰＵ１０２が検出すると、そのときのスクロールバー１８０６の位置に基づいて文書の表示領域を特定し直す。文書の表示領域を特定し直すと、ＣＰＵ１０２はその表示領域に対してイラストを新たに作成し、新たに作成したイラストを使って出力画面の表示を自動的に更新する。 The illustration to be displayed may be automatically changed when the display area of the document is changed, as in the case where the data to be displayed includes data indicating a page. When the data to be displayed does not include data indicating a page, a change in the display area of the document can be detected based on a change in the position of the scroll bar 1806. For example, the CPU 102 compares the length of the area 1803 in the moving direction of the scroll bar 1806 with the amount of change in the position of the scroll bar 1806. When the CPU 102 detects that the fluctuation amount exceeds the length, the display area of the document is specified again based on the position of the scroll bar 1806 at that time. When the display area of the document is specified again, the CPU 102 creates a new illustration for the display area, and automatically updates the display on the output screen using the newly created illustration.

またこのようなシステムにおいて、ワードを抽出する対象部は、表示領域を特定しなくても設定することができる。例えば入力装置１１０を使ってユーザが選択した部分を対象部に設定することができる。 In such a system, the target part from which the word is extracted can be set without specifying the display area. For example, a portion selected by the user using the input device 110 can be set as the target portion.

本発明は、上述したシステムだけでなく、検索システムにも応用することができる。図１９は検索システムのハードウェア構成を説明する図である。 The present invention can be applied not only to the system described above but also to a search system. FIG. 19 is a diagram illustrating the hardware configuration of the search system.

この検索システムでは、コンピュータ１９０１および１９０２がネットワーク１９０３を通じて接続している。ここでは、コンピュータ１９０１は文書の検索サービスを提供するのに用いるコンピュータで、コンピュータ１９０２はそのサービスをユーザが受けるのに利用するコンピュータである。コンピュータ１９０１および１９０２には、汎用のコンピュータを利用することができる。 In this search system, computers 1901 and 1902 are connected through a network 1903. Here, a computer 1901 is a computer used to provide a document search service, and a computer 1902 is a computer used by a user to receive the service. A general-purpose computer can be used as the computers 1901 and 1902.

図１９に示すように、コンピュータ１９０１は、ＣＰＵ１９０４やバス１９０５を備えている。ＣＰＵ１９０４はバス１９０５を通じてＲＯＭ１９０６やＲＡＭ１９０７と接続される。ＲＯＭ１９０６に記憶されたプログラムの指令に従ってコンピュータ１９０１が起動すると、ＣＰＵ１９０４はＲＡＭ１９０７上でＯＳ１９０８の一部または全部を動作させる。 As illustrated in FIG. 19, the computer 1901 includes a CPU 1904 and a bus 1905. The CPU 1904 is connected to the ROM 1906 and the RAM 1907 through the bus 1905. When the computer 1901 is activated in accordance with a program instruction stored in the ROM 1906, the CPU 1904 operates a part or all of the OS 1908 on the RAM 1907.

バス１９０５には通信インターフェイス１９０９も接続されている。通信インターフェイス１９０９はコンピュータ１９０１をネットワーク１９０３に接続する。 A communication interface 1909 is also connected to the bus 1905. A communication interface 1909 connects the computer 1901 to the network 1903.

さらにバス１９０５には、ＨＤＤ１９１０も接続されている。このＨＤＤ１９１０には、文書検索プログラム１９１１のファイルやその他のファイルが格納される。文書検索プログラム１９１１は、文書を管理したり文書を検索したりするのに用いるプログラムである。また文書検索プログラム１９１１は、上述の文書内容把握支援プログラム１１４と同様に、文書内容の把握を支援するのに用いることができる。ここではさらに、ウェブサーバのように他のコンピュータと通信するのにも用いる。コンピュータ１９０１の起動時などサービスを開始するとき、ＣＰＵ１９０４はＯＳ１９０８の指令に従ってＨＤＤ１９１０からその文書検索プログラム１９１１のファイルを読み出し、ＲＡＭ１９０７上で文書検索プログラム１９１１を動作させる。 Further, an HDD 1910 is also connected to the bus 1905. The HDD 1910 stores a document search program 1911 file and other files. The document search program 1911 is a program used for managing documents and searching for documents. The document search program 1911 can be used to support grasping of the document contents, like the document content grasping support program 114 described above. It is also used here to communicate with other computers, such as a web server. When starting a service such as when the computer 1901 is started up, the CPU 1904 reads the file of the document search program 1911 from the HDD 1910 in accordance with a command from the OS 1908 and operates the document search program 1911 on the RAM 1907.

この文書検索プログラム１９１１の指令に従ってコンピュータ１９０１が動作することにより、コンピュータ１９０１は図２０に示すように、通信処理部２００１、文書管理部２００２、インデックステーブル２００３、文書データベース２００４、キーワード抽出部２００５、画像データベース２００６、画像管理テーブル２００７、画像検索部２００８、画像データ取得部２００９、イラスト作成部２０１０、および文書一覧出力部２０１１を備える。 As the computer 1901 operates in accordance with the command of the document search program 1911, the computer 1901 can be connected to a communication processing unit 2001, a document management unit 2002, an index table 2003, a document database 2004, a keyword extraction unit 2005, an image, as shown in FIG. The database 2006 includes an image management table 2007, an image search unit 2008, an image data acquisition unit 2009, an illustration creation unit 2010, and a document list output unit 2011.

通信処理部２００１は、文書の検索サービスを提供するために、他のコンピュータとの通信に関する処理をする。ここでは、コンピュータ１９０２からの検索要求を受け付けたり、その要求に対する応答をコンピュータ１９０２に送信したりする。このような機能を実現するため、文書検索プログラム１９１１の指令に従ってＣＰＵ１９０４は処理を実行し、通信インターフェイス１９０９を用いてコンピュータ１９０２と通信する。 A communication processing unit 2001 performs processing related to communication with another computer in order to provide a document search service. Here, a search request from the computer 1902 is accepted, or a response to the request is transmitted to the computer 1902. In order to realize such a function, the CPU 1904 executes processing in accordance with a command from the document search program 1911 and communicates with the computer 1902 using the communication interface 1909.

文書管理部２００２は、文書の管理や検索をする。文書の検索は、書誌事項のような属性データについて行ってもよいし、文書の本文データについて行ってもよい。文書データは、テキスト形式かバイナリ形式の文字データを含む。新聞や雑誌、論文、書籍、法令集、業務書類の電子データやウェブページのデータのように文字データを含んでいれば、図形データやその他のデータを含んでいてもよい。バイナリ形式のデータの場合、テキスト形式の文字データを抽出する必要がある。本文データについて全文検索をするのであれば、Ｎ文字インデックス方式のような既に知られた検索方式を用いることができる。そのために、文書管理部２００２がインデックステーブル２００３を作成するようにしてもよい。 A document management unit 2002 manages and retrieves documents. The document search may be performed on attribute data such as bibliographic items, or may be performed on text data of the document. The document data includes text data or binary character data. As long as it includes character data such as electronic data of newspapers, magazines, papers, books, laws and regulations, business documents, and data of web pages, graphic data and other data may be included. In the case of binary data, it is necessary to extract text data. If the full text search is performed on the text data, a known search method such as the N character index method can be used. Therefore, the document management unit 2002 may create the index table 2003.

インデックステーブル２００３は、索引キーとその索引キーを付与した文書のファイルを特定するＩＤデータとを含むインデックスを記憶している。また文書データベース２００４は、索引キーを付与した文書データのファイルを蓄積する。ここでは、インデックステーブル２００３や文書データベース２００４をＨＤＤ１９１０上に構築している。 The index table 2003 stores an index including an index key and ID data that identifies a file of a document to which the index key is assigned. The document database 2004 accumulates document data files to which index keys are assigned. Here, an index table 2003 and a document database 2004 are built on the HDD 1910.

文書管理部２００２は、指定された条件に従って、文書データベース２００４に蓄積された文書のデータを検索する。ここでは、コンピュータ１９０２から通信インターフェイス１９０９が検索要求を受信すると、プログラム１９１１の指令に従ってＣＰＵ１９０４がその検索要求から、コンピュータ１９０２のユーザが指定した条件を抽出する。文書の本文データについて検索条件がその本文データに含まれるワードを指定する場合、ＣＰＵ１９０４はバス１９０５を通じてＨＤＤ１９１０にアクセスし、その検索ワードに対応するＩＤデータをインデックステーブル２００３から取得する。取得したＩＤデータからリストを作成し、そのリストをＨＤＤ１９１０に一時的に記憶する。 The document management unit 2002 searches the document data stored in the document database 2004 according to the specified condition. Here, when the communication interface 1909 receives a search request from the computer 1902, the CPU 1904 extracts conditions specified by the user of the computer 1902 from the search request in accordance with an instruction of the program 1911. When the search condition for the text data of the document specifies a word included in the text data, the CPU 1904 accesses the HDD 1910 through the bus 1905 and acquires ID data corresponding to the search word from the index table 2003. A list is created from the acquired ID data, and the list is temporarily stored in the HDD 1910.

キーワード抽出部２００５は、検索した文書のデータから、その文書に含まれるワードを複数抽出する。ここでは、検索ワードに対応するＩＤデータを取得すると、ＣＰＵ１９０４がＨＤＤ１９１０にアクセスし、取得したＩＤデータによって特定される文書データのファイルから文字データを取得する。ＣＰＵ１９０４は、文書データに含まれる文字データから、複数のキーワードを抽出する。文書の検索条件が文書のデータに含まれるワードを指定する場合、その指定されたワードも、キーワードの抽出に用いることができる。その場合、ＣＰＵ１９０４は、文書データに含まれる文字データから複数のワードを選択し、選択したワードのうち、指定されたワードに対応するワードをキーワードとして抽出する。 The keyword extraction unit 2005 extracts a plurality of words included in the document from the retrieved document data. Here, when ID data corresponding to a search word is acquired, the CPU 1904 accesses the HDD 1910 and acquires character data from a file of document data specified by the acquired ID data. The CPU 1904 extracts a plurality of keywords from the character data included in the document data. When a document search condition specifies a word included in document data, the specified word can also be used for keyword extraction. In that case, the CPU 1904 selects a plurality of words from the character data included in the document data, and extracts a word corresponding to the designated word from the selected words as a keyword.

画像データベース２００６、画像管理テーブル２００７、画像検索部２００８、画像データ取得部２００９、およびイラスト作成部２０１０は、図２で示した画像データベース２０５、画像管理テーブル２０６、画像検索部２０７、画像データ取得部２０８、およびイラスト作成部２０９と基本的に同様である。ＣＰＵ１９０４はプログラム１９１１の指令に従い、上述の手順で、抽出したキーワードから文書のイラストを作成する。検索した文書が複数ある場合には、それぞれの文書に対しイラストを作成する。文書毎に作成したイラストは、ＨＤＤ１９１０に一時的に記憶する。また作成したイラストのファイルにもそのファイルを特定するＩＤデータを与え、ＨＤＤ１９１０に記憶したリストのＩＤデータに対応付ける。 The image database 2006, the image management table 2007, the image search unit 2008, the image data acquisition unit 2009, and the illustration creation unit 2010 are the image database 205, the image management table 206, the image search unit 207, and the image data acquisition unit shown in FIG. 208 and the illustration creation unit 209 are basically the same. The CPU 1904 creates an illustration of the document from the extracted keyword according to the above-described procedure according to the command of the program 1911. If there are a plurality of retrieved documents, an illustration is created for each document. The illustration created for each document is temporarily stored in the HDD 1910. Also, ID data for specifying the file is given to the created illustration file, and is associated with the ID data of the list stored in the HDD 1910.

文書一覧出力部２０１１は、検索した文書毎に作成したイラストを用いて、検索した文書の一覧のデータを通信処理部２００１に出力する。通信処理部２００１は、そのデータを検索要求元のコンピュータ１９０２に送信する。この機能を実現するため、ここではＣＰＵ１９０４がプログラム１９１１の指令に従って、ＨＤＤ１９１０からリストを取得する。ＣＰＵ１９０４は、取得したリストとそのリストに含まれるＩＤデータから、検索した文書の一覧を表示するデータを作成する。このデータは、イラストのデータの他、文書の題名や種類、作者、発行・発表日、ファイルの所在のような文書の属性データを含む。ＣＰＵ１９０４は、このデータから、コンピュータ１９０２への通信データを作成し、通信インターフェイス１９０９を用いてコンピュータ１９０２に送信する。ネットワーク１９０３を介して通信データを受信すると、コンピュータ１９０２はその受信データに基づいて、文書の一覧を表示する。図２１は文書一覧の表示画面の一例を示す。 The document list output unit 2011 outputs data of a list of searched documents to the communication processing unit 2001 using an illustration created for each searched document. The communication processing unit 2001 transmits the data to the search request source computer 1902. In order to realize this function, the CPU 1904 acquires a list from the HDD 1910 in accordance with an instruction of the program 1911 here. The CPU 1904 creates data for displaying a list of retrieved documents from the acquired list and the ID data included in the list. In addition to illustration data, this data includes document attribute data such as document title and type, author, publication / announcement date, and file location. The CPU 1904 creates communication data for the computer 1902 from this data, and transmits it to the computer 1902 using the communication interface 1909. When communication data is received via the network 1903, the computer 1902 displays a list of documents based on the received data. FIG. 21 shows an example of a document list display screen.

文書一覧の表示画面２１０１は、検索条件の入力欄２１０２や検索実行ボタン２１０３、一覧表示部２１０４を有する。この例における検索条件の入力欄２１０２は、コンピュータ１９０２でユーザが検索キーワードを入力したり変更したりするのに用いられる。検索実行ボタン２１０３はユーザが検索実行をコンピュータ１９０１に指示するのに用いることができる。一覧表示部２１０４は、検索した文書の一覧を表示する。 The document list display screen 2101 includes a search condition input field 2102, a search execution button 2103, and a list display unit 2104. The search condition input field 2102 in this example is used when a user inputs or changes a search keyword on the computer 1902. A search execution button 2103 can be used when the user instructs the computer 1901 to execute a search. A list display unit 2104 displays a list of searched documents.

その一覧は、検索した文書のそれぞれについて、属性データ２１０５と、対応するイラスト２１０６とを含む。この例では、属性データのうち、文書の題名と作者を表示している。文書の題名には、本文データへのリンク２１０７が付されている。ユーザがコンピュータ１９０２でこのリンク２１０７に対し操作すると、コンピュータ１９０２はその本文データをコンピュータ１９０１に要求する。コンピュータ１９０１からその本文データを受信すると、コンピュータ１９０２はそのデータを表示する。図２１の例のように文書の一覧にイラストを含めず、本文データを表示したときにその本文データとともに、対応するイラストを表示するようにしてもよい。 The list includes attribute data 2105 and a corresponding illustration 2106 for each retrieved document. In this example, the title and author of the document are displayed in the attribute data. A link 2107 to text data is attached to the title of the document. When the user operates the link 2107 on the computer 1902, the computer 1902 requests the computer 1901 for the text data. When the text data is received from the computer 1901, the computer 1902 displays the data. As in the example of FIG. 21, illustrations may not be included in the document list, and corresponding text may be displayed together with the text data when the text data is displayed.

このように文書にイラストを対応付けて表示することで、ユーザは検索された文書の内容を簡単且つ迅速に把握することができる。このため、多数の文書が検索された場合でも、必要な文書を効率良く選別することができる。 By displaying the illustration in association with the document in this way, the user can easily and quickly grasp the contents of the retrieved document. Therefore, even when a large number of documents are searched, necessary documents can be efficiently selected.

またこのような検索システムは、論文や雑誌、インターネットの情報だけでなく、特許や実用新案、意匠、その他の知的財産に関する公報を検索するシステムにも利用することができる。特許公開公報であれば、その公報の属性データは出願人やＩＰＣ、出願番号、出願日、公開番号、公開日、発明者、発明の名称のような項目のデータを含む。コンピュータ１９０２のユーザはこのような項目の一つまたは複数について検索条件を指定することができる。属性データに対するこのような項目データを検索条件としてユーザが指定すると、コンピュータ１９０１は、指定された項目について公報の属性データを検索する。 Such a search system can be used not only for information on papers, magazines, and the Internet, but also for systems that search for patents, utility models, designs, and other gazettes related to intellectual property. If it is a patent publication gazette, the attribute data of the gazette includes data of items such as the applicant, IPC, application number, application date, publication number, publication date, inventor, name of the invention. A user of the computer 1902 can specify search conditions for one or more of these items. When the user designates such item data for the attribute data as a search condition, the computer 1901 retrieves the attribute data of the publication for the designated item.

また本文データは、願書に添付された明細書や特許請求の範囲、要約書それぞれの書類について別個に用意するようにしてもよい。一部の書類のみを指定するため、これらの書類の書類名を検索条件の項目に含めるようにしてもよい。さらに特許請求の範囲の一部を指定するため、請求項番号のようなデータを検索条件の項目に含めるようにしてもよい。また明細書の一部の記載事項を指定するため、「背景技術」や「発明が解決しようとする課題」、「課題を解決するための手段」のような明細書中の見出しを検索条件の項目に含めるようにしてもよい。本文データに対するこのような項目を検索条件としてユーザが指定すると、コンピュータ１９０１は、指定された項目について公報の本文データを検索する。 Further, the text data may be prepared separately for each of the specification, claims, and abstract documents attached to the application. Since only some documents are designated, the document names of these documents may be included in the search condition item. Furthermore, in order to designate a part of the claims, data such as claim numbers may be included in the items of the search conditions. In addition, in order to specify a part of the description, a headline in the specification such as “background art”, “problem to be solved by the invention”, “means for solving the problem” is used as a search condition. You may make it include in an item. When the user specifies such an item for the text data as a search condition, the computer 1901 searches the text data of the publication for the specified item.

属性データや本文データについて公報を検索すると、上述の通り、コンピュータ１９０１は、検索した公報毎にイラストを作成する。コンピュータ１９０１は、検索した公報毎に作成したイラストを用いて、検索した公報の一覧を表示するデータを作成し、そのデータを検索要求元のコンピュータ１９０２に送信する。コンピュータ１９０２は、ネットワーク１９０３を介してデータを受信すると、そのデータに基づいて、検索した公報の一覧を表示する。図２２は公報一覧の表示画面の一例を示す。 When a gazette is searched for attribute data and text data, as described above, the computer 1901 creates an illustration for each searched gazette. The computer 1901 creates data for displaying a list of searched publications using an illustration created for each searched publication, and transmits the data to the search requesting computer 1902. When the computer 1902 receives data via the network 1903, the computer 1902 displays a list of searched publications based on the data. FIG. 22 shows an example of a publication list display screen.

この公報一覧の表示画面２２０１は、検索条件表示部２２０２や一覧表示部２２０３を有する。検索条件表示部２２０２は、ユーザが指定した検索条件を表示するのに用いられる。一覧表示部２２０３は、検索した公報の一覧を表示する。 The publication list display screen 2201 includes a search condition display unit 2202 and a list display unit 2203. The search condition display unit 2202 is used to display a search condition designated by the user. A list display unit 2203 displays a list of searched publications.

一覧は、検索した公報のそれぞれについて、例えば公報番号２２０４と、その公報の代表図面２２０５と、対応するイラスト２２０６とを含む。公報番号２２０４には、本文データへのリンク２２０７が付されている。ユーザがコンピュータ１９０２でこのリンク２２０７に対し操作すると、コンピュータ１９０２はその公報の本文データをコンピュータ１９０１に要求する。コンピュータ１９０１からその公報の本文データを受信すると、コンピュータ１９０２はそのデータを表示する。図２２の例のように公報の一覧に、代表図面２２０５やイラスト２２０６を含めず、本文データを表示したときにその本文データとともに、代表図面およびイラストを表示するようにしてもよい。 The list includes, for example, a publication number 2204, a representative drawing 2205 of the publication, and a corresponding illustration 2206 for each searched publication. The publication number 2204 has a link 2207 to the text data. When the user operates the link 2207 with the computer 1902, the computer 1902 requests the computer 1901 for the text data of the publication. When the text data of the publication is received from the computer 1901, the computer 1902 displays the data. As in the example of FIG. 22, the representative drawing 2205 and the illustration 2206 may not be included in the list of publications, and when the text data is displayed, the representative drawing and the illustration may be displayed together with the text data.

ユーザは、代表図面２２０５およびイラスト２２０６から、公報の内容を容易に把握することができる。イラスト２２０６は、本文データから作成するので、代表図面と類似することもあれば、大きく相違することもある。いずれにしても、代表図面だけでは得られない情報をユーザはイラストを一見して得ることができる。このため、公報が多数検索された場合でも、必要な公報を効率良く選別することが可能となる。 The user can easily grasp the contents of the publication from the representative drawing 2205 and the illustration 2206. Since the illustration 2206 is created from the text data, it may be similar to the representative drawing or may be significantly different. In any case, the user can obtain information that cannot be obtained only by the representative drawing at a glance. For this reason, even when a large number of publications are searched, it becomes possible to efficiently select necessary publications.

本発明は、コンピュータを用いたファイル管理システムにも応用することができる。このコンピュータには、他のシステムと同様、汎用のコンピュータを利用することができる。ここでは、そのコンピュータとして、図１で示したコンピュータ１０１を用いる。 The present invention can also be applied to a file management system using a computer. As this computer, a general-purpose computer can be used like other systems. Here, the computer 101 shown in FIG. 1 is used as the computer.

図２３に示すように、この例では、コンピュータ１０１のＨＤＤ１１１に、ファイル管理モジュール２３０１のファイルやその他のファイルが格納される。ファイル管理モジュール２３０１は、ファイルの格納場所を階層的に管理する機能をＯＳ１０６に提供するために用いる。このファイル管理モジュール２３０１を利用することで、ＯＳ１０６は、文書のデータを含むファイルについてその文書内容をユーザが把握するのを支援する機能も得ることができる。コンピュータ１０１が起動するとき、ＣＰＵ１０２はＯＳ１０６の指令に従ってＨＤＤ１１１からファイル管理モジュール２３０１のファイルをＲＡＭ１０５に読み出し、そのファイル管理モジュール２３０１の提供する機能を実現する。 As shown in FIG. 23, in this example, the file of the file management module 2301 and other files are stored in the HDD 111 of the computer 101. The file management module 2301 is used to provide the OS 106 with a function for hierarchically managing file storage locations. By using this file management module 2301, the OS 106 can also obtain a function for assisting the user to grasp the document contents of a file including document data. When the computer 101 is activated, the CPU 102 reads the file of the file management module 2301 from the HDD 111 to the RAM 105 in accordance with a command from the OS 106 and realizes a function provided by the file management module 2301.

ファイル管理モジュール２３０１を用いたＯＳ１０６の指令に従ってコンピュータ１０１が動作することにより、コンピュータ１０１は図２４に示すように、ユーザインターフェイス部２４０１、ファイル管理部２４０２、ファイル格納部２４０３、ファイル特定部２４０４、キーワード抽出部２４０５、画像データベース２４０６、画像管理テーブル２４０７、画像検索部２４０８、画像データ取得部２４０９、イラスト作成部２４１０、およびファイル一覧出力部２４１１を備える。 When the computer 101 operates in accordance with an instruction from the OS 106 using the file management module 2301, the computer 101 can be configured as shown in FIG. 24 by a user interface unit 2401, a file management unit 2402, a file storage unit 2403, a file specifying unit 2404, a keyword, An extraction unit 2405, an image database 2406, an image management table 2407, an image search unit 2408, an image data acquisition unit 2409, an illustration creation unit 2410, and a file list output unit 2411 are provided.

ユーザインターフェイス部２４０１は、ＧＵＩを提供する点で図２で示したユーザインターフェイス部２０１と基本的に同様である。ユーザインターフェイス部２４０１は、ユーザが入力装置１０１を用いてファイルの格納場所を指定したりファイルを選択したり、ディスプレイ１０９でファイルの一覧を表示するためのＧＵＩを提供する。 The user interface unit 2401 is basically the same as the user interface unit 201 shown in FIG. 2 in that it provides a GUI. The user interface unit 2401 provides a GUI for the user to specify a file storage location, select a file, or display a list of files on the display 109 using the input device 101.

ファイル管理部２４０２は、ファイルの格納場所を階層的に管理する。ここでは、ファイル管理部２４０２は、ファイル格納部２４０３に格納したファイルを管理する。ファイル格納部２４０３としてＨＤＤ１１１の記憶領域を利用することができる。ファイル格納部２４０３には、ＯＳ１０６やドライバ、アプリケーションに必要なファイルや、ユーザが作成したりネットワークを通じて他のコンピュータから取得したりしたファイルなど、多数のファイルが通常格納される。それらのファイルは、テキスト形式やバイナリ形式のデータを含む。このようなファイルを管理するため、ＨＤＤ１１１の記憶領域にはファイル格納部２４０３に加えて管理領域が設けられる。この管理領域は、階層的なディレクトリを表すデータを格納する。ＯＳ１０６の指令に従ってＣＰＵ１０２は、その管理領域のデータを用いてファイルの格納場所を階層的に管理する。 The file management unit 2402 manages file storage locations hierarchically. Here, the file management unit 2402 manages the files stored in the file storage unit 2403. A storage area of the HDD 111 can be used as the file storage unit 2403. The file storage unit 2403 normally stores a large number of files such as files necessary for the OS 106, drivers, and applications, and files created by users or acquired from other computers through a network. These files include text and binary data. In order to manage such files, a management area is provided in the storage area of the HDD 111 in addition to the file storage unit 2403. This management area stores data representing a hierarchical directory. In accordance with a command from the OS 106, the CPU 102 hierarchically manages file storage locations using data in the management area.

ファイル特定部２４０４は、選択した場所に格納されたファイルから、文書のデータを含むファイルを特定する。ファイルの格納場所の選択は、ユーザインターフェイス部２４０１の提供するファイル操作画面を使ってユーザがすることができる。図２５はファイル操作画面の一例を示す。ファイル操作画面２５０１は、ディレクトリ領域２５０２とファイル領域２５０３とを有する。ディレクトリ領域２５０２は、ここでは、ルートディレクトリおよびそれより下位の階層にあるフォルダを表示する。この例では、ルートディレクトリ２５０４、および「Documents」という名称のフォルダ２５０５を図示している。ユーザは、入力装置１１０を用いてポインタ２５０６を操作することで、ディレクトリ領域２５０２に表示されたフォルダを選択する指示をコンピュータ１０１に与えることができる。ファイル領域２５０３は、ここでは、選択されたフォルダにあるファイルのアイコンおよび名称を表示する。この例では、フォルダ２５０５にあるファイル２５０７乃至２５０９を図示している。ユーザは、入力装置１１０を用いてポインタ２５０６を操作することで、ファイル領域２５０３に表示されたファイルを選択する指示をコンピュータ１０１に与えることができる。選択されたファイルに対し実行や削除、名称変更などの指示をユーザがすると、ＣＰＵ１０２は、ＯＳ１０６の指令に従ってユーザの指示に対応する処理を実行する。またファイル操作画面２５０１は、イラストボタン２５１０も有している。このボタン２５１０を押す操作をユーザがすると、ＣＰＵ１０２は、ファイル領域２５０３の表示形式を変更する。例えばその変更のとき、ＣＰＵ１０２は、選択されているフォルダを特定する。選択されているフォルダを特定すると、そのフォルダに格納されているファイルから、文書のデータを含むファイルを特定する。この特定は、ファイルの種類に従ってすることができる。文書のデータを含むファイルがそのフォルダに格納されていないときは、ＣＰＵ１０２は処理を中止してもよい。図２５の例では、ファイル２５０７および２５０８が文書のデータを含み、ファイル２５０９が文書のデータを含まない。この場合、ＣＰＵ１０２は、文書のデータを含むファイルとしてファイル２５０７および２５０８を特定する。 The file specifying unit 2404 specifies a file including document data from the files stored in the selected location. The user can select a file storage location using a file operation screen provided by the user interface unit 2401. FIG. 25 shows an example of a file operation screen. The file operation screen 2501 has a directory area 2502 and a file area 2503. Here, the directory area 2502 displays the root directory and folders in a lower hierarchy. In this example, a root directory 2504 and a folder 2505 named “Documents” are illustrated. The user can give the computer 101 an instruction to select a folder displayed in the directory area 2502 by operating the pointer 2506 using the input device 110. Here, the file area 2503 displays icons and names of files in the selected folder. In this example, files 2507 to 2509 in the folder 2505 are illustrated. The user can give the computer 101 an instruction to select a file displayed in the file area 2503 by operating the pointer 2506 using the input device 110. When the user gives an instruction such as execution, deletion, or name change to the selected file, the CPU 102 executes processing corresponding to the user's instruction in accordance with an instruction from the OS 106. The file operation screen 2501 also has an illustration button 2510. When the user presses the button 2510, the CPU 102 changes the display format of the file area 2503. For example, when the change is made, the CPU 102 specifies the selected folder. When the selected folder is specified, a file including document data is specified from the files stored in the folder. This specification can be made according to the file type. When a file including document data is not stored in the folder, the CPU 102 may stop the processing. In the example of FIG. 25, the files 2507 and 2508 include document data, and the file 2509 does not include document data. In this case, the CPU 102 specifies the files 2507 and 2508 as files including document data.

キーワード抽出部２４０５は、特定したファイルの文書のデータに基づいて、その文書に含まれるワードを複数抽出する。ここでは、特定したファイルのデータをファイル格納部２４０３から取得し、そのファイルの文書のデータに基づいて、その文書に含まれるワードを複数抽出する。この機能を実現するため、ＯＳ１０６の指令に従ってＣＰＵ１０２は、特定したファイルのデータをＨＤＤ１１１から読み出す。 The keyword extraction unit 2405 extracts a plurality of words included in the document based on the document data of the specified file. Here, the data of the specified file is acquired from the file storage unit 2403, and a plurality of words included in the document are extracted based on the document data of the file. In order to realize this function, the CPU 102 reads the data of the specified file from the HDD 111 in accordance with a command from the OS 106.

画像データベース２４０６、画像管理テーブル２４０７、画像検索部２４０８、画像データ取得部２４０９、およびイラスト作成部２４１０は、図２で示した画像データベース２０５、画像管理テーブル２０６、画像検索部２０７、画像データ取得部２０８、およびイラスト作成部２０９と基本的に同様である。ＣＰＵ１０２はＯＳ１０６の指令に従い、上述の手順で、抽出したキーワードから文書のイラストを作成する。特定したファイルが複数ある場合には、それぞれのファイルに対しイラストを作成する。 The image database 2406, the image management table 2407, the image search unit 2408, the image data acquisition unit 2409, and the illustration creation unit 2410 are the image database 205, the image management table 206, the image search unit 207, and the image data acquisition unit shown in FIG. 208 and the illustration creation unit 209 are basically the same. The CPU 102 creates an illustration of the document from the extracted keyword according to the above-described procedure according to the command of the OS 106. If there are multiple identified files, an illustration is created for each file.

ファイル一覧出力部２４１１は、特定したファイル毎に作成したイラストを用いて、選択したファイルの一覧を出力する。ここでは、ユーザインターフェイス部２４０１の提供するファイル操作画面２５０１を通じてそのファイルの一覧を出力する。ファイルの一覧を出力するため、ＣＰＵ１０２は、特定したファイル毎にイラストを作成すると、そのイラストから、対応するファイルのアイコン画像を作成する。アイコン画像を作成すると、そのアイコン画像を用いて、ファイル操作画面２５０１のファイル領域２５０３に表示しているファイルのアイコンを更新する。このようにアイコンの表示を更新することで、ファイル一覧の表示形式の変更が完了する。図２６は表示形式が変更されたファイル操作画面の一例を示す。 The file list output unit 2411 outputs a list of selected files using illustrations created for each identified file. Here, a list of the files is output through the file operation screen 2501 provided by the user interface unit 2401. In order to output a list of files, when the CPU 102 creates an illustration for each identified file, the CPU 102 creates an icon image of the corresponding file from the illustration. When the icon image is created, the icon of the file displayed in the file area 2503 of the file operation screen 2501 is updated using the icon image. By updating the icon display in this manner, the change of the file list display format is completed. FIG. 26 shows an example of a file operation screen whose display format has been changed.

表示形式の変更後には、図２６に示すように、ファイル領域２５０３に表示されたファイルの一部のアイコンが更新される。この例では、文書のデータを含まないファイル２５０９のファイルのアイコンは変更されていない。文書のデータを含むファイル２５０７および２５０８のファイルのアイコンが、対応するイラストの画像を用いて更新されている。 After the change of the display format, as shown in FIG. 26, some icons of the files displayed in the file area 2503 are updated. In this example, the file icon of the file 2509 not including document data is not changed. The file icons of the files 2507 and 2508 including the document data are updated using the corresponding illustration images.

このようにファイル一覧の表示形式を変更することで、ユーザは文書データを含むファイルを容易に見分けることができ、またその文書の内容を簡単且つ迅速に把握することが可能となる。 By changing the display format of the file list in this way, the user can easily identify the file containing the document data, and can easily and quickly grasp the contents of the document.

上述のようなファイル管理システムにおいて、例えば作成日時や、選択した格納場所に対し属性データが与えられているときには、キーワードの抽出にその属性データを利用することができる。 In the file management system as described above, for example, when attribute data is given to a creation date and time or a selected storage location, the attribute data can be used for keyword extraction.

このようなコンピュータシステムにおける文書内容把握支援プログラム１１４や文書検索プログラム１９１１、ファイル管理モジュール２３０１は、コンピュータ１０１や１９０１に限らず、他のコンピュータや、携帯端末や携帯電話のような機器で動作させることも可能である。イラストのデータ形式にＳＶＧ（ＳｃａｌｅＶｅｃｔｏｒＧｒａｐｈｉｃｓ）のようなベクター形式を用いれば、機器の表示領域に応じてイラストの表示サイズを円滑に変更することができる。また携帯端末や携帯電話のように、長い文章を表示するには表示領域が狭い機器でも、イラストを使うことで、文章全体の内容を簡単且つ迅速に把握することが可能となる。図２７は携帯電話でイラストを表示した一例を示す。 The document content grasping support program 114, the document search program 1911, and the file management module 2301 in such a computer system are not limited to the computer 101 and 1901, and are operated on other computers, devices such as mobile terminals and mobile phones. Is also possible. If a vector format such as SVG (Scale Vector Graphics) is used for the data format of the illustration, the display size of the illustration can be smoothly changed according to the display area of the device. Further, even with a device having a narrow display area for displaying a long sentence, such as a mobile terminal or a mobile phone, it is possible to easily and quickly grasp the contents of the whole sentence by using an illustration. FIG. 27 shows an example of an illustration displayed on a mobile phone.

携帯電話２７０１は、液晶ディスプレイ２７０２や、操作ボタン２７０３および２７０４を有している。携帯電話２７０１は、上述のプログラムやモジュールに相当するプログラムの指令に従って動作することにより、そのディスプレイ２７０２にイラストを表示する。この例では、ディスプレイ２７０２にイラスト９０１を表示している。例えばユーザがボタン２７０３を用いてページを変更する指示を携帯電話２７０１に与えると、携帯電話２７０１はそのプログラムの指令に従って、変更先のページに対応するイラストを作成し、新たに作成したイラストをディスプレイ２７０２に表示する。限られた領域で識別可能に文章を表示するよりも、認識可能にイラストを表示する方が容易である。ユーザは、比較的狭い表示領域でも、文章の内容を簡単且つ迅速に把握することができる。さらにイラストを用いれば、テキスト全体をスクロールして表示するよりも簡単にページを進めることができる。例えばボタン２７０４を押す操作をユーザがすることで、ページのテキストを表示するか、イラストを表示するかを切り換える。イラストを順次更新することでページを進め、テキストが読みたくなれば、ユーザは表示を切り換える。これによって、ユーザは、必要なテキストのみを読むことができる。 A cellular phone 2701 has a liquid crystal display 2702 and operation buttons 2703 and 2704. The mobile phone 2701 displays an illustration on its display 2702 by operating according to the commands of the programs corresponding to the above-described programs and modules. In this example, an illustration 901 is displayed on the display 2702. For example, when the user gives an instruction to change the page using the button 2703 to the mobile phone 2701, the mobile phone 2701 creates an illustration corresponding to the page to be changed in accordance with an instruction of the program, and displays the newly created illustration. 2702 is displayed. It is easier to display an illustration in a recognizable manner than to display a sentence in a limited area so that it can be identified. The user can easily and quickly grasp the contents of the sentence even in a relatively narrow display area. In addition, if you use illustrations, you can move forward more easily than scrolling through the entire text. For example, when the user performs an operation of pressing a button 2704, the user switches between displaying text on the page and displaying an illustration. If the user wants to read the text by sequentially updating the illustrations, the user switches the display. This allows the user to read only the necessary text.

文書内容把握支援プログラム１１４や文書検索プログラム１９１１、ファイル管理モジュール２３０１は、インターネットなどの電気通信回線を用いたり、コンピュータ読み取り可能な記録媒体に格納したりすることで、関係者や第三者に提供することができる。例えばプログラムの指令を電気信号や光信号、磁気信号などで表現し、その信号を搬送波に載せて送信することで、同軸ケーブルや銅線、光ファイバのような伝送媒体でそのプログラムを提供することができる。またコンピュータ読取可能な記録媒体としては、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭなどの光学メディアや、フレキシブルディスクのような磁気メディア、フラッシュメモリやＲＡＭのような半導体メモリを利用することができる。 The document content grasp support program 114, the document search program 1911, and the file management module 2301 are provided to related parties and third parties by using an electric communication line such as the Internet or by storing it on a computer-readable recording medium. can do. For example, a program command is expressed by an electric signal, an optical signal, a magnetic signal, etc., and the signal is placed on a carrier wave and transmitted, so that the program is provided on a transmission medium such as a coaxial cable, copper wire, or optical fiber. Can do. As a computer-readable recording medium, optical media such as CD-ROM and DVD-ROM, magnetic media such as a flexible disk, and semiconductor memory such as flash memory and RAM can be used.

また文書内容把握支援プログラム１１４や文書検索プログラム１９１１、ファイル管理モジュール２３０１の指令を複数のコンピュータで分散して処理するようにしてもよい。 Further, the commands of the document content grasp support program 114, the document search program 1911, and the file management module 2301 may be distributed and processed by a plurality of computers.

上述した実施の形態は本発明の技術的範囲を制限するものではなく、既に記載したもの以外でも、本発明の範囲内で種々の変形や応用が可能である。例えば静止画のイラストを用いる代わりに、イラストとしてアニメーション画像や動画を用いることができる。 The embodiments described above do not limit the technical scope of the present invention, and various modifications and applications other than those already described are possible within the scope of the present invention. For example, instead of using a still image illustration, an animation image or a moving image can be used as an illustration.

また上述の説明では、出現頻度の高いワードを優先的にキーワードとして抽出したが、これに限られるものではない。見出しに含まれているワードや強調表示されるワードを優先的にキーワードとして抽出するようにしてもよい。また出現頻度の低いワードを優先的にキーワードとして抽出するようにしてもよい。例えば図を含むような文書で、重要な事項がその図に記述されていれば、出現頻度の低いワードをキーワードとして抽出することで、その図を補完するイラストを作成することができる。また名詞だけでなく、形容詞のような別の品詞のワードも抽出するようにしてもよい。 In the above description, words having a high appearance frequency are preferentially extracted as keywords. However, the present invention is not limited to this. You may make it extract the word contained in the headline, and the word highlighted as a keyword preferentially. Moreover, you may make it extract a word with low appearance frequency preferentially as a keyword. For example, in a document including a figure, if an important matter is described in the figure, an illustration that complements the figure can be created by extracting a word with a low appearance frequency as a keyword. Further, not only nouns but also words of other parts of speech such as adjectives may be extracted.

また文章から抽出したワードに加えてその文章に含まれていないワードに基づいて、イラストに用いる画像を検索するようにしてもよい。例えばワード間の関係を表すデータベースを用意する。そして、文章から抽出したワードに関連するワードをそのデータベースから検索する。その関連するワードも画像を検索するのに用いる。これによってイラストの完成度を高めたり、多様性を増大させたりすることができる。 In addition to the word extracted from the sentence, an image used for the illustration may be searched based on a word not included in the sentence. For example, a database representing the relationship between words is prepared. Then, a word related to the word extracted from the sentence is searched from the database. The associated word is also used to retrieve the image. This can increase the completeness of illustrations and increase diversity.

本発明にかかる文書内容把握支援システム、文書内容把握支援方法および文書内容把握支援プログラムは、文章の内容も含めて文書の内容を簡単かつ迅速に把握することができるという効果を有し、文章閲覧システムのほか検索システムや、ファイル管理システムなどで有用である。 The document content grasping support system, the document content grasping support method, and the document content grasping support program according to the present invention have the effect that the contents of the document including the contents of the sentences can be grasped easily and quickly. It is useful for search systems and file management systems in addition to systems.

閲覧システムのハードウェア構成を示す図Diagram showing the hardware configuration of the browsing system 閲覧システムの機能ブロック図Functional block diagram of browsing system 電子書籍の閲覧画面の一例を示す図The figure which shows an example of the browsing screen of an electronic book 文書内容把握支援プログラムがコンピュータに実行させる手順を示すフローチャートThe flowchart which shows the procedure which a document content grasping support program makes a computer perform キーワードを抽出する手順を示すフローチャートFlow chart showing the procedure for extracting keywords 文章のデータの一例を示す図Figure showing an example of sentence data 画像データベースを説明するための図Illustration for explaining the image database 画像管理テーブルの一例を示す図The figure which shows an example of an image management table 作成したイラストの一例を示す図Figure showing an example of the created illustration イラストの出力画面を説明するための図Illustration to explain the output screen of the illustration 閲覧システムの別の機能ブロック図Another functional block diagram of the browsing system 配置管理テーブルを説明するための図Diagram for explaining the location management table 文章データの別の例を示す図Figure showing another example of sentence data 文書内容把握支援プログラムがコンピュータに実行させる別の手順を示すフローチャートFlowchart showing another procedure executed by the computer for the document content grasping support program 閲覧画面の別の例を示す図Figure showing another example of the browsing screen 閲覧システムのさらに別の機能ブロック図Yet another functional block diagram of the browsing system 閲覧システムの別のハードウェア構成を示す図Diagram showing another hardware configuration of the browsing system ウェブブラウザの画面の一例を示す図Figure showing an example of a web browser screen 検索システムのハードウェア構成を示す図Diagram showing the hardware configuration of the search system 検索システムの機能ブロック図Functional block diagram of search system 検索文書一覧の表示画面の一例を示す図The figure which shows an example of the display screen of a search document list 検索公報一覧の表示画面の一例を示す図Figure showing an example of the search bulletin list display screen ファイル管理システムのハードウェア構成を示す図The figure which shows the hardware constitutions of the file management system ファイル管理システムの機能ブロック図Functional block diagram of file management system ファイル操作画面の一例を示す図Figure showing an example of the file operation screen ファイル操作画面の別の表示形式を説明するための図Figure for explaining another display format of file operation screen 携帯電話でイラストを表示した一例を示す図The figure which shows an example which displayed the illustration with the cellular phone

Explanation of symbols

１１４文書内容把握支援プログラム
２０１、２４０１ユーザインターフェイス部
２０２文書データ取得部
２０３データ解析部
２０４、２００５、２４０５キーワード抽出部
２０５、２００６、２４０６画像データベース
２０６、２００７、２４０７画像管理テーブル
２０７、２００８、２４０８画像検索部
２０８、２００９、２４０９画像データ取得部
２０９、２０１０、２４１０イラスト作成部
２１０イラスト出力部
２１１レイアウト決定部
２１２配置管理テーブル
２１３対象設定部
２１４表示領域特定部
１９１１文書検索プログラム
２００１通信処理部
２００２文書管理部
２００３インデックステーブル
２００４文書データベース
２０１１文書一覧出力部
２４０２ファイル管理部
２４０３ファイル格納部
２４０４ファイル特定部
２４１１１ファイル一覧出力部 114 Document content grasp support program 201, 2401 User interface unit 202 Document data acquisition unit 203 Data analysis unit 204, 2005, 2405 Keyword extraction unit 205, 2006, 2406 Image database 206, 2007, 2407 Image management table 207, 2008, 2408 Image Search unit 208, 2009, 2409 Image data acquisition unit 209, 2010, 2410 Illustration creation unit 210 Illustration output unit 211 Layout determination unit 212 Arrangement management table 213 Target setting unit 214 Display area specification unit 1911 Document search program 2001 Communication processing unit 2002 Document Management unit 2003 Index table 2004 Document database 2011 Document list output unit 2402 File management unit 2403 File storage unit 2404 File identification unit 24111 File list output unit

Claims

Means for extracting a plurality of words contained in the document from the document data;
An image database that stores image data,
A table that associates words with images,
Means for retrieving an image corresponding to each of the extracted words from the table;
Means for obtaining retrieved image data from an image database;
A document content understanding support system comprising means for creating an illustration of the document using the acquired image data, and means for outputting the created illustration.

2. The document content grasping support system according to claim 1, wherein the means for extracting the word extracts the word according to the appearance frequency in the document.

Means for determining the layout of the acquired image data;
2. The document content grasping support system according to claim 1, wherein the means for creating the illustration lays out the acquired image data according to the determination.

The table associates a word with a group to which the word belongs,
2. The document content grasping support system according to claim 1, wherein the means for creating the illustration creates an illustration for each group of extracted words.

5. The document content grasping support system according to claim 4, wherein the means for creating the illustration determines the size relationship between the illustrations according to the appearance frequency of each extracted word group.

Means for setting a target part in a part or all of a document;
2. The document content grasping support system according to claim 1, wherein the word extracting means extracts a plurality of words included in the target part from the set data of the target part.

Further comprising means for displaying the document;
7. The document content grasping support system according to claim 6, wherein the means for setting the target part has means for specifying a display area of the document, and sets the target part in the specified display area.

8. The document content according to claim 7, wherein the means for specifying the display area detects a change of the page to be displayed when the data of the displayed document includes data indicating a page, and specifies the display area based on the result. Grasping support system.

Document database that stores document data,
Document search means for searching stored document data according to specified conditions,
Means for extracting a plurality of words contained in the document based on the retrieved document data;
An image database that stores image data,
A table that associates words with images,
Image search means for searching the image corresponding to each of the plurality of extracted words from the table;
Means for obtaining retrieved image data from an image database;
A retrieval system comprising means for creating an illustration of the document using the acquired image data, and means for outputting a list of retrieved documents using the illustration created for each retrieved document.

10. The search system according to claim 9, wherein the means for extracting the word selects a word to be extracted based on the designated word when the document search condition designates a word included in the document data.

Means for hierarchically managing file storage locations,
A means for identifying a file containing document data from files stored in a selected location;
Means for extracting a plurality of words contained in the document based on the document data of the identified file;
An image database that stores image data,
A table that associates words with images,
Means for retrieving an image corresponding to each of the extracted words from the table;
Means for obtaining retrieved image data from an image database;
A file management system comprising means for creating an illustration of the document using the acquired image data, and means for outputting a list of files stored in the selected location using the illustration created for each identified file .

A method for supporting the grasp of the contents of a document using a computer,
A procedure for the computer to extract multiple words contained in the document from the document data,
A procedure for a computer to search for images corresponding to a plurality of extracted words from a table associating words with images,
Procedure for the computer to retrieve the retrieved image data from the image database that stores the image data,
A document content comprehension support method that includes a procedure for a computer to create an illustration of the document using the acquired image data, and a procedure for the computer to output the created illustration.

A program for causing a computer to execute the procedure according to claim 12.

A computer-readable recording medium on which the program according to claim 13 is recorded.