JP2012226492A

JP2012226492A - Document information providing device, document browsing terminal and method, and computer program

Info

Publication number: JP2012226492A
Application number: JP2011092288A
Authority: JP
Inventors: Kazuhiko Yamashita; 和彦山下; Masaki Kugimiya; 政紀釘宮
Original assignee: Magic Software Japan KK
Current assignee: Magic Software Japan KK
Priority date: 2011-04-18
Filing date: 2011-04-18
Publication date: 2012-11-15

Abstract

PROBLEM TO BE SOLVED: To provide a device or a terminal configured such that a document consisting of only pages including a specific keyword is created as a document in data format, only a desired place to be read can be read, and the load of processing on the terminal is lightened by making the file size small.SOLUTION: A document information storage part 1A is stored with division document data generated by dividing document data in a predetermined file format into pages while text data included in the division document data are related. On receiving a request to retrieve division document data from a document browsing terminal 2 together with a keyword, a division document data extraction part 105 refers to the document information storage means 1A so as to extract the division document data related to the text data including the keyword. A merging processing part 106 merges the extracted division document data to generate merged document data, which is provided to the document browsing terminal 2.

Description

本発明は、データ形式の文書において、指定されたキーワードを含むページを閲覧するための技術に関する。 The present invention relates to a technique for browsing a page including a specified keyword in a data format document.

現在、製品カタログや取扱説明書など、様々な文書はデータ化され、各種の端末で閲覧可能に提供されている。
しかしながら、こういった文書は膨大なページに及ぶ場合が多々あり、特に携帯端末で閲覧するにはデータのサイズが大きすぎて不便なことが多い。 Currently, various documents such as product catalogs and instruction manuals are converted into data and provided so that they can be viewed on various terminals.
However, there are many cases where such a document covers an enormous number of pages, and in particular, it is often inconvenient because the data size is too large for browsing on a mobile terminal.

特に、商品のカタログを端末上に表示しながら、取引先相手と商談する場合などには、閲覧したい箇所を的確に、しかも素早く表示させたいという要望があった。 In particular, there has been a demand for displaying a location to be browsed accurately and quickly, for example, when making a business talk with a business partner while displaying a product catalog on a terminal.

この点、特許文献１では、少なくとも１ページの文書部に含まれる情報を格納して検索する文書検索方法であって、文書部に含まれる情報を示すイメージデータを生成する工程と、イメージデータ部を第一の記憶部に格納する工程と、文書部に含まれる情報のテキスト部を示すテキストデータを生成する工程と、テキストデータを第二の記憶部に格納する工程と、文書部の各語句の座標情報を示すデータを有しテキストデータとイメージデータとを関連付けるテーブルを生成する工程と、テキストデータ内で操作者が定義した検索基準に従って検索語句を確定する工程と、テキストデータ内で確定された検索語句に従ってテーブルより座標情報を確定する工程と、確定された座標情報に基づいて検索語句の少なくとも一部を含んだ文書部のページの部分表示を行う工程とを有する文書検索方法が提案されている。
また、特許文献２では、クライアント端末とサーバとがネットワークを介して接続されたネットワークシステムにおけるドキュメント管理システムにおいて、クライアント端末が、選択されたドキュメントからそのドキュメントの出力イメージに相当するイメージファイルを生成し、そのドキュメントのオリジナルファイルと生成したイメージファイルとをサーバに転送し、サーバからイメージファイルが転送されたときには、そのイメージファイルに基づく表示を行い、サーバが、イメージファイルとオリジナルファイルを格納すると共に、クライアント端末からの要求に応じてイメージファイルとオリジナルファイルを送出し、イメージファイルのファイル形式が各クライアント端末間で統一されているシステムが提案されている。 In this regard, Patent Document 1 is a document search method for storing and searching for information contained in at least one page of a document part, the step of generating image data indicating information contained in the document part, and an image data part In the first storage unit, generating text data indicating the text part of the information included in the document part, storing the text data in the second storage part, and each word in the document part Generating a table having data indicating coordinate information of the text, associating the text data with the image data, determining a search phrase in accordance with the search criteria defined by the operator in the text data, and confirming in the text data A step of determining coordinate information from the table according to the search term, and a copy of the document part including at least a part of the search term based on the determined coordinate information. Document search method and a step of performing a partial display of di have been proposed.
In Patent Document 2, in a document management system in a network system in which a client terminal and a server are connected via a network, the client terminal generates an image file corresponding to the output image of the document from the selected document. The original file of the document and the generated image file are transferred to the server, and when the image file is transferred from the server, the display based on the image file is performed, and the server stores the image file and the original file. A system has been proposed in which an image file and an original file are sent in response to a request from a client terminal, and the file format of the image file is unified among the client terminals.

特開平７−９３３７４号公報JP-A-7-93374 特開平１０−６９４７６号公報Japanese Patent Laid-Open No. 10-69476

上記特許文献１又は特許文献２によれば、所定のデータ形式からなる文書データを、所定の形式に統一し、クライアントの要求に応じたイメージデータ（ファイル）を提供することができる。 According to Patent Document 1 or Patent Document 2, document data having a predetermined data format can be unified into a predetermined format, and image data (file) according to a client request can be provided.

しかしながら、複数のページからなる文書データにおいて、所定のキーワードを含むページが複数ある場合に、当該キーワードを含むページを個々に展開して閲覧するのは面倒である。 However, in document data composed of a plurality of pages, when there are a plurality of pages including a predetermined keyword, it is troublesome to individually expand and browse the pages including the keyword.

そこで、本発明は、データ形式の文書について、指定のキーワードを含むページのみによって構成された文書を生成することにより、読みたい箇所だけを読むことができると共に、ファイルサイズを小さくして端末の処理負担が減らすことのできる装置又は端末を提供することを目的とする。 In view of this, the present invention generates a document composed only of pages including a specified keyword for a document in a data format, so that only the portion to be read can be read and the file size can be reduced to process the terminal. It is an object of the present invention to provide an apparatus or a terminal that can reduce the burden.

上記目的を達成するため、本発明の一の観点に係る文書情報提供装置は、データ形式の文書を閲覧するための文書閲覧端末と、ネットワークを介して通信可能に構成され、当該文書閲覧端末からの検索要求に応じて、該当する文書を閲覧可能に提供する装置であって、一又は複数のページからなる所定のファイル形式の文書データを、ページごとの分割ドキュメントデータに分割する分割処理手段と、上記分割ドキュメントデータごとに、当該分割ドキュメントデータに含まれるテキストデータを関連付けて記憶する文書情報記憶手段と、上記文書閲覧端末から、キーワードと共に、上記分割ドキュメントデータの検索要求を受信する検索要求受信手段と、上記検索要求に応じ、上記文書情報記憶手段を参照して、上記キーワードを含むテキストデータと関連付けられた分割ドキュメントデータを抽出する分割ドキュメントデータ抽出手段と、上記抽出した分割ドキュメントデータを結合して、結合ドキュメントデータを生成する結合処理手段と、上記文書閲覧端末に対し、上記結合ドキュメントデータを送信する結合ドキュメントデータ送信手段と、を有することを特徴とする。 In order to achieve the above object, a document information providing apparatus according to an aspect of the present invention is configured to be communicable with a document browsing terminal for browsing a document in a data format via a network. A division processing unit that divides document data of a predetermined file format composed of one or a plurality of pages into divided document data for each page. A document information storage means for storing the text data included in the divided document data in association with each of the divided document data, and a search request reception for receiving a search request for the divided document data together with a keyword from the document browsing terminal And a text including the keyword by referring to the document information storage means in response to the search request. Divided document data extracting means for extracting divided document data associated with the document data, combined processing means for generating combined document data by combining the extracted divided document data, and the combined document for the document browsing terminal Combined document data transmission means for transmitting data.

また、オリジナルの文書データを、上記所定のファイル形式の文書データに変換する変換処理手段、をさらに有するものとしてもよい。 Further, the image processing apparatus may further include conversion processing means for converting original document data into document data of the predetermined file format.

また、上記分割ドキュメントデータごとのサムネイルを生成する生成処理手段と、上記文書閲覧端末からの検索要求に応じ、上記文書情報記憶手段を参照して、上記キーワードを含むテキストデータと関連付けられた分割ドキュメントデータのサムネイルを抽出するサムネイル抽出手段と、上記文書閲覧端末に対し、上記検索要求に対する検索結果として、上記抽出したサムネイルの一覧を送信する検索結果送信手段と、上記文書閲覧端末から、上記サムネイルによって示される分割ドキュメントデータの結合要求を受信する結合要求受信手段と、をさらに有し、上記文書情報記憶手段は、上記分割ドキュメントデータごとに、当該分割ドキュメントデータに含まれるテキストデータと、当該分割ドキュメントデータのサムネイルとを関連付けて記憶し、上記分割ドキュメントデータ抽出手段は、上記結合要求に応じ、上記文書情報記憶手段を参照して、上記サムネイルによって示される分割ドキュメントデータを抽出するものとしてもよい。 Further, a generation processing unit that generates thumbnails for each of the divided document data, and a divided document associated with the text data including the keyword with reference to the document information storage unit in response to a search request from the document browsing terminal A thumbnail extraction unit that extracts thumbnails of data, a search result transmission unit that transmits a list of the extracted thumbnails as a search result for the search request to the document browsing terminal, and a thumbnail from the document browsing terminal by the thumbnail. A merge request receiving unit configured to receive a merged request for the divided document data shown, wherein the document information storage unit includes, for each divided document data, text data included in the divided document data, and the divided document. Associating thumbnails with data Storing Te, the divided document data extracting means, in response to the binding request, by referring to the document information storage unit may be extracts a divided document data represented by the thumbnails.

また、上記サムネイル抽出手段により抽出したサムネイルのうち、上記キーワードを含むテキストデータと関連付けられた分割ドキュメントデータのサムネイルを特定する特定処理手段、をさらに有し、上記サムネイル抽出手段は、上記文書閲覧端末からの検索要求に応じ、上記文書情報記憶手段を参照して、全ての上記サムネイルを抽出し、上記検索結果送信手段は、上記文書閲覧端末に対し、上記検索要求に対する検索結果として、上記抽出されたサムネイルの一覧であって、上記特定したサムネイルを他のサムネイルと識別可能に明示した一覧を送信し、上記結合要求受信手段は、上記文書閲覧端末から、上記明示したサムネイルによって示される分割ドキュメントデータの結合要求を受信し、上記分割ドキュメントデータ抽出手段は、上記文書情報記憶手段を参照して、上記明示したサムネイルによって示される分割ドキュメントデータを抽出するものとしてもよい。 The thumbnail extraction unit further includes a specific processing unit that identifies a thumbnail of the divided document data associated with the text data including the keyword among the thumbnails extracted by the thumbnail extraction unit, and the thumbnail extraction unit includes the document browsing terminal. In response to a search request from the document information storage means, extracts all the thumbnails, and the search result transmission means extracts the extracted result as a search result for the search request to the document browsing terminal. A list of specified thumbnails that clearly identifies the identified thumbnail as distinguishable from other thumbnails, and the combining request receiving means receives the divided document data indicated by the specified thumbnail from the document browsing terminal. The divided document data extracting means receives With reference to the document information storage unit may be extracts a divided document data indicated by the thumbnail described above explicitly.

また、上記文書閲覧端末から、データを出力するディスプレイのサイズ情報を受信するサイズ情報受信手段、をさらに有し、上記生成処理手段は、上記分割ドキュメントデータごとに、複数のサイズのサムネイルを生成し、上記サムネイル抽出手段は、上記文書閲覧端末からの検索要求に応じ、上記文書情報記憶手段を参照して、上記サムネイルあるいは上記キーワードを含むテキストデータと関連付けられたサムネイルのうち、上記文書閲覧端末のディスプレイのサイズ情報に応じたサイズのサムネイルを抽出するものとしてもよい。 Further, the information processing device further includes size information receiving means for receiving size information of a display that outputs data from the document browsing terminal, and the generation processing means generates thumbnails of a plurality of sizes for each of the divided document data. The thumbnail extracting unit refers to the document information storage unit in response to a search request from the document browsing terminal, and among the thumbnails associated with the thumbnail or the text data including the keyword, A thumbnail having a size corresponding to the size information of the display may be extracted.

また、本発明の別の観点に係る文書情報提供方法は、データ形式の文書を閲覧するための文書閲覧端末と、ネットワークを介して通信可能に構成され、コンピュータにより、上記文書閲覧端末からの検索要求に応じて、該当する文書を閲覧可能に提供する方法であって、一又は複数のページからなる所定のファイル形式の文書データを、ページごとの分割ドキュメントデータに分割する処理と、上記分割ドキュメントデータごとに、当該分割ドキュメントデータに含まれるテキストデータを関連付けて文書情報記憶手段に記憶する処理と、上記文書閲覧端末から、キーワードと共に、上記分割ドキュメントデータの検索要求を受信する処理と、上記検索要求に応じ、上記文書情報記憶手段を参照して、上記キーワードを含むテキストデータと関連付けられた分割ドキュメントデータを抽出する処理と、上記抽出した分割ドキュメントデータを結合して、結合ドキュメントデータを生成する処理と、上記文書閲覧端末に対し、上記結合ドキュメントデータを送信する処理と、を実行することを特徴とする。 A document information providing method according to another aspect of the present invention is configured to be communicable with a document browsing terminal for browsing a document in a data format via a network, and can be searched from the document browsing terminal by a computer. A method for providing a corresponding document so that it can be browsed according to a request, a process of dividing document data of a predetermined file format composed of one or a plurality of pages into divided document data for each page, and the divided document A process for associating text data included in the divided document data for each data and storing it in the document information storage means, a process for receiving a search request for the divided document data together with a keyword from the document browsing terminal, and the search Upon request, referring to the document information storage means, the text data including the keyword and A process of extracting linked divided document data, a process of combining the extracted divided document data to generate combined document data, a process of transmitting the combined document data to the document browsing terminal, It is characterized by performing.

また、本発明のさらに別の観点に係るコンピュータプログラムは、データ形式の文書を閲覧するための文書閲覧端末と、ネットワークを介して通信可能に構成され、コンピュータを、上記文書閲覧端末からの検索要求に応じて、該当する文書を閲覧可能に提供する文書情報提供装置として機能させるためのプログラムであって、上記コンピュータに対して、一又は複数のページからなる所定のファイル形式の文書データを、ページごとの分割ドキュメントデータに分割する処理と、上記分割ドキュメントデータごとに、当該分割ドキュメントデータに含まれるテキストデータを関連付けて文書情報記憶手段に記憶する処理と、上記文書閲覧端末から、キーワードと共に、上記分割ドキュメントデータの検索要求を受信する処理と、上記検索要求に応じ、上記文書情報記憶手段を参照して、上記キーワードを含むテキストデータと関連付けられた分割ドキュメントデータを抽出する処理と、上記抽出した分割ドキュメントデータを結合して、結合ドキュメントデータを生成する処理と、上記文書閲覧端末に対し、上記結合ドキュメントデータを送信する処理と、を実行させる。 A computer program according to still another aspect of the present invention is configured to be communicable with a document browsing terminal for browsing a document in a data format via a network, and the computer is requested to perform a search request from the document browsing terminal. And a document information providing apparatus for providing a corresponding document so that the document can be browsed. Document data in a predetermined file format consisting of one or a plurality of pages is transmitted to the computer. A process of dividing each divided document data, a process of associating and storing text data included in the divided document data in the document information storage unit for each of the divided document data, and a keyword from the document viewing terminal together with the keyword A process for receiving a search request for divided document data and the above search request And processing for extracting the divided document data associated with the text data including the keyword by referring to the document information storage means and generating the combined document data by combining the extracted divided document data And processing for transmitting the combined document data to the document browsing terminal.

また、本発明の一の観点に係る文書閲覧端末は、データ形式の文書を閲覧するための端末であって、一又は複数のページからなる所定のファイル形式の文書データを、ページごとの分割ドキュメントデータに分割する分割処理手段と、上記分割ドキュメントデータごとに、当該分割ドキュメントデータに含まれるテキストデータを関連付けて記憶する文書情報記憶手段と、キーワードの入力と共に、上記分割ドキュメントデータの検索要求を受け付ける検索要求受付手段と、上記検索要求に応じ、上記文書情報記憶手段を参照して、上記キーワードを含むテキストデータと関連付けられた分割ドキュメントデータを抽出する分割ドキュメントデータ抽出手段と、上記抽出した分割ドキュメントデータを結合して、結合ドキュメントデータを生成する結合処理手段と、上記結合した結合ドキュメントデータを表示する表示手段と、を有することを特徴とする。 A document browsing terminal according to one aspect of the present invention is a terminal for browsing a document in a data format, and converts document data in a predetermined file format composed of one or a plurality of pages into divided documents for each page. A division processing unit that divides the data, a document information storage unit that associates and stores text data included in the divided document data for each of the divided document data, a keyword input, and a search request for the divided document data are received. A search request receiving unit, a divided document data extracting unit that extracts the divided document data associated with the text data including the keyword with reference to the document information storage unit in response to the search request, and the extracted divided document Merge data to produce merged document data A binding processing means for, and having a display means for displaying the merged document data described above bonded.

また、本発明の別の観点に係る文書閲覧方法は、コンピュータにより、データ形式の文書を閲覧するための方法であって、一又は複数のページからなる所定のファイル形式の文書データを、ページごとの分割ドキュメントデータに分割する処理と、上記分割ドキュメントデータごとに、当該分割ドキュメントデータに含まれるテキストデータを関連付けて文書情報記憶手段に記憶する処理と、キーワードの入力と共に、上記分割ドキュメントデータの検索要求を受け付ける検索要求受付手段と、上記検索要求に応じ、上記文書情報記憶手段を参照して、上記キーワードを含むテキストデータと関連付けられた分割ドキュメントデータを抽出する処理と、上記抽出した分割ドキュメントデータを結合して、結合ドキュメントデータを生成する処理と、上記結合した結合ドキュメントデータを表示する処理と、を実行することを特徴とする。 A document browsing method according to another aspect of the present invention is a method for browsing a document in a data format by a computer, wherein document data in a predetermined file format consisting of one or a plurality of pages is converted page by page. A process of dividing the divided document data, a process of associating the text data included in the divided document data with each divided document data and storing it in the document information storage means, and a search for the divided document data together with a keyword input A search request receiving means for receiving a request; a process for extracting divided document data associated with text data including the keyword by referring to the document information storage means in response to the search request; and the extracted divided document data To generate combined document data When, and executes a processing for displaying the merged document data described above bonded.

また、本発明のさらに別の観点に係るコンピュータプログラムは、コンピュータを、データ形式の文書を閲覧するための文書閲覧端末として機能させるためのプログラムであって、上記コンピュータに対して、一又は複数のページからなる所定のファイル形式の文書データを、ページごとの分割ドキュメントデータに分割する処理と、上記分割ドキュメントデータごとに、当該分割ドキュメントデータに含まれるテキストデータを関連付けて文書情報記憶手段に記憶する処理と、キーワードの入力と共に、上記分割ドキュメントデータの検索要求を受け付ける検索要求受付手段と、上記検索要求に応じ、上記文書情報記憶手段を参照して、上記キーワードを含むテキストデータと関連付けられた分割ドキュメントデータを抽出する処理と、上記抽出した分割ドキュメントデータを結合して、結合ドキュメントデータを生成する処理と、上記結合した結合ドキュメントデータを表示する処理と、を実行させる。 A computer program according to still another aspect of the present invention is a program for causing a computer to function as a document browsing terminal for browsing a document in a data format. A process of dividing document data of a predetermined file format consisting of pages into divided document data for each page, and text data included in the divided document data is associated with each divided document data and stored in the document information storage unit. A search request receiving means for receiving a search request for the divided document data together with processing and keyword input, and a division associated with text data including the keyword with reference to the document information storage means in response to the search request The process of extracting document data and above Extracted divided document data bound to the, and generating a merged document data, and processing for displaying the merged document data described above bonded, to the execution.

本発明によれば、文書データが膨大なページ数からなる場合でも、指定のキーワードを含むページのみの文書を閲覧することができるため、読みたい箇所だけを読むのに便利である。
また、指定したキーワードのみを含むページのみから構成された文書を閲覧可能に生成するため、ファイルサイズを小さくすることができるため、端末の処理負担が減らすことができる。 According to the present invention, even when the document data consists of an enormous number of pages, it is possible to browse a document having only a page including a specified keyword, which is convenient for reading only a portion to be read.
In addition, since a document composed only of pages including only the designated keyword is generated so as to be viewable, the file size can be reduced, so that the processing burden on the terminal can be reduced.

本発明の実施形態に係る文書情報提供装置及び文書閲覧端末による処理の概要を示す概要図である。It is a schematic diagram which shows the outline | summary of the process by the document information provision apparatus and document browsing terminal which concern on embodiment of this invention. 本発明の第一の実施形態に係る文書情報提供装置が備える機能を示す機能ブロック図である。It is a functional block diagram which shows the function with which the document information provision apparatus which concerns on 1st embodiment of this invention is provided. 本実施形態に係る文書情報提供装置が備える文書情報記憶部に記憶されるデータの一例を示す図である。It is a figure which shows an example of the data memorize | stored in the document information storage part with which the document information provision apparatus which concerns on this embodiment is provided. 本実施形態に係る文書情報提供装置によって実行される処理の流れを示す処理フロー図である。It is a processing flowchart which shows the flow of the process performed by the document information provision apparatus which concerns on this embodiment. 本実施形態に係る文書情報提供装置及び文書閲覧端末によって実行される処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process performed by the document information provision apparatus and document browsing terminal which concern on this embodiment. 本実施形態に係る文書情報提供装置によって提供される文書情報一覧の出力例を示す図である。It is a figure which shows the example of an output of the document information list provided by the document information provision apparatus which concerns on this embodiment. 本実施形態に係る文書情報提供装置及び文書閲覧端末によって実行される処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process performed by the document information provision apparatus and document browsing terminal which concern on this embodiment. 本実施形態に係る文書情報提供装置によって提供される結合ドキュメントデータの出力例を示す図である。It is a figure which shows the output example of the combination document data provided by the document information provision apparatus which concerns on this embodiment. 本発明の第二の実施形態に係る文書情報提供装置が備える機能を示す機能ブロック図である。It is a functional block diagram which shows the function with which the document information provision apparatus which concerns on 2nd embodiment of this invention is provided. 本実施形態に係る文書情報提供装置が備える文書情報記憶部に記憶されるデータの一例を示す図である。It is a figure which shows an example of the data memorize | stored in the document information storage part with which the document information provision apparatus which concerns on this embodiment is provided. 本実施形態に係る文書情報提供装置によって実行される処理の流れを示す処理フロー図である。It is a processing flowchart which shows the flow of the process performed by the document information provision apparatus which concerns on this embodiment. 本実施形態に係る文書情報提供装置及び文書閲覧端末によって実行される処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process performed by the document information provision apparatus and document browsing terminal which concern on this embodiment. 本実施形態に係る文書情報提供装置によって提供されるサムネイル一覧の出力例を示す図である。It is a figure which shows the example of an output of the thumbnail list provided by the document information provision apparatus which concerns on this embodiment. 本実施形態に係る文書情報提供装置及び文書閲覧端末によって実行される処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process performed by the document information provision apparatus and document browsing terminal which concern on this embodiment. 本実施形態に係る文書情報提供装置及び文書閲覧端末によって実行される処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process performed by the document information provision apparatus and document browsing terminal which concern on this embodiment. 本発明の第三の実施形態に係る文書情報提供装置が備える機能を示す機能ブロック図である。It is a functional block diagram which shows the function with which the document information provision apparatus which concerns on 3rd embodiment of this invention is provided. 本実施形態に係る文書情報提供装置及び文書閲覧端末によって実行される処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process performed by the document information provision apparatus and document browsing terminal which concern on this embodiment. 本実施形態に係る文書情報提供装置によって提供されるサムネイル一覧の出力例を示す図である。It is a figure which shows the example of an output of the thumbnail list provided by the document information provision apparatus which concerns on this embodiment. 本実施形態に係る文書情報提供装置及び文書閲覧端末によって実行される処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process performed by the document information provision apparatus and document browsing terminal which concern on this embodiment. 本発明の第四の実施形態に係る文書情報提供装置が備える機能を示す機能ブロック図である。It is a functional block diagram which shows the function with which the document information provision apparatus which concerns on 4th embodiment of this invention is provided. 本実施形態に係る文書情報提供装置が備える文書情報記憶部に記憶されるデータの一例を示す図である。It is a figure which shows an example of the data memorize | stored in the document information storage part with which the document information provision apparatus which concerns on this embodiment is provided. 本実施形態に係る文書情報提供装置及び文書閲覧端末によって実行される処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the process performed by the document information provision apparatus and document browsing terminal which concern on this embodiment. 本発明の第五の実施形態に係る文書閲覧端末が備える機能を示す機能ブロック図である。It is a functional block diagram which shows the function with which the document browsing terminal which concerns on 5th embodiment of this invention is provided. 本実施形態に係る文書閲覧端末によって実行される処理の流れを示す処理フロー図である。It is a processing flowchart which shows the flow of the process performed by the document browsing terminal which concerns on this embodiment. 本実施形態に係る文書閲覧端末によって実行される処理の流れを示す処理フロー図である。It is a processing flowchart which shows the flow of the process performed by the document browsing terminal which concerns on this embodiment.

次に、本発明の実施形態に係る文書情報提供装置及び文書閲覧端末について、図を参照して説明する。
本実施形形態に係る文書情報提供装置による処理の概要を図１に示す。
データ形式の文書からなるオリジナルデータは、所定のファイル形式のフルドキュメントデータに変換される（Ｓ１）。 Next, a document information providing apparatus and a document browsing terminal according to an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 shows an outline of processing by the document information providing apparatus according to this embodiment.
Original data composed of documents in a data format is converted into full document data in a predetermined file format (S1).

フルドキュメントデータは、ページごとの分割ドキュメントデータに分割されると共に、各分割ドキュメントデータ夫々に含まれるテキストデータ（分割テキストデータ）と関連付けられる（Ｓ２）。
なお、分割テキストデータは、フルドキュメントデータから抽出したテキストデータを、分割ドキュメントデータに対応して分割したものであってもよいし、分割ドキュメントデータから抽出したものであってもよい。 The full document data is divided into divided document data for each page and is associated with text data (divided text data) included in each divided document data (S2).
The divided text data may be obtained by dividing text data extracted from full document data in accordance with the divided document data, or may be extracted from divided document data.

文書の閲覧を希望する閲覧者により、キーワードと共に文書の検索要求が行われると、当該キーワードを含む分割テキストデータを特定した上、当該分割テキストデータと関連付けられた分割ドキュメントデータが抽出される（Ｓ３）。 When a search request for a document together with a keyword is made by a viewer who wishes to view the document, the divided text data including the keyword is specified, and the divided document data associated with the divided text data is extracted (S3). ).

抽出された分割ドキュメントデータは、一のデータに結合され、結合ドキュメントデータとして閲覧者に提供される（Ｓ４）。
これにより閲覧者は、所望のキーワードを含むページのみによって構成された文書を閲覧することができる。 The extracted divided document data is combined with one data and provided to the viewer as combined document data (S4).
Thus, the viewer can browse a document composed only of pages including a desired keyword.

まず、本発明の第一の実施形態に係る文書情報提供装置について説明する。
図２に示されるように、本発明の第一の実施形態に係る文書情報提供装置１は、インターネット等のネットワークＮＷを介して、文書閲覧端末２と通信可能に構成されている。 First, the document information providing apparatus according to the first embodiment of the present invention will be described.
As shown in FIG. 2, the document information providing apparatus 1 according to the first embodiment of the present invention is configured to be able to communicate with the document browsing terminal 2 via a network NW such as the Internet.

文書情報提供装置１は、文書閲覧端末２を利用する閲覧者の要求に応じて、文書データを閲覧可能に提供する装置である。
この文書情報提供装置１は、CPU（Central Processing Unit）、CPUが実行するコンピュータプログラム、コンピュータプログラムや所定のデータを記憶するRAM（Random Access Memory）やROM（Read Only Memory）、及びハードディスクドライブなどの外部記憶装置により、文書情報記憶部１Ａ、変換処理部１０１、テキストデータ抽出部１０２、分割処理部１０３、文書情報抽出部１０４、分割ドキュメントデータ抽出部１０５、結合処理部１０６、及び通信処理部１０７からなる各機能ブロックを構成する。 The document information providing apparatus 1 is an apparatus that provides document data so that it can be browsed in response to a request from a viewer who uses the document browsing terminal 2.
The document information providing apparatus 1 includes a CPU (Central Processing Unit), a computer program executed by the CPU, a RAM (Random Access Memory) storing a computer program and predetermined data, a ROM (Read Only Memory), and a hard disk drive. By an external storage device, the document information storage unit 1A, the conversion processing unit 101, the text data extraction unit 102, the division processing unit 103, the document information extraction unit 104, the divided document data extraction unit 105, the combination processing unit 106, and the communication processing unit 107 Each functional block consisting of

文書情報記憶部１Ａは、閲覧者に対して閲覧可能に提供される文書データに関する文書情報を記憶することができる記憶部である。
文書データは、電子形式の文書であり、Microsoft Word（登録商標）やMicrosoft Power Point（登録商標）などのソフトウェアにより作成され、各種のファイル形式で構成されるものを含む。
この文書データは、インターネット等のネットワークＮＷを介して、あるいは直接入力といった所定の受付方法によって受け付けられ、文書情報記憶部１Ａに登録される。 The document information storage unit 1A is a storage unit that can store document information related to document data that is provided so as to be viewable to a viewer.
The document data is an electronic document, and includes data created by software such as Microsoft Word (registered trademark) and Microsoft Power Point (registered trademark) and configured in various file formats.
This document data is received via a network NW such as the Internet or by a predetermined receiving method such as direct input, and is registered in the document information storage unit 1A.

この文書情報記憶部１Ａには例えば、図３に示されるように、文書情報ごとに、文書情報を識別するための文書ＩＤ、文書概要情報、オリジナルデータ、フルドキュメントデータ、フルテキストデータ、分割ドキュメントデータ、サムネイル、分割テキストデータが相互に関連付けて記憶される。 In this document information storage unit 1A, for example, as shown in FIG. 3, for each document information, a document ID for identifying the document information, document summary information, original data, full document data, full text data, divided documents Data, thumbnails, and divided text data are stored in association with each other.

文書概要情報には例えば、文書データの作成者や作成日、文書の要約など、文書データの概要に関する情報が含まれる。
オリジナルデータは、変換や分割といった処理を施す前のオリジナルの文書データであり、所定の受け付け方法によって受け付けた際のファイル形式で構成される。 The document summary information includes, for example, information related to the document data summary such as document data creator, creation date, document summary, and the like.
The original data is original document data before processing such as conversion and division, and is configured in a file format when received by a predetermined receiving method.

フルドキュメントデータは、変換処理部１０１によってオリジナルデータを所定のファイル形式に変換したデータである。
フルテキストデータは、オリジナルデータあるいはフルドキュメントデータのテキストデータである。このフルテキストデータは、テキストデータ抽出部１０２により、オリジナルデータあるいはフルドキュメントデータから抽出される。 Full document data is data obtained by converting original data into a predetermined file format by the conversion processing unit 101.
Full text data is text data of original data or full document data. The full text data is extracted from the original data or full document data by the text data extraction unit 102.

分割ドキュメントデータは、フルドキュメントデータのページごとのデータである。この分割ドキュメントデータは、分割処理部１０３により、フルドキュメントデータをページごとに分割することで得られ、全ての分割ドキュメントデータによってフルドキュメントデータを構成することができる。
分割テキストデータは、分割ドキュメントデータごとのテキストデータである。この分割テキストデータは、分割処理部１０３により、分割ドキュメントデータに対応させて、フルテキストデータをページごとに分割して得られる。なお、分割テキストデータは、分割ドキュメントデータから、テキストデータを抽出することによって得ることもできる。 The divided document data is data for each page of full document data. The divided document data is obtained by dividing the full document data into pages by the division processing unit 103, and the full document data can be constituted by all the divided document data.
The divided text data is text data for each divided document data. The divided text data is obtained by dividing the full text data for each page by the division processing unit 103 in correspondence with the divided document data. The divided text data can also be obtained by extracting text data from the divided document data.

変換処理部１０１は、オリジナルデータのファイル形式を変換する処理部である。
この変換処理部１０１により、オリジナルデータは、所定のファイル形式で構成され、一又は複数のページからなるフルドキュメントデータに変換される。
ここで、変換するファイル形式には例えば、ＰＤＦ（Portable Documents Format）が挙げられる。
なお、例えば、ＰＤＦ形式に変換する場合に、オリジナルデータがＰＤＦである場合には、変換が不要であり、特に変換処理を実行しなくともよい。 The conversion processing unit 101 is a processing unit that converts the file format of the original data.
By this conversion processing unit 101, the original data is configured in a predetermined file format and converted into full document data composed of one or a plurality of pages.
Here, as a file format to be converted, for example, PDF (Portable Documents Format) can be cited.
For example, when converting to the PDF format, if the original data is PDF, conversion is not necessary, and conversion processing is not particularly required.

テキストデータ抽出部１０２は、オリジナルデータやフルドキュメントデータ、あるいは分割ドキュメントデータから、レイアウト情報や修飾情報を取り除くことによってテキスト形式のデータを抽出する処理を実行する。
この処理により、オリジナルデータやフルドキュメントデータからフルテキストデータが抽出されたり、分割ドキュメントデータから分割テキストデータが抽出されたりする。 The text data extraction unit 102 executes processing for extracting text format data by removing layout information and modification information from original data, full document data, or divided document data.
By this processing, full text data is extracted from original data or full document data, or divided text data is extracted from divided document data.

分割処理部１０３は、フルドキュメントデータが複数のページで構成される場合に、当該フルドキュメントデータを、一ページごとの分割ドキュメントデータに分割する処理を実行する。
また、この分割処理部１０３は、フルドキュメントデータから抽出されたフルテキストデータを、分割ドキュメントデータに対応した、ページごとの分割テキストデータに分割する処理を実行することができる。
なお、フルドキュメントデータが一ページで構成される場合には、分割が不要であり、特に分割処理を実行しなくともよい。 When the full document data is composed of a plurality of pages, the division processing unit 103 executes processing for dividing the full document data into divided document data for each page.
The division processing unit 103 can execute a process of dividing the full text data extracted from the full document data into divided text data for each page corresponding to the divided document data.
If the full document data is composed of one page, no division is necessary, and the division process need not be executed.

文書情報抽出部１０４は、文書ＩＤに基づいて、文書情報記憶部１Ａを参照して、当該文書ＩＤによって識別される文書情報を抽出する。 Based on the document ID, the document information extraction unit 104 refers to the document information storage unit 1A and extracts the document information identified by the document ID.

分割ドキュメントデータ抽出部１０５は、文書閲覧端末２からの検索要求に応じて、文書情報記憶部１Ａを参照して、検索要求に係るキーワードを含む分割テキストデータと関連付けられた分割ドキュメントデータを抽出する。 In response to the search request from the document browsing terminal 2, the divided document data extraction unit 105 refers to the document information storage unit 1A and extracts divided document data associated with the divided text data including the keyword related to the search request. .

結合処理部１０６は、分割ドキュメントデータ抽出部１０４によって抽出された分割ドキュメントデータが複数ある場合に、当該抽出された分割ドキュメントデータを結合し、一のデータに結合して結合ドキュメントデータを生成する処理を実行する。
なお、分割ドキュメントデータ抽出部１０４によって抽出された分割ドキュメントデータが一つしかない場合には、特に結合処理を実行しなくともよい。 The merge processing unit 106, when there are a plurality of divided document data extracted by the divided document data extraction unit 104, combines the extracted divided document data and combines them into one data to generate combined document data Execute.
Note that when there is only one piece of divided document data extracted by the divided document data extraction unit 104, it is not necessary to perform the combining process.

通信処理部１０７は、文書閲覧端末２との間で、インターネット等のネットワークＮＷを介し、所定のプロトコルに従ったデータの送受信処理を実行する処理部であって、Webブラウザ等により実現される。
この通信処理部１０７は例えば、文書閲覧端末２から、キーワードと共に分割ドキュメントデータの検索要求を受信したりする。また、文書閲覧端末２に対して、分割ドキュメントデータ抽出部１０５によって抽出した分割ドキュメントデータや、結合処理部１０６によって生成された結合ドキュメントデータを送信したりする。 The communication processing unit 107 is a processing unit that executes data transmission / reception processing according to a predetermined protocol with the document browsing terminal 2 via a network NW such as the Internet, and is realized by a Web browser or the like.
For example, the communication processing unit 107 receives a search request for divided document data together with a keyword from the document browsing terminal 2. Also, the divided document data extracted by the divided document data extraction unit 105 and the combined document data generated by the combination processing unit 106 are transmitted to the document browsing terminal 2.

文書閲覧端末２は、文書情報提供装置１によって提供される文書データを閲覧する閲覧者が利用する端末である。
この文書閲覧端末２は例えば、スマートフォンなどの携帯電話機やＰＤＡ（Personal Digital Assistance）などにより構成することができ、入出力２０１、及び通信処理部２０２からなる機能ブロックを構成する。 The document browsing terminal 2 is a terminal used by a viewer who browses document data provided by the document information providing apparatus 1.
The document browsing terminal 2 can be configured by, for example, a mobile phone such as a smartphone or a PDA (Personal Digital Assistance), and configures a functional block including an input / output 201 and a communication processing unit 202.

入出力処理部２０１は、データを入出力するための機能部であり、データを出力するためのＬＣＤ（Liquid Crystal Display）等のディスプレイやスピーカ、データを入力するためのマウスやキーボード等により構成される。 The input / output processing unit 201 is a functional unit for inputting / outputting data, and includes a display such as an LCD (Liquid Crystal Display) and a speaker for outputting data, a mouse and a keyboard for inputting data, and the like. The

通信処理部２０２は、文書情報提供装置１との間で、インターネット等のネットワークＮＷを介し、所定のプロトコルに従ったデータの送受信処理を実行する処理部であって、Webブラウザ等により実現される。
この通信処理部２０２は例えば、文書情報提供装置１から、分割ドキュメントデータや結合ドキュメントデータを受信したりする。また、文書情報提供装置１に対して、キーワードと共に分割ドキュメントデータの検索要求を送信したりする。 The communication processing unit 202 is a processing unit that executes data transmission / reception processing according to a predetermined protocol with the document information providing apparatus 1 via a network NW such as the Internet, and is realized by a Web browser or the like. .
For example, the communication processing unit 202 receives divided document data and combined document data from the document information providing apparatus 1. Further, a search request for divided document data is transmitted together with the keyword to the document information providing apparatus 1.

続いて、本実施形態に係る文書情報提供装置１及び文書閲覧端末２によって実行される処理の流れを説明する。
まず、図４を参照して、文書情報記憶部１Ａに文書情報を登録する処理について説明する。
文書情報提供装置１は、ネットワークＮＷや直接入力といった任意の手段により、文書のオリジナルデータを受け付けると、変換処理部１０１により、当該オリジナルデータを所定のファイル形式からなるフルドキュメントデータに変換する（Ｓ１０１）。 Subsequently, a flow of processing executed by the document information providing apparatus 1 and the document browsing terminal 2 according to the present embodiment will be described.
First, a process for registering document information in the document information storage unit 1A will be described with reference to FIG.
When the document information providing apparatus 1 receives the original data of the document by any means such as the network NW or direct input, the conversion processing unit 101 converts the original data into full document data having a predetermined file format (S101). ).

テキストデータ抽出部１０２は、フルドキュメントデータからフルテキストデータを抽出する（Ｓ１０２）。
抽出されたフルテキストデータは、分割処理部１０３により、ページごとの分割ドキュメントデータに分割される（Ｓ１０３）。 The text data extraction unit 102 extracts full text data from full document data (S102).
The extracted full text data is divided into divided document data for each page by the division processing unit 103 (S103).

そして、分割ドキュメントデータごとに、当該分割ドキュメントデータに含まれるテキストデータ（分割テキストデータ）が関連付けられ、文書情報記憶部１Ａに登録される（Ｓ１０４）。
なお、分割ドキュメントデータごとに関連付けられる分割テキストデータは、分割処理部１０３により、フルテキストデータを、分割ドキュメントデータに対応して分割されたものであってもよいし、テキストデータ抽出部１０２により、分割ドキュメントデータから抽出されたものであってもよい。 Then, for each divided document data, text data (divided text data) included in the divided document data is associated and registered in the document information storage unit 1A (S104).
The divided text data associated with each divided document data may be obtained by dividing the full text data corresponding to the divided document data by the division processing unit 103, or by the text data extracting unit 102. It may be extracted from the divided document data.

また、この登録の際、併せて、各分割ドキュメントデータの元になったオリジナルデータ、フルドキュメントデータ、フルテキストデータ、及び文書概要情報からなる文書情報に対して、文書情報を識別するための文書ＩＤが発行される。
なお、文書概要情報は、オリジナルデータ、フルドキュメントデータ、あるいはフルテキストデータから、文書作成者情報や作成日時といった所定の項目を抽出した情報や、文書の最初のページに基づいて生成されたサムネイルなどによって構成される。 In addition, at the time of this registration, a document for identifying the document information with respect to the document information composed of the original data, full document data, full text data, and document summary information that is the basis of each divided document data. An ID is issued.
The document summary information includes information obtained by extracting predetermined items such as document creator information and creation date from original data, full document data, or full text data, a thumbnail generated based on the first page of the document, etc. Consists of.

以下、閲覧者からの検索要求に応じて、文書情報記憶部１Ａに登録されている文書情報を提供する処理の流れについて説明する。
最初に、閲覧者から検索要求がなされ、検索要求に合致するページを含む文書情報の一覧が閲覧者に提供されるまでの処理について、図５を参照して説明する。
まず、閲覧者は文書閲覧端末２により、文書情報提供装置１に対して、所望のキーワードと共に、検索要求を送信する（Ｓ１１１）。 Hereinafter, the flow of processing for providing document information registered in the document information storage unit 1A in response to a search request from a viewer will be described.
First, a process from when a search request is made by a viewer to when a list of document information including pages matching the search request is provided to the viewer will be described with reference to FIG.
First, the viewer transmits a search request together with a desired keyword to the document information providing apparatus 1 through the document browsing terminal 2 (S111).

これに対して文書情報提供装置１は、文書情報記憶部１Ａを参照して、指定されたキーワードをフルテキストデータに含む文書の文書概要情報を抽出する（Ｓ１１２）。
抽出した文書概要情報は、指定されたキーワードを文書内に含む文書の検索結果として、一覧化されて、文書閲覧端末２に対して送信される（Ｓ１１３）。 On the other hand, the document information providing apparatus 1 refers to the document information storage unit 1A and extracts document summary information of a document that includes the specified keyword in full text data (S112).
The extracted document summary information is listed as a search result of documents including the specified keyword in the document and transmitted to the document browsing terminal 2 (S113).

このとき、文書閲覧端末２上において、文書の検索結果として表示される文書概要情報一覧の出力画面例を図６に示す。
文書概要情報一覧画面には、指定されたキーワードを含む文書の文書概要情報１００が一覧表示されている。この文書概要情報１００には例えば、文書データのサムネイル１０１や、文書のタイトルやファイル名といった書誌情報が含まれる。
また、各文書概要情報１００には、各文書概要情報を任意に選択するための選択ボタン１０３が設けられている。この選択ボタン１０３は押下することで、文書情報提供装置１に対して、文書ＩＤと共に、当該文書ＩＤに係る文書データの取得要求が送信される。 FIG. 6 shows an output screen example of the document summary information list displayed as a document search result on the document browsing terminal 2 at this time.
The document summary information list screen displays document summary information 100 of documents including the specified keyword. The document summary information 100 includes, for example, document data thumbnails 101 and bibliographic information such as document titles and file names.
Each document summary information 100 is provided with a selection button 103 for arbitrarily selecting each document summary information. When the selection button 103 is pressed, a document data acquisition request for the document ID is transmitted to the document information providing apparatus 1 together with the document ID.

次に、図７を参照して、閲覧者からの選択にしたがい、文書閲覧端末２に文書データが提供されるまでの処理について説明する。
閲覧者が、文書概要情報一覧から、任意の文書概要情報の選択ボタン１０３を押下すると、文書情報提供装置１に対して、文書ＩＤと共に、当該文書ＩＤに係る文書データの取得要求が送信される（Ｓ１２１）。 Next, referring to FIG. 7, a process until document data is provided to the document browsing terminal 2 according to the selection from the viewer will be described.
When the viewer presses a selection button 103 for arbitrary document summary information from the document summary information list, a document data acquisition request for the document ID is transmitted to the document information providing apparatus 1 together with the document ID. (S121).

これに応じて文書情報提供装置１は、分割ドキュメントデータ抽出部１０５により、文書情報記憶部１Ａを参照して、受信した文書ＩＤに係る文書情報と関連付けられた分割ドキュメントデータのうち、分割テキストデータに閲覧者指定のキーワードを含むものを抽出する（Ｓ１２２）。 In response to this, the document information providing apparatus 1 refers to the document information storage unit 1A by the divided document data extraction unit 105, and among the divided document data associated with the document information related to the received document ID, the divided text data That contain the keyword designated by the viewer are extracted (S122).

抽出された分割ドキュメントデータは、結合処理部１０６により結合され、結合ドキュメントデータが生成される（Ｓ１２３）。そして、文書閲覧端末２に対して、当該結合ドキュメントデータが送信される（Ｓ１２４）。 The extracted divided document data is combined by the combining processing unit 106 to generate combined document data (S123). Then, the combined document data is transmitted to the document browsing terminal 2 (S124).

ここで、文書閲覧端末２上において表示される結合ドキュメントデータの出力例を図８に示す。
結合ドキュメントデータ出力画面１１０中に表示される結合ドキュメントデータは、閲覧者が指定したキーワードを含む分割ドキュメントデータ１１１のみによって各ページが構成されている。 Here, an output example of the combined document data displayed on the document browsing terminal 2 is shown in FIG.
Each page of the combined document data displayed on the combined document data output screen 110 is configured only by the divided document data 111 including the keyword specified by the viewer.

以上の本実施形態により、文書データが膨大なページ数からなる場合でも、指定のキーワードを含むページのみの文書を閲覧することができるため、読みたい箇所だけを読むのに便利である。
また、文書閲覧端末２に最終的に提供される文書が、必要なページのみによって構成されているため、ファイルサイズを小さくすることができる。その結果、ダウンロードや表示に要する処理負担が減らされ、快適に利用することができる。 According to the above-described embodiment, even when the document data is composed of a huge number of pages, it is possible to browse only a page including a specified keyword, which is convenient for reading only a portion to be read.
Further, since the document finally provided to the document browsing terminal 2 is composed of only necessary pages, the file size can be reduced. As a result, the processing load required for downloading and display is reduced, and it can be used comfortably.

続いて、本発明の第二の実施形態に係る文書情報提供装置について説明する。
本実施形態は上記第一の実施形態の変形例であり、分割ドキュメントデータごとにサムネイルが生成されており、結合ドキュメントデータを参照する前に、キーワードを含む分割ドキュメントデータとしてどのようなものがあるかをサムネイルによって確認可能としたものである。 Subsequently, a document information providing apparatus according to the second embodiment of the present invention will be described.
This embodiment is a modification of the first embodiment described above. A thumbnail is generated for each divided document data, and there is any kind of divided document data including a keyword before referring to the combined document data. Can be confirmed by a thumbnail.

図９に示されるように、本実施形態に係る文書情報提供装置３は、第一の実施形態と同様、インターネット等のネットワークＮＷを介して、文書閲覧端末２と通信可能に構成されており、文書閲覧端末２を利用する閲覧者の要求に応じて、文書データを閲覧可能に提供する装置である。
なお、文書閲覧端末２の構成は、上述した第一の実施形態に係る文書閲覧端末２と同様である。 As shown in FIG. 9, the document information providing apparatus 3 according to the present embodiment is configured to be communicable with the document browsing terminal 2 via a network NW such as the Internet, as in the first embodiment. It is an apparatus that provides document data so that it can be browsed in response to a request from a viewer who uses the document browsing terminal 2.
The configuration of the document browsing terminal 2 is the same as that of the document browsing terminal 2 according to the first embodiment described above.

文書情報提供装置３は、CPU（Central Processing Unit）、CPUが実行するコンピュータプログラム、コンピュータプログラムや所定のデータを記憶するRAM（Random Access Memory）やROM（Read Only Memory）、及びハードディスクドライブなどの外部記憶装置により、文書情報記憶部３Ａ、変換処理部３０１、テキストデータ抽出部３０２、分割処理部３０３、生成処理部３０４、文書情報抽出部３０５、サムネイル抽出部３０６、分割ドキュメントデータ抽出部３０７、結合処理部３０８、及び通信処理部３０９からなる各機能ブロックを構成する。 The document information providing apparatus 3 includes a CPU (Central Processing Unit), a computer program executed by the CPU, a RAM (Random Access Memory) storing a computer program and predetermined data, a ROM (Read Only Memory), and an external device such as a hard disk drive. Depending on the storage device, the document information storage unit 3A, conversion processing unit 301, text data extraction unit 302, division processing unit 303, generation processing unit 304, document information extraction unit 305, thumbnail extraction unit 306, divided document data extraction unit 307, combination Each functional block including the processing unit 308 and the communication processing unit 309 is configured.

なお、変換処理部３０１、テキストデータ抽出部３０２、分割処理部３０３、文書情報抽出部３０５、結合処理部３０８、及び通信処理部３０９の各機能部の構成はそれぞれ、上述した第一の実施形態における変換処理部１０１、テキストデータ抽出部１０２、分割処理部１０３、文書情報抽出部１０４、結合処理部１０６、及び通信処理部１０７の各機能部の構成と同様である。 The configuration of each functional unit of the conversion processing unit 301, the text data extraction unit 302, the division processing unit 303, the document information extraction unit 305, the combination processing unit 308, and the communication processing unit 309 is the first embodiment described above. The configuration of each functional unit of the conversion processing unit 101, text data extraction unit 102, division processing unit 103, document information extraction unit 104, combination processing unit 106, and communication processing unit 107 in FIG.

本実施形態における文書情報記憶部３Ａは、閲覧者に対して閲覧可能に提供される文書データに関する文書情報を記憶することができる記憶部である。
この文書情報記憶部３Ａには例えば、図１０に示されるように、文書情報ごとに、文書情報を識別するための文書ＩＤ、文書概要情報、オリジナルデータ、フルドキュメントデータ、フルテキストデータ、分割ドキュメントデータ、分割テキストデータに加え、分割ドキュメントデータごとのサムネイルが相互に関連付けて記憶される。 The document information storage unit 3A in the present embodiment is a storage unit that can store document information related to document data that is provided so as to be viewable to a viewer.
In this document information storage unit 3A, for example, as shown in FIG. 10, for each document information, a document ID for identifying the document information, document summary information, original data, full document data, full text data, divided documents In addition to data and divided text data, thumbnails for each divided document data are stored in association with each other.

ここで、サムネイルは、分割ドキュメントデータの縮小画像である。
このサムネイルは、生成処理部１０６により、分割ドキュメントデータに基づいて生成される。 Here, the thumbnail is a reduced image of the divided document data.
The thumbnail is generated by the generation processing unit 106 based on the divided document data.

生成処理部３０４は、分割ドキュメントデータごとのサムネイルを生成する処理を実行する。 The generation processing unit 304 executes processing for generating a thumbnail for each divided document data.

サムネイル抽出部３０６は、文書閲覧端末２からの検索要求に応じ、文書情報記憶部３Ａを参照して、閲覧者指定のキーワードを含む分割テキストデータと関連付けられた分割ドキュメントデータのサムネイルを抽出する処理を実行する。 The thumbnail extraction unit 306 refers to the document information storage unit 3A in response to a search request from the document browsing terminal 2, and extracts thumbnails of the divided document data associated with the divided text data including the keyword specified by the viewer. Execute.

分割ドキュメントデータ抽出部３０７は、文書情報記憶部３Ａを参照して、閲覧者が選択したサムネイルによって示される分割ドキュメントデータを抽出する。 The divided document data extraction unit 307 refers to the document information storage unit 3A and extracts divided document data indicated by the thumbnail selected by the viewer.

続いて、本実施形態に係る文書情報提供装置３及び文書閲覧端末２によって実行される処理の流れを説明する。
まず、図１１を参照して、文書情報記憶部３Ａに文書情報を登録する処理について説明する。
文書情報提供装置３は、ネットワークＮＷや直接入力といった任意の手段により、文書のオリジナルデータを受け付けると、変換処理部３０１により、当該オリジナルデータを所定のファイル形式からなるフルドキュメントデータに変換する（Ｓ２０１）。 Subsequently, a flow of processing executed by the document information providing apparatus 3 and the document browsing terminal 2 according to the present embodiment will be described.
First, a process for registering document information in the document information storage unit 3A will be described with reference to FIG.
When the document information providing apparatus 3 receives the original data of the document by any means such as the network NW or direct input, the conversion processing unit 301 converts the original data into full document data having a predetermined file format (S201). ).

テキストデータ抽出部３０２は、フルドキュメントデータからフルテキストデータを抽出する（Ｓ２０２）。
抽出されたフルテキストデータは、分割処理部３０３により、ページごとの分割ドキュメントデータに分割される（Ｓ２０３）。 The text data extraction unit 302 extracts full text data from full document data (S202).
The extracted full text data is divided into divided document data for each page by the division processing unit 303 (S203).

生成処理部３０４は、分割ドキュメントデータごとに、サムネイルを生成する（Ｓ２０４）。
そして、分割ドキュメントデータごとに、当該分割ドキュメントデータに含まれるテキストデータ（分割テキストデータ）と、当該分割ドキュメントデータのサムネイルとが関連付けられ、文書情報記憶部３Ａに登録される（Ｓ２０５）。 The generation processing unit 304 generates a thumbnail for each divided document data (S204).
Then, for each divided document data, text data (divided text data) included in the divided document data and a thumbnail of the divided document data are associated with each other and registered in the document information storage unit 3A (S205).

なお、第一の実施形態と同様、分割ドキュメントデータごとに関連付けられる分割テキストデータは、分割処理部３０３により、フルテキストデータを、分割ドキュメントデータに対応して分割されたものであってもよいし、テキストデータ抽出部３０２により、分割ドキュメントデータから抽出されたものであってもよい。
また、同様に登録の際、併せて、各分割ドキュメントデータの元になったオリジナルデータ、フルドキュメントデータ、フルテキストデータ、文書概要情報、及び分割ドキュメントごとのサムネイルからなる文書情報に対して、文書情報を識別するための文書ＩＤが発行される。 As in the first embodiment, the divided text data associated with each piece of divided document data may be obtained by dividing the full text data corresponding to the divided document data by the division processing unit 303. The text data extraction unit 302 may extract the divided document data.
Similarly, at the time of registration, the document data is composed of original data, full document data, full text data, document summary information, and thumbnail information for each divided document. A document ID for identifying information is issued.

以下、図１２を参照して、閲覧者からの検索要求に応じて、文書情報記憶部３Ａに登録されている文書情報を提供する処理の流れについて説明する。
なお、閲覧者から検索要求がなされ、検索要求に合致するページを含む文書情報の一覧が閲覧者に提供されるまでの処理については、第一の実施形態において図５を参照して説明したとおりであり、ここではまず、閲覧者が文書を選択してから、文書閲覧端末２に文書データが提供されるまでの処理について説明する。 Hereinafter, the flow of processing for providing document information registered in the document information storage unit 3A in response to a search request from a viewer will be described with reference to FIG.
Note that the processing from when the viewer makes a search request until the viewer is provided with a list of document information including pages that match the search request is as described with reference to FIG. 5 in the first embodiment. Here, first, a process from when a viewer selects a document to when document data is provided to the document browsing terminal 2 will be described.

閲覧者が、文書概要情報一覧から、任意の文書を選択すると、文書情報提供装置１に対して、文書ＩＤと共に、当該文書ＩＤに係る文書データの取得要求が送信される（Ｓ２１１）。 When the viewer selects an arbitrary document from the document summary information list, a document data acquisition request for the document ID is transmitted to the document information providing apparatus 1 together with the document ID (S211).

これに応じて文書情報提供装置１は、サムネイル抽出部３０６により、文書情報記憶部３Ａを参照して、受信した文書ＩＤに係る文書情報と関連付けられた分割ドキュメントデータのサムネイルのうち、分割テキストデータに閲覧者指定のキーワードを含むものを抽出する（Ｓ２１２）。
抽出したサムネイルは、一覧化されて文書閲覧端末２に送信される（Ｓ２１３） In response to this, the document information providing apparatus 1 refers to the document information storage unit 3A by the thumbnail extraction unit 306, and among the thumbnails of the divided document data associated with the document information related to the received document ID, the divided text data That contain the keyword specified by the viewer is extracted (S212).
The extracted thumbnails are listed and transmitted to the document browsing terminal 2 (S213).

このとき、文書閲覧端末２上において表示されるサムネイル一覧の出力例を図１３に示す。
サムネイル一覧２００中に表示されるサムネイル２０１は、閲覧者が指定したキーワードを分割テキストデータに含む分割ドキュメントデータのサムネイル２０１のみによって構成されている。
各サムネイル２０１は、各サムネイル２０１によって示される分割ドキュメントデータの表示要求を受付可能に表示されており、例えば、一のサムネイル２０１を選択してダブルクリックすることで、当該一のサムネイル２０１によって示される分割ドキュメントデータが個別表示される。 FIG. 13 shows an output example of the thumbnail list displayed on the document browsing terminal 2 at this time.
The thumbnail 201 displayed in the thumbnail list 200 is configured only by the thumbnail 201 of the divided document data including the keyword specified by the viewer in the divided text data.
Each thumbnail 201 is displayed so as to be able to accept a display request for the divided document data indicated by each thumbnail 201. For example, when one thumbnail 201 is selected and double-clicked, the thumbnail 201 is indicated by the one thumbnail 201. The divided document data is displayed individually.

また、サムネイル一覧２００中には、サムネイル２０１の選択を受け付け、文書情報提供装置３に対して、当該選択されたサムネイル２０１によって示される分割ドキュメントデータからなる結合ドキュメントデータの提供を要求する結合表示ボタン２０２が設けられている。サムネイル２０１の選択は例えば、所望のサムネイル２０１のクリックにより受け付けられ、選択が済んだ後に、結合表示ボタン２０２を押下することで、当該選択されたサムネイルからなる結合ドキュメントデータが表示される。 Also, in the thumbnail list 200, a combined display button that accepts selection of the thumbnail 201 and requests the document information providing apparatus 3 to provide combined document data composed of the divided document data indicated by the selected thumbnail 201. 202 is provided. The selection of the thumbnail 201 is accepted, for example, by clicking on the desired thumbnail 201. After the selection is completed, the combined document data including the selected thumbnail is displayed by pressing the combined display button 202.

ここで、サムネイル一覧２００から、閲覧者からの選択に応じて一のサムネイル２０１によって示される分割ドキュメントデータが提供される処理を、図１４を参照して説明する。
閲覧者は、サムネイル一覧２００から、閲覧を所望する分割ドキュメントデータのサムネイル２０１を選択する（Ｓ２２１）。
なお、サムネイルの選択は、サムネイル２０１のダブルクリックによって受け付けてもよいし、別途、サムネイルによって示される分割ドキュメントデータの個別表示を要求するボタンを設けるようにしてもよく、各種の設計が可能である。 Here, a process of providing the divided document data indicated by one thumbnail 201 according to the selection from the viewer from the thumbnail list 200 will be described with reference to FIG.
The viewer selects the thumbnail 201 of the divided document data desired to be browsed from the thumbnail list 200 (S221).
The selection of the thumbnail may be accepted by double-clicking the thumbnail 201, or a button for requesting individual display of the divided document data indicated by the thumbnail may be provided separately, and various designs are possible. .

文書情報提供装置３は、閲覧者所望のサムネイルの選択情報を受信すると、文書情報記憶部３Ａを参照して、選択されたサムネイルと関連付けられた分割ドキュメントデータを抽出する（Ｓ２２２）。
抽出されたドキュメントデータは、文書閲覧端末２に送信され、これにより閲覧者は、所望の分割ドキュメントデータを閲覧することができる。 When the document information providing apparatus 3 receives the selection information of the thumbnail desired by the viewer, the document information providing apparatus 3 refers to the document information storage unit 3A and extracts divided document data associated with the selected thumbnail (S222).
The extracted document data is transmitted to the document browsing terminal 2, whereby the viewer can browse the desired divided document data.

また、一覧表示されたサムネイルから、閲覧者の任意によるサムネイルの選択を受け付け、当該選択に係るサムネイルで示される分割ドキュメントデータから構成される結合ドキュメントデータを提供する場合の処理を、図１５を参照して説明する。
閲覧者は、サムネイル一覧２００からサムネイル２０１を選択した上、文書情報提供装置３に対して、当該選択したサムネイルによって示される分割ドキュメントデータからなる結合ドキュメントデータの提供要求を送信する（Ｓ２３１）。 Also, see FIG. 15 for processing in the case of accepting selection of thumbnails arbitrarily by the viewer from the displayed thumbnails and providing combined document data composed of divided document data indicated by the thumbnails related to the selection. To explain.
The viewer selects the thumbnail 201 from the thumbnail list 200 and transmits a request for providing combined document data including divided document data indicated by the selected thumbnail to the document information providing apparatus 3 (S231).

これに応じて文書情報提供装置３は、分割ドキュメント抽出部３０７により、文書情報記憶部３Ａを参照して、閲覧者によって選択されたサムネイルと関連付けられた分割ドキュメントデータを全て抽出する（Ｓ２３２）。
抽出された分割ドキュメントデータは、結合処理部１０６により結合され、結合ドキュメントデータが生成される（Ｓ２３３）。そして、文書閲覧端末２に対して、当該結合ドキュメントデータが送信される（Ｓ２３４）。 In response to this, the document information providing apparatus 3 uses the divided document extraction unit 307 to refer to the document information storage unit 3A and extract all the divided document data associated with the thumbnail selected by the viewer (S232).
The extracted divided document data is combined by the combining processing unit 106 to generate combined document data (S233). Then, the combined document data is transmitted to the document browsing terminal 2 (S234).

なお、以上の本実施形態における結合ドキュメントデータの生成においては、閲覧者に対してサムネイル一覧の中から所望のサムネイルを選択させ、当該選択されたサムネイルによって示される分割ドキュメントデータを抽出して結合ドキュメントデータを生成するものとしてが、これに限らず、一覧表示されたサムネイルによって示される全ての分割ドキュメントデータを結合して結合ドキュメントデータを生成するようにしてもよい。 In the generation of the combined document data in the above embodiment, the viewer selects a desired thumbnail from the thumbnail list, extracts the divided document data indicated by the selected thumbnail, and extracts the combined document. Data generation is not limited to this, and all divided document data indicated by thumbnails displayed in a list may be combined to generate combined document data.

以上の本実施形態により、選択したキーワードを含む分割ドキュメントデータがサムネイルで一覧表示されるので、キーワードを含む複数の分割ドキュメントデータの中から、所望の分割ドキュメントデータを容易に選択することができる。
また、閲覧者が閲覧したい分割ドキュメントだけをまとめて表示させることができる。 According to the present embodiment described above, the divided document data including the selected keyword is displayed in a list as thumbnails, so that desired divided document data can be easily selected from a plurality of divided document data including the keyword.
Further, only the divided documents that the viewer wants to browse can be displayed together.

続いて、本発明の第三の実施形態に係る文書情報提供装置について説明する。
本実施形態は上記第二の実施形態の変形例であり、閲覧者が選択した文書の分割ドキュメントデータの全てのサムネイルを一覧表示し、当該一覧表示の中で、閲覧者指定のキーワードを含む分割ドキュメントデータを明示するものである。 Subsequently, a document information providing apparatus according to the third embodiment of the present invention will be described.
This embodiment is a modification of the second embodiment described above, in which all thumbnails of the divided document data of the document selected by the viewer are displayed as a list, and the divided display including the keyword specified by the viewer in the list display Document data is specified.

図１６に示されるように、本実施形態に係る文書情報提供装置４は、第一又は第二の実施形態と同様、インターネット等のネットワークＮＷを介して、文書閲覧端末２と通信可能に構成されており、文書閲覧端末２を利用する閲覧者の要求に応じて、文書データを閲覧可能に提供する装置である。
なお、文書閲覧端末２の構成は、上述した第一又は第二の実施形態に係る文書閲覧端末２と同様である。 As shown in FIG. 16, the document information providing apparatus 4 according to the present embodiment is configured to be able to communicate with the document browsing terminal 2 via a network NW such as the Internet, as in the first or second embodiment. In response to a request from a viewer who uses the document browsing terminal 2, the document data can be browsed.
The configuration of the document browsing terminal 2 is the same as that of the document browsing terminal 2 according to the first or second embodiment described above.

文書情報提供装置４は、CPU（Central Processing Unit）、CPUが実行するコンピュータプログラム、コンピュータプログラムや所定のデータを記憶するRAM（Random Access Memory）やROM（Read Only Memory）、及びハードディスクドライブなどの外部記憶装置により、文書情報記憶部４Ａ、変換処理部４０１、テキストデータ抽出部４０２、分割処理部４０３、生成処理部４０４、文書情報抽出部４０５、サムネイル抽出部４０６、特定処理部４０７、分割ドキュメントデータ抽出部４０８、結合処理部４０９、及び通信処理部４１０からなる各機能ブロックを構成する。 The document information providing device 4 includes a CPU (Central Processing Unit), a computer program executed by the CPU, a RAM (Random Access Memory) storing a computer program and predetermined data, a ROM (Read Only Memory), and an external device such as a hard disk drive. Depending on the storage device, the document information storage unit 4A, conversion processing unit 401, text data extraction unit 402, division processing unit 403, generation processing unit 404, document information extraction unit 405, thumbnail extraction unit 406, specific processing unit 407, divided document data Each functional block including the extraction unit 408, the combination processing unit 409, and the communication processing unit 410 is configured.

なお、文書情報記憶部４Ａ、変換処理部４０１、テキストデータ抽出部４０２、分割処理部４０３、生成処理部４０４、文書情報抽出部４０５、結合処理部４０９、及び通信処理部４１０の各機能部の構成はそれぞれ、上述した第一又は第二の実施形態における文書情報記憶部３Ａ、変換処理部１０１、３０１、テキストデータ抽出部１０２、３０２、分割処理部１０３、３０３、生成処理部３０４、文書情報抽出部１０４、３０５、結合処理部１０６、３０８、及び通信処理部１０７、３０９の各機能部の構成と同様である。 Note that each of the functional units of the document information storage unit 4A, the conversion processing unit 401, the text data extraction unit 402, the division processing unit 403, the generation processing unit 404, the document information extraction unit 405, the combination processing unit 409, and the communication processing unit 410 is described. Each of the configurations includes the document information storage unit 3A, the conversion processing units 101 and 301, the text data extraction units 102 and 302, the division processing units 103 and 303, the generation processing unit 304, and the document information in the first or second embodiment described above. The configuration is the same as that of the functional units of the extraction units 104 and 305, the combination processing units 106 and 308, and the communication processing units 107 and 309.

サムネイル抽出部４０６は、文書情報記憶部４Ａを参照して、閲覧者が選択した文書を構成する分割ドキュメントデータのサムネイルを全て抽出する。 The thumbnail extraction unit 406 refers to the document information storage unit 4A and extracts all thumbnails of the divided document data constituting the document selected by the viewer.

特定処理部４０７は、サムネイル抽出部４０６により抽出されたサムネイルのうち、閲覧者が指定したキーワードを含む分割テキストデータと関連付けられた分割ドキュメントデータのサムネイルを特定する。 The identification processing unit 407 identifies the thumbnails of the divided document data associated with the divided text data including the keyword specified by the viewer among the thumbnails extracted by the thumbnail extraction unit 406.

分割ドキュメントデータ抽出部４０８は、文書情報記憶部４Ａを参照して、特定処理部４０７により特定されたサムネイルによって示される分割ドキュメントデータを抽出する。 The divided document data extracting unit 408 refers to the document information storage unit 4A and extracts divided document data indicated by the thumbnail specified by the specifying processing unit 407.

以下、図１７を参照して、閲覧者からの検索要求に応じて、文書情報記憶部４Ａに登録されている文書情報を提供する処理の流れについて説明する。
なお、文書情報記憶部４Ａに文書情報を登録する処理については、第二の実施形態において図１１を参照して説明したとおりである。また、閲覧者から検索要求がなされ、検索要求に合致するページを含む文書情報の一覧が閲覧者に提供されるまでの処理については、第一の実施形態において図５を参照して説明したとおりである。そこで、ここではまず、閲覧者が文書を選択してから、文書閲覧端末２に文書データが提供されるまでの処理について説明する。 Hereinafter, the flow of processing for providing document information registered in the document information storage unit 4A in response to a search request from a viewer will be described with reference to FIG.
Note that the process of registering document information in the document information storage unit 4A is as described with reference to FIG. 11 in the second embodiment. In addition, as described with reference to FIG. 5 in the first embodiment, the processing until the browser makes a search request and the viewer is provided with a list of document information including pages that match the search request. It is. Therefore, here, first, a process from when a viewer selects a document to when document data is provided to the document viewing terminal 2 will be described.

閲覧者が、文書概要情報一覧から、任意の文書を選択すると、文書情報提供装置４に対して、文書ＩＤと共に、当該文書ＩＤに係る文書データの取得要求が送信される（Ｓ３０１）。 When the viewer selects an arbitrary document from the document summary information list, a document data acquisition request is transmitted to the document information providing apparatus 4 together with the document ID (S301).

これに応じて文書情報提供装置４は、サムネイル抽出部４０６により、文書情報記憶部４Ａを参照して、受信した文書ＩＤに係る文書情報と関連付けられた分割ドキュメントデータのサムネイルを全て抽出する（Ｓ３０２）。 In response, the document information providing apparatus 4 causes the thumbnail extracting unit 406 to refer to the document information storage unit 4A and extract all thumbnails of the divided document data associated with the document information related to the received document ID (S302). ).

特定処理部４０７は、抽出されたサムネイルのうち、分割テキストデータに閲覧者指定のキーワードを含む分割ドキュメントデータのサムネイルを特定する（Ｓ３０３）。
そして、抽出されたサムネイルのうち、特定されたドキュメントデータのサムネイルが明示されたサムネイル一覧が文書閲覧端末２に送信される（Ｓ３０４） The identification processing unit 407 identifies the thumbnail of the divided document data that includes the keyword specified by the viewer in the divided text data among the extracted thumbnails (S303).
Then, among the extracted thumbnails, a thumbnail list in which the thumbnails of the specified document data are specified is transmitted to the document browsing terminal 2 (S304).

このとき、文書閲覧端末２上において表示されるサムネイル一覧の出力例を図１８に示す。
サムネイル一覧３００には、閲覧者が選択した文書を構成する分割ドキュメントデータのサムネイル３０１、３０２が全て表示されている。
ここで、閲覧者が指定したキーワードを分割テキストデータに含む分割ドキュメントデータのサムネイル３０２は、枠線が太線になっており、他のサムネイル３０１と区別可能に明示されている。 At this time, an output example of the thumbnail list displayed on the document browsing terminal 2 is shown in FIG.
The thumbnail list 300 displays all the thumbnails 301 and 302 of the divided document data constituting the document selected by the viewer.
Here, the thumbnail 302 of the divided document data that includes the keyword specified by the viewer in the divided text data has a framed line that is clearly distinguished from the other thumbnails 301.

また、サムネイル一覧３００には、文書情報提供装置４に対し、明示されたサムネイル３０２により示される分割ドキュメントデータを結合した結合ドキュメントデータの表示を要求する結合表示ボタン３０３が設けられている。 The thumbnail list 300 is provided with a combined display button 303 that requests the document information providing apparatus 4 to display combined document data obtained by combining the divided document data indicated by the specified thumbnail 302.

なお、本実施形態におけるサムネイル一覧３００欄では、サムネイル３０２の枠線を太枠にすることで、当該サムネイル３０２を明示しているが、これに限らず、サムネイル３０１とサムネイル３０２とをそれぞれグループ化するなど、各種の方法により、サムネイル３０１とサムネイル３０２を区別可能に表示させてもよい。 In the thumbnail list 300 column in the present embodiment, the thumbnail 302 is clearly shown by making the border of the thumbnail 302 a thick frame. However, the present invention is not limited to this, and the thumbnail 301 and the thumbnail 302 are grouped. The thumbnail 301 and the thumbnail 302 may be displayed in a distinguishable manner by various methods.

また、各サムネイル３０１、３０２は、第二の実施形態と同様、各サムネイル３０１、３０２によって示される分割ドキュメントデータの表示要求を受付可能に表示され、例えば、一のサムネイル３０１、３０２を選択してダブルクリックすることで、当該一のサムネイル３０１、３０２によって示される分割ドキュメントデータが個別表示されるようになっていてもよい。 Similarly to the second embodiment, the thumbnails 301 and 302 are displayed so that a request for displaying divided document data indicated by the thumbnails 301 and 302 can be received. For example, one thumbnail 301 or 302 is selected. By double-clicking, the divided document data indicated by the one thumbnail 301 and 302 may be individually displayed.

また、一覧表示されたサムネイルにおいて、明示されたサムネイルで示される分割ドキュメントデータから構成される結合ドキュメントデータを提供する場合の処理を、図１９を参照して説明する。
閲覧者は、文書情報提供装置３に対して、閲覧者指定のキーワードを含むものとして明示されたサムネイルによって示される分割ドキュメントデータを結合した結合ドキュメントデータの提供要求を送信する（Ｓ３１１）。 In addition, processing in the case of providing combined document data composed of divided document data indicated by the specified thumbnail among the thumbnails displayed in a list will be described with reference to FIG.
The viewer transmits a request for providing combined document data obtained by combining the divided document data indicated by the thumbnails specified as including the keyword specified by the viewer to the document information providing apparatus 3 (S311).

これに応じて文書情報提供装置３は、分割ドキュメント抽出部３０７により、文書情報記憶部３Ａを参照して、特定処理部４０７によって特定されたサムネイルであって、サムネイル一覧において明示されたサムネイルと関連付けられた分割ドキュメントデータを全て抽出する（Ｓ３１２）。 In response, the document information providing apparatus 3 refers to the document information storage unit 3A by the divided document extraction unit 307 and associates the thumbnails specified by the specification processing unit 407 with the thumbnails specified in the thumbnail list. All the divided document data are extracted (S312).

抽出された分割ドキュメントデータは、結合処理部４０９により結合され、結合ドキュメントデータが生成される（Ｓ３１３）。そして、文書閲覧端末２に対して、当該結合ドキュメントデータが送信される（Ｓ３１４）。 The extracted divided document data is combined by the combining processing unit 409 to generate combined document data (S313). Then, the combined document data is transmitted to the document browsing terminal 2 (S314).

以上の本実施形態により、閲覧者指定のキーワードを含むページのサムネイルが、サムネイル一覧において明示されるので、文書中のどのページにキーワードが入っているのかを一見して把握することができる。 According to the present embodiment described above, thumbnails of pages including a keyword designated by the viewer are clearly shown in the thumbnail list, so it is possible to grasp at a glance which page in the document contains the keyword.

続いて、本発明の第四の実施形態に係る文書情報提供装置について説明する。
本実施形態は上記第二又は第三の実施形態の変形例であり、閲覧者に分割ドキュメントデータのサムネイルを送信する際、文書閲覧端末のディスプレイに応じたサムネイルを提供するものである。
なお、以下の本実施形態の説明においては、上記第三の実施形態の変形例とした場合について説明する。 Subsequently, a document information providing apparatus according to the fourth embodiment of the present invention will be described.
This embodiment is a modification of the second or third embodiment, and provides thumbnails corresponding to the display of the document browsing terminal when transmitting thumbnails of divided document data to the viewer.
In the following description of the present embodiment, a case in which a modification of the third embodiment is used will be described.

図２０に示されるように、本実施形態に係る文書情報提供装置５は、第三の実施形態と同様、インターネット等のネットワークＮＷを介して、文書閲覧端末２と通信可能に構成されており、文書閲覧端末２を利用する閲覧者の要求に応じて、文書データを閲覧可能に提供する装置である。
なお、文書閲覧端末２の構成は、上述した第一、第二、又は第三の実施形態に係る文書閲覧端末２と同様である。 As shown in FIG. 20, the document information providing apparatus 5 according to the present embodiment is configured to be communicable with the document browsing terminal 2 via a network NW such as the Internet, as in the third embodiment. It is an apparatus that provides document data so that it can be browsed in response to a request from a viewer who uses the document browsing terminal 2.
The configuration of the document browsing terminal 2 is the same as that of the document browsing terminal 2 according to the first, second, or third embodiment described above.

文書情報提供装置５は、CPU（Central Processing Unit）、CPUが実行するコンピュータプログラム、コンピュータプログラムや所定のデータを記憶するRAM（Random Access Memory）やROM（Read Only Memory）、及びハードディスクドライブなどの外部記憶装置により、文書情報記憶部５Ａ、変換処理部５０１、テキストデータ抽出部５０２、分割処理部５０３、生成処理部５０４、文書情報抽出部５０５、サムネイル抽出部５０６、特定処理部５０７、分割ドキュメントデータ抽出部５０８、結合処理部５０９、及び通信処理部５１０からなる各機能ブロックを構成する。 The document information providing apparatus 5 includes a CPU (Central Processing Unit), a computer program executed by the CPU, a RAM (Random Access Memory) storing a computer program and predetermined data, a ROM (Read Only Memory), and an external device such as a hard disk drive. Depending on the storage device, the document information storage unit 5A, conversion processing unit 501, text data extraction unit 502, division processing unit 503, generation processing unit 504, document information extraction unit 505, thumbnail extraction unit 506, specific processing unit 507, divided document data Each functional block including the extraction unit 508, the combination processing unit 509, and the communication processing unit 510 is configured.

なお、変換処理部５０１、テキストデータ抽出部５０２、分割処理部５０３、文書情報抽出部５０５、サムネイル抽出部５０６、特定処理部５０７、分割ドキュメントデータ抽出部５０８、結合処理部５０９、及び通信処理部５１０の各機能部の構成はそれぞれ、上述した第三の実施形態における変換処理部４０１、テキストデータ抽出部４０２、分割処理部４０３、文書情報抽出部４０５、サムネイル抽出部４０６、特定処理部４０７、分割ドキュメント抽出部４０８、結合処理部４０９、及び通信処理部４１０の各機能部の構成と同様である。 The conversion processing unit 501, the text data extraction unit 502, the division processing unit 503, the document information extraction unit 505, the thumbnail extraction unit 506, the specific processing unit 507, the divided document data extraction unit 508, the combination processing unit 509, and the communication processing unit The configuration of each functional unit 510 includes the conversion processing unit 401, the text data extraction unit 402, the division processing unit 403, the document information extraction unit 405, the thumbnail extraction unit 406, the specific processing unit 407 in the third embodiment described above. The configuration is the same as that of each functional unit of the divided document extraction unit 408, the combination processing unit 409, and the communication processing unit 410.

文書情報記憶部５Ａは、閲覧者に対して閲覧可能に提供される文書データに関する文書情報を記憶することができる記憶部である。
この文書情報記憶部５Ａには例えば、図２１に示されるように、第三の実施形態と同様、文書情報ごとに、文書情報を識別するための文書ＩＤ、文書概要情報、オリジナルデータ、フルドキュメントデータ、フルテキストデータ、分割ドキュメントデータ、分割テキストデータに加え、分割ドキュメントデータごとのサムネイルが相互に関連付けて記憶される。 The document information storage unit 5A is a storage unit that can store document information related to document data that is provided so as to be viewable to a viewer.
For example, as shown in FIG. 21, the document information storage unit 5A has a document ID for identifying the document information, document summary information, original data, full document as in the third embodiment. In addition to data, full text data, divided document data, and divided text data, thumbnails for each divided document data are stored in association with each other.

ここで、サムネイルには、複数のサイズのサムネイルが用意されている。本例では、サイズ１、サイズ２、サイズ３の三種類のサムネイルが用意されており、これら３つのサムネイルの画像サイズは互いに異なっている。
なお、本例では、三種類のサイズのサムネイルを用意したが、これに限らず、２種類あるいは４種類の以上のサムネイルを用意するものとしてもよい。 Here, thumbnails of a plurality of sizes are prepared as thumbnails. In this example, three types of thumbnails of size 1, size 2, and size 3 are prepared, and the image sizes of these three thumbnails are different from each other.
In this example, three types of thumbnails are prepared. However, the present invention is not limited to this, and two or more types of thumbnails may be prepared.

生成処理部５０４は、分割ドキュメントデータごとに、複数のサイズのサムネイルを生成する。 The generation processing unit 504 generates thumbnails of a plurality of sizes for each divided document data.

サムネイル抽出部５０６は、文書閲覧端末２からの検索要求に応じ、文書情報記憶部５Ａを参照して、閲覧者指定のキーワードを含む分割テキストデータと関連付けられた分割ドキュメントデータのサムネイルのうち、文書閲覧端末２の入出力部２０１として実現されるディスプレイのサイズに応じたものを抽出する。 The thumbnail extraction unit 506 refers to the document information storage unit 5A in response to a search request from the document browsing terminal 2, and among the thumbnails of the divided document data associated with the divided text data including the keyword specified by the viewer, A display corresponding to the size of the display realized as the input / output unit 201 of the browsing terminal 2 is extracted.

以下、図２２を参照して、閲覧者からの検索要求に応じて、文書情報記憶部５Ａに登録されている文書情報を提供する処理の流れについて説明する。
なお、文書情報記憶部５Ａに文書情報を登録する処理は、第二の実施形態におけるのと同様であるが、生成処理部５０４により、分割ドキュメントデータごとにサムネイルが生成される際には、分割ドキュメントごとに、サイズが異なる複数のサムネイルが生成され、分割ドキュメントデータごとに複数のサイズのサムネイルが登録される。
また、閲覧者から検索要求がなされ、検索要求に合致するページを含む文書情報の一覧が閲覧者に提供されるまでの処理については、第二の実施形態において図１１を参照して説明したとおりである。
そこで、ここでは、閲覧者が文書を選択してから、文書閲覧端末２に文書データが提供されるまでの処理について説明する。 Hereinafter, the flow of processing for providing document information registered in the document information storage unit 5A in response to a search request from a viewer will be described with reference to FIG.
The process of registering document information in the document information storage unit 5A is the same as that in the second embodiment. However, when a thumbnail is generated for each divided document data by the generation processing unit 504, the division is performed. A plurality of thumbnails having different sizes are generated for each document, and thumbnails having a plurality of sizes are registered for each divided document data.
In addition, as described with reference to FIG. 11 in the second embodiment, the processing from when the viewer makes a search request until the viewer is provided with a list of document information including pages matching the search request. It is.
Therefore, here, a process from when a viewer selects a document to when document data is provided to the document viewing terminal 2 will be described.

閲覧者が、文書概要情報一覧から、任意の文書を選択すると、文書情報提供装置５に対して、文書ＩＤと共に、当該文書ＩＤに係る文書データの取得要求が送信される（Ｓ４０１）。
この際、文書ＩＤと共に、文書閲覧端末２の入出力部２０１として実現されるディスプレイのサイズ情報（例えば、縦幅と横幅など）が併せて、文書情報提供装置５に対して送信される。 When the viewer selects an arbitrary document from the document summary information list, a document data acquisition request is transmitted to the document information providing apparatus 5 together with the document ID (S401).
At this time, together with the document ID, display size information (for example, vertical width and horizontal width) realized as the input / output unit 201 of the document browsing terminal 2 is also transmitted to the document information providing apparatus 5.

これに応じて文書情報提供装置５は、サムネイル抽出部５０６により、文書情報記憶部５Ａを参照して、受信した文書ＩＤに係る文書情報と関連付けられた分割ドキュメントデータのサムネイルのうち、文書閲覧端末２のディスプレイのサイズ情報に応じたサムネイルを全て抽出する（Ｓ４０２）。
なお、文書情報記憶部５Ａに記憶されている複数のサイズのサムネイルから、文書閲覧端末２のディスプレイのサイズ情報に応じたサムネイルを選択する場合には、例えば、文書閲覧端末２のディスプレイのサイズが所定値以下であれば、このサイズのサムネイルを抽出するといったように、予め決められたサイズの対応情報を所定のテーブルにもっておき、これを参照することで、適切なサイズのサムネイルを選択することができる。 In response to this, the document information providing apparatus 5 refers to the document information storage unit 5A by the thumbnail extraction unit 506, and among the thumbnails of the divided document data associated with the document information related to the received document ID, the document browsing terminal All thumbnails corresponding to the size information of the display 2 are extracted (S402).
In the case where a thumbnail corresponding to the display size information of the document browsing terminal 2 is selected from a plurality of thumbnails stored in the document information storage unit 5A, for example, the display size of the document browsing terminal 2 is selected. If it is equal to or smaller than a predetermined value, a corresponding size of a predetermined size is stored in a predetermined table, such as extracting a thumbnail of this size, and a thumbnail of an appropriate size is selected by referring to this. Can do.

特定処理部５０７は、抽出されたサムネイルのうち、分割テキストデータに閲覧者指定のキーワードを含む分割ドキュメントデータのサムネイルを特定する（Ｓ４０３）。
そして、抽出されたサムネイルのうち、特定されたドキュメントデータのサムネイルが明示されたサムネイル一覧が文書閲覧端末２に送信される（Ｓ４０４） The identification processing unit 507 identifies thumbnails of the divided document data that includes the keyword specified by the viewer in the divided text data among the extracted thumbnails (S403).
Then, among the extracted thumbnails, a thumbnail list in which the thumbnails of the specified document data are specified is transmitted to the document browsing terminal 2 (S404).

以上の本実施形態により、文書閲覧端末２のディスプレイに応じて、適切なサイズの分割ドキュメントデータのサムネイルが表示される。 According to the embodiment described above, thumbnails of divided document data having an appropriate size are displayed according to the display of the document browsing terminal 2.

続いて、本発明の第五の実施形態に係る文書閲覧端末について説明する。
本実施形態は上記第一から第四の実施形態に係る文書情報提供装置が備える機能を、文書閲覧端末にもたせ、文書閲覧端末用のビューア（viewer）としたものである。 Next, a document browsing terminal according to the fifth embodiment of the present invention will be described.
In the present embodiment, the document information providing apparatus according to the first to fourth embodiments has the functions provided to the document viewing terminal, and is used as a viewer for the document viewing terminal.

図２３に示されるように、本実施形態に係る文書閲覧端末６は、スマートフォンなどの携帯電話機やＰＤＡ（Personal Digital Assistance）などにより実現され、CPU（Central Processing Unit）、CPUが実行するコンピュータプログラム、コンピュータプログラムや所定のデータを記憶するRAM（Random Access Memory）などのメモリにより、文書情報記憶部６Ａ、変換処理部６０１、テキストデータ抽出部６０２、分割処理部６０３、生成処理部６０４、文書情報抽出部６０５、サムネイル抽出部６０６、特定処理部６０７、分割ドキュメントデータ抽出部６０８、結合処理部６０９、及び入出力部６１０からなる各機能ブロックを構成する。 As shown in FIG. 23, the document browsing terminal 6 according to the present embodiment is realized by a mobile phone such as a smartphone, a PDA (Personal Digital Assistance), and the like, and a CPU (Central Processing Unit), a computer program executed by the CPU, A document information storage unit 6A, a conversion processing unit 601, a text data extraction unit 602, a division processing unit 603, a generation processing unit 604, and document information extraction are performed by a memory such as a RAM (Random Access Memory) that stores a computer program and predetermined data. Each functional block includes a unit 605, a thumbnail extraction unit 606, a specific processing unit 607, a divided document data extraction unit 608, a combination processing unit 609, and an input / output unit 610.

文書情報記憶部６Ａは、閲覧者に対して閲覧可能に提供される文書データに関する文書情報を記憶することができる記憶部である。
この文書情報記憶部６Ａには、第二の実施形態における文書情報記憶部３Ａと同様、文書情報ごとに、文書情報を識別するための文書ＩＤ、文書概要情報、オリジナルデータ、フルドキュメントデータ、フルテキストデータ、分割ドキュメントデータ、分割テキストデータに加え、分割ドキュメントデータごとのサムネイルが相互に関連付けて記憶される。 The document information storage unit 6A is a storage unit that can store document information related to document data that is provided so as to be viewable to a viewer.
In the document information storage unit 6A, as in the document information storage unit 3A in the second embodiment, for each document information, a document ID for identifying the document information, document summary information, original data, full document data, full In addition to text data, divided document data, and divided text data, thumbnails for each divided document data are stored in association with each other.

変換処理部６０１は、オリジナルデータのファイル形式を変換する処理部であり、第一の実施形態における変換処理部１０１と同様の機能を備える。 The conversion processing unit 601 is a processing unit that converts the file format of the original data, and has the same function as the conversion processing unit 101 in the first embodiment.

テキストデータ抽出部６０２は、オリジナルデータやフルドキュメントデータ、あるいは分割ドキュメントデータから、レイアウト情報や修飾情報を取り除くことによってテキスト形式のデータを抽出する処理を実行する機能部であって、第一の実施形態におけるテキストデータ抽出部１０２と同様の機能を備える。 The text data extraction unit 602 is a functional unit that executes processing for extracting text format data by removing layout information and decoration information from original data, full document data, or divided document data. It has the same function as the text data extraction unit 102 in the form.

分割処理部６０３は、フルドキュメントデータが複数のページで構成される場合に、当該フルドキュメントデータを、一ページごとの分割ドキュメントデータに分割する処理を実行する処理部であって、第一の実施形態における分割処理部１０３と同様の機能を備える。 The division processing unit 603 is a processing unit that executes a process of dividing the full document data into divided document data for each page when the full document data includes a plurality of pages. The same function as the division processing unit 103 in the embodiment is provided.

生成処理部６０４は、分割ドキュメントデータごとにサムネイルを生成する処理部であって、第二の実施形態における生成処理部３０４と同様の機能を備える。
なお、本実施形態におけるサムネイルの生成の際には、入出力部６１０として実現される文書閲覧端末６のディスプレイのサイズに応じたサイズのサムネイルを生成するようにしてもよい。 The generation processing unit 604 is a processing unit that generates a thumbnail for each divided document data, and has the same function as the generation processing unit 304 in the second embodiment.
When generating thumbnails in the present embodiment, thumbnails having a size corresponding to the display size of the document browsing terminal 6 realized as the input / output unit 610 may be generated.

文書情報抽出部６０５は、文書ＩＤに基づいて、文書情報記憶部６Ａを参照して、当該文書ＩＤによって識別される文書情報を抽出する機能部であって、文書情報抽出部１０４と同様の機能を備える。 The document information extraction unit 605 is a functional unit that extracts the document information identified by the document ID by referring to the document information storage unit 6A based on the document ID, and has the same function as the document information extraction unit 104. Is provided.

サムネイル抽出部６０６は、入出力部６１０を介した閲覧者の検索要求に応じ、文書情報記憶部６Ａを参照して、閲覧者指定のキーワードを含む分割テキストデータと関連付けられた分割ドキュメントデータのサムネイルを抽出する機能部であって、第三の実施形態におけるサムネイル抽出部３０６と同様の機能を備える。 The thumbnail extraction unit 606 refers to the document information storage unit 6A in response to a browser search request via the input / output unit 610, and thumbnails of the divided document data associated with the divided text data including the keyword specified by the viewer. And a function similar to that of the thumbnail extraction unit 306 in the third embodiment.

特定処理部６０７は、サムネイル抽出部６０６により抽出されたサムネイルのうち、閲覧者が指定したキーワードを含む分割テキストデータと関連付けられた分割ドキュメントデータのサムネイルを特定する処理部であって、第三の実施形態における特定処理部４０７と同様の機能を備える。 The identification processing unit 607 is a processing unit that identifies thumbnails of the divided document data associated with the divided text data including the keyword specified by the viewer among the thumbnails extracted by the thumbnail extracting unit 606. The same function as the specific processing unit 407 in the embodiment is provided.

分割ドキュメントデータ抽出部６０８は、文書情報記憶部６Ａを参照して、特定処理部６０７により特定されたサムネイルによって示される分割ドキュメントデータを抽出する機能部であって、第三の実施形態における分割ドキュメントデータ抽出部４０８と同様の機能を備える。 The divided document data extraction unit 608 is a functional unit that extracts the divided document data indicated by the thumbnail specified by the specification processing unit 607 with reference to the document information storage unit 6A, and the divided document data in the third embodiment. A function similar to that of the data extraction unit 408 is provided.

結合処理部６０９は、分割ドキュメントデータ抽出部６０８によって抽出された分割ドキュメントデータが複数ある場合に、当該抽出された分割ドキュメントデータを結合し、一のデータに結合して結合ドキュメントデータを生成する処理を実行する処理部であって、第一の実施形態における結合処理部１０６と同様の機能を備える。 When there are a plurality of pieces of divided document data extracted by the divided document data extraction unit 608, the combining processing unit 609 combines the extracted divided document data and combines them into one data to generate combined document data. Is provided with the same function as the combination processing unit 106 in the first embodiment.

入出力処理部６１０は、データを入出力するための機能部であり、データを出力するためのＬＣＤ（Liquid Crystal Display）等のディスプレイやスピーカ、データを入力するためのマウスやキーボード等により構成される。 The input / output processing unit 610 is a functional unit for inputting and outputting data, and includes a display such as an LCD (Liquid Crystal Display) and a speaker for outputting data, a mouse and a keyboard for inputting data, and the like. The

次に、本実施形態に係る文書閲覧端末６による一連の処理の流れについて、図２４及び図２５を参照して説明する。
なお、文書情報記憶部６Ａに文書情報を登録する処理は、第二の実施形態において図１１を参照して説明したのと同様である。 Next, a flow of a series of processing by the document browsing terminal 6 according to the present embodiment will be described with reference to FIGS.
The process of registering document information in the document information storage unit 6A is the same as that described with reference to FIG. 11 in the second embodiment.

まず、閲覧者は、入出力部６１０により、所望のキーワードと共に、文書の検索要求を入力する（Ｓ５０１）。 First, the viewer inputs a document search request together with a desired keyword through the input / output unit 610 (S501).

これに対して文書情報抽出部６０５は、文書情報記憶部６Ａを参照して、指定されたキーワードをフルテキストデータに含む文書の文書概要情報を抽出する（Ｓ６０２）。
抽出した文書概要情報は、指定されたキーワードを文書内に含む文書の検索結果として、一覧化されて文書閲覧端末６上に表示される（Ｓ５０３）。 In response to this, the document information extraction unit 605 refers to the document information storage unit 6A and extracts document summary information of a document that includes the specified keyword in full text data (S602).
The extracted document summary information is listed and displayed on the document browsing terminal 6 as a search result of documents including the specified keyword in the document (S503).

閲覧者が、文書概要情報一覧から、任意の文書を選択すると（Ｓ５０４）、サムネイル抽出部６０６は、文書情報記憶部６Ａを参照して、選択された文書の文書ＩＤに係る文書情報と関連付けられた分割ドキュメントデータのサムネイルを全て抽出する（Ｓ５０５）。 When the viewer selects an arbitrary document from the document summary information list (S504), the thumbnail extraction unit 606 refers to the document information storage unit 6A and is associated with the document information related to the document ID of the selected document. All thumbnails of the divided document data are extracted (S505).

特定処理部６０７は、抽出されたサムネイルのうち、分割テキストデータに閲覧者指定のキーワードを含む分割ドキュメントデータのサムネイルを特定する（Ｓ５０６）。
そして、抽出されたサムネイルのうち、特定されたドキュメントデータのサムネイルが明示されたサムネイル一覧が文書閲覧端末６上に表示される（Ｓ５０７） The identification processing unit 607 identifies thumbnails of the divided document data including the keyword specified by the viewer in the divided text data among the extracted thumbnails (S506).
Then, among the extracted thumbnails, a thumbnail list in which the thumbnails of the specified document data are specified is displayed on the document browsing terminal 6 (S507).

閲覧者は、閲覧者指定のキーワードを含むものとして明示されたサムネイルによって示される分割ドキュメントデータを結合した結合ドキュメントデータの提供要求を送信する（Ｓ５０８）。 The viewer transmits a request for providing combined document data obtained by combining the divided document data indicated by the thumbnail specified as including the keyword specified by the viewer (S508).

これに応じて分割ドキュメント抽出部６０８は、文書情報記憶部６Ａを参照して、特定処理部６０７によって特定されたサムネイルであって、サムネイル一覧において明示されたサムネイルと関連付けられた分割ドキュメントデータを全て抽出する（Ｓ５０９）。 In response to this, the divided document extraction unit 608 refers to the document information storage unit 6A, and all the divided document data associated with the thumbnails specified by the specification processing unit 607 and specified in the thumbnail list are displayed. Extract (S509).

抽出された分割ドキュメントデータは、結合処理部６０９により結合され、結合ドキュメントデータが生成される（Ｓ５１０）。そして、文書閲覧端末６上において、当該結合ドキュメントデータが表示される（Ｓ５１１）。 The extracted divided document data is combined by the combining processing unit 609 to generate combined document data (S510). Then, the combined document data is displayed on the document browsing terminal 6 (S511).

なお、以上の本実施形態においても、前述した第一から第四の実施形態において述べたとおりの設計変更等が可能であり、例えば、分割ドキュメントデータを個別に表示させることができるようにしてもよい。 Also in the present embodiment described above, the design change or the like as described in the first to fourth embodiments can be performed. For example, the divided document data can be individually displayed. Good.

以上の本実施形態により、ネットワークを介することなく、文書閲覧端末６に文書を登録して、必要なときに必要な文書の分割ドキュメントデータあるいは結合ドキュメントデータを閲覧することができる。 According to the present embodiment described above, it is possible to register a document in the document browsing terminal 6 without using a network and browse the divided document data or combined document data of the necessary document when necessary.

１文書情報提供装置
１０１変換処理部
１０２テキストデータ抽出部
１０３分割処理部
１０４文書情報抽出部
１０５分割ドキュメントデータ抽出部
１０６結合処理部
１０７通信処理部
１Ａ文書情報記憶部
２文書閲覧端末
２０１入出力部
２０２通信処理部
３文書情報提供装置
３０１変換処理部
３０２テキストデータ抽出部
３０３分割処理部
３０４生成処理部
３０５文書情報抽出部
３０６サムネイル抽出部
３０７分割ドキュメントデータ抽出部
３０８結合処理部
３０９通信処理部
３Ａ文書情報記憶部
４文書情報提供装置
４０１変換処理部
４０２テキストデータ抽出部
４０３分割処理部
４０４生成処理部
４０５文書情報抽出部
４０６サムネイル抽出部
４０７特定処理部
４０８分割ドキュメントデータ抽出部
４０９結合処理部
４１０通信処理部
４Ａ文書情報記憶部
５文書情報提供装置
６文書閲覧端末
６０１変換処理部
６０２テキストデータ抽出部
６０３分割処理部
６０４生成処理部
６０５文書情報抽出部
６０６サムネイル抽出部
６０７特定処理部
６０８分割ドキュメントデータ抽出部
６０９結合処理部
６１０通信処理部
６Ａ文書情報記憶部
ＮＷネットワーク DESCRIPTION OF SYMBOLS 1 Document information provision apparatus 101 Conversion processing part 102 Text data extraction part 103 Division | segmentation processing part 104 Document information extraction part 105 Division | segmentation document data extraction part 106 Joint processing part 107 Communication processing part 1A Document information storage part 2 Document browsing terminal 201 Input / output part 202 Communication Processing Unit 3 Document Information Providing Device 301 Conversion Processing Unit 302 Text Data Extraction Unit 303 Division Processing Unit 304 Generation Processing Unit 305 Document Information Extraction Unit 306 Thumbnail Extraction Unit 307 Division Document Data Extraction Unit 308 Combination Processing Unit 309 Communication Processing Unit 3A Document information storage unit 4 Document information providing device 401 Conversion processing unit 402 Text data extraction unit 403 Division processing unit 404 Generation processing unit 405 Document information extraction unit 406 Thumbnail extraction unit 407 Specific processing unit 408 Division document data extraction 409 Connection processing unit 410 Communication processing unit 4A Document information storage unit 5 Document information providing device 6 Document browsing terminal 601 Conversion processing unit 602 Text data extraction unit 603 Division processing unit 604 Generation processing unit 605 Document information extraction unit 606 Thumbnail extraction unit 607 specification Processing unit 608 Divided document data extraction unit 609 Join processing unit 610 Communication processing unit 6A Document information storage unit NW network

Claims

A device that is configured to be communicable with a document browsing terminal for browsing a document in a data format via a network, and provides a corresponding document in a readable manner in response to a search request from the document browsing terminal.
Division processing means for dividing document data of a predetermined file format consisting of one or a plurality of pages into divided document data for each page;
Document information storage means for storing the text data included in the divided document data in association with each divided document data,
Search request receiving means for receiving a search request for the divided document data together with a keyword from the document browsing terminal;
In response to the search request, with reference to the document information storage means, divided document data extraction means for extracting the divided document data associated with the text data including the keyword,
A combination processing means for combining the extracted divided document data to generate combined document data;
Combined document data transmission means for transmitting the combined document data to the document browsing terminal,
A document information providing apparatus characterized by that.

Conversion processing means for converting original document data into document data of the predetermined file format,
The document information providing apparatus according to claim 1.

Generation processing means for generating thumbnails for each of the divided document data;
In response to a search request from the document browsing terminal, referring to the document information storage unit, a thumbnail extracting unit that extracts thumbnails of divided document data associated with text data including the keyword;
Search result transmission means for transmitting the extracted thumbnail list as a search result for the search request to the document browsing terminal;
A merge request receiving means for receiving a merge request for the divided document data indicated by the thumbnail from the document browsing terminal;
The document information storage means stores, for each divided document data, the text data included in the divided document data and a thumbnail of the divided document data in association with each other,
The divided document data extraction unit extracts the divided document data indicated by the thumbnail by referring to the document information storage unit in response to the combination request.
The document information providing apparatus according to claim 1 or 2.

Specific processing means for specifying thumbnails of divided document data associated with text data including the keyword among the thumbnails extracted by the thumbnail extracting means;
The thumbnail extraction means extracts all the thumbnails by referring to the document information storage means in response to a search request from the document browsing terminal,
The search result transmitting means transmits, to the document browsing terminal, a list of the extracted thumbnails as a search result for the search request, in which the specified thumbnail is clearly identified from other thumbnails. And
The combining request receiving means receives a request for combining divided document data indicated by the specified thumbnail from the document browsing terminal,
The divided document data extraction means refers to the document information storage means to extract the divided document data indicated by the specified thumbnail;
The document information providing apparatus according to claim 3.

Size information receiving means for receiving size information of a display that outputs data from the document browsing terminal,
The generation processing means generates thumbnails of a plurality of sizes for each of the divided document data,
The thumbnail extracting means refers to the document information storage means in response to a search request from the document browsing terminal, and among the thumbnails associated with the thumbnail or text data including the keyword, the display of the document browsing terminal Extract thumbnails of the size according to the size information of
The document information providing apparatus according to claim 3 or 4.

It is configured to be able to communicate with a document browsing terminal for browsing data format documents via a network.
In accordance with a search request from the document browsing terminal by a computer, a method for providing a corresponding document so that it can be browsed,
A process of dividing document data of a predetermined file format consisting of one or a plurality of pages into divided document data for each page;
For each of the divided document data, processing for associating the text data included in the divided document data and storing it in the document information storage unit;
A process of receiving a search request for the divided document data together with a keyword from the document browsing terminal;
In response to the search request, referring to the document information storage means, a process of extracting divided document data associated with text data including the keyword;
A process of combining the extracted divided document data to generate combined document data;
A process for transmitting the combined document data to the document browsing terminal;
Document information providing method characterized by the above.

It is configured to be able to communicate with a document browsing terminal for browsing data format documents via a network.
A program for causing a computer to function as a document information providing apparatus that provides a pertinent document in response to a search request from the document browsing terminal,
For the above computer
A process of dividing document data of a predetermined file format consisting of one or a plurality of pages into divided document data for each page;
For each of the divided document data, processing for associating the text data included in the divided document data and storing it in the document information storage unit;
A process of receiving a search request for the divided document data together with a keyword from the document browsing terminal;
In response to the search request, referring to the document information storage means, a process of extracting divided document data associated with text data including the keyword;
A process of combining the extracted divided document data to generate combined document data;
A process of transmitting the combined document data to the document browsing terminal;
Computer program.

A terminal for viewing data format documents,
Division processing means for dividing document data of a predetermined file format consisting of one or a plurality of pages into divided document data for each page;
Document information storage means for storing the text data included in the divided document data in association with each divided document data,
A search request receiving means for receiving a search request for the divided document data together with the input of a keyword;
In response to the search request, with reference to the document information storage means, divided document data extraction means for extracting the divided document data associated with the text data including the keyword,
A combination processing means for combining the extracted divided document data to generate combined document data;
Display means for displaying the combined document data combined.
A document browsing terminal characterized by that.

Conversion processing means for converting original document data into document data of the predetermined file format,
The document browsing terminal according to claim 8.

A method for browsing a document in a data format by a computer,
A process of dividing document data of a predetermined file format consisting of one or a plurality of pages into divided document data for each page;
For each of the divided document data, processing for associating the text data included in the divided document data and storing it in the document information storage unit;
A search request receiving means for receiving a search request for the divided document data together with the input of a keyword;
In response to the search request, referring to the document information storage means, a process of extracting divided document data associated with text data including the keyword;
A process of combining the extracted divided document data to generate combined document data;
A process of displaying the combined document data combined above,
A document browsing method characterized by the above.

A program for causing a computer to function as a document browsing terminal for browsing a document in a data format,
For the above computer
A process of dividing document data of a predetermined file format consisting of one or a plurality of pages into divided document data for each page;
For each of the divided document data, processing for associating the text data included in the divided document data and storing it in the document information storage unit;
A search request receiving means for receiving a search request for the divided document data together with the input of a keyword;
In response to the search request, referring to the document information storage means, a process of extracting divided document data associated with text data including the keyword;
A process of combining the extracted divided document data to generate combined document data;
Processing to display the combined document data combined above.
Computer program.