JP7095259B2

JP7095259B2 - Document processing equipment and programs

Info

Publication number: JP7095259B2
Application number: JP2017212349A
Authority: JP
Inventors: 淳大橋
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2017-11-02
Filing date: 2017-11-02
Publication date: 2022-07-05
Anticipated expiration: 2037-11-02
Also published as: JP2019086860A

Description

本発明は、文書処理装置及びプログラムに関する。 The present invention relates to a document processing apparatus and a program.

スキャナで読み込まれた画像データやパーソナルコンピュータ（ＰＣ）上のアプリケーションで作成された文書データを管理する文書管理装置・ソフトウエアが存在している。例えば、出願人が提供するＤｏｃｕＷｏｒｋｓ（商標）や、ＡｄｏｂｅＳｙｓｔｅｍｓ社のＡｄｏｂｅＡｃｒｏｂａｔ（商標）等のドキュメントハンドリングソフトウェアが、その一例である。この種の文書管理装置は、紙文書をスキャンすることで得られたイメージデータのファイル（例えばビットマップ形式、ＴＩＦＦ形式、ＪＰＥＧ形式）や、ワードプロセッサやスプレッドシート等の各種アプリケーションで作成されたアプリケーションファイルを取り込んで、管理することができる。 There are document management devices and software that manage image data read by a scanner and document data created by an application on a personal computer (PC). For example, document handling software such as DocuWorks ™ provided by the applicant and Adobe Acrobat ™ of Adobe Systems, Inc. is an example. This type of document management device is an image data file obtained by scanning a paper document (for example, bitmap format, TIFF format, JPEG format), or an application file created by various applications such as a word processor or a spreadsheet. Can be captured and managed.

この種の文書管理装置は、様々なアプリケーションで作成されたデータ形式の異なるファイルを、ページ記述言語等のページの見た目を規定する言語で記述したデータとして取り扱う。また、この種の文書管理装置は、スキャナやデジタルカメラ等が生成したイメージデータ形式のファイルを取り扱うこともできる。 This type of document management device handles files with different data formats created by various applications as data described in a language that defines the appearance of the page, such as a page description language. In addition, this type of document management device can also handle files in the image data format generated by a scanner, a digital camera, or the like.

またこの種の文書管理装置の中には、特許文献１に例示されるように、複数の文書ファイルを束ねて１つの文書ファイルを構成したり、１つの文書ファイルをページ単位で複数の文書ファイルに分解（「ばらし」）したりする機能を持つものがある。アプリケーション文書とイメージ文書を１つに束ねた文書ファイルは、ページ記述言語等の言語で記述されたページとイメージデータ形式のページとが混在したファイルとなる。 Further, in this type of document management device, as exemplified in Patent Document 1, a plurality of document files are bundled to form one document file, or one document file is combined into a plurality of document files in page units. Some have the function of disassembling ("disassembling"). A document file in which an application document and an image document are bundled into one is a file in which a page described in a language such as a page description language and a page in an image data format are mixed.

特許文献２に開示された装置は、文書データの要素をサムネイル化したサムネイル画像を表示する機能を備えた画像処理装置であって、文書データを格納する記憶装置と、この記憶装置に格納された文書データのサムネイル画像を制御するためのサムネイル制御オブジェクトをこの文書データに追加し、追加されたサムネイル制御オブジェクトを記憶装置に保存するサムネイル制御オブジェクト付加操作機能とを備える。 The device disclosed in Patent Document 2 is an image processing device having a function of displaying a thumbnail image in which elements of document data are thumbnailed, and is a storage device for storing document data and a storage device stored in the storage device. It has a thumbnail control object addition operation function that adds a thumbnail control object for controlling the thumbnail image of the document data to this document data and saves the added thumbnail control object in the storage device.

特開平１０－１２４４８９号公報Japanese Unexamined Patent Publication No. 10-12489 特開２００８－４２３５９号公報Japanese Unexamined Patent Publication No. 2008-42359

文字認識はイメージデータを対象とする処理であり、イメージデータ形式でないデータについては文字認識は実行できない。イメージデータ形式のページとイメージデータ形式でないページとが混在した文書に対する文字認識の処理において、イメージデータ形式のページに対してのみ文字認識が行われ、イメージデータ形式でないページについては文字認識も他の処理も行われないとすると、ユーザはその文書に文字認識が実行できなかったページが含まれていたことすら分からない。 Character recognition is a process that targets image data, and character recognition cannot be performed for data that is not in the image data format. In the process of character recognition for a document in which pages in image data format and pages in non-image data format are mixed, character recognition is performed only for pages in image data format, and character recognition is also performed for pages not in image data format. If no processing is done, the user does not even know that the document contained pages for which character recognition could not be performed.

本発明は、文字認識の処理において、文書中のイメージデータ形式でないページについて何の処理も行わない場合と比べて、イメージデータ形式でないページについてより多くの情報を提供することを目的とする。 It is an object of the present invention to provide more information about pages that are not in the image data format in the character recognition process, as compared to the case where no processing is performed on the pages that are not in the image data format in the document.

請求項１に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する、文書処理装置であって、前記第２種ページ処理は、前記文書に前記第２種ページが含まれている場合に、前記文字認識が適用できないページが含まれることをユーザに通知する通知処理である、文書処理装置である。 The invention according to claim 1 is when a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed is instructed to perform character recognition. , The character recognition is executed for the first type page included in the document, and the second type page processing, which is a process different from the character recognition, is executed for the second type page included in the document. In the document processing device , the type 2 page processing is a notification process for notifying the user that the document includes a page to which the character recognition cannot be applied when the document contains the type 2 page. There is a document processing device .

請求項２に係る発明は、前記通知処理では、更に、前記第２種ページから文字情報を取得するか否かの問合せを前記ユーザに対して行い、前記文書処理装置は、前記問合せに対するユーザの回答に応じて、前記第２種ページから文字情報を取得する処理を実行するか否かを制御する、請求項１に記載の文書処理装置である。 In the invention according to claim 2 , in the notification processing, an inquiry as to whether or not to acquire character information from the second type page is further made to the user, and the document processing apparatus is a user's inquiry to the inquiry. The document processing apparatus according to claim 1 , which controls whether or not to execute a process of acquiring character information from the second type page according to an answer.

請求項３に係る発明は、前記第２種ページ処理は、前記第２種ページから文字情報を取得する処理である、請求項１に記載の文書処理装置である。 The invention according to claim 3 is the document processing apparatus according to claim 1, wherein the type 2 page processing is a processing for acquiring character information from the type 2 page.

請求項４に係る発明は、前記第２種ページ処理は、前記第２種ページに含まれるテキストデータを抽出する処理である、請求項３に記載の文書処理装置である。 The invention according to claim 4 is the document processing apparatus according to claim 3 , wherein the type 2 page processing is a processing for extracting text data included in the type 2 page.

請求項５に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する、文書処理装置であって、前記第２種ページ処理は、前記第２種ページをイメージデータに変換し、このイメージデータに対して前記文字認識を実行する処理である、文書処理装置である。 The invention according to claim 5 is in the case where the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed. , The character recognition is executed for the first type page included in the document, and the second type page processing, which is a process different from the character recognition, is executed for the second type page included in the document. The type 2 page processing is a document processing device , which is a process of converting the type 2 page into image data and executing the character recognition on the image data.

請求項６に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する、文書処理装置であって、前記文書処理装置は、前記第２種ページに含まれるテキストデータを抽出する処理、及び、前記第２種ページをイメージデータに変換し、このイメージデータに対して前記文字認識を実行する処理、のうちユーザから選択された処理を、前記第２種ページ処理として実行する、文書処理装置である。 The invention according to claim 6 is in the case where the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed. , The character recognition is executed for the first type page included in the document, and the second type page processing, which is a process different from the character recognition, is executed for the second type page included in the document. A document processing device, wherein the document processing device performs a process of extracting text data included in the second-class page, converts the second-class page into image data, and displays the characters on the image data. It is a document processing device that executes a process selected by the user among the processes for executing recognition as the second type page process.

請求項７に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する、文書処理装置であって、前記第２種ページ処理は、前記第２種ページから抽出したテキストデータと、前記第２種ページを変換したイメージデータに対する前記文字認識の結果と、に基づいて、前記第２種ページについての処理結果となる文字情報を求める処理である、文書処理装置である。 The invention according to claim 7 is in the case where the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed. , The character recognition is executed for the first type page included in the document, and the second type page processing, which is a process different from the character recognition, is executed for the second type page included in the document. In the document processing device, the type 2 page processing is based on the text data extracted from the type 2 page and the result of character recognition for the image data obtained by converting the type 2 page. This is a document processing device that is a process for obtaining character information that is a processing result for a type 2 page.

請求項８に係る発明は、前記第２種ページ処理では、前記第２種ページ内のある場所について、前記第２種ページから抽出したテキストデータ中のその場所に対応する第１の文字と、前記第２種ページを変換したイメージデータに対する前記文字認識の結果中のその場所に対応する第２の文字との両方が存在する場合に、前記第１の文字をその場所についての文字情報として採用する、請求項７に記載の文書処理装置である。 According to the eighth aspect of the present invention, in the second-class page processing, for a certain place in the second-class page, the first character corresponding to the place in the text data extracted from the second-class page and the first character. When both the second character corresponding to the place in the result of the character recognition for the image data obtained by converting the second type page exist, the first character is adopted as the character information about the place. The document processing apparatus according to claim 7 .

請求項９に係る発明は、前記第２種ページ処理では、前記第２種ページ内のある場所について、前記第２種ページから抽出したテキストデータ中にはその場所に対応する文字がなく、前記第２種ページを変換したイメージデータに対する前記文字認識の結果中のその場所に対応する文字がある場合に、前記文字認識の結果中の前記文字をその場所についての文字情報として採用する、請求項７又は８に記載の文書処理装置である。 According to the invention of claim 9 , in the type 2 page processing, there is no character corresponding to the place in the text data extracted from the type 2 page for a certain place in the type 2 page, and the above-mentioned invention. A claim that, when there is a character corresponding to the place in the result of the character recognition for the image data converted from the second type page, the character in the result of the character recognition is adopted as character information about the place. The document processing apparatus according to 7 or 8 .

請求項１０に係る発明は、前記第２種ページ処理では、前記第２種ページ内のある場所について、前記第２種ページから抽出したテキストデータ中にはその場所に対応する文字があり、前記第２種ページを変換したイメージデータに対する前記文字認識の結果中のその場所に対応する文字がない場合に、前記テキストデータ中の前記文字をその場所についての文字情報として採用するか否かを、ユーザの指示に従って制御する、請求項７～９のいずれか１項に記載の文書処理装置である。 According to the invention of claim 10 , in the type 2 page processing, there is a character corresponding to a certain place in the type 2 page in the text data extracted from the type 2 page. Whether or not to adopt the character in the text data as character information about the place when there is no character corresponding to the place in the result of the character recognition for the image data converted from the second type page. The document processing apparatus according to any one of claims 7 to 9 , which is controlled according to a user's instruction.

請求項１１に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する、文書処理装置であって、前記文書に前記第２種ページが含まれている場合に、前記文字認識が適用できないページが含まれることをユーザに通知する通知処理と、前記第２種ページから文字情報を取得する処理と、のうちのいずれを前記第２種ページ処理として実行するかの設定をユーザから受け付ける設定画面を提示する手段、を有する文書処理装置である。 The invention according to claim 11 is in the case where the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed. , The character recognition is executed for the first type page included in the document, and the second type page processing, which is a process different from the character recognition, is executed for the second type page included in the document. In the document processing device, when the document includes the type 2 page, the notification process for notifying the user that the page to which the character recognition cannot be applied is included, and the character from the type 2 page. It is a document processing apparatus having a process for acquiring information and a means for presenting a setting screen for receiving a setting from a user as to which of the two is executed as the second type page process.

請求項１２に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する、文書処理装置であって、前記文書に前記第２種ページが含まれている場合に、前記文字認識が適用できないページが含まれることをユーザに通知する通知処理と、前記第２種ページに含まれるテキストデータを抽出する処理と、前記第２種ページをイメージデータに変換し、このイメージデータに対して前記文字認識を実行する処理と、のうちの２以上を前記第２種処理の選択肢としてユーザに選択させるための設定画面、を提示する手段、を有する文書処理装置である。 The invention according to claim 12 is in the case where the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed. , The character recognition is executed for the first type page included in the document, and the second type page processing, which is a process different from the character recognition, is executed for the second type page included in the document. In the document processing device, when the document includes the second type page, the notification process for notifying the user that the page to which the character recognition cannot be applied is included, and the second type page include the notification process. Two or more of the process of extracting the text data to be generated and the process of converting the second-class page into image data and executing the character recognition on the image data are selected as the second-class processing options. It is a document processing device having a means for presenting a setting screen for letting a user select.

請求項１３に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する手段、としてコンピュータを機能させるためのプログラムであって、前記第２種ページ処理は、前記文書に前記第２種ページが含まれている場合に、前記文字認識が適用できないページが含まれることをユーザに通知する通知処理である、プログラムである。
請求項１４に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する手段、としてコンピュータを機能させるためのプログラムであって、前記第２種ページ処理は、前記第２種ページをイメージデータに変換し、このイメージデータに対して前記文字認識を実行する処理である、プログラムである。
請求項１５に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する手段、としてコンピュータを機能させるためのプログラムであって、前記第２種ページ処理は、前記第２種ページから抽出したテキストデータと、前記第２種ページを変換したイメージデータに対する前記文字認識の結果と、に基づいて、前記第２種ページについての処理結果となる文字情報を求める処理である、プログラムである。
請求項１６に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する手段、としてコンピュータを機能させるためのプログラムであって、前記コンピュータを、前記文書に前記第２種ページが含まれている場合に、前記文字認識が適用できないページが含まれることをユーザに通知する通知処理と、前記第２種ページから文字情報を取得する処理と、のうちのいずれを前記第２種ページ処理として実行するかの設定をユーザから受け付ける設定画面を提示する手段、として機能させることを特徴とするプログラムである。
請求項１７に係る発明は、イメージデータ形式の第１種ページと、イメージデータ形式でない第２種ページと、の混在が可能なデータ形式の文書に対して文字認識の実行が指示された場合に、前記文書に含まれる前記第１種ページについては前記文字認識を実行し、前記文書に含まれる前記第２種ページについては前記文字認識とは異なる処理である第２種ページ処理を実行する手段、としてコンピュータを機能させるためのプログラムであって、前記コンピュータを、前記文書に前記第２種ページが含まれている場合に、前記文字認識が適用できないページが含まれることをユーザに通知する通知処理と、前記第２種ページに含まれるテキストデータを抽出する処理と、前記第２種ページをイメージデータに変換し、このイメージデータに対して前記文字認識を実行する処理と、のうちの２以上を前記第２種処理の選択肢としてユーザに選択させるための設定画面、を提示する手段、として機能させることを特徴とするプログラムである。 The invention according to claim 13 is a case where the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed. , A means for executing the character recognition for the type 1 page included in the document, and executing the type 2 page processing which is a process different from the character recognition for the type 2 page included in the document. This is a program for operating a computer as a user , and the user indicates that the second-class page processing includes a page to which the character recognition cannot be applied when the second-class page is included in the document. It is a program that is a notification process to notify to .
The invention according to claim 14 is in the case where the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed. , A means for executing the character recognition for the first type page included in the document, and executing the second type page processing which is a process different from the character recognition for the second type page included in the document. The type 2 page processing is a program for converting the type 2 page into image data and executing the character recognition for the image data. Is.
The invention according to claim 15 is a case where the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed. , A means for executing the character recognition for the first type page included in the document, and executing the second type page processing which is a process different from the character recognition for the second type page included in the document. The second-class page processing is a program for operating the computer as, and the second-class page processing includes the text data extracted from the second-class page and the result of the character recognition for the image data obtained by converting the second-class page. , Is a program that obtains character information that is a processing result for the second type page.
The invention according to claim 16 is a case where the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed. , A means for executing the character recognition for the type 1 page included in the document, and executing the type 2 page processing which is a process different from the character recognition for the type 2 page included in the document. , Is a program for operating the computer as, and notifies the user that the computer includes a page to which the character recognition cannot be applied when the document contains the second type page. To function as a means of presenting a setting screen that accepts from the user the setting of which of the processing and the processing of acquiring character information from the second type page is executed as the second type page processing. It is a featured program.
The invention according to claim 17 is a case where the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed. , A means for executing the character recognition for the first type page included in the document, and executing the second type page processing which is a process different from the character recognition for the second type page included in the document. , A program for operating the computer as, notifying the user that the computer includes a page to which the character recognition cannot be applied when the document contains the second type page. Two of a process, a process of extracting the text data included in the type 2 page, and a process of converting the type 2 page into image data and executing the character recognition on the image data. The program is characterized in that the above functions as a means for presenting a setting screen for allowing the user to select the above as an option for the second type of processing.

参考例の構成１は、イメージデータ形式でないページを含んだ第１文書を、他の１以上の文書と束ねて１つの文書にする束ね処理を実行する際に、前記イメージデータ形式でないページをイメージデータ形式に変換し、前記束ね処理の結果の文書が前記イメージデータ形式でないページを含まないようにする束ね手段、を有する、文書処理装置である。 In the configuration 1 of the reference example, when the first document including the page not in the image data format is bundled with one or more other documents into one document, the page not in the image data format is imaged. It is a document processing apparatus having a bundling means for converting into a data format and preventing the document as a result of the bundling process from including pages other than the image data format.

参考例の構成２は、イメージデータ形式でないページを含んだ第１文書を、他の１以上の文書と束ねて１つの文書にする束ね処理を実行する際に、前記束ね処理の結果の文書に前記イメージデータ形式でないページを残す第２束ね手段と、前記束ね処理が指示された場合に、前記束ね手段と前記第２束ね手段のいずれをもちいるかの指定をユーザから受け付ける手段と、を更に有する参考例の構成１に記載の文書処理装置である。 Configuration 2 of the reference example is a document as a result of the bundling process when the first document including a page not in the image data format is bundled with one or more other documents into one document. It further has a second bundling means for leaving pages that are not in the image data format, and a means for accepting from the user whether to use the bundling means or the second bundling means when the bundling process is instructed. The document processing apparatus according to the configuration 1 of the reference example.

参考例の構成３は、イメージデータ形式でないページを含んだ第１文書を、他の１以上の文書と束ねて１つの文書にする束ね処理を実行する際に、前記イメージデータ形式でないページをイメージデータ形式に変換し、前記束ね処理の結果の文書が前記イメージデータ形式でないページを含まないようにする束ね手段、としてコンピュータを機能させるためのプログラムである。 In the configuration 3 of the reference example, when the first document including the page not in the image data format is bundled with one or more other documents into one document, the page not in the image data format is imaged. It is a program for operating a computer as a bundling means for converting into a data format and preventing the document resulting from the bundling process from including pages other than the image data format.

請求項１又は１３に係る発明によれば、文書中のイメージデータ形式でない第２種ページに対して単に文字認識を実行しないだけでとどめる場合と比べて、第２種ページ処理により得られる情報をユーザに提供することができる。 According to the invention according to claim 1 or 13 , the information obtained by the type 2 page processing is obtained as compared with the case where the character recognition is simply not executed for the type 2 page that is not in the image data format in the document. It can be provided to the user.

更に、ユーザは文字認識の対象とした文書に文字認識が適用できないページが含まれていることを知ることができる。 Further , the user can know that the document targeted for character recognition contains pages to which character recognition cannot be applied.

請求項２に係る発明によれば、ユーザは文字認識が適用できないページから別の方法で文字情報を取得することを指示することができる。 According to the second aspect of the present invention, the user can instruct to acquire character information by another method from a page to which character recognition cannot be applied.

請求項３に係る発明によれば、第２種ページから文字認識とは別の方法で取得した文字情報をユーザに提供することができる。 According to the third aspect of the present invention, it is possible to provide the user with character information acquired from the second type page by a method different from character recognition.

請求項４に係る発明によれば、第２種ページが含むテキストデータという、文字認識により得られる文字情報よりも正確さが高い文字情報を、第２種ページの文字情報としてユーザに提供することができる。 According to the invention of claim 4 , the text data included in the type 2 page, which is character information having higher accuracy than the character information obtained by character recognition, is provided to the user as the character information of the type 2 page. Can be done.

請求項５に係る発明によれば、第２種ページを表示した際に人の目で文字に見える画像が、第２種ページのデータ内でテキストデータ形式以外の形式で表現されている場合でも、その文字の情報を得ることができる。 According to the invention of claim 5 , even when the image that can be seen as characters by the human eye when the type 2 page is displayed is expressed in a format other than the text data format in the data of the type 2 page. , You can get the information of the character.

請求項６又は１４に係る発明によれば、第２種ページからテキストデータを抽出するか、第２種ページをイメージデータに変換して文字認識を行うか、ユーザの希望する方式を用いることができる。 According to the invention of claim 6 or 14 , it is possible to extract text data from the type 2 page, convert the type 2 page into image data for character recognition, or use a method desired by the user. can.

請求項７又は１５に係る発明によれば、第２種ページから抽出したテキストデータ、又は、第２種ページを変換したイメージデータに対する文字認識の結果、のうちの一方を採用する場合よりも、より正確な情報を提供することができる。 According to the invention of claim 7 or 15 , the text data extracted from the type 2 page or the result of character recognition for the image data converted from the type 2 page is adopted as compared with the case of adopting either one. More accurate information can be provided.

請求項８に係る発明によれば、第２種ページ内の同じ場所についてのテキストデータと文字認識結果とが異なっている場合に、正確なテキストデータの文字情報を採用することができる。 According to the invention of claim 8 , when the text data for the same place in the type 2 page and the character recognition result are different, the accurate character information of the text data can be adopted.

請求項９に係る発明によれば、第２種ページ内のテキストデータの文字はないが文字認識により文字が認識される場所について、文字情報を得ることができる。 According to the invention of claim 9 , character information can be obtained about a place where a character is recognized by character recognition although there is no character in the text data in the type 2 page.

請求項１０に係る発明によれば、第２種ページ内のテキストデータの文字はあるが文字認識により文字が認識されない場所について、テキストデータの文字情報を採用するかどうかの決定にユーザの意向を反映することができる。 According to the invention of claim 10 , the user's intention is to decide whether or not to adopt the character information of the text data in the place where the characters of the text data are present but the characters are not recognized by the character recognition in the type 2 page. Can be reflected.

請求項１１、１２、１６又は１７に係る発明によれば、第２種ページ処理としてどの処理を行うか、ユーザが指定することができる。 According to the invention according to claim 11, 12, 16 or 17 , the user can specify which process is to be performed as the second type page process.

参考例の構成１又は３に係る発明によれば、束ね処理の結果の文書として全ページに文字認識が適用可能な文書を得ることができる。 According to the invention according to the configuration 1 or 3 of the reference example, a document to which character recognition can be applied to all pages can be obtained as a document as a result of the bundling process.

参考例の構成２によれば、束ね処理の結果の文書として、全ページに文字認識が適用可能な文書を得るか、束ねる前の文書に含まれていたイメージデータ形式でないページを残した文書を得るかを、ユーザの希望に応じて決定することができる。 According to the configuration 2 of the reference example, as the document as a result of the bundling process, a document to which character recognition can be applied to all pages is obtained, or a document in which pages other than the image data format included in the document before bundling are left is left. It can be decided according to the user's wishes.

実施形態の文字認識処理制御が適用される文書ハンドリングシステムの概略構成を例示する図である。It is a figure which illustrates the schematic structure of the document handling system to which the character recognition processing control of an embodiment is applied. 文書ハンドリングシステムが提供する文書一覧画面の例を示す図である。It is a figure which shows the example of the document list screen provided by a document handling system. 文書ハンドリングシステムのある設定項目についての設定画面の例を示す図である。It is a figure which shows the example of the setting screen about a certain setting item of a document handling system. ユーザから文字認識（ＯＣＲ）の実行を指示された際に処理制御部が実行する制御手順の例を示す図である。It is a figure which shows the example of the control procedure which a processing control unit executes when the user instructed to execute character recognition (OCR). アプリページが含まれる文書に文字認識を実行した後に処理制御部が表示する画面の例を示す図である。It is a figure which shows the example of the screen which the processing control part displays after performing character recognition in the document which contains an application page. ユーザから文字認識（ＯＣＲ）の実行を指示された際に処理制御部が実行する制御手順の別の例を示す図である。It is a figure which shows another example of the control procedure which a process control part executes when the user instructed to execute character recognition (OCR). アプリページが含まれる文書に文字認識を実行する際に処理制御部が表示する確認画面の例を示す図である。It is a figure which shows the example of the confirmation screen which the processing control part displays when performing character recognition in the document which contains an application page. アプリページが含まれる文書に文字認識を実行する際に処理制御部が表示する確認画面の別の例を示す図である。It is a figure which shows another example of the confirmation screen which the processing control part displays when performing character recognition in the document which contains an application page. 処理制御部によるアプリページからのテキスト取得方法の自動選択処理の一例を示す図である。It is a figure which shows an example of the automatic selection process of the text acquisition method from the application page by a process control unit. 処理制御部によるアプリページからのテキスト取得方法の自動選択処理の別の例を示す図である。It is a figure which shows another example of the automatic selection process of the text acquisition method from the application page by a process control unit. 処理制御部によるアプリページからのテキスト取得方法の自動選択処理の更に別の例を示す図である。It is a figure which shows still another example of the automatic selection process of the text acquisition method from the application page by a process control unit. 処理制御部によるアプリページからのテキスト取得方法の自動選択処理の更に別の例を示す図である。It is a figure which shows still another example of the automatic selection process of the text acquisition method from the application page by a process control unit. 文書ハンドリングシステムのある設定項目についての設定画面の別の例を示す図である。It is a figure which shows another example of the setting screen about one setting item of a document handling system.

図１に、本発明に係る文字認識処理制御が適用される文書ハンドリングシステム１０の概略構成を例示する。文書ハンドリングシステム１０は、例えば出願人の提供するＤｏｃｕＷｏｒｋｓ（商標）や、ＡｄｏｂｅＳｙｓｔｅｍｓ社のＡｄｏｂｅＡｃｒｏｂａｔ（商標）のように、登録された文書ファイルに対する閲覧、編集、注釈付け（色付きマーカー、付箋、スタンプ等の付加）等の処理機能を提供するソフトウエアである。このソフトウエアは、ＰＣ（パーソナルコンピュータ）又はサーバ等のコンピュータで実行されることにより、文書ハンドリングシステム１０の機能を実現する。文書ハンドリングシステム１０は、ＰＣにインストールされた個人用のシステムであってもよいし、リモートのユーザに対して文書ハンドリングサービスを提供するサーバとして構築されていてもよい。 FIG. 1 illustrates a schematic configuration of a document handling system 10 to which the character recognition processing control according to the present invention is applied. The document handling system 10 allows viewing, editing, annotating (colored markers, sticky notes, stamps, etc.) of registered document files, such as DocuWorks ™ provided by the applicant and Adobe Acrobat ™ of Adobe Systems. It is software that provides processing functions such as (addition). This software realizes the function of the document handling system 10 by being executed by a computer such as a PC (personal computer) or a server. The document handling system 10 may be a personal system installed on a PC, or may be constructed as a server that provides a document handling service to a remote user.

文書ハンドリングシステム１０は、機能モジュールとして、ファイル取り込み部１２、文書処理部１４、ＵＩ処理部１６、設定管理部１８、及び文字認識処理部２０を含んでいる。 The document handling system 10 includes a file import unit 12, a document processing unit 14, a UI processing unit 16, a setting management unit 18, and a character recognition processing unit 20 as functional modules.

ファイル取り込み部１２は、文書ハンドリングシステム１０の外部で生成されたファイルを文書ハンドリングシステム１０内に取り込む（すなわち登録する）ための処理を担う。取り込まれるファイルには、ワードプロセッサや表計算等のアプリケーションソフトウエアで生成された当該アプリケーション固有のデータ形式のファイル（アプリケーションファイルと呼ぶ）、スキャナやデジタルカメラ等により生成されたビットマップ、ＴＩＦＦ、ＪＰＥＧ等のイメージデータ形式のイメージファイル等がある。 The file import unit 12 is responsible for importing (that is, registering) a file generated outside the document handling system 10 into the document handling system 10. The files to be imported include files in the application-specific data format (called application files) generated by application software such as word processors and table calculations, bitmaps generated by scanners and digital cameras, TIFF, JPEG, etc. There are image files in the image data format of.

ファイル取り込み部１２は、取り込み対象としてアプリケーションファイルが入力された場合、これを文書ハンドリングシステム１０が用いる特定のデータ記述言語で記述された文書ファイルへと変換する。用いられるデータ記述言語は、ＰＤＬ（ページ記述言語）、又は、ページ記述言語のように文書（画像）の見た目を記述可能な他の種類の言語（ＰＤＦのデータ形式もその一例）である。このデータ記述言語で記述される文書ファイルには、テキスト、ベクターグラフィックス（ベクター表現で記述された図形）、連続階調イメージ（ビットマップ、ＴＩＦＦ、ＪＰＥＧ等のデータ形式のも）等の複数種類のオブジェクトが含まれ得る。このデータ記述言語で記述された文書ファイルを、以下では「アプリ文書」と呼ぶ。またアプリ文書を構成する個々のページのことを「アプリページ」と呼ぶ。アプリページには、テキスト、ベクターグラフィックス、連続階調イメージ等のいくつかのオブジェクトが含まれ得る。アプリページは、イメージデータ形式のオブジェクトを含み得るが、アプリページ自体のデータ形式は、そのデータ記述言語により規定されるものであり、イメージデータ形式ではない。 When an application file is input as an import target, the file import unit 12 converts it into a document file described in a specific data description language used by the document handling system 10. The data description language used is PDL (page description language) or another kind of language such as a page description language that can describe the appearance of a document (image) (PDF data format is also an example). Document files described in this data description language include multiple types such as text, vector graphics (graphics described in vector representation), and continuous gradation images (data formats such as bitmap, TIFF, and JPEG). Objects can be included. A document file described in this data description language is hereinafter referred to as an "application document". In addition, the individual pages that make up the application document are called "application pages". The app page can contain several objects such as text, vector graphics, and continuous tone images. The app page may include objects in the image data format, but the data format of the app page itself is defined by its data description language, not the image data format.

またファイル取り込み部１２は、取り込み対象としてイメージファイルが入力された場合、そのファイルを「イメージ文書」として取り込む。イメージ文書は、その文書が含む個々のページがビットマップ、ＴＩＦＦ、ＪＰＥＧ等のイメージデータ形式のデータである文書ファイルである。イメージ文書を構成する各ページのことを「イメージページ」と呼ぶ。 Further, when an image file is input as an import target, the file import unit 12 imports the file as an "image document". An image document is a document file in which each page included in the document is data in an image data format such as bitmap, TIFF, or JPEG. Each page that constitutes an image document is called an "image page".

このように、ファイル取り込み部１２は、プリンタドライバや仮想プリンタに類似した役割を果たす。 In this way, the file capture unit 12 plays a role similar to that of a printer driver or a virtual printer.

ファイル取り込み部１２により取り込まれた文書ファイルは、文書ハンドリングシステム１０の管理下にあるフォルダのうちのいずれか（例えばユーザが指定又は事前設定したもの）に格納される。 The document file imported by the file import unit 12 is stored in one of the folders under the control of the document handling system 10 (for example, the one specified or preset by the user).

文書処理部１４は、文書ハンドリングシステム１０の管理下にあるフォルダに格納された文書ファイルに対して、ユーザの指示する処理を実行する。例えば、ユーザから文書ファイルの閲覧が指示された場合には、文書処理部１４はその文書ファイルを開き、文書ファイルのページの画像を画面表示する。また、文書処理部１４は、ユーザからの指示に応じて、開いた文書ファイルに対する注釈の追加や削除等を行う。また、文書処理部１４は、文書の「ばらし」及び「束ね」の機能を有する。ばらし処理とは、１つの文書ファイルを、指定されたページ以前のページからなる第１の文書ファイルと、その指定ページの次のページ以降のページからなる第２の文書ファイルとに分割する処理である。また、束ね処理とは、ユーザが指定した複数の文書ファイルを束ねて１つの文書ファイルにする処理である。束ね後の文書ファイルは、束ねる前の第１の文書ファイルのページ群と第２の文書ファイルのページ群とを併せ持ったファイルとなる。アプリ文書とイメージ文書とを束ねた場合、一つの例では、アプリページとイメージページとが混在した文書ファイルが生成される。文書ファイルは、内包するページごとに、そのページの属性情報の１つとして、そのページがアプリページ又はイメージページのいずれであるかを示す種類情報を有する。 The document processing unit 14 executes a process instructed by the user for a document file stored in a folder under the control of the document handling system 10. For example, when the user is instructed to browse the document file, the document processing unit 14 opens the document file and displays an image of the page of the document file on the screen. Further, the document processing unit 14 adds or deletes annotations to the opened document file in response to an instruction from the user. In addition, the document processing unit 14 has the functions of "disassembling" and "bunching" documents. The disassembly process is a process of dividing one document file into a first document file consisting of pages before the specified page and a second document file consisting of pages after the next page of the specified page. be. Further, the bundling process is a process of bundling a plurality of document files specified by the user into one document file. The document file after bundling is a file having both the page group of the first document file and the page group of the second document file before bundling. When the application document and the image document are bundled, in one example, a document file in which the application page and the image page are mixed is generated. The document file has, for each included page, type information indicating whether the page is an application page or an image page as one of the attribute information of the page.

ＵＩ処理部１６は、文書ファイルの操作のためのＵＩ（ユーザインタフェース）画面を提供し、その画面に対するユーザの操作を受け付ける。ＵＩ処理部１６は、例えば、図２に例示する文書管理画面１００を生成し、文書ハンドリングシステム１０がインストールされたコンピュータの表示装置に表示する。例示した文書管理画面１００には、開いているフォルダ内にある各文書ファイルのアイコン１０２、１０４、１０６が列挙表示されている。この例では、各文書ファイルのアイコン１０２、１０４、１０６は、それぞれ当該文書ファイルの最初のページのサムネイル画像である。例えば、ユーザは、ある文書ファイルのアイコンをドラッグ＆ドロップ操作により別の文書ファイルのアイコンに重ねることで、それら２つの文書ファイルの束ね処理を指示することができる。また、図示省略した文書管理画面１００上のメニューや、アイコンに対するマウスの右クリック等の操作で呼び出されるコンテキストメニューから、文書ファイルに行う処理を選択することもできる。 The UI processing unit 16 provides a UI (user interface) screen for operating a document file, and accepts a user's operation on the screen. The UI processing unit 16 generates, for example, the document management screen 100 illustrated in FIG. 2 and displays it on the display device of the computer on which the document handling system 10 is installed. On the illustrated document management screen 100, icons 102, 104, and 106 of each document file in the open folder are listed and displayed. In this example, the icons 102, 104, and 106 of each document file are thumbnail images of the first page of the document file, respectively. For example, the user can instruct the bundling process of these two document files by superimposing the icon of one document file on the icon of another document file by a drag and drop operation. It is also possible to select the process to be performed on the document file from the menu on the document management screen 100 (not shown) or the context menu called by an operation such as right-clicking the mouse on the icon.

設定管理部１８は、文書ハンドリングシステム１０の各種設定項目の入力を受け付け、入力された設定項目の値を保持する。設定管理部１８が管理する設定項目の例として、文書を束ねる際のアプリページの処理、ＯＣＲ（光学文字認識、以下単に文字認識とも呼ぶ）処理時のアプリページに対する処理、等がある。 The setting management unit 18 accepts input of various setting items of the document handling system 10 and holds the values of the input setting items. Examples of setting items managed by the setting management unit 18 include processing of application pages when bundling documents, processing of application pages during OCR (optical character recognition, hereinafter also simply referred to as character recognition) processing, and the like.

図３に、これらの項目の設定を受け付けるためにＵＩ処理部１６が表示する設定画面２００を例示する。この設定画面２００には、文書を束ねる際のアプリページの処理についての選択肢欄２０２に、第１選択肢「そのまま束ねる」と、第２選択肢「アプリページをイメージページに変換して束ねる」の２つを示している。第１選択肢は、束ね対象の文書ファイルに含まれるアプリページをアプリページのままで束ねる処理である。この処理の結果得られる束ね後の文書ファイルには、アプリページが残る。一方、第２選択肢は、束ね対象の文書ファイルに含まれるアプリページをイメージページに変換してから束ねる処理である。この処理の結果得られる束ね後の文書ファイルは、イメージページからなり、アプリページは含まない。 FIG. 3 illustrates a setting screen 200 displayed by the UI processing unit 16 in order to accept the settings of these items. On this setting screen 200, there are two options, the first option "bundle as it is" and the second option "convert the application page into an image page and bundle" in the option column 202 for processing the application page when bundling documents. Is shown. The first option is the process of bundling the application pages included in the document file to be bundled as they are. The application page remains in the bundled document file obtained as a result of this process. On the other hand, the second option is a process of converting the application pages included in the document files to be bundled into image pages and then bundling them. The bundled document file obtained as a result of this process consists of image pages and does not include application pages.

選択肢欄２０２の第２選択肢が選択されている場合、ユーザが文書ファイル同士を束ねる操作を行った場合、文書処理部１４は、それら文書ファイルに含まれるすべてのアプリページをラスタライザ２８等によりイメージページに変換する。この場合に得られる束ね結果の文書ファイルは、イメージページからなり、アプリページを含まないので、すべてのページに対して文字認識処理を適用できる。 When the second option of the option column 202 is selected, when the user performs an operation of bundling the document files, the document processing unit 14 displays all the application pages included in those document files as image pages by the rasterizer 28 or the like. Convert to. Since the document file of the bundled result obtained in this case consists of an image page and does not include an application page, character recognition processing can be applied to all pages.

なお、選択肢欄２０２の各選択肢の先頭（左端）にある円形の図形は、ラジオボタンである。ユーザは、希望する選択肢のラジオボタンをマウスのクリックやタッチ操作で選択する。 The circular figure at the beginning (left end) of each option in the option column 202 is a radio button. The user selects a desired radio button by clicking or touching the mouse.

また、設定画面２００には、ＯＣＲ時のアプリページに対する処理の選択肢欄２０４に、第１選択肢「ＯＣＲができなかった旨のメッセージを表示する」、第２選択肢「イメージページに変換してＯＣＲを実行する」、第３選択肢「アプリページ内のテキストを抽出してＯＣＲ結果とする」の３つを示している。ここで、ＯＣＲ（文字認識）は、画像（イメージデータ）に含まれる文字をパターンマッチング等により認識する処理であり、イメージデータ以外のデータには直接適用できない。したがって、文書ファイル中のイメージページにはＯＣＲは実行可能であるが、アプリページにはＯＣＲを実行できない。選択肢欄２０４中の第１選択肢は、ＯＣＲ対象の文書ファイル中のイメージページのみにＯＣＲ処理を実行し、ＯＣＲ終了後に、ＯＣＲが適用できないページが含まれていたことを示すメッセージを画面表示する処理である。第２選択肢は、その文書ファイル中のアプリページをイメージページに変換した上で、全ページに対してＯＣＲを実行する処理である。第３選択肢は、アプリページ内のテキストオブジェクトに含まれるテキストデータを抽出し、抽出したテキストデータを文字認識結果として出力する処理である。 Further, on the setting screen 200, the first option "display a message that OCR could not be performed" and the second option "convert to an image page and perform OCR" are displayed in the option column 204 for processing the application page at the time of OCR. "Execute" and the third option "Extract the text in the application page and use it as the OCR result" are shown. Here, OCR (character recognition) is a process of recognizing characters included in an image (image data) by pattern matching or the like, and cannot be directly applied to data other than image data. Therefore, OCR can be executed on the image page in the document file, but OCR cannot be executed on the application page. The first option in the option column 204 is a process of executing OCR processing only on the image page in the document file to be OCR and displaying a message on the screen indicating that a page to which OCR cannot be applied is included after the end of OCR. Is. The second option is a process of converting the application page in the document file into an image page and then executing OCR for all the pages. The third option is a process of extracting the text data included in the text object in the application page and outputting the extracted text data as a character recognition result.

図示の例では、選択肢欄２０４に３つの選択肢が含まれていたが、選択肢欄２０４にはそれら３つの選択肢のすべてが含まれている必要はなく、またそれら３つ以外の選択肢が含まれていてもよい。 In the illustrated example, the choice field 204 contained three choices, but the choice field 204 does not have to contain all three choices and includes choices other than those three. You may.

図１の説明に戻ると、文字認識処理部２０は、ＯＣＲ（文字認識）処理を担う機能モジュールである。文字認識処理部２０は、処理制御部２２、文字認識エンジン２４、テキスト抽出部２６、及びラスタライザ２８を含む。処理制御部２２は、文字認識エンジン２４、テキスト抽出部２６、及びラスタライザ２８を制御して、文字認識処理部２０の機能を実現するための制御を行う。文字認識エンジン２４は、イメージデータに対して公知のＯＣＲアルゴリズムによる文字認識処理を実行する。テキスト抽出部２６は、アプリページからテキストデータを抽出する。ラスタライザ２８は、データ記述言語で記述されたアプリページをラスターデータ（ビットマップイメージ）に変換する。 Returning to the description of FIG. 1, the character recognition processing unit 20 is a functional module responsible for OCR (character recognition) processing. The character recognition processing unit 20 includes a processing control unit 22, a character recognition engine 24, a text extraction unit 26, and a rasterizer 28. The processing control unit 22 controls the character recognition engine 24, the text extraction unit 26, and the rasterizer 28 to perform control for realizing the functions of the character recognition processing unit 20. The character recognition engine 24 executes character recognition processing based on a known OCR algorithm on the image data. The text extraction unit 26 extracts text data from the application page. The rasterizer 28 converts the application page described in the data description language into raster data (bitmap image).

図４に、ユーザから文字認識（ＯＣＲ）の実行を指示された際に処理制御部２２が実行する制御手順の例を示す。この手順では、まず処理制御部２２は、文字認識の対象に指定された文書ファイル内の各ページの属性を調べ、それら各ページの種類情報を取得する（Ｓ１０）。これにより、その文書ファイル内の各ページがアプリページ及びイメージページのいずれであるかが分かる。処理制御部２２は、Ｓ１０で取得した情報に基づき、その文書ファイルが１以上のアプリページを含むか否かを判定する（Ｓ１２）。この判定の結果がＮｏ、すなわちその文書ファイル内のページがすべてイメージページである場合、それらすべてのページに対して公知の文字認識処理を実行する（Ｓ２２）。各ページの文字認識結果のデータは、それぞれ対応するページに対応付けて保存される。 FIG. 4 shows an example of a control procedure executed by the processing control unit 22 when the user is instructed to execute character recognition (OCR). In this procedure, the processing control unit 22 first examines the attributes of each page in the document file designated as the target of character recognition, and acquires the type information of each page (S10). This makes it possible to know whether each page in the document file is an application page or an image page. The processing control unit 22 determines whether or not the document file includes one or more application pages based on the information acquired in S10 (S12). When the result of this determination is No, that is, all the pages in the document file are image pages, a known character recognition process is executed for all the pages (S22). The character recognition result data of each page is saved in association with the corresponding page.

Ｓ１２でその文書ファイルがアプリページを１以上含むと判定した場合、処理制御部２２は、文字認識（ＯＣＲ）処理時のアプリページに対する処理の設定項目（図３の符号２０４参照）を設定管理部１８から取得する。そして、その設定項目が上述の第１選択肢（メッセージ表示）、第２選択肢（イメージ変換後にＯＣＲ）、第３選択肢（テキスト抽出）のいずれであるかを判定する（Ｓ１４）。 When it is determined in S12 that the document file contains one or more application pages, the processing control unit 22 sets processing setting items (see reference numeral 204 in FIG. 3) for the application page during character recognition (OCR) processing. Obtained from 18. Then, it is determined whether the setting item is the above-mentioned first option (message display), second option (OCR after image conversion), or third option (text extraction) (S14).

その設定項目の値が第１選択肢（メニュー表示）を示すものである場合、処理制御部２２は、その文書ファイル中の各イメージページを文字認識エンジン２４に処理させ、これにより得られたそれら各ページの文字認識結果のデータを保存する（Ｓ１６）。なお、文書ファイルがアプリページのみで構成されている場合は、Ｓ１６では１ページも文字認識されない。Ｓ１６の文字認識処理が完了すると、処理制御部２２は、今回対象として指定された文書ファイルに文字認識が適用できないページが含まれていた旨を示す表示を行う（Ｓ１８）。 When the value of the setting item indicates the first option (menu display), the processing control unit 22 causes the character recognition engine 24 to process each image page in the document file, and each of them obtained by this is processed. The data of the character recognition result of the page is saved (S16). If the document file is composed only of application pages, no characters are recognized in S16. When the character recognition process of S16 is completed, the process control unit 22 displays that the document file designated as the target this time contains a page to which character recognition cannot be applied (S18).

図５にこのとき表示される画面３００の例を示す。文字認識を指示したユーザは、この画面により、今回指示した文書ファイルの中に文字認識結果がないページがあることを理解する。また、この画面３００に、文字認識が適用できなかったページ群についての更なる情報を表示してもよい。このような更なる情報には、例えば、文字認識が適用できなかったページの番号のリスト、文書の全ページ数に占める文字認識できなかったページの割合等が含まれる。また、文書ファイルがアプリページのみを含み、イメージページを含まない場合は、図５に例示した画面に代えて、ＯＣＲが可能なページが含まれていなかった旨を示す画面を表示してもよい。 FIG. 5 shows an example of the screen 300 displayed at this time. The user who has instructed character recognition understands from this screen that some pages in the document file instructed this time do not have the character recognition result. Further, further information about the page group to which character recognition could not be applied may be displayed on this screen 300. Such further information includes, for example, a list of page numbers to which character recognition could not be applied, the percentage of pages to which character recognition could not be applied to the total number of pages in the document, and the like. Further, when the document file contains only the application page and does not include the image page, a screen indicating that the page capable of OCR is not included may be displayed instead of the screen illustrated in FIG. ..

また、この画面３００上で、文字認識が適用できなかったページ（アプリページ）からテキスト情報の取得を試みるか否かをユーザに問い合わせてもよい。アプリページからのテキストの取得方法には、イメージに変換して文字認識する方法、アプリページ内のテキストオブジェクトが持つテキストデータを取得する方法等がある（詳細は後述の別の例を参照）。画面３００上で、文字認識が適用できなかったページについての処置として、なにもしない、イメージに変換して文字認識する、アプリページ内のテキストオブジェクトが持つテキストデータを取得する、という選択肢の中からユーザに処理を選択させてもよい。そして、ユーザがテキスト取得の方法のいずれかを選択した場合、処理制御部２２は、その方法で各アプリページからテキスト情報を取得する。 Further, on the screen 300, the user may be inquired whether or not to try to acquire the text information from the page (application page) to which the character recognition cannot be applied. There are two methods for acquiring text from the application page, such as converting to an image and recognizing characters, and acquiring the text data of the text object in the application page (see another example below for details). As a measure for the page to which character recognition cannot be applied on the screen 300, among the options of doing nothing, converting to an image and recognizing the character, and acquiring the text data of the text object in the application page. You may let the user select the process from. Then, when the user selects one of the text acquisition methods, the processing control unit 22 acquires the text information from each application page by that method.

Ｓ１４で設定項目の値が第２選択肢（イメージ変換後にＯＣＲ）であることが分かると、処理制御部２２は、その文書ファイル中の各アプリページをラスタライザ２８に処理させることで、それら各アプリページをイメージページに変換する（Ｓ２０）。これにより、対象の文書ファイルはすべてイメージページとなる。処理制御部２２は、それらすべてのイメージページに対して、文字認識エンジン２４に文字認識処理を行わせる（Ｓ２２）。 When it is found in S14 that the value of the setting item is the second option (OCR after image conversion), the processing control unit 22 causes the rasterizer 28 to process each application page in the document file, so that each application page is processed. Is converted into an image page (S20). As a result, all the target document files become image pages. The processing control unit 22 causes the character recognition engine 24 to perform character recognition processing on all of these image pages (S22).

文書ファイル内のすべてのアプリページをイメージページに変換してからその文書ファイル内の全ページを文字認識するという流れは一例に過ぎない。この代わりに、文書ファイルをページ順に１ページずつ処理する中で、アプリページであればイメージページに変換してから文字認識を実行し、イメージページであれば単に文字認識を行うという処理でもよい。 The flow of converting all the application pages in the document file to image pages and then recognizing all the pages in the document file is just an example. Instead of this, while processing the document file page by page in page order, if it is an application page, it may be converted into an image page and then character recognition is executed, and if it is an image page, character recognition may be simply performed.

なお、文書ファイル内の各アプリページを、Ｓ２０で得られたイメージページに置き換えてもよい。この場合、文字認識処理の後に文書ハンドリングシステム１０に保存されるその文書ファイルは、全ページがイメージページである。なお、文字認識処理時このような置換えを行うか否かを、Ｓ１２でアプリページを含むと判定した時点でユーザに問い合わせてもよいし、設定管理部１８の設定項目の一つとして事前に設定できるようにしてもよい。 In addition, each application page in the document file may be replaced with the image page obtained in S20. In this case, all the pages of the document file stored in the document handling system 10 after the character recognition process are image pages. It should be noted that the user may be inquired at the time when it is determined in S12 that the application page is included, and whether or not such replacement is performed during the character recognition process may be inquired in advance as one of the setting items of the setting management unit 18. You may be able to do it.

Ｓ１４で設定項目の値が第３選択肢（テキスト抽出）であることが分かると、処理制御部２２は、その文書ファイル中の各アプリページ内のテキストデータをテキスト抽出部２６に抽出させる（Ｓ２４）。そして、アプリページから抽出されたテキストデータをそのアプリページに対応する文字認識結果として保存する。 When it is found in S14 that the value of the setting item is the third option (text extraction), the processing control unit 22 causes the text extraction unit 26 to extract the text data in each application page in the document file (S24). .. Then, the text data extracted from the application page is saved as the character recognition result corresponding to the application page.

以上の説明では、ある文書ファイルが文字認識の対象として選択された場合を例にとったが、これは一例に過ぎない。この代わりに、文書ファイル中の一部のページ群が文字認識の対象として選択された場合や、文書ファイルの全ページ又は選択された一部分のページ群のうちのページ内の特定の領域（例えば当該領域の外周の矩形の四隅の位置をユーザが指定）のみが文字認識の対象として選択された場合にも、同様の処理を行えばよい。このように、文字認識の対象が文書ファイル全体であってもその一部（すなわち一部のページ群、あるいはページ内の一部の領域）であってもよい点は、この後に説明する他のバリエーションにおいても同様である。 In the above explanation, the case where a certain document file is selected as the target of character recognition is taken as an example, but this is only an example. Instead, when some pages in the document file are selected for character recognition, or in a specific area within a page of all or selected parts of the document file (eg, such). The same processing may be performed even when only (the user specifies the positions of the four corners of the rectangle on the outer circumference of the area) is selected as the target of character recognition. In this way, the point that the target of character recognition may be the entire document file or a part thereof (that is, a part of a page group or a part of a page) is described later. The same applies to variations.

以上の例は、事前の設定に応じてアプリページの取扱を決定するものであった。別の例として、文字認識処理を開始した後、対象の文書ファイルにアプリページがあることが分かった際に、ユーザにアプリページの取扱を問い合わせてもよい。この例に沿った手順を、図６に示す。図６の手順では、文字認識におけるアプリページの処理についての設定を事前に設定管理部１８に登録しておく必要はない。 In the above example, the handling of the application page is determined according to the preset settings. As another example, after starting the character recognition process, when it is found that the target document file has the application page, the user may be inquired about the handling of the application page. The procedure according to this example is shown in FIG. In the procedure of FIG. 6, it is not necessary to register the settings for the processing of the application page in character recognition in the setting management unit 18 in advance.

図６の手順では、まず処理制御部２２は、文字認識の対象に指定された文書ファイル内の各ページの属性を調べ、それら各ページの種類情報を取得する（Ｓ３０）。処理制御部２２は、Ｓ１０で取得した情報に基づき、その文書ファイル内のページがすべてアプリページであるか否かを判定する（Ｓ３２）。Ｓ３２の判定結果がＹｅｓの場合、文書ファイル内には文字認識が可能なイメージページが１ページもないので、処理制御部２２は、ユーザの指定した文書ファイルには文字認識可能なページが含まれない旨を示す画面をＵＩ処理部１６を介して表示する（Ｓ３４）。 In the procedure of FIG. 6, first, the processing control unit 22 examines the attributes of each page in the document file designated as the target of character recognition, and acquires the type information of each page (S30). The processing control unit 22 determines whether or not all the pages in the document file are application pages based on the information acquired in S10 (S32). When the determination result of S32 is Yes, there is no image page capable of character recognition in the document file, so that the processing control unit 22 includes a page capable of character recognition in the document file specified by the user. A screen indicating that there is no such screen is displayed via the UI processing unit 16 (S34).

Ｓ３２の判定結果がＮｏの場合、処理制御部２２は、その文書ファイル内にアプリページが含まれるか否かを判定する（Ｓ３６）。この判定の結果がＮｏの場合、その文書ファイル内のページはすべてイメージページである。この場合、処理制御部２２は、その文書ファイルの全ページに対して、文字認識エンジン２４による文字認識処理を施す（Ｓ３８）。 If the determination result in S32 is No, the processing control unit 22 determines whether or not the application page is included in the document file (S36). If the result of this determination is No, all the pages in the document file are image pages. In this case, the processing control unit 22 performs character recognition processing by the character recognition engine 24 on all pages of the document file (S38).

Ｓ３６の判定の結果がＹｅｓの場合、処理制御部２２は、ＵＩ処理部１６を介して、図７に例示する確認画面３２０を表示する（Ｓ４０）。この確認画面３２０には、ユーザの指定した文字認識の対象には文字認識不可のページが含まれる旨のメッセージ、文字認識可能なページのみに文字認識する処理を実行してよいかを示す問合せ、及びその問合せに対する是非を入力するための２つのボタン（「ＯＫ」又は「取り消し」）が表示される。ユーザは、文字認識可能なページのみ文字認識するという処理でよい場合「ＯＫ」ボタンを選択し、そうでなければ「取り消し」ボタンを選択する。処理制御部２２は、ユーザがどちらを選択したかを判定し（Ｓ４２）、「ＯＫ」が選択された場合は、その文書ファイル内のイメージページのみを文字認識エンジン２４に処理させる（Ｓ４４）。「取り消し」が選択された場合は、Ｓ４４をスキップして処理を終了する。この場合、その文書ファイルに対して文字認識処理は一切行われない。 When the determination result of S36 is Yes, the processing control unit 22 displays the confirmation screen 320 illustrated in FIG. 7 via the UI processing unit 16 (S40). On the confirmation screen 320, a message indicating that the target of character recognition specified by the user includes a page that cannot recognize characters, an inquiry indicating whether or not the process of recognizing characters may be executed only on the page that can recognize characters, And two buttons ("OK" or "Cancel") for entering the pros and cons of the inquiry are displayed. The user selects the "OK" button when the process of recognizing characters only on the page where the characters can be recognized is sufficient, and selects the "Cancel" button otherwise. The processing control unit 22 determines which is selected by the user (S42), and when "OK" is selected, causes the character recognition engine 24 to process only the image page in the document file (S44). If "Cancel" is selected, S44 is skipped and the process ends. In this case, no character recognition processing is performed on the document file.

以上に説明した図６の手順では、アプリページを含む文書ファイルについては、文字認識を一切行わないか、又はその中のイメージページのみについて文字認識を行うかのいずれかであった。このような手順はあくまで一例に過ぎない。この代わりに、図４の手順と同様、アプリページに対しても文字認識又はそれと同等の結果が得られる処理を行うようにしてもよい。 In the procedure of FIG. 6 described above, either the character recognition is not performed at all for the document file including the application page, or the character recognition is performed only for the image page in the document file. Such a procedure is just an example. Instead of this, similar to the procedure of FIG. 4, the application page may be subjected to character recognition or a process for obtaining an equivalent result.

例えば、Ｓ３０で取得した情報から文字認識処理の対象のアプリページが含まれると判定した場合、処理制御部２２は、図８に例示する確認画面３４０を表示してもよい。この確認画面３４０には、ユーザの指定した文字認識の対象には文字認識不可のページが含まれる旨のメッセージ、文字認識不可のページに対する処理を選択するよう要請するメッセージ、その処理の選択肢を示す選択肢欄３４２、「ＯＫ」ボタン及び「取り消し」ボタンが表示される。選択肢欄３４２に示される選択肢は、図３に例示した設定画面２００の選択肢欄２０４に示された３つの選択肢と似ている。すなわち、第１の選択肢は、文字認識（ＯＣＲ）不可のページ（アプリページ）には文字認識を行わず、イメージページのみに文字認識を行う処理である。第２の選択肢は、文字認識不可のページをイメージページに変換して文字認識を行うという処理であり、第３の選択肢は、文字認識不可のページに含まれるテキストデータを抽出し、抽出したテキストデータを文字認識結果とする処理である。ユーザは、確認画面３４０の選択肢欄３４２から自分の希望する処理を１つ選択し、「ＯＫ」ボタンを押下することで、その処理の実行を処理制御部２２に指示する。例えば第１の選択肢が選択された場合、処理制御部２２は、文書ファイル中のイメージページのみを文字認識エンジン２４に処理させる。第２の選択肢が選択された場合には、処理制御部２２は、文書ファイル内のアプリページについては、まずラスタライザ２８に処理させ、その結果得られたイメージデータを文字認識エンジン２４に処理させる。第３の選択肢が選択された場合には、処理制御部２２は、テキスト抽出部２６に各アプリページからテキストデータを抽出させ、抽出されたテキストデータを当該ページの文字認識結果とする。 For example, when it is determined from the information acquired in S30 that the application page to be processed for character recognition is included, the processing control unit 22 may display the confirmation screen 340 exemplified in FIG. The confirmation screen 340 shows a message that the target of character recognition specified by the user includes a page whose character recognition is not possible, a message requesting selection of processing for the page whose character recognition is not possible, and options for the processing. The option column 342, the "OK" button and the "Cancel" button are displayed. The options shown in the option column 342 are similar to the three options shown in the option column 204 of the setting screen 200 illustrated in FIG. That is, the first option is a process in which character recognition is not performed on a page (application page) where character recognition (OCR) is not possible, and character recognition is performed only on an image page. The second option is a process of converting an unrecognizable page into an image page and performing character recognition, and the third option is to extract the text data contained in the unrecognizable page and extract the text. This is a process that uses data as a character recognition result. The user selects one process desired by himself / herself from the option field 342 of the confirmation screen 340, and presses the "OK" button to instruct the process control unit 22 to execute the process. For example, when the first option is selected, the processing control unit 22 causes the character recognition engine 24 to process only the image page in the document file. When the second option is selected, the processing control unit 22 first causes the rasterizer 28 to process the application page in the document file, and causes the character recognition engine 24 to process the image data obtained as a result. When the third option is selected, the processing control unit 22 causes the text extraction unit 26 to extract text data from each application page, and uses the extracted text data as the character recognition result of the page.

次に、更なるバリエーションについて説明する。以上に説明した例では、アプリページからテキスト情報を取得する方法として、イメージ化して文字認識する方法と、アプリページ内のテキストオブジェクトからテキストを抽出する方法のどちらを採用するか、ユーザが選択した。これに対して、以下に説明する例では、この選択を処理制御部２２が自動で行う。 Next, further variations will be described. In the example described above, the user has selected whether to acquire text information from the application page by creating an image and recognizing characters or extracting text from a text object in the application page. .. On the other hand, in the example described below, the processing control unit 22 automatically makes this selection.

図９に、処理制御部２２によるテキスト取得方法の選択処理の一例を示す。処理制御部２２は、指定された文字認識の対象内のアプリページ毎に図９の処理を行う。すなわち、処理制御部２２は、テキスト抽出部２６を用いてアプリページ内のテキストオブジェクトからテキストデータを抽出し（Ｓ５０）、これにより抽出されたテキストデータが含む文字数が、予め設定された閾値以上であるか否かを判定する（Ｓ５２）。抽出されたテキストの文字数が閾値以上の場合は、処理制御部２２は、その抽出されたテキストをそのアプリページに対応する文字認識結果として保存する（Ｓ５４）。一方、抽出されたテキストの文字数がその閾値未満である場合には、処理制御部２２は、ラスタライザ２８を用いてそのアプリページをイメージデータに変換し、文字認識エンジン２４によりそのイメージデータに対して文字認識を行う（Ｓ５６）。 FIG. 9 shows an example of the text acquisition method selection process by the process control unit 22. The processing control unit 22 performs the processing of FIG. 9 for each application page in the designated character recognition target. That is, the processing control unit 22 extracts text data from the text object in the application page using the text extraction unit 26 (S50), and the number of characters included in the extracted text data is equal to or higher than a preset threshold value. It is determined whether or not there is (S52). When the number of characters in the extracted text is equal to or greater than the threshold value, the processing control unit 22 saves the extracted text as a character recognition result corresponding to the application page (S54). On the other hand, when the number of characters in the extracted text is less than the threshold value, the processing control unit 22 converts the application page into image data using the rasterizer 28, and the character recognition engine 24 transfers the image data to the image data. Character recognition is performed (S56).

図９の処理は、アプリページを表示（描画）したときに文字に見える部分が、アプリページのデータ内でテキストとして表現されているとは限らない、という事実に対処するものである。すなわち、アプリケーションは、自分が作成したアプリケーションファイルを文書ハンドリングシステム１０のファイル取り込み部１２（あるいはこれと同等の機能を持つ仮想プリンタ）に対して、そのファイル内のテキスト中の文字を複数のベクターグラフィックスに分解して入力することがある。これは、例えば、その文字の見た目の再現性を確保する等の理由による。このように文字がベクターグラフィックスに分解されてファイル取り込み部１２に入力された場合、そのファイル取り込み部１２が生成するアプリ文書のアプリページ内でも、その文字はベクターグラフィックスとして表現されることになる。したがって、そのアプリページにはその文字はテキストデータとしては存在しないが、そのアプリページを表示（描画）すれば、その文字に見える画像が現れることになる。アプリページ内の「文字」（目で見ると文字に見える画像オブジェクト）の多くがベクターグラフィックスで表現されている場合、そのアプリページからテキストを抽出しても、非常に少ない文字数のテキストしか得られない。 The process of FIG. 9 deals with the fact that the part that looks like a character when the application page is displayed (drawn) is not always expressed as text in the data of the application page. That is, the application applies the application file created by the application to the file importing unit 12 (or a virtual printer having an equivalent function thereof) of the document handling system 10 with a plurality of vector graphics of the characters in the text in the file. It may be disassembled into files and input. This is because, for example, the reproducibility of the appearance of the character is ensured. When a character is decomposed into vector graphics and input to the file import unit 12 in this way, the character is expressed as vector graphics even in the application page of the application document generated by the file import unit 12. Become. Therefore, the character does not exist as text data on the application page, but when the application page is displayed (drawn), an image that looks like the character appears. If many of the "characters" (image objects that look like characters to the eye) in an app page are represented by vector graphics, extracting text from that app page will only yield text with a very small number of characters. I can't.

そこで、図９の手順では、アプリページから抽出した文字数が余りに少ない（すなわち閾値未満）場合には、そのアプリページをイメージに変換してから文字認識処理を行うのである。そのアプリページ中に、テキストとして抽出可能な文字数よりも多くの「文字」が含まれている（その多くがベクターグラフィックスで表現されている）場合には、イメージ化して文字認識することで、それら「文字」の多くを検出することが可能になる。 Therefore, in the procedure of FIG. 9, when the number of characters extracted from the application page is too small (that is, less than the threshold value), the application page is converted into an image and then the character recognition process is performed. If the app page contains more "characters" than the number of characters that can be extracted as text (many of which are expressed in vector graphics), it can be imaged and recognized as characters. It becomes possible to detect many of those "characters".

なお、Ｓ５６の後、文字認識処理で認識した文字数ｍとＳ５０で抽出した文字数ｎとを比較し、抽出したテキストの文字数よりも十分に多い文字数が文字認識により得られたことを確認してもよい。すなわち、この例では、文字数ｍが文字数ｎに十分に多い（すなわちその差がある閾値以上）場合には、文字認識の結果を最終的な文字認識結果として採用し、そうでなければ、抽出したテキストを最終的な文字認識結果として採用する。 After S56, even if the number of characters m recognized by the character recognition process and the number of characters n extracted in S50 are compared and it is confirmed that the number of characters sufficiently larger than the number of characters in the extracted text is obtained by character recognition. good. That is, in this example, when the number of characters m is sufficiently larger than the number of characters n (that is, the difference is equal to or more than a certain threshold value), the result of character recognition is adopted as the final character recognition result, and if not, it is extracted. Adopt the text as the final character recognition result.

図１０に、アプリページからのテキスト取得方法の自動選択の別の例の処理手順を示す。 FIG. 10 shows a processing procedure of another example of automatic selection of the text acquisition method from the application page.

図１０の手順では、処理制御部２２は、テキスト抽出部２６を用いてアプリページ内のテキストオブジェクトからテキストデータを抽出する（Ｓ６０）。また処理制御部２２は、ラスタライザ２８を用いてそのアプリページをイメージデータに変換し、文字認識エンジン２４によりそのイメージデータに対して文字認識を行う（Ｓ６２）（Ｓ６０とＳ６２の処理順序は図示の順に限らない）。そして、処理制御部２２は、Ｓ６０で抽出したテキストとＳ６２の文字認識により得られたテキストとの間で文字数を比較し、それら両者のうち文字数が多い方を最終的な文字認識結果として採用する（Ｓ６４）。なお、文字認識の誤りの可能性を考慮して、Ｓ６４では、Ｓ６２の文字認識で得た文字の数が、Ｓ６０で抽出したテキストの文字数よりも十分に大きい（すなわち両者の差がある閾値以上）場合にのみＳ６２の文字認識結果を採用し、そうでなければＳ６０のテキスト抽出結果を採用するようにしてもよい。 In the procedure of FIG. 10, the processing control unit 22 uses the text extraction unit 26 to extract text data from the text object in the application page (S60). Further, the processing control unit 22 converts the application page into image data using the rasterizer 28, and character recognition is performed on the image data by the character recognition engine 24 (S62) (the processing order of S60 and S62 is shown in the figure. Not limited to order). Then, the processing control unit 22 compares the number of characters between the text extracted in S60 and the text obtained by the character recognition in S62, and adopts the one having the larger number of characters as the final character recognition result. (S64). In consideration of the possibility of character recognition error, in S64, the number of characters obtained by character recognition in S62 is sufficiently larger than the number of characters in the text extracted in S60 (that is, the difference between the two is equal to or greater than the threshold value). ) The character recognition result of S62 may be adopted only in the case, and the text extraction result of S60 may be adopted otherwise.

図１１に、アプリページからのテキスト取得方法の自動選択の更に別の例の処理手順を示す。 FIG. 11 shows a processing procedure of still another example of automatic selection of the text acquisition method from the application page.

図１１の手順では、処理制御部２２は、テキスト抽出部２６を用いてアプリページ内のテキストオブジェクトからテキストデータを抽出すると共に、アプリページの全領域の中でそのテキストオブジェクトが存在するテキスト領域の位置や寸法を特定する（Ｓ７０）。テキスト領域の位置や寸法は、そのテキストオブジェクトに含まれるパラメータ（例えばテキストの流し込み範囲を規定するもの）から求めることができる。また処理制御部２２は、ラスタライザ２８を用いてそのアプリページをイメージデータに変換し、そのイメージデータのうち上述のテキスト領域以外の範囲のイメージに対して文字認識を行う（Ｓ７２）。そして、処理制御部２２は、Ｓ７０で抽出したテキストと、Ｓ７２の文字認識により得られたテキストと、そのアプリページに対する最終的な文字認識結果として採用する（Ｓ７４）。 In the procedure of FIG. 11, the processing control unit 22 uses the text extraction unit 26 to extract text data from the text object in the application page, and in the entire area of the application page, the text area in which the text object exists. Specify the position and dimensions (S70). The position and dimensions of the text area can be obtained from the parameters included in the text object (for example, those that define the text flow range). Further, the processing control unit 22 converts the application page into image data using the rasterizer 28, and performs character recognition on the image in the range other than the above-mentioned text area in the image data (S72). Then, the processing control unit 22 adopts the text extracted in S70, the text obtained by the character recognition in S72, and the final character recognition result for the application page (S74).

図１２に、アプリページからのテキスト取得方法の自動選択の更に別の例の処理手順を示す。 FIG. 12 shows a processing procedure of still another example of automatic selection of the text acquisition method from the application page.

この例は、アプリページに含まれるテキストが必ずしもそのアプリページを表示（描画）した際に、人の目に見えるとは限らないことを考慮に入れた処理の例である。 This example is an example of processing that takes into consideration that the text contained in the application page is not always visible to the human eye when the application page is displayed (drawn).

例えば、アプリケーションが、作成した文書中のテキストを完全な透明に設定したり、そのテキストの文字色を背景色と同じ色に設定したりした場合、そのテキストは人の目には見えなくなる。また、テキストオブジェクトを、不透明な他のオブジェクトで覆い隠してしまうと、そのテキストオブジェクトは見えなくなる（表示時に見えるのは「他のオブジェクト」の方）。アプリケーションファイルを作成した作成者が、第三者に見えない形でそのファイル内にテキスト情報を残しておこうとする場合に、それらの方法を採る場合がある。また、アプリケーションの中には、ユーザの編集操作の結果削除された文字列を透明な文字で、一種の履歴としてページ中に残すものもある。 For example, if the application sets the text in the created document to be completely transparent, or the text color of the text is set to the same color as the background color, the text is invisible to the human eye. Also, if you obscure a text object with another opaque object, that text object becomes invisible (the "other object" is visible when displayed). If the creator of the application file wants to leave textual information in the file in a way that is invisible to third parties, they may use those methods. In addition, some applications leave a character string deleted as a result of a user's editing operation as a kind of history on the page as transparent characters.

この種の「目に見えないテキスト」は、そのアプリページをイメージに変換（すなわち描画）して文字認識を行っても認識することはできないが、テキスト抽出部２６によりアプリページ中から抽出することはできる。したがって、「目に見えないテキスト」を含んだアプリページが文字認識処理の対象に含まれる場合、その処理の結果として見た目（表示される文字はない）と内容（テキストデータは存在する）のどちらを優先するのかが問題になる。見た目を優先する場合、アプリページから抽出されたテキストは破棄し、内容を優先する場合には、抽出されたテキストを文字認識結果として採用する。 This kind of "invisible text" cannot be recognized even if the application page is converted (that is, drawn) into an image and character recognition is performed, but it is extracted from the application page by the text extraction unit 26. Can be done. Therefore, if an app page containing "invisible text" is included in the target of character recognition processing, either the appearance (no characters are displayed) or the content (text data exists) as a result of the processing. The question is whether to prioritize. When giving priority to appearance, the text extracted from the application page is discarded, and when giving priority to content, the extracted text is adopted as the character recognition result.

図１２の例では、それら両者のどちらを優先するのかを、文字認識処理を指示するユーザの判断に委ねる。すなわち、そのユーザは、それら両者のどちらを優先するのかを示す設定項目の値を設定管理部１８に登録する。あるいはユーザは、「目に見えないテキスト」がアプリページから検知された場合にＵＩ処理部１６が表示する確認画面上で、見た目と内容のどちらを優先するかを選択する。 In the example of FIG. 12, which of the two is prioritized is left to the judgment of the user instructing the character recognition process. That is, the user registers the value of the setting item indicating which of the two is prioritized in the setting management unit 18. Alternatively, the user selects whether to prioritize the appearance or the content on the confirmation screen displayed by the UI processing unit 16 when the "invisible text" is detected from the application page.

図１２の手順では、処理制御部２２は、テキスト抽出部２６を用いてアプリページ内のテキストオブジェクトからテキストデータを抽出する（Ｓ８０）。また処理制御部２２は、ラスタライザ２８を用いてそのアプリページをイメージデータに変換し、文字認識エンジン２４によりそのイメージデータに対して文字認識を行う（Ｓ８２）。そして、そのアプリページの全領域の中で、Ｓ８０で抽出されたテキスト中の文字、又はＳ８２で認識された文字の少なくとも一方が存在する領域毎に、Ｓ８４～Ｓ９２の処理を行う。 In the procedure of FIG. 12, the processing control unit 22 uses the text extraction unit 26 to extract text data from the text object in the application page (S80). Further, the processing control unit 22 converts the application page into image data using the rasterizer 28, and character recognition is performed on the image data by the character recognition engine 24 (S82). Then, the processes S84 to S92 are performed for each area in which at least one of the characters in the text extracted in S80 or the characters recognized in S82 exists in the entire area of the application page.

なお、ＯＣＲアルゴリズムは、認識した文字のページ内での位置を検出するので、文字認識結果の各文字のある領域はその文字の位置の情報から特定可能である。また、アプリページは、当該ページ内でのテキストオブジェクトの領域を示す情報（例えば矩形の領域として規定される）と、そのテキストオブジェクト内の各文字のフォントサイズや字送り、行間隔等の情報とを有している。テキストオブジェクト内の各文字の位置はそれらの情報から計算可能である。これらの情報から、テキストオブジェクトから抽出された文字又は文字認識された文字の少なくとも一方が存在する領域を特定すればよい。テキストオブジェクトから抽出された文字及び文字認識された文字のどちらも存在しない領域は、文字が全くない領域であり、図１２の処理の対象外である。 Since the OCR algorithm detects the position of the recognized character in the page, the area of each character in the character recognition result can be specified from the information on the position of the character. In addition, the application page contains information indicating the area of the text object in the page (for example, defined as a rectangular area) and information such as the font size, distance spacing, and line spacing of each character in the text object. have. The position of each character in the text object can be calculated from that information. From this information, it is sufficient to identify the area where at least one of the characters extracted from the text object and the characters recognized by the characters exists. The area in which neither the character extracted from the text object nor the character recognized character exists is an area in which there is no character at all and is not subject to the processing of FIG.

Ｓ８４では、処理制御部２２は、現在注目している領域に、テキスト抽出部２６が抽出した文字が存在するかどうかを判定する。この判定の結果がＮｏの場合、その領域には文字認識処理により認識された文字が存在する。この場合、処理制御部２２は、その認識された文字を、その領域についての文字認識結果として採用する（Ｓ８６）。 In S84, the processing control unit 22 determines whether or not the character extracted by the text extraction unit 26 exists in the area currently attracting attention. If the result of this determination is No, the character recognized by the character recognition process exists in the area. In this case, the processing control unit 22 adopts the recognized character as the character recognition result for the area (S86).

Ｓ８４で現在注目している領域に、テキスト抽出部２６が抽出した文字が存在すると判定した場合、処理制御部２２は、その領域に文字認識された文字が存在するかどうかを判定する（Ｓ８８）。Ｓ８８の判定結果がＹｅｓの場合、その領域には、テキスト抽出部２６が抽出した文字と、文字認識エンジン２４が認識した文字の両方が存在する。文字認識の結果は誤りである可能性があるので、この場合は、テキスト抽出部２６が抽出した文字の方をその領域についての文字認識結果として採用する（Ｓ９０）。 When it is determined that the character extracted by the text extraction unit 26 exists in the area currently attracting attention in S84, the processing control unit 22 determines whether or not the character recognized character exists in the area (S88). .. When the determination result of S88 is Yes, both the characters extracted by the text extraction unit 26 and the characters recognized by the character recognition engine 24 are present in the area. Since the result of character recognition may be incorrect, in this case, the character extracted by the text extraction unit 26 is adopted as the character recognition result for the area (S90).

Ｓ８８の判定結果がＮｏの場合は、その領域には、抽出されたテキストの文字は存在するが、文字認識された文字は存在しない。これは前述した「目に見えないテキスト」のケースである。このケースでは、処理制御部２２は、見た目と内容のどちらを優先するかをユーザに問い合わせるか、または設定管理部１８に登録されている設定内容に基づいて判定する。そして、見た目を優先する場合には、その領域には文字がないものとし（すなわち文字認識結果が「文字なし」）、内容を優先する場合には、その領域の文字認識結果として、テキスト抽出部２６が抽出したその領域のテキストを採用する（Ｓ９２）。 When the determination result of S88 is No, the extracted text characters exist in the area, but the character-recognized characters do not exist. This is the case of "invisible text" mentioned above. In this case, the processing control unit 22 asks the user whether to prioritize the appearance or the content, or determines based on the setting content registered in the setting management unit 18. When the appearance is prioritized, it is assumed that there are no characters in the area (that is, the character recognition result is "no character"), and when the content is prioritized, the text extraction unit is used as the character recognition result in the area. The text of the area extracted by 26 is adopted (S92).

図１２に例示した処理では、アプリページ内のテキストオブジェクトから抽出した文字と、文字認識処理により認識した文字とが競合した場合、基本的には、正確であるテキストオブジェクトから抽出した文字を採用する。ただし、テキストオブジェクトからは文字が抽出された場所から、文字認識処理では文字が認識されない場合は、「目に見えないテキスト」であり、一律に抽出されたテキストを優先してよいとは限らないので、ユーザに判断を求める。 In the process illustrated in FIG. 12, when the character extracted from the text object in the application page and the character recognized by the character recognition process conflict, basically, the character extracted from the text object which is accurate is adopted. .. However, if the character is not recognized by the character recognition process from the place where the character is extracted from the text object, it is "invisible text" and it is not always possible to give priority to the uniformly extracted text. Therefore, ask the user to make a decision.

文字認識処理部２０は、図９～図１２を参照して説明したテキスト取得方法の自動選択処理のうちのいずれかを常にアプリページに適用するようにプログラミングされてもよい。 The character recognition processing unit 20 may be programmed to always apply any one of the automatic selection processes of the text acquisition method described with reference to FIGS. 9 to 12 to the application page.

また、テキスト取得方法の自動選択を、アプリページに対する処理の選択肢としてユーザに提示し、ユーザに選択させてもよい。例えば、図１３に示す設定画面２００ａでは、ＯＣＲ時のアプリページに対する処理の選択肢欄２０４ａに、図３に示した第１～第３選択肢に加え、第４選択肢としてテキスト取得方法の自動選択を挙げている。なお、第４選択肢の方法は、他の選択肢よりも顕著に処理に時間がかかるため、選択肢欄２０４ａ内の第４選択肢の説明文には、処理に時間を要する旨の但し書きが付されている。ある程度時間がかかっても、なるべく良質でなるべく大量のテキスト情報を得たいユーザは、この第４選択肢を選択すればよい。この設定画面２００ａの第４選択肢がユーザの選択した方法として設定管理部１８に登録されている場合、処理制御部２２は、文字認識対象のうちのアプリページについては、図９～図１２に示した方法のいずれかを用いてアプリページを処理する。 Further, the automatic selection of the text acquisition method may be presented to the user as a processing option for the application page, and the user may be made to select it. For example, in the setting screen 200a shown in FIG. 13, in the processing option column 204a for the application page at the time of OCR, in addition to the first to third options shown in FIG. 3, the automatic selection of the text acquisition method is listed as the fourth option. ing. Since the method of the fourth option takes significantly longer to process than the other options, the description of the fourth option in the option column 204a is provided with a proviso that the process takes time. .. A user who wants to obtain as much text information as possible with the highest quality even if it takes a certain amount of time may select this fourth option. When the fourth option of the setting screen 200a is registered in the setting management unit 18 as the method selected by the user, the processing control unit 22 shows the application pages of the character recognition targets in FIGS. 9 to 12. Process the app page using one of the above methods.

以上に例示した文書ハンドリングシステム１０又は文字認識処理部２０は、例えば、コンピュータにそれら各装置の機能を表すプログラムを実行させることにより実現される。ここで、コンピュータは、例えば、ハードウエアとして、ＣＰＵ等のマイクロプロセッサ、ランダムアクセスメモリ（ＲＡＭ）およびリードオンリメモリ（ＲＯＭ）等のメモリ（一次記憶）、ＨＤＤ（ハードディスクドライブ）を制御するＨＤＤコントローラ、各種Ｉ／Ｏ（入出力）インタフェース、ローカルエリアネットワークなどのネットワークとの接続のための制御を行うネットワークインタフェース等が、たとえばバスを介して接続された回路構成を有する。また、そのバスに対し、例えばＩ／Ｏインタフェース経由で、ＣＤやＤＶＤなどの可搬型ディスク記録媒体に対する読み取り及び／又は書き込みのためのディスクドライブ、フラッシュメモリなどの各種規格の可搬型の不揮発性記録媒体に対する読み取り及び／又は書き込みのためのメモリリーダライタ、などが接続されてもよい。上に例示した各機能モジュールの処理内容が記述されたプログラムがＣＤやＤＶＤ等の記録媒体を経由して、又はネットワーク等の通信手段経由で、ハードディスクドライブ等の固定記憶装置に保存され、コンピュータにインストールされる。固定記憶装置に記憶されたプログラムがＲＡＭに読み出されＣＰＵ等のマイクロプロセッサにより実行されることにより、上に例示した機能モジュール群が実現される。 The document handling system 10 or the character recognition processing unit 20 exemplified above is realized by, for example, causing a computer to execute a program representing the functions of each of the devices. Here, the computer is, for example, as hardware, a microprocessor such as a CPU, a memory (primary storage) such as a random access memory (RAM) and a read-only memory (ROM), and an HDD controller that controls an HDD (hard disk drive). Various I / O (input / output) interfaces, network interfaces that control for connection with networks such as local area networks, and the like have a circuit configuration connected via, for example, a bus. In addition, portable non-volatile recording of various standards such as a disk drive and flash memory for reading and / or writing to a portable disk recording medium such as a CD or DVD via an I / O interface, for example, on the bus. A memory reader / writer for reading and / or writing to the medium may be connected. A program describing the processing contents of each functional module illustrated above is stored in a fixed storage device such as a hard disk drive via a recording medium such as a CD or DVD, or via a communication means such as a network, and stored in a computer. Will be installed. By reading the program stored in the fixed storage device into the RAM and executing it by a microprocessor such as a CPU, the functional module group illustrated above is realized.

１０文書ハンドリングシステム、１２ファイル取り込み部、１４文書処理部、１６ＵＩ処理部、１８設定管理部、２０文字認識処理部、２２処理制御部、２４文字認識エンジン、２６テキスト抽出部、２８ラスタライザ。
10 document handling system, 12 file import unit, 14 document processing unit, 16 UI processing unit, 18 setting management unit, 20 character recognition processing unit, 22 processing control unit, 24 character recognition engine, 26 text extraction unit, 28 rasterizer.

Claims

When the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed, the first type included in the document. A document processing device that executes the character recognition for the type 1 page, and executes the type 2 page processing, which is a process different from the character recognition, for the type 2 page included in the document .
The second-class page processing is a document processing device that notifies a user that a page to which the character recognition cannot be applied is included when the second-class page is included in the document .

In the notification process, the user is further inquired as to whether or not to acquire character information from the second type page.
The document processing device controls whether or not to execute a process of acquiring character information from the second type page in response to a user's response to the inquiry.
The document processing apparatus according to claim 1 .

The document processing apparatus according to claim 1, wherein the type 2 page processing is a processing for acquiring character information from the type 2 page.

The document processing apparatus according to claim 3 , wherein the type 2 page processing is a processing for extracting text data included in the type 2 page.

When the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed, the first type included in the document. A document processing device that executes the character recognition for the type 1 page, and executes the type 2 page processing, which is a process different from the character recognition, for the type 2 page included in the document.
The second-class page processing is a document processing apparatus that converts the second-class page into image data and executes the character recognition on the image data.

When the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed, the first type included in the document. A document processing device that executes the character recognition for the type 1 page, and executes the type 2 page processing, which is a process different from the character recognition, for the type 2 page included in the document.
The document processing apparatus has a process of extracting text data included in the type 2 page, and a process of converting the type 2 page into image data and executing the character recognition on the image data. A document processing device that executes a process selected by a user as the second-class page process.

When the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed, the first type included in the document. A document processing device that executes the character recognition for the type 1 page, and executes the type 2 page processing, which is a process different from the character recognition, for the type 2 page included in the document.
The second-class page processing is processing for the second-class page based on the text data extracted from the second-class page and the character recognition result for the image data obtained by converting the second-class page. A document processing device that is a process for obtaining the resulting character information.

In the type 2 page processing, for a certain place in the type 2 page, the first character corresponding to the place in the text data extracted from the type 2 page and the type 2 page are converted. The seventh aspect of claim 7 , wherein the first character is adopted as character information about the place when both the second character corresponding to the place in the result of the character recognition for the image data are present. Document processing device.

In the type 2 page processing, there is no character corresponding to the place in the text data extracted from the type 2 page for a certain place in the type 2 page, and the image obtained by converting the type 2 page. The document processing according to claim 7 or 8 , wherein when there is a character corresponding to the place in the result of the character recognition for the data, the character in the result of the character recognition is adopted as character information about the place. Device.

In the type 2 page processing, with respect to a certain place in the type 2 page, there is a character corresponding to the place in the text data extracted from the type 2 page, and the image obtained by converting the type 2 page. When there is no character corresponding to the place in the result of the character recognition for the data, whether or not the character in the text data is adopted as the character information about the place is controlled according to the user's instruction. Item 6. The document processing apparatus according to any one of Items 7 to 9 .

When the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed, the first type included in the document. A document processing device that executes the character recognition for the type 1 page, and executes the type 2 page processing, which is a process different from the character recognition, for the type 2 page included in the document.
When the document contains the second type page, a notification process for notifying the user that the page to which the character recognition cannot be applied is included, and
The process of acquiring character information from the second type page and
A document processing apparatus having a means for presenting a setting screen for receiving a setting from a user as to which of the two is executed as the second type page processing.

When the execution of character recognition is instructed for a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed, the first type included in the document. A document processing device that executes the character recognition for the type 1 page, and executes the type 2 page processing, which is a process different from the character recognition, for the type 2 page included in the document.
When the document contains the second type page, a notification process for notifying the user that the page to which the character recognition cannot be applied is included, and
The process of extracting the text data contained in the second type page and
A process of converting the second type page into image data and executing the character recognition on the image data.
A document processing apparatus having a means for presenting a setting screen for allowing a user to select two or more of them as options for the second type processing.

When a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed is instructed to execute character recognition, the first type included in the document. To operate the computer as a means for executing the character recognition for the type 1 page and performing the type 2 page processing which is a process different from the character recognition for the type 2 page included in the document. It ’s a program,
The second-class page processing is a notification process for notifying a user that a page to which the character recognition cannot be applied is included when the second-class page is included in the document .

When a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed is instructed to execute character recognition, the first type included in the document. To operate the computer as a means for executing the character recognition for the type 1 page and executing the type 2 page processing which is a process different from the character recognition for the type 2 page included in the document. It ’s a program,
The second-class page processing is a program that converts the second-class page into image data and executes the character recognition on the image data.

When a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed is instructed to execute character recognition, the first type included in the document. To operate the computer as a means for executing the character recognition for the type 1 page and performing the type 2 page processing which is a process different from the character recognition for the type 2 page included in the document. It ’s a program,
The type 2 page processing is processing for the type 2 page based on the text data extracted from the type 2 page and the result of the character recognition for the image data obtained by converting the type 2 page. A program that is a process for obtaining the resulting character information.

When a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed is instructed to execute character recognition, the first type included in the document. To operate the computer as a means for executing the character recognition for the type 1 page and executing the type 2 page processing which is a process different from the character recognition for the type 2 page included in the document. It ’s a program,
The computer
When the document contains the second type page, a notification process for notifying the user that the page to which the character recognition cannot be applied is included, and
The process of acquiring character information from the second type page and
A means for presenting a setting screen that accepts from the user the setting of which of the two is executed as the second type page processing.
A program characterized by functioning as.

When a document in a data format in which a type 1 page in an image data format and a type 2 page in a non-image data format can be mixed is instructed to execute character recognition, the first type included in the document. To operate the computer as a means for executing the character recognition for the type 1 page and executing the type 2 page processing which is a process different from the character recognition for the type 2 page included in the document. It ’s a program,
The computer
When the document contains the second type page, a notification process for notifying the user that the page to which the character recognition cannot be applied is included, and
The process of extracting the text data contained in the second type page and
A process of converting the second type page into image data and executing the character recognition on the image data.
A means for presenting a setting screen for allowing the user to select two or more of them as options for the second type processing.
A program characterized by functioning as.