JP6061806B2

JP6061806B2 - Image forming system

Info

Publication number: JP6061806B2
Application number: JP2013159376A
Authority: JP
Inventors: 健一桂
Original assignee: Kyocera Document Solutions Inc
Current assignee: Kyocera Document Solutions Inc
Priority date: 2013-07-31
Filing date: 2013-07-31
Publication date: 2017-01-18
Anticipated expiration: 2033-07-31
Also published as: JP2015032017A

Description

本発明は、光学文字認識（以下、「ＯＣＲ（optical character recognition）」という）機能で読み取った画像をテキストデータに変換する画像形成装置に関する。 The present invention relates to an image forming apparatus that converts an image read by an optical character recognition (hereinafter referred to as “OCR (optical character recognition)”) function into text data.

最近の技術によれば、プリンター、多機能プリンター、多機能周辺装置、又は複合機などのＭＦＰ（ＭｕｌｔｉｆｕｎｃｔｉｏｎＰｅｒｉｐｈｅｒａｌ）である画像形成装置は、紙の原稿をスキャナーで読込み、テキストデータに変換する機能を備えている。このように原稿を電子化することで、原稿の文章を流用して他の原稿を作成するときに容易に他の原稿を作成することができ、また、電子化された原稿は保管が容易である。しかし、スキャナーで読み込んだ原稿の書体によっては、ＯＣＲ機能が文字や単語を正しく認識できないことがある。つまり、ＯＣＲ機能による認識の精度（以下、「認識度」という）が１００%ではないために、文字の誤認識が発生する。このため、ＯＣＲ機能によって認識された原稿のテキストデータに対してユーザーが誤認識された文字や単語がないかを１つ１つ確認している。例えば、特許文献１のスキャナー装置では、原稿を読み取りデジタル化（以下、「スキャン」という）して得られた画像データをＯＣＲ機能により得られたテキストデータであるメタデータを生成し、スキャナー装置の表示部またはＰＣ（パーソナルコンピュータ）に配信して表示させ、ユーザーがスキャナー装置またはＰＣでこのメタデータを校正することができるようにしている。しかし、同じ書体で作成された原稿などにおいては、ＯＣＲ機能で誤認識される文字や単語は、何回も誤認識されることがある。このため、ユーザーは、テキストデータにおいて同じように誤認識された文字や単語を何回も手作業で正しい文字や単語に校正する作業を行わなければならない。この対策として、予め誤認識される可能性の高い文字や単語をリスト（以下、「ＯＣＲ辞書」という）に登録し、ＯＣＲ機能の認識度が低いと判定したときに、このＯＣＲ辞書を使用することで、ＯＣＲ機能による誤認識率を低下させるようにしている。 According to recent technology, an image forming apparatus that is an MFP (Multifunction Peripheral) such as a printer, a multifunction printer, a multifunction peripheral device, or a multifunction peripheral has a function of reading a paper document with a scanner and converting it into text data. I have. By digitizing the original in this way, it is possible to easily create another original when diverting the text of the original to create another original, and the electronic original is easy to store. is there. However, depending on the typeface of the original read by the scanner, the OCR function may not recognize characters and words correctly. That is, since the accuracy of recognition by the OCR function (hereinafter referred to as “recognition degree”) is not 100%, erroneous recognition of characters occurs. For this reason, it is confirmed one by one whether there is any character or word that is erroneously recognized by the user in the text data of the document recognized by the OCR function. For example, the scanner device of Patent Document 1 generates metadata, which is text data obtained by an OCR function, from image data obtained by reading and digitizing an original (hereinafter referred to as “scan”). The metadata is distributed and displayed on a display unit or a PC (personal computer) so that the user can calibrate the metadata with the scanner device or the PC. However, in a document created with the same typeface, characters and words that are erroneously recognized by the OCR function may be erroneously recognized many times. For this reason, the user must manually calibrate characters and words that have been misrecognized in the text data as many times as possible manually. As a countermeasure, characters and words that are likely to be erroneously recognized are registered in a list (hereinafter referred to as “OCR dictionary”), and this OCR dictionary is used when it is determined that the recognition degree of the OCR function is low. Thus, the misrecognition rate by the OCR function is lowered.

特開２００９−２１８６１号公報JP 2009-216181

しかし、ユーザーは誤認識の可能性がある文字や単語をＯＣＲ辞書に登録しなければならないので手間がかかり、また校正した文字の登録漏れが発生するという問題があった。 However, there is a problem that the user has to register characters and words that may be erroneously recognized in the OCR dictionary, which is troublesome and that registration of corrected characters occurs.

本発明はこのような状況に鑑みてなされたものであり、上記課題を解決できる画像形成装置を提供することを目的とする。 The present invention has been made in view of such a situation, and an object thereof is to provide an image forming apparatus capable of solving the above-described problems.

本発明の画像形成システムは、ＯＣＲ機能を有する画像形成装置と外部装置とがネットワークを介して接続された画像形成システムであって、前記画像形成装置は、原稿画像データから原稿テキストデータを作成するＯＣＲ処理部と、前記原稿テキストデータの校正前原稿テキストデータ及び校正後原稿テキストデータにより、校正前及び校正後の文字または単語をＯＣＲ辞書保存テーブルに登録するＯＣＲ辞書登録処理部と、ユーザー毎の前記ＯＣＲ辞書保存テーブルと、ユーザー認証のためのユーザー名及びユーザーＩＤとを保存する記憶部と、を備え、前記画像形成装置は、（１）前記記憶部に保存されたユーザー名及びユーザーＩＤと、前記画像形成装置に入力されたユーザー名及びユーザーＩＤとに基づいてユーザー認証を行い、（２）前記ユーザー認証が正常である場合には、前記ＯＣＲ処理部は、前記原稿画像データから文字または単語を取り出し、前記ＯＣＲ機能を実行して認識度を算出し、（３）前記認識度が所定の閾値以上である場合には、前記ＯＣＲ機能で認識された文字または単語を前記原稿テキストデータに保存し、（４）前記認識度が前記所定の閾値より低い場合には、前記ユーザー認証が正常であった前記ユーザー名に対応する前記ＯＣＲ辞書保存テーブルに、前記認識度が前記所定の閾値より低い文字または単語が登録されているか否かを判断し、（５）登録されている場合には、校正前の文字または単語を校正後の文字または単語に置き換えて前記原稿テキストデータに保存し、（６）登録されていない場合には、前記ＯＣＲ機能で認識された文字または単語を前記原稿テキストデータに保存し、（７）前記原稿テキストデータを、前記ネットワークを介して前記外部装置に送信し、前記外部装置により、前記原稿テキストデータが受信されると、前記ＯＣＲ処理部により誤認識された文字または単語が正しい文字または単語に校正されることにより、校正前の文字または単語が装飾されて前記校正前原稿テキストデータに保存され、校正後の文字または単語が装飾されて前記校正後原稿テキストデータに保存され、さらに、ユーザー名及びユーザーＩＤ、前記ＯＣＲ辞書保存テーブルに登録するためのデータであることを示すコマンドが先頭に付加された前記校正前原稿テキストデータ及び前記校正後原稿テキストデータが、前記ネットワークを介して前記画像形成装置に送信され、前記画像形成装置は、（１）前記記憶部に保存された前記ユーザー名及び前記ユーザーＩＤと、前記外部装置から受信した前記ユーザー名及び前記ユーザーＩＤとに基づいてユーザー認証を行い、（２）前記ユーザー認証が正常である場合には、前記外部装置から受信した前記校正前原稿テキストデータ及び前記校正後原稿テキストデータを保存し、（３）前記ＯＣＲ辞書登録処理部により、前記校正前原稿テキストデータまたは前記校正後原稿テキストデータにおける装飾された文字または単語の位置をポインターに設定し、前記校正前原稿テキストデータまたは前記校正後原稿テキストデータにおけるポインターが示す文字または単語を取り出し、（４）取り出した前記文字または単語が、前記ユーザー認証が正常であった前記ユーザー名に対応するＯＣＲ辞書保存テーブルに登録されているか否かを判断し、（５）登録されていない場合には、前記ユーザー名に対応する前記ＯＣＲ辞書保存テーブルに、前記校正前原稿テキストデータ及び前記校正後原稿テキストデータから取り出した文字または単語を登録することを特徴とする。
また、前記装飾された文字または単語は、指定色に変更された文字または単語であることを特徴としている。
また、前記画像形成装置は、Ｗｅｂブラウザにより画面を表示し、前記画面に前記校正前原稿テキストデータのファイル名及び前記校正後原稿テキストデータのファイル名を設定することで、前記校正前原稿テキストデータ及び前記校正後原稿テキストデータを受信することを特徴としている。
The image forming system of the present invention is an image forming system in which an image forming apparatus having an OCR function and an external apparatus are connected via a network, and the image forming apparatus creates document text data from document image data. and OCR processing unit, the calibration before the original text data and proofread the original text data of the document text data, and OCR dictionary registration processing section for registering the characters or words after calibration before and calibrated OCR dictionary storage table, the user each of The OCR dictionary storage table, and a storage unit that stores a user name and user ID for user authentication. The image forming apparatus includes (1) a user name and a user ID stored in the storage unit, , Performing user authentication based on the user name and user ID input to the image forming apparatus, 2) When the user authentication is normal, the OCR processing unit extracts characters or words from the document image data, executes the OCR function to calculate a recognition degree, and (3) the recognition degree is If it is greater than or equal to a predetermined threshold, characters or words recognized by the OCR function are stored in the original text data. (4) If the recognition level is lower than the predetermined threshold, the user authentication is performed. It is determined whether or not a character or word whose recognition level is lower than the predetermined threshold is registered in the OCR dictionary storage table corresponding to the normal user name, and (5) when registered Replaces the character or word before proofreading with the character or word after proofreading and saves it in the original text data. (6) If not registered, recognized by the OCR function Characters or words are stored in the original text data, and (7) the original text data is transmitted to the external device via the network, and when the original text data is received by the external device, the OCR By correcting a character or word erroneously recognized by the processing unit to a correct character or word, the character or word before proofreading is decorated and stored in the original text data before proofreading, and the character or word after proofreading is decorated. And stored in the proofread original text data, and further includes the user name and user ID, the pre-proofread original text data to which a command indicating that the data is to be registered in the OCR dictionary storage table is added, and The proofread original text data is transmitted to the image forming apparatus via the network, The image forming apparatus performs user authentication based on (1) the user name and the user ID stored in the storage unit, and the user name and the user ID received from the external apparatus, and (2) If the user authentication is normal, the pre-calibration original text data and the post-correction original text data received from the external device are stored, and (3) the pre-calibration original text is stored by the OCR dictionary registration processing unit. The position of the decorated character or word in the data or the proofread original text data is set as a pointer, and the character or word indicated by the pointer in the pre-proofread original text data or the proofread original text data is extracted, (4) The letter or word is the user whose user authentication was normal (5) if not registered, the OCR dictionary storage table corresponding to the user name stores the pre-proofread original text data and the and features that you register a character or word taken out from the calibration after the manuscript text data.
Also, characters or words that the decorated is characterized characters or words der Rukoto that changed designated color.
Further, the image forming apparatus, W eb displays a screen by the browser, by setting the file name and the file name of the proofread manuscript text data of the calibration before the original text data on the screen, the calibration before the original text Data and the proofread original text data are received.

本発明の画像形成装置は、ＯＣＲ機能により誤認識された文字や単語をユーザーがＰＣで校正すると、ＯＣＲ機能が誤認識する可能性のある文字や単語を容易に漏れなくＯＣＲ辞書に登録され、誤認識された文字や単語をＯＣＲ機能が正しい文字や単語に置き換えることで、ユーザーが校正する時間を短縮できる。 In the image forming apparatus of the present invention, when a user calibrates a character or a word that is misrecognized by the OCR function with a PC, the character or word that may be misrecognized by the OCR function is easily registered in the OCR dictionary without omission, By replacing the erroneously recognized characters and words with the correct characters and words by the OCR function, it is possible to shorten the time for the user to proofread.

本発明の実施形態に係るＯＣＲ辞書登録手順の概要を示す図である。It is a figure which shows the outline | summary of the OCR dictionary registration procedure which concerns on embodiment of this invention. 本発明の実施形態に係る画像形成装置及びＰＣの機能構成を示す図である。1 is a diagram illustrating a functional configuration of an image forming apparatus and a PC according to an embodiment of the present invention. 図２に示すＯＣＲ辞書保存テーブルの構成図である。FIG. 3 is a configuration diagram of an OCR dictionary storage table shown in FIG. 2. 本発明の実施形態に係るＯＣＲ処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the OCR process which concerns on embodiment of this invention. 本発明の実施形態に係るＯＣＲ辞書登録処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the OCR dictionary registration process which concerns on embodiment of this invention.

以下、本発明を実施するための形態（以下、「実施形態」という）を、図面を参照して説明する。本発明の画像形成装置においては、文字または単語の両方をＯＣＲ機能で認識することが可能であるが、実施形態では文字を例に説明する。 Hereinafter, modes for carrying out the present invention (hereinafter referred to as “embodiments”) will be described with reference to the drawings. In the image forming apparatus of the present invention, it is possible to recognize both characters and words with the OCR function.

最初に、ＯＣＲ辞書登録手順の概要について、図１を用いて説明する。以下、図１（ａ）に示すＯＣＲ辞書登録手順の（１）から（５）の順で説明する。 First, an outline of the OCR dictionary registration procedure will be described with reference to FIG. Hereinafter, the OCR dictionary registration procedure shown in FIG. 1A will be described in the order of (1) to (5).

まず、（１）では、ユーザーが画像形成装置１００の原稿台に原稿をセットし、操作パネルからスキャン要求を行い、「ユーザー名」と「ユーザーＩＤ」を入力すると、スキャナーにより原稿がスキャンされる。画像形成装置１００は、「ユーザー名」と「ユーザーＩＤ」によりユーザー認証を行い、ユーザー認証が正常に行われたときに、ＯＣＲ機能により認識された文字に変換された原稿テキストデータが作成される。ＯＣＲ機能は、図１（ｂ）に示すように文字の認識度が予め決められた所定の閾値より低いかを判定し、この所定の閾値より低いときに、ユーザー名に対応するＯＣＲ辞書を取り出し、認識度の低い文字をＯＣＲ辞書で検索し、ＯＣＲ機能が認識度の低い文字をＯＣＲ辞書に登録されている正しい文字に置き換える。 First, in (1), when a user sets a document on the document table of the image forming apparatus 100, makes a scan request from the operation panel, and inputs "user name" and "user ID", the document is scanned by the scanner. . The image forming apparatus 100 performs user authentication using “user name” and “user ID”, and when the user authentication is normally performed, original text data converted into characters recognized by the OCR function is created. . As shown in FIG. 1B, the OCR function determines whether the character recognition level is lower than a predetermined threshold value, and when it is lower than the predetermined threshold value, the OCR dictionary corresponding to the user name is extracted. The character with low recognition degree is searched in the OCR dictionary, and the OCR function replaces the character with low recognition degree with the correct character registered in the OCR dictionary.

次いで、（２）では、画像形成装置１００からＰＣ２００に原稿テキストデータを送信する。 Next, in (2), document text data is transmitted from the image forming apparatus 100 to the PC 200.

次いで、（３）では、ＰＣ２００が画像形成装置１００から原稿テキストデータを受信すると、ＰＣ２００の表示部に表示する。ユーザーが表示された原稿テキストデータの文字を確認し、誤認識された文字を正しい文字に校正する。このときに、ユーザーは、原稿テキストデータの誤認識された文字を予め決められた色（以下、「指定色」という）に変更した校正前原稿テキストデータと、正しく校正された文字を指定色に変更した校正後原稿テキストデータを作成する。 Next, in (3), when the PC 200 receives the document text data from the image forming apparatus 100, it is displayed on the display unit of the PC 200. The user confirms the characters in the displayed manuscript text data, and corrects the misrecognized characters to the correct characters. At this time, the user changes the unrecognized character text data in the manuscript text data to a predetermined color (hereinafter referred to as “designated color”) and the calibrated character text as the designated color. Create post-proof proof manuscript text data.

次いで、（４）では、ＰＣ２００は、「ユーザー名」と「ユーザーＩＤ」、及び校正前原稿テキストデータと校正後原稿テキストデータを画像形成装置１００に送信する。画像形成装置１００は、「ユーザー名」と「ユーザーＩＤ」、及び校正前原稿テキストデータと校正後原稿テキストデータを受信すると、「ユーザー名」と「ユーザーＩＤ」によりユーザー認証を行う。ユーザー認証が正常に行われたときに、校正前原稿テキストデータと校正後原稿テキストデータによりＯＣＲ辞書を登録する処理を行う。なお、ＰＣ２００が校正前原稿テキストデータと校正後原稿テキストデータを画像形成装置１００に送信するときには、例えば、図１（ｃ）に示すように、画像形成装置１００が提供するＷｅｂブラウザによりＯＣＲ辞書更新画面をＰＣ２００に表示し、「校正前原稿テキストデータのファイル名」のエリアに校正前原稿テキストデータに付けられているファイル名、「校正後原稿テキストデータのファイル名」のエリアに校正後原稿テキストデータに付けられているファイル名を設定する。この設定により、Ｗｅｂブラウザにより画像形成装置１００は、校正前原稿テキストデータと校正後原稿テキストデータを受信する。 Next, in (4), the PC 200 transmits “user name” and “user ID”, pre-proofread original text data, and post-proofread original text data to the image forming apparatus 100. When the image forming apparatus 100 receives the “user name” and “user ID”, the original text data before proofreading, and the original text data after proofreading, the image forming apparatus 100 performs user authentication using the “user name” and “user ID”. When the user authentication is normally performed, a process of registering the OCR dictionary with the pre-proofread original text data and the post-proofread original text data is performed. When the PC 200 transmits the original text data before calibration and the original text data after calibration to the image forming apparatus 100, for example, as shown in FIG. 1C, the OCR dictionary is updated by a Web browser provided by the image forming apparatus 100. The screen is displayed on the PC 200, the file name attached to the pre-proofread original text data file in the “file name of pre-proofread original text data” area, and the post-proofread original text in the “file name of post-proofread original text data” area. Set the file name attached to the data. With this setting, the image forming apparatus 100 receives the pre-proofread original text data and the post-proofread original text data by the Web browser.

次いで、（５）では、画像形成装置１００は、ＰＣ２００から受信した校正前原稿テキストデータから指定色の誤認識された文字を取り出し、校正後原稿テキストデータからユーザーが校正した指定色の文字を取り出すと、ＯＣＲ辞書に登録する。 Next, in (5), the image forming apparatus 100 takes out characters with erroneously recognized colors designated from the pre-calibration original text data received from the PC 200, and takes out characters of designated colors calibrated by the user from the original text data after proofreading. And registered in the OCR dictionary.

次に、実施形態における画像形成装置１００及びＰＣ２００の機能構成について、図２を用いて説明する。 Next, functional configurations of the image forming apparatus 100 and the PC 200 in the embodiment will be described with reference to FIG.

まず、画像形成装置１００の機能構成について説明する。図２に示すように画像形成装置１００は、制御部１０１、補助記憶部１０２、記憶部１０３、操作パネル１０４、操作パネル処理部１０５、スキャナー部１０６、画像処理部１０７、画像印刷部１０８、及びネットワーク通信部１０９を備えている。これらの各部は、バスなどにより接続される構成となっている。また、制御部１０１には、ＯＣＲ処理部１０１ａとＯＣＲ辞書登録処理部１０１ｂを備えている。補助記憶部１０２には、校正前原稿テキストデータ保存エリア１０２ａと校正後原稿テキストデータ保存エリア１０２ｂを備えている。記憶部１０３にはＯＣＲ辞書保存テーブル１０３ａとユーザー認証データ保存エリア１０３ｂを備えている。 First, the functional configuration of the image forming apparatus 100 will be described. As shown in FIG. 2, the image forming apparatus 100 includes a control unit 101, an auxiliary storage unit 102, a storage unit 103, an operation panel 104, an operation panel processing unit 105, a scanner unit 106, an image processing unit 107, an image printing unit 108, and A network communication unit 109 is provided. Each of these units is connected by a bus or the like. The control unit 101 includes an OCR processing unit 101a and an OCR dictionary registration processing unit 101b. The auxiliary storage unit 102 includes a pre-calibration original text data storage area 102a and a post-calibration original text data storage area 102b. The storage unit 103 includes an OCR dictionary storage table 103a and a user authentication data storage area 103b.

制御部１０１は、ＲＡＭやＲＯＭ等の主記憶手段、及びＭＰＵ(Micro Processing Unit)やＣＰＵ(Central Processing Unit)等の制御手段を備えている。また、制御部１０１は、各種Ｉ／Ｏや、ＵＳＢ（ユニバーサル・シリアル・バス）等のインターフェース、バスコントローラ等を含む総合的な画像形成装置１００のコントロールを行う。制御部１０１には、ＯＣＲ機能により文字を認識するＯＣＲ処理部１０１ａと、ＯＣＲ辞書に文字を登録するＯＣＲ辞書登録処理部１０１ｂが設けられている。ＯＣＲ処理部１０１ａとＯＣＲ辞書登録処理部１０１ｂの詳細については、後述する。 The control unit 101 includes main storage means such as RAM and ROM, and control means such as MPU (Micro Processing Unit) and CPU (Central Processing Unit). The control unit 101 controls the overall image forming apparatus 100 including various I / Os, interfaces such as USB (Universal Serial Bus), a bus controller, and the like. The control unit 101 is provided with an OCR processing unit 101a that recognizes characters by the OCR function and an OCR dictionary registration processing unit 101b that registers characters in the OCR dictionary. Details of the OCR processing unit 101a and the OCR dictionary registration processing unit 101b will be described later.

補助記憶部１０２は、制御部１０１が実行する処理のプログラムやデータが記憶されるフラッシュメモリである。校正前原稿テキストデータ保存エリア１０２ａは、ＰＣ２００から受信する校正前テキストデータを一時的に保存する。校正後原稿テキストデータ保存エリア１０２ｂは、ＰＣ２００から受信する校正後テキストデータを一時的に保存する。 The auxiliary storage unit 102 is a flash memory in which programs and data for processing executed by the control unit 101 are stored. The pre-proofread original text data storage area 102a temporarily stores pre-proofread text data received from the PC 200. The proofread original text data storage area 102b temporarily stores the proofread text data received from the PC 200.

記憶部１０３は、データやプログラムを保存するハードディスクドライブである。ＯＣＲ辞書保存テーブル１０３ａは、ＯＣＲ辞書登録処理部１０１ｂにより登録されるユーザー毎のＯＣＲ辞書を保存する。ＯＣＲ辞書保存テーブル１０３ａの詳細な構成については、後述する。ユーザー認証データ保存エリア１０３ｂは、ユーザー認証を行うときに使用する「ユーザー名」と「ユーザーＩＤ」を保存する。 The storage unit 103 is a hard disk drive that stores data and programs. The OCR dictionary storage table 103a stores an OCR dictionary for each user registered by the OCR dictionary registration processing unit 101b. The detailed configuration of the OCR dictionary storage table 103a will be described later. The user authentication data storage area 103b stores “user name” and “user ID” used when performing user authentication.

操作パネル１０４は、ユーザーによる操作や設定を受け付ける。また、操作パネル１０４には、画像形成装置１００が備えている機能の操作項目の画面、ユーザーに通知する警告メッセージやエラーメッセージを表示することができる。 The operation panel 104 accepts user operations and settings. In addition, the operation panel 104 can display a screen of operation items of functions provided in the image forming apparatus 100, a warning message and an error message to be notified to the user.

操作パネル処理部１０５は、画像形成装置１００が備えている機能の操作項目の画面を操作パネル１０４に表示する処理、操作パネル１０４から入力されるユーザーの操作や設定を入力する処理、及び画像形成装置１００の状態をユーザーに知らせるための警告メッセージやエラーメッセージを操作パネル１０４に表示する処理を行う。 The operation panel processing unit 105 displays a screen of operation items of functions provided in the image forming apparatus 100 on the operation panel 104, processes for inputting user operations and settings input from the operation panel 104, and image formation Processing for displaying a warning message or an error message on the operation panel 104 to inform the user of the state of the apparatus 100 is performed.

スキャナー部１０６は、画像形成装置１００の原稿台にセットされた原稿のスキャンを行うことで原稿をデジタル化するもので、ユーザーが操作パネル１０４からスキャン要求を行うと、スキャナー部１０６は原稿をスキャンし、スキャンした原稿データを画像処理部１０７に出力する。 The scanner unit 106 digitizes a document by scanning the document set on the document table of the image forming apparatus 100. When the user makes a scan request from the operation panel 104, the scanner unit 106 scans the document. Then, the scanned document data is output to the image processing unit 107.

画像形成部１０７は、スキャナー部１０６から原稿データを入力すると原稿データの画質、解像度、サイズ、回転方向、及び色などの調整を行い、原稿データの文字情報からビットマップグラフィクス（ラスター形式のデータ）に変換して画像化（ラスタライズ）した原稿画像データを作成し、制御部１０１に出力する。 When the document data is input from the scanner unit 106, the image forming unit 107 adjusts image quality, resolution, size, rotation direction, color, and the like of the document data, and bitmap graphics (raster format data) from character information of the document data. Document image data converted into an image (rasterized) is created and output to the control unit 101.

画像印刷部１０８は、ユーザーからの印刷要求により、スキャナー部１０６によりスキャンされた原稿の原稿データ、または画像形成装置１００が保存している校正前原稿テキストデータ、校正後原稿テキストデータなどをコピー用紙に印刷する。 The image printing unit 108 copies original document data scanned by the scanner unit 106 according to a print request from the user, or pre-calibration original text data and post-calibration original text data stored in the image forming apparatus 100. Print on.

ネットワーク通信部１０９は、ネットワーク３００に接続するための着脱可能なＬＡＮインターフェースを備えている。ＬＡＮインターフェースには、ＴＣＰ／ＩＰ、ＡｐｐｌｅＴａｌｋ、ＳＭＢ等の各種ネットワークプロトコルのインテリジェントな送受信を行うネットワーク部を含んでいる。 The network communication unit 109 includes a detachable LAN interface for connecting to the network 300. The LAN interface includes a network unit that performs intelligent transmission and reception of various network protocols such as TCP / IP, AppleTalk, and SMB.

次に、ＰＣ２００の機能構成について説明する。図２に示すようにＰＣ２００は、制御部２０１、補助記憶部２０２、ネットワーク通信部２０３、及び表示部２０４を備え、これら各部はバスなどにより接続される構成となっている。また、制御部２０１には、原稿テキストデータ送信処理部２０１ａを備えている。補助記憶部２０２には、校正前原稿テキストデータ保存エリア２０２ａと校正後原稿テキストデータ２０２ｂを備えている。 Next, the functional configuration of the PC 200 will be described. As shown in FIG. 2, the PC 200 includes a control unit 201, an auxiliary storage unit 202, a network communication unit 203, and a display unit 204. These units are connected by a bus or the like. In addition, the control unit 201 includes a document text data transmission processing unit 201a. The auxiliary storage unit 202 includes a pre-proofread original text data storage area 202a and post-proofread original text data 202b.

制御部２０１は、ＲＡＭやＲＯＭ等の主記憶手段、及びＣＰＵ(Central Processing Unit)等の制御手段を備えている。また、制御部２０１は、各種Ｉ／Ｏや、ＵＳＢ（ユニバーサル・シリアル・バス）等のインターフェース、バスコントローラ等を含む総合的なＰＣ２００のコントロールを行う。制御部２０１には、画像形成装置１００に校正前原稿テキストデータ及び校正後原稿テキストデータを送信する原稿テキストデータ処理部２０１aが設けられている。 The control unit 201 includes main storage means such as RAM and ROM, and control means such as a CPU (Central Processing Unit). Further, the control unit 201 performs comprehensive control of the PC 200 including various I / Os, interfaces such as USB (Universal Serial Bus), a bus controller, and the like. The control unit 201 is provided with a document text data processing unit 201 a that transmits the document text data before calibration and the document text data after calibration to the image forming apparatus 100.

補助記憶部２０２は、フラッシュメモリ等からなる補助記憶装置で、制御部２０１が実行する処理のプログラムやデータを記憶する。校正前原稿テキストデータ保存エリア２０２ａは、画像形成装置１００から受信する原稿テキストデータを保存する。その後、ユーザーにより文字の校正が行われると、校正前の文字が指定色に変更され、校正前原稿テキストデータに保存される。校正後原稿テキストデータ保存エリア２０２ｂは、ユーザーが校正した文字とその文字が指定色に変更された校正後原稿テキストデータを保存する。 The auxiliary storage unit 202 is an auxiliary storage device including a flash memory or the like, and stores programs and data for processing executed by the control unit 201. The pre-calibration original text data storage area 202a stores original text data received from the image forming apparatus 100. Thereafter, when the user proofreads the character, the character before proofreading is changed to the designated color and stored in the original text data before proofreading. The proofread original text data storage area 202b stores proofread original text data in which the user proofreads the character and the character is changed to a specified color.

ネットワーク通信部２０３は、ネットワーク３００に接続するための着脱可能なＬＡＮインターフェースを備えている。ＬＡＮインターフェースには、ＴＣＰ／ＩＰ、ＡｐｐｌｅＴａｌｋ、ＳＭＢ等の各種ネットワークプロトコルのインテリジェントな送受信を行うネットワーク部を含んでいる。 The network communication unit 203 includes a detachable LAN interface for connecting to the network 300. The LAN interface includes a network unit that performs intelligent transmission and reception of various network protocols such as TCP / IP, AppleTalk, and SMB.

表示部２０４は、液晶ディスプレイ等のモニターを備え、ユーザーにより校正前の原稿テキストデータが表示される。 The display unit 204 includes a monitor such as a liquid crystal display, and displays text data before proofreading by the user.

次に、ＯＣＲ辞書保存テーブル１０３ａの構成について、図３を用いて説明する。ＯＣＲ辞書保存テーブル１０３ａは、ＯＣＲ機能が文字を認識するときに使用するユーザー毎のＯＣＲ辞書が保存されているテーブルである。ＯＣＲ辞書保存テーブル１０３ａは、図３に示すように複数のユーザーＯＣＲ辞書格納エリア１０３ａ−ｕから構成されている。また、各々のユーザーＯＣＲ辞書格納エリア１０３ａ−ｕには、「ユーザー名」ｕ１、「ユーザーＩＤ」ｕ２、「登録数」ｕ３、「言語」ｕ４、「文字コード」ｕ５、「校正前の文字」ｕ６、及び「校正後の文字」ｕ７のデータ項目が設けられている。 Next, the configuration of the OCR dictionary storage table 103a will be described with reference to FIG. The OCR dictionary storage table 103a is a table in which an OCR dictionary for each user used when the OCR function recognizes characters is stored. As shown in FIG. 3, the OCR dictionary storage table 103a includes a plurality of user OCR dictionary storage areas 103a-u. Each user OCR dictionary storage area 103a-u includes “user name” u1, “user ID” u2, “registration number” u3, “language” u4, “character code” u5, “characters before proofreading”. Data items of u6 and “character after proofreading” u7 are provided.

「ユーザー名」ｕ１は、個人またはグループのユーザーを識別するためにユーザー毎に付されているユニークな名前である。「ユーザーＩＤ」ｕ２は、ユーザー名に対応して付けられた非公開の識別番号で、画像形成装置１００がユーザー認証を行うときに使用する。「登録数」ｕ３は、ユーザーＯＣＲ辞書格納エリア１０３ａ−ｕに登録されている文字の合計数である。「言語」ｕ４は、原稿に記載された言語である。「文字コード」ｕ５は、原稿テキストデータで用いられている文字コードである。「校正前の文字」ｕ６は、ＯＣＲ処理部１０１ａによって誤認識され、ユーザーが校正する前の文字である。「校正後の文字」ｕ７は、ＯＣＲ処理部１０１ａによって誤認識された後に、ユーザーが校正した文字である。図３に示すユーザーＡのユーザーＯＣＲ辞書格納エリア（ユーザーＡのＯＣＲ辞書格納エリア）１０３ａ−ｕは、ユーザー名が「ＡＡＡＡＡ」、ユーザーＩＤが「Ａ００１」、文字の登録数が「１０」、言語が「日本語」、文字コードが「ＳＪＩＳ」、校正前の文字が「せラミックス」、校正後の文字が「セラミックス」である例を示している。 The “user name” u1 is a unique name assigned to each user in order to identify an individual or group user. The “user ID” u2 is a private identification number assigned in correspondence with the user name, and is used when the image forming apparatus 100 performs user authentication. “Number of registrations” u3 is the total number of characters registered in the user OCR dictionary storage area 103a-u. “Language” u4 is a language described in the manuscript. “Character code” u5 is a character code used in the document text data. The “character before proofreading” u6 is a character that is erroneously recognized by the OCR processing unit 101a and is not proofread by the user. The “character after proofreading” u7 is a character proofread by the user after being erroneously recognized by the OCR processing unit 101a. In the user A user OCR dictionary storage area (user A OCR dictionary storage area) 103a-u shown in FIG. Is “Japanese”, the character code is “SJIS”, the character before proofreading is “Seramix”, and the character after proofreading is “ceramics”.

次に、本発明の実施形態に係る画像形成装置１００のＯＣＲ処理部１０１ａが実行するＯＣＲ処理について、図４を用いて説明する。 Next, OCR processing executed by the OCR processing unit 101a of the image forming apparatus 100 according to the embodiment of the present invention will be described with reference to FIG.

画像形成装置１００の制御部１０１が画像処理部１０７から原稿画像データを入力すると、制御部１０１は、ＯＣＲ処理部１０１ａを起動する。ＯＣＲ処理部１０１ａが起動されると、ＯＣＲ処理部１０１ａはＯＣＲ処理を開始する。以下、図３に示すＯＣＲ処理のステップ順に説明する。 When the control unit 101 of the image forming apparatus 100 inputs document image data from the image processing unit 107, the control unit 101 activates the OCR processing unit 101a. When the OCR processing unit 101a is activated, the OCR processing unit 101a starts OCR processing. Hereinafter, the steps will be described in the order of the OCR processing shown in FIG.

（ステップＳ１０１）
まず、ＯＣＲ処理部１０１ａは、操作パネル処理部１０５からユーザーが設定した「ユーザー名」と「ユーザーＩＤ」を入力する。 (Step S101)
First, the OCR processing unit 101 a inputs “user name” and “user ID” set by the user from the operation panel processing unit 105.

（ステップＳ１０２）
次いで、ＯＣＲ処理部１０１ａは、ユーザー認証データ保存エリア１０３ｂからステップＳ１０１で入力した「ユーザー名」に対応する「ユーザーＩＤ」を取り出す。 (Step S102)
Next, the OCR processing unit 101a extracts the “user ID” corresponding to the “user name” input in step S101 from the user authentication data storage area 103b.

（ステップＳ１０３）
次いで、ＯＣＲ処理部１０１ａは、操作パネル処理部１０５から入力した「ユーザーＩＤ」とユーザー認証データ保存エリア１０３ｂから取り出した「ユーザーＩＤ」が同じであるかを判定することでユーザー認証を行う。ユーザー認証が正常であるとき（ステップＳ１０３のＹｅｓ）は、ステップＳ１０４に進む。ユーザー認証が正常でないとき（ステップＳ１０３のＮｏ）は、ＯＣＲ処理を終了する。 (Step S103)
Next, the OCR processing unit 101a performs user authentication by determining whether the “user ID” input from the operation panel processing unit 105 is the same as the “user ID” extracted from the user authentication data storage area 103b. When user authentication is normal (Yes in step S103), the process proceeds to step S104. When the user authentication is not normal (No in step S103), the OCR process is terminated.

（ステップＳ１０４）
次いで、ＯＣＲ処理部１０１ａは、スキャナー部１０６によりスキャンされ、画像処理部１０７により画像処理が行われた原稿画像データを入力する。 (Step S104)
Next, the OCR processing unit 101 a inputs document image data scanned by the scanner unit 106 and subjected to image processing by the image processing unit 107.

（ステップＳ１０５）
次いで、ＯＣＲ処理部１０１ａは、ステップＳ１０１で入力した「ユーザー名」に対応するユーザーＯＣＲ辞書格納エリア１０３ａ−uを取り出す。例えば、「ユーザー名」に「ＡＡＡＡＡ」が設定され、「ユーザー名」の「ＡＡＡＡＡ」に対応しているユーザーＯＣＲ辞書保存エリア１０３ａ−ｕが「ユーザーＡのＯＣＲ辞書保存エリア」であれば、「ユーザーＡのＯＣＲ辞書保存エリア」が取り出される。 (Step S105)
Next, the OCR processing unit 101a extracts the user OCR dictionary storage area 103a-u corresponding to the “user name” input in step S101. For example, if “AAAAA” is set in “User Name” and the user OCR dictionary storage area 103a-u corresponding to “AAAAAA” in “User Name” is “User A OCR Dictionary Storage Area”, “ User A's OCR dictionary storage area is retrieved.

（ステップＳ１０６）
次いで、ＯＣＲ処理部１０１ａは、原稿画像データの先頭の文字を取り出すために、先頭の文字の位置をポインターに設定する。 (Step S106)
Next, the OCR processing unit 101a sets the position of the first character as a pointer in order to extract the first character of the document image data.

（ステップＳ１０７）
次いで、ＯＣＲ処理部１０１ａは、原稿画像データからポインターが示す文字を取り出し、ＯＣＲ機能を実行して認識度を算出する。 (Step S107)
Next, the OCR processing unit 101a extracts the character indicated by the pointer from the document image data, executes the OCR function, and calculates the recognition level.

（ステップＳ１０８）
次いで、ＯＣＲ処理部１０１ａは、認識度が予め決められた所定の閾値より低いかを判定する。所定の閾値より低いとき（ステップＳ１０８のＹｅｓ）は、ステップＳ１０９に進む。所定の閾値以上であるとき（ステップＳ１０８のＮｏ）は、ステップＳ１１１に進む。 (Step S108)
Next, the OCR processing unit 101a determines whether the recognition level is lower than a predetermined threshold value. When it is lower than the predetermined threshold value (Yes in step S108), the process proceeds to step S109. When it is equal to or greater than the predetermined threshold (No in step S108), the process proceeds to step S111.

（ステップＳ１０９）
ステップＳ１０８のＹｅｓにおいて、ＯＣＲ処理部１０１ａは、認識度が低い文字がステップ１０５で取り出したユーザーＯＣＲ辞書格納エリア１０３ａ−uの「校正前の文字」ｕ６に登録されているかを判定する。認識度が低い文字が「校正前の文字」ｕ６に登録されているとき（ステップＳ１０９のＹｅｓ）は、ステップＳ１１０に進む。認識度の低い文字が「校正前の文字」に登録されていないとき（ステップＳ１０９のＮｏ）は、ステップＳ１１１に進む。 (Step S109)
In Yes of step S108, the OCR processing unit 101a determines whether or not a character with a low recognition degree is registered in the “character before proofreading” u6 in the user OCR dictionary storage area 103a-u extracted in step 105. When a character with a low recognition degree is registered in the “character before proofreading” u6 (Yes in step S109), the process proceeds to step S110. When the character with low recognition degree is not registered in “character before proofreading” (No in step S109), the process proceeds to step S111.

（ステップＳ１１０）
ステップＳ１０９のＹｅｓにおいて、ＯＣＲ処理部１０１ａは、ポインターが示すＯＣＲ処理された文字をユーザーＯＣＲ辞書格納エリア１０３ａ−uの「校正前の文字」ｕ６に対応する「校正後の文字」ｕ７に置き換え、原稿テキストデータに保存する。 (Step S110)
In Yes of step S109, the OCR processing unit 101a replaces the character subjected to the OCR processing indicated by the pointer with the “character after proofreading” u7 corresponding to the “character before proofreading” u6 in the user OCR dictionary storage area 103a-u. Save to original text data.

（ステップＳ１１１）
ステップＳ１０８のＮｏ、及びステップＳ１０９のＮｏにおいて、ＯＣＲ処理部１０１ａは、ＯＣＲ機能で認識された文字を原稿テキストデータに保存する。 (Step S111)
In step S108 No and step S109 No, the OCR processing unit 101a stores the character recognized by the OCR function in the document text data.

（ステップＳ１１２）
次いで、ＯＣＲ処理部１０１ａは、原稿画像データの全ての文字が取り出されたかを判定する。全ての文字が取り出されたとき（ステップＳ１１２のＹｅｓ）は、ステップＳ１１３に進む。全ての文字が取り出されていないとき（ステップＳ１１２のＮｏ）は、ステップＳ１１４に進む。 (Step S112)
Next, the OCR processing unit 101a determines whether all characters of the document image data have been extracted. When all the characters have been extracted (Yes in step S112), the process proceeds to step S113. When all the characters have not been extracted (No in step S112), the process proceeds to step S114.

（ステップＳ１１３）
ステップＳ１１２のＹｅｓにおいて、ＯＣＲ処理部１０１ａは、原稿テキストデータをネットワーク通信部１０９によりＰＣ２００に送信し、ＯＣＲ処理を終了する。 (Step S113)
In Yes of step S112, the OCR processing unit 101a transmits the document text data to the PC 200 through the network communication unit 109, and ends the OCR processing.

（ステップＳ１１４）
ステップＳ１１２のＮｏにおいて、ＯＣＲ処理部１０１ａは、原稿画像データの次の文字を取り出すために、次の文字の位置をポインターに設定し、ステップＳ１０７に戻る。 (Step S114)
In step S112 No, the OCR processing unit 101a sets the position of the next character as a pointer in order to extract the next character of the document image data, and returns to step S107.

画像形成装置１００のネットワーク通信部１０９がＰＣ２００から「ユーザー名」と「ユーザーＩＤ」、及び校正前テキストデータと校正後テキストデータを受信すると、制御部１０１は、ＯＣＲ辞書登録処理部１０１ｂを起動する。ＯＣＲ辞書登録処理部１０１ｂが起動されると、ＯＣＲ辞書登録処理部１０１ｂはＯＣＲ辞書登録処理を開始する。以下、図５に示すＯＣＲ辞書登録処理のステップ順に説明する。 When the network communication unit 109 of the image forming apparatus 100 receives “user name” and “user ID”, pre-proofread text data, and post-proofread text data from the PC 200, the control unit 101 activates the OCR dictionary registration processing unit 101b. . When the OCR dictionary registration processing unit 101b is activated, the OCR dictionary registration processing unit 101b starts an OCR dictionary registration process. Hereinafter, the steps will be described in the order of the OCR dictionary registration process shown in FIG.

（ステップＳ２０１）
まず、ＯＣＲ辞書登録処理部１０１ｂは、ネットワーク通信部１０９から「ユーザー名」と「ユーザーＩＤ」を入力する。 (Step S201)
First, the OCR dictionary registration processing unit 101 b inputs “user name” and “user ID” from the network communication unit 109.

（ステップＳ２０２）
次いで、ＯＣＲ辞書登録処理部１０１ｂは、ユーザー認証データ保存エリア１０３ｂからステップＳ２０１で入力した「ユーザー名」に対応する「ユーザーＩＤ」を取り出す。 (Step S202)
Next, the OCR dictionary registration processing unit 101b extracts the “user ID” corresponding to the “user name” input in step S201 from the user authentication data storage area 103b.

（ステップＳ２０３）
次いで、ＯＣＲ辞書登録処理部１０１ｂは、ステップＳ２０１で入力した「ユーザーＩＤ」とユーザー認証データ保存エリア１０３ｂから取り出した「ユーザーＩＤ」が同じであるかを判定することでユーザー認証を行う。ユーザー認証が正常であるとき（ステップＳ２０３のＹｅｓ）は、ステップＳ２０４に進む。ユーザー認証が正常でないとき（ステップＳ２０３のＮｏ）は、ＯＣＲ辞書登録処理を終了する。 (Step S203)
Next, the OCR dictionary registration processing unit 101b performs user authentication by determining whether the “user ID” input in step S201 is the same as the “user ID” extracted from the user authentication data storage area 103b. When user authentication is normal (Yes in step S203), the process proceeds to step S204. When the user authentication is not normal (No in step S203), the OCR dictionary registration process is terminated.

（ステップＳ２０４）
まず、ＯＣＲ辞書登録処理部１０１ｂは、ネットワーク通信部１０９から校正前原稿テキストデータを入力すると、校正前原稿テキストデータ保存エリア１０２ａに保存する。同様に、ＯＣＲ辞書登録処理部１０１ｂは、ネットワーク通信部１０９から校正後原稿テキストデータを入力すると、校正後原稿テキストデータ保存エリア１０２ｂに保存する。 (Step S204)
First, when the pre-proofread original text data is input from the network communication unit 109, the OCR dictionary registration processing unit 101b stores the pre-proofread original text data storage area 102a. Similarly, when the proofread document text data is input from the network communication unit 109, the OCR dictionary registration processing unit 101b stores the proofread document text data storage area 102b.

（ステップＳ２０５）
次いで、ＯＣＲ辞書登録処理部１０１ｂは、ステップＳ２０１で入力した「ユーザー名」に対応するユーザーＯＣＲ辞書格納エリア１０３ａ−uを取り出す。 (Step S205)
Next, the OCR dictionary registration processing unit 101b takes out the user OCR dictionary storage area 103a-u corresponding to the “user name” input in step S201.

（ステップＳ２０６）
次いで、ＯＣＲ辞書登録処理部１０１ｂは、校正前原稿テキストデータから先頭の指定色に変更されている文字を取り出すために、校正前原稿テキストデータにおける先頭の指定色に変更されている文字の位置をポインターに設定する。 (Step S206)
Next, the OCR dictionary registration processing unit 101b determines the position of the character changed to the first designated color in the pre-proofread original text data in order to extract the character changed to the first designated color from the pre-proofread original text data. Set to pointer.

（ステップＳ２０７）
次いで、ＯＣＲ辞書登録処理部１０１ｂは、原稿校正前テキストデータからポインターが示す文字を取り出す。 (Step S207)
Next, the OCR dictionary registration processing unit 101b extracts the character indicated by the pointer from the text data before proofreading.

（ステップＳ２０８）
次いで、ＯＣＲ辞書登録処理部１０１ｂは、ステップ２０７で取り出した文字がユーザーＯＣＲ辞書格納エリア１０３ａ−ｕの「校正前の文字」ｕ６に登録されているかを判定する。「校正前の文字」ｕ６に登録されているとき（ステップＳ２０８のＹｅｓ）は、ステップＳ２１０に進む。「校正前の文字」ｕ６に登録されていないとき（ステップＳ２０８のＮｏ）は、ステップＳ２０９に進む。 (Step S208)
Next, the OCR dictionary registration processing unit 101b determines whether or not the character extracted in step 207 is registered in the “character before calibration” u6 in the user OCR dictionary storage area 103a-u. When it is registered in “character before proofreading” u6 (Yes in step S208), the process proceeds to step S210. When it is not registered in “character before proofreading” u6 (No in step S208), the process proceeds to step S209.

（ステップＳ２０９）
ステップＳ２０８のＮｏにおいて、ＯＣＲ辞書登録処理部１０１ｂは、「ユーザー名」に対応するユーザーＯＣＲ辞書格納エリア１０３ａ−ｕの「ユーザー名」ｕ１、「ユーザーＩＤ」ｕ２、「登録数」ｕ３、「言語」ｕ４、「文字コード」ｕ５、「校正前の文字」ｕ６、及び「校正後の文字」ｕ７のデータ項目に情報を設定する。なお、「校正後の文字」ｕ７は、校正前テキストデータに対応する文字が校正後テキストデータに保存されているので、校正後テキストデータから取り出して設定する。 (Step S209)
In step S208 No, the OCR dictionary registration processing unit 101b selects “user name” u1, “user ID” u2, “number of registrations” u3, “language” in the user OCR dictionary storage area 103a-u corresponding to “user name”. Information is set in the data items of “u4”, “character code” u5, “character before calibration” u6, and “character after calibration” u7. The “character after proofreading” u7 is set by taking out the text data after proofreading since the characters corresponding to the text data before proofreading are stored in the text data after proofreading.

（ステップＳ２１０）
ステップＳ２０８のＹｅｓまたはステップＳ２０９の次に、ＯＣＲ辞書登録処理部１０１ｂは、校正前テキストデータにおいて指定色に変更されている全ての文字が取り出されたかを判定する。全ての文字が取り出されたとき（ステップＳ２１０のＹｅｓ）は、ステップＳ２１１に進む。全ての文字が取り出されていないとき（ステップＳ２１０のＮｏ）は、ステップＳ２１２に進む。 (Step S210)
After Yes in step S208 or step S209, the OCR dictionary registration processing unit 101b determines whether all characters that have been changed to the designated color in the text data before proofreading have been extracted. When all the characters have been extracted (Yes in step S210), the process proceeds to step S211. When all the characters have not been extracted (No in step S210), the process proceeds to step S212.

（ステップＳ２１１）
ステップＳ２１０のＹｅｓにおいて、ＯＣＲ辞書登録処理部１０１ｂは、更新したユーザーＯＣＲ辞書格納エリア１０３ａ−ｕをＯＣＲ辞書保存テーブル１０３ａの「ユーザー名」に対応するユーザーＯＣＲ辞書格納エリア１０３ａ−uに保存し、ＯＣＲ辞書登録処理を終了する。 (Step S211)
In Yes of step S210, the OCR dictionary registration processing unit 101b stores the updated user OCR dictionary storage area 103a-u in the user OCR dictionary storage area 103a-u corresponding to the “user name” in the OCR dictionary storage table 103a. The OCR dictionary registration process is terminated.

（ステップＳ２１２）
ステップＳ２１０のＮｏにおいて、ＯＣＲ辞書登録処理部１０１ｂは、校正前原稿テキストデータから指定色に変更されている次の文字を取り出すために、校正前原稿テキストデータにおける指定色に変更されている次の文字の位置をポインターに設定し、ステップＳ２０７に戻る。 (Step S212)
In step S210 No, the OCR dictionary registration processing unit 101b extracts the next character that has been changed to the designated color from the original text data before proofreading, and the next color that has been changed to the designated color in the original text data before proofreading. The character position is set to the pointer, and the process returns to step S207.

以上のように、ユーザーにより校正された文字に対応する指定色に変更された校正前の文字が保存されている校正前原稿テキストデータと、ユーザーにより校正された文字が指定色に変更された校正後原稿テキストデータとを、ＰＣ２００から画像形成装置１００に送信することで、画像形成装置１００は、ＯＣＲ機能で誤認識された文字とユーザーにより校正された文字を容易に判別することができる。また、画像形成装置１００のＷｅｂブラウザで表示されたＯＣＲ辞書更新画面により、校正前原稿テキストデータと校正後原稿テキストデータを画像形成装置１００はＰＣ２００から容易に受信することができる。また、ＯＣＲ機能は、言語、単語、及びフォントなどによって誤認識が変化するので、ユーザーの使用頻度が高い言語、単語、及びフォントに応じたＯＣＲ辞書を作成することで、ＯＣＲ機能がＯＣＲ辞書を使用することにより誤認識された文字を正しい文字に置き換える精度が向上する。また、ＯＣＲ処理部１０１ａによるＯＣＲ処理、またはＯＣＲ辞書登録処理部１０１ｂによるＯＣＲ辞書登録処理において、ユーザー認証が正常であるかを判定しているので、許可されていないユーザーによるＯＣＲ処理及びＯＣＲ辞書登録処理の実行を不可とすることができる。 As described above, the original text data before calibration in which the pre-calibration text changed to the specified color corresponding to the text calibrated by the user and the calibration in which the text calibrated by the user has been changed to the specified color By transmitting the post original text data from the PC 200 to the image forming apparatus 100, the image forming apparatus 100 can easily discriminate between a character erroneously recognized by the OCR function and a character calibrated by the user. Further, the image forming apparatus 100 can easily receive the pre-proofread original text data and the post-proofread original text data from the PC 200 by the OCR dictionary update screen displayed on the Web browser of the image forming apparatus 100. In addition, since the misrecognition of the OCR function varies depending on the language, word, font, etc., creating an OCR dictionary according to the language, word, and font frequently used by the user allows the OCR function to change the OCR dictionary. This improves the accuracy of replacing erroneously recognized characters with correct characters. In addition, since it is determined whether the user authentication is normal in the OCR processing by the OCR processing unit 101a or the OCR dictionary registration processing by the OCR dictionary registration processing unit 101b, the OCR processing and OCR dictionary registration by an unauthorized user are determined. Execution of processing can be disabled.

なお、実施形態においては、ユーザーにより校正された文字、及びユーザーにより校正された文字に対応する校正前の文字を指定色に変更したが、この指定色に限定されず、網掛け文字、太文字、斜体文字、反転文字、アンダーライン文字などのコンピュータが判別できる装飾文字とすることが可能である。このようにコンピュータが判別できる装飾文字とすることで、ＯＣＲ辞書登録処理が校正前の文字と校正後の文字を判別し易くなり、自動でＯＣＲ辞書に登録することができる。 In the embodiment, the character proofread by the user and the character before proofreading corresponding to the character proofread by the user are changed to the specified color. However, the present invention is not limited to this specified color. It is possible to use decorative characters that can be identified by a computer, such as italic characters, inverted characters, and underline characters. By using the decoration characters that can be discriminated by the computer in this way, the OCR dictionary registration process can easily discriminate between the characters before proofreading and the characters after proofreading, and can be automatically registered in the OCR dictionary.

また、ＯＣＲ辞書保存テーブル１０３ａは、複数のユーザーＯＣＲ辞書格納エリア１０３ａ−ｕから構成されているが、ユーザーＯＣＲ辞書格納エリア１０３ａ−ｕを相互にコピーすることも可能である。例えば、図３に示すユーザーＡのＯＣＲ辞書格納エリアをユーザーＥのＯＣＲ辞書格納エリアにコピーすることができる。 The OCR dictionary storage table 103a is composed of a plurality of user OCR dictionary storage areas 103a-u, but the user OCR dictionary storage areas 103a-u can also be copied to each other. For example, the user A's OCR dictionary storage area shown in FIG. 3 can be copied to the user E's OCR dictionary storage area.

また、ＯＣＲ機能が文字をＯＣＲ辞書で検索するか否かを判定する認識度の閾値は、ＯＣＲ機能の速度、ＯＣＲ機能の精度、またはユーザーの設定により変更が可能である。更に、認識度の閾値に「０」を設定すると、ＯＣＲ機能がＯＣＲ辞書を使用しないような設定とすることも可能である。 In addition, the threshold value of the recognition level for determining whether or not the OCR function searches for characters in the OCR dictionary can be changed according to the speed of the OCR function, the accuracy of the OCR function, or user settings. Furthermore, when “0” is set as the threshold value of the recognition level, the OCR function can be set not to use the OCR dictionary.

また、校正前原稿テキストデータと校正後原稿テキストデータを画像形成装置１００に送信するときに、校正前原稿テキストデータと校正後原稿テキストデータの先頭にＯＣＲ辞書に登録するためのデータであることを示すコマンドを付加することで、画像形成装置１００は、容易に判別でるようにすることも可能である。 In addition, when the original text data before proofreading and the original text data after proofreading are transmitted to the image forming apparatus 100, they are data to be registered in the OCR dictionary at the head of the original text data before proofreading and the original text data after proofreading. By adding the command shown, the image forming apparatus 100 can be easily discriminated.

また、ＯＣＲ辞書登録処理部１０１ｂのＯＣＲ辞書登録処理において、校正前原稿テキストデータにおける指定色に変更されている文字でＯＣＲ辞書に登録されているかを判定したが、校正後原稿テキストデータにおける指定色に変更されている文字でＯＣＲ辞書に登録されているかを判定することも可能である。 Further, in the OCR dictionary registration processing of the OCR dictionary registration processing unit 101b, it is determined whether or not the character changed to the designated color in the pre-proofread original text data is registered in the OCR dictionary. It is also possible to determine whether or not the character changed to is registered in the OCR dictionary.

また、ＯＣＲ処理部１０１ａのＯＣＲ処理において、ＯＣＲ処理の先頭でユーザー認証を行うようにしたが、操作パネル１０４に「ユーザー名」と「ユーザーＩＤ」が入力されたときに、操作パネル処理部１０５によりユーザー認証を行うようにすることも可能である。 In the OCR processing of the OCR processing unit 101a, user authentication is performed at the beginning of the OCR processing. However, when “user name” and “user ID” are input to the operation panel 104, the operation panel processing unit 105 is used. It is also possible to perform user authentication according to the above.

また、実施形態においては、画像形成装置１００から原稿テキストデータをＰＣ２００に送信し、ＰＣ２００から校正前原稿テキストデータと校正後原稿テキストデータを画像形成装置１００に送信するようにしたが、画像形成装置１００が原稿テキストデータの校正が可能なディスプレイ（表示部）を備えているときには、画像形成装置１００により校正前原稿テキストデータと校正後原稿テキストデータを作成することも可能である。 In the embodiment, the original text data is transmitted from the image forming apparatus 100 to the PC 200, and the pre-calibration original text data and the post-calibration original text data are transmitted from the PC 200 to the image forming apparatus 100. When the display 100 includes a display (display unit) capable of proofreading original text data, the image forming apparatus 100 can create pre-proofread original text data and post-proofread original text data.

このような本発明の画像形成装置は、ＯＣＲ機能により誤認識された文字や単語をユーザーがＰＣで校正すると、ＯＣＲ機能が誤認識する可能性のある文字や単語を容易に漏れなくＯＣＲ辞書に登録できるので、誤認識された文字や単語をＯＣＲ機能が正しい文字や単語に置き換えることで、ユーザーが校正する時間を短縮できる。 In such an image forming apparatus of the present invention, when a user calibrates a character or a word that is erroneously recognized by the OCR function with a PC, the character or word that may be erroneously recognized by the OCR function is easily included in the OCR dictionary without omission. Since registration can be performed, the time for the user to proofread can be shortened by replacing the erroneously recognized character or word with the correct character or word by the OCR function.

以上、具体的な実施の形態により本発明を説明したが、上記実施の形態は本発明の例示であり、この実施の形態に限定されないことは言うまでもない。 As mentioned above, although this invention was demonstrated by specific embodiment, it cannot be overemphasized that the said embodiment is an illustration of this invention and is not limited to this embodiment.

本発明は、画像形成装置に好適であるが、画像形成装置に限られるものではなく、ＯＣＲ機能を有する装置一般に適用できる。 The present invention is suitable for an image forming apparatus, but is not limited to an image forming apparatus, and can be applied to any apparatus having an OCR function.

１００・・・・・画像形成装置
１０１・・・・・制御部
１０１ａ・・・・ＯＣＲ処理部
１０１ｂ・・・・ＯＣＲ辞書登録処理部
１０２・・・・・補助記憶部
１０２ａ・・・・校正前原稿テキストデータ保存エリア
１０２ｂ・・・・校正後原稿テキストデータ保存エリア
１０３・・・・・記憶部
１０３ａ・・・・ＯＣＲ辞書保存テーブル
１０３ｂ・・・・ユーザー認証データ保存エリア
１０４・・・・・操作パネル
１０５・・・・・操作パネル処理部
１０６・・・・・スキャナー部
１０７・・・・・画像処理部
１０８・・・・・画像印刷部
１０９・・・・・ネットワーク通信部
２００・・・・・ＰＣ
２０１・・・・・制御部
２０１ａ・・・・原稿テキストデータ送信処理部
２０２・・・・・補助記憶部
２０２ａ・・・・校正前原稿テキストデータ保存エリア
２０２ｂ・・・・校正後原稿テキストデータ保存エリア
２０３・・・・・ネットワーク通信部
２０４・・・・・表示部
３００・・・・・ネットワーク DESCRIPTION OF SYMBOLS 100 ... Image forming apparatus 101 ... Control part 101a ... OCR processing part 101b ... OCR dictionary registration processing part 102 ... Auxiliary storage part 102a ... Calibration Previous document text data storage area 102b... Post-proof document text data storage area 103... Storage unit 103a... OCR dictionary storage table 103b. Operation panel 105... Operation panel processing unit 106... Scanner unit 107... Image processing unit 108. .... PC
201... Control unit 201 a... Original text data transmission processing unit 202... Auxiliary storage unit 202 a... Pre-calibration original text data storage area 202 b. Storage area 203... Network communication unit 204... Display unit 300.

Claims

An image forming system in which an image forming apparatus having an OCR function and an external device are connected via a network ,
The image forming apparatus includes:
An OCR processing unit that creates document text data from document image data ;
An OCR dictionary registration processing unit for registering pre-proofread and post-proofread characters or words in the OCR dictionary storage table based on pre-proofread original text data and post-proofread original text data of the original text data ;
A storage unit that stores the OCR dictionary storage table for each user and a user name and a user ID for user authentication ;
The image forming apparatus performs (1) user authentication based on a user name and user ID stored in the storage unit and a user name and user ID input to the image forming apparatus, and (2) the user. If the authentication is normal, the OCR processing unit extracts characters or words from the document image data, executes the OCR function to calculate a recognition degree, and (3) the recognition degree is equal to or greater than a predetermined threshold value. The character or word recognized by the OCR function is stored in the original text data. (4) When the recognition degree is lower than the predetermined threshold, the user authentication is normal. It is determined whether or not a character or word whose recognition level is lower than the predetermined threshold is registered in the OCR dictionary storage table corresponding to the user name, and (5) registered If the character or word before proofreading is replaced with the character or word after proofreading and stored in the original text data, (6) if not registered, the character or word recognized by the OCR function (7) transmitting the document text data to the external device via the network;
When the original text data is received by the external device, the character or word erroneously recognized by the OCR processing unit is calibrated to the correct character or word, so that the character or word before calibration is decorated and the character or word is decorated. This data is stored in the original text data before proofreading, decorated with characters or words after proofreading, stored in the original text data after proofreading , and further registered in the OCR dictionary storage table with the user name and user ID. The pre-calibration original text data and the post-correction original text data to which a command indicating this is added is transmitted to the image forming apparatus via the network,
The image forming apparatus performs user authentication based on (1) the user name and the user ID stored in the storage unit, and the user name and the user ID received from the external apparatus, and (2) If the user authentication is normal, the pre-calibration original text data and the post-correction original text data received from the external device are stored, and (3) the pre-calibration original text is stored by the OCR dictionary registration processing unit. The position of the decorated character or word in the data or the proofread original text data is set as a pointer, and the character or word indicated by the pointer in the pre-proofread original text data or the proofread original text data is extracted, (4) The letter or word is the user whose user authentication was normal (5) if not registered, the OCR dictionary storage table corresponding to the user name stores the pre-proofread original text data and the image forming system characterized that you register a character or word fetched from proofread the original text data.

The decorative character or word, an image forming system according to claim 1, wherein the characters or words der Rukoto that changed designated color.

Before Symbol image forming apparatus, the screen displays a by Web browser, by setting the file name and the file name of the proofread manuscript text data of the calibration before the original text data on the screen, the calibration before the original text data and 3. The image forming system according to claim 1, wherein the proofread original text data is received.