JP2008084186A

JP2008084186A - Image processing system and image processing program

Info

Publication number: JP2008084186A
Application number: JP2006265667A
Authority: JP
Inventors: Masahiro Kato; 雅弘加藤
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2006-09-28
Filing date: 2006-09-28
Publication date: 2008-04-10

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image processing system and an image processing program which can suppress increase in transformation rule information. <P>SOLUTION: The system has an input means for inputting a document image which has two or more areas containing character strings, a split means for dividing the document image inputted by the input means for every area, a recognition means for recognizing the character string from the document image inputted by the input means, a determination means for determining whether the character string which has been recognized by the recognition means is contained in the character string which is defined for every area memorized beforehand for every area divided by the split means, and a converting means for converting the character string recognized by the recognition means to the specific character string corresponding to the character string which is defined for every area memorized beforehand when it is determined that the character string recognized by the recognition means is contained in the character string which is defined for every area memorized beforehand. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、画像処理システム及び画像処理プログラムに関する。 The present invention relates to an image processing system and an image processing program.

近年、紙文書の電子化が進み、紙文書をスキャナ等により電子化された文書（以下、画像データと称する）に対して、様々な処理が施されている。例えば、多種多様な画像データから画像処理によるフォーマット（表の位置や大きさの違いなど）や光学式文字認識処理（以下、ＯＣＲ処理と称する）による文字を認識し、認識した結果を用いて、他の画像データとの関係を、変換規則情報として保持し、変換規則情報に従い変換する場合がある。具体的には、帳票を対象とした画像データで用いられ、Ａ社、Ｂ社毎に異なる帳票のフォーマットや同一の内容を指すが表現の異なる情報（価格と値段など）などを予め変換規則情報として保持し、Ａ社の帳票画像データからＢ社の帳票画像データへの変換の際に用いられる。 In recent years, the digitization of paper documents has progressed, and various processes have been performed on documents obtained by digitizing paper documents with a scanner or the like (hereinafter referred to as image data). For example, from a wide variety of image data, a format by image processing (difference in table position and size, etc.) or a character by optical character recognition processing (hereinafter referred to as OCR processing) is recognized, and the recognition result is used. A relationship with other image data may be held as conversion rule information and converted according to the conversion rule information. More specifically, conversion rule information is used in advance for image data for forms, such as different form formats or information (price and price, etc.) indicating the same contents but different expressions for Company A and Company B. And used when converting form image data of company A into form image data of company B.

これらに関連する技術として、例えば、特許文献１には、文書の依存関係を効果的に管理可能な文書管理装置が開示されている。また、例えば、特許文献２には、文書のユーザに対して文脈依存ツール、制御およびヘルプコンテンツを提供するための機能が付加されたコンピュータで生成された「スマート」文書を作成、実施および使用するための方法およびシステムが開示されている。
また、例えば、特許文献３には、ＸＭＬ文書の表示文書上で指定された挿入位置を基に、当該ＸＭＬ文書への新たな要素を挿入することができる方法および装置が開示されている。
特開２００３−２８１１１８号公報特開２００４−４６８２８号公報特開２００４−２７２６８４号公報 As a technique related to these, for example, Patent Document 1 discloses a document management apparatus capable of effectively managing document dependency. Also, for example, Patent Document 2 creates, implements and uses a “smart” document generated by a computer to which a function for providing context-sensitive tools, control and help contents to a document user is added. A method and system are disclosed.
Further, for example, Patent Document 3 discloses a method and apparatus that can insert a new element into an XML document based on an insertion position designated on the display document of the XML document.
JP 2003-281118 A Japanese Patent Laid-Open No. 2004-46828 JP 2004-272684 A

本発明は、このような背景技術の状況の中でなされたもので、変換規則情報の増加を抑制することができなかったことを課題とし、変換規則情報の増加を抑制することのできる画像処理システム及び画像処理プログラムを提供することを目的とする。 The present invention has been made in the background of such a background art, and an object of the present invention is to prevent an increase in conversion rule information, and an image processing capable of suppressing an increase in conversion rule information. It is an object to provide a system and an image processing program.

上記目的を達成するために請求項１の発明は、文字列を含む領域を複数有する文書画像を入力する入力手段と、前記入力手段によって入力された文書画像を領域毎に分割する分割手段と、前記入力手段によって入力された文書画像から文字列を認識する認識手段と、前記分割手段によって分割された領域毎に前記認識手段によって認識された文字列が、予め記憶されている領域毎に定められた文字列に含まれているか否かを判断する判断手段と、前記判断手段により、前記認識手段によって認識された文字列が予め記憶されている領域毎に定められた文字列に含まれていると判断された場合には、前記認識手段によって認識された文字列を前記予め記憶されている領域毎に定められた文字列と対応する特定文字列に変換する変換手段とを有することを特徴とする。 In order to achieve the above object, the invention of claim 1 includes an input means for inputting a document image having a plurality of areas including character strings, a dividing means for dividing the document image input by the input means for each area, A recognition unit for recognizing a character string from the document image input by the input unit, and a character string recognized by the recognition unit for each region divided by the dividing unit is determined for each prestored region. A determination means for determining whether or not the character string is included in the character string, and the character string recognized by the recognition means by the determination means is included in a character string determined for each prestored area. A conversion means for converting the character string recognized by the recognition means into a specific character string corresponding to a character string determined for each of the previously stored areas. It is characterized in.

なお、本発明は請求項２の発明のように、前記予め記憶されている領域毎に定められた文字列は、前記特定文字が示す意味を有する文字列毎に設けられた集団に属しており、前記特定文字列は、前記集団毎に対応づけられているようにしても良い。 In the present invention, as in the invention of claim 2, the character string determined for each of the previously stored areas belongs to a group provided for each character string having the meaning indicated by the specific character. The specific character string may be associated with each group.

なお、本発明は請求項３の発明のように、前記予め記憶されている領域毎に定められた文字列は、前記認識手段が誤認識した場合の文字列が含まれるようにしても良い。 In the present invention, as in the invention of claim 3, the character string determined for each of the previously stored areas may include a character string when the recognition unit erroneously recognizes the character string.

なお、本発明は請求項４の発明のように、前記特定文字列が示す内容を含む文書画像である出力文書画像を出力するために、前記特定文字列を、前記出力文書画像に応じて変更する変更手段を更に有するようにしても良い。 The present invention changes the specific character string according to the output document image in order to output an output document image that is a document image including the content indicated by the specific character string. You may make it have further the change means to do.

なお、本発明は請求項５の発明のように、前記領域は、前記文字列に対応すると共に所定の値を示す値情報を更に含み、前記認識手段は、前記値情報を認識し、前記変更手段は、前記認識手段によって認識された値を、前記出力文書画像に応じて変更するようにしても良い。 According to the present invention, as in the invention of claim 5, the area further includes value information corresponding to the character string and indicating a predetermined value, and the recognizing means recognizes the value information and changes the value. The means may change the value recognized by the recognition means according to the output document image.

上記目的を達成するために請求項６の発明は、文字列を含む領域を複数有する文書画像を入力する入力ステップと、前記入力ステップによって入力された文書画像を領域毎に分割する分割ステップと、前記入力ステップによって入力された文書画像から文字列を認識する認識ステップと、前記分割ステップによって分割された領域毎に前記認識ステップによって認識された文字列が、予め記憶されている領域毎に定められた文字列に含まれているか否かを判断する判断ステップと、前記判断ステップにより、前記認識ステップによって認識された文字列が予め記憶されている領域毎に定められた文字列に含まれていると判断された場合には、前記認識ステップによって認識された文字列を前記予め記憶されている領域毎に定められた文字列と対応する特定文字列に変換する変換ステップとを有する処理をコンピュータで実行する。 In order to achieve the above object, the invention of claim 6 includes an input step of inputting a document image having a plurality of regions including character strings, and a dividing step of dividing the document image input by the input step into regions. A recognition step for recognizing a character string from the document image input by the input step, and a character string recognized by the recognition step for each region divided by the division step is determined for each region stored in advance. A determination step for determining whether or not the character string is included in the character string, and the determination step includes that the character string recognized by the recognition step is included in a character string determined for each prestored area. If it is determined that the character string recognized by the recognition step is a character string determined for each of the previously stored areas It executes processing and a conversion step of converting the specific character string to respond in the computer.

請求項１に記載の発明によれば、本構成を有していない場合に比較して、文書画像全体ではなく領域毎に変換規則情報を持つことができるため、他の文書画像で同様の領域があった場合には新たに変換規則情報を作成する必要がないため、変換規則情報の増加を抑制することのできる画像処理システムを提供することができる。 According to the first aspect of the present invention, since it is possible to have conversion rule information for each area instead of the entire document image, compared to the case where the present configuration is not provided, the same area is used for other document images. Since there is no need to create new conversion rule information when there is an image, an image processing system capable of suppressing an increase in conversion rule information can be provided.

請求項２に記載の発明によれば、本構成を有していない場合に比較して、特定文字が示す意味を有する文字列を特定文字に統一できる。 According to the second aspect of the present invention, the character string having the meaning indicated by the specific character can be unified with the specific character as compared with the case where the present configuration is not provided.

請求項３に記載の発明によれば、本構成を有していない場合に比較して、文字列を誤認識した場合にも、特定文字列に変換することができる。 According to the third aspect of the present invention, it is possible to convert the character string into the specific character string even when the character string is erroneously recognized as compared with the case where the present configuration is not provided.

請求項４に記載の発明によれば、本構成を有していない場合に比較して、出力文書画像に応じて特定文字列を変更することができる。 According to the fourth aspect of the present invention, the specific character string can be changed according to the output document image as compared with the case where the present configuration is not provided.

請求項５に記載の発明によれば、本構成を有していない場合に比較して、文字列に対応すると共に所定の値を示す値情報が示す値を、出力文書画像に応じて変更することができる。 According to the fifth aspect of the present invention, the value indicated by the value information corresponding to the character string and indicating the predetermined value is changed according to the output document image as compared with the case where the present configuration is not provided. be able to.

請求項６に記載の発明によれば、本構成を有していない場合に比較して、文書画像全体ではなく領域毎に変換規則情報を持つことができるため、他の文書画像で同様の領域があった場合には新たに変換規則情報を作成する必要がないため、変換規則情報の増加を抑制することのできる画像処理プログラムを提供することができる。 According to the invention described in claim 6, since it is possible to have the conversion rule information for each area instead of the entire document image as compared with the case where this configuration is not provided, the same area is used for other document images. Since there is no need to create new conversion rule information when there is an image, an image processing program capable of suppressing an increase in conversion rule information can be provided.

以下、図面を参照して、本発明の一実施の形態について詳細に説明する。図１は、本発明の一実施の形態の概念的なモジュール構成図を示している。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings. FIG. 1 is a conceptual module configuration diagram of an embodiment of the present invention.

なお、モジュールとは、一般的に論理的に分離可能なソフトウェア、ハードウェア等の部品を指す。したがって、本実施の形態におけるモジュールはプログラムにおけるモジュールのことだけでなく、ハードウェア構成におけるモジュールも指す。それゆえ、本実施の形態は、プログラム、システムおよび方法の説明をも兼ねている。また、モジュールは機能にほぼ一対一に対応しているが、実装においては、１モジュールを１プログラムで構成してもよいし、複数モジュールを１プログラムで構成してもよく、逆に１モジュールを複数プログラムで構成してもよい。また、複数モジュールは１コンピュータによって実行されてもよいし、分散または並列環境におけるコンピュータによって１モジュールが複数コンピュータで実行されてもよい。また、以下、「接続」とは物理的な接続の他、論理的な接続を含む。 The module generally refers to a component such as software or hardware that can be logically separated. Therefore, the module in the present embodiment indicates not only a module in a program but also a module in a hardware configuration. Therefore, the present embodiment also serves as an explanation of a program, a system, and a method. In addition, the modules correspond almost one-to-one with the functions. However, in mounting, one module may be composed of one program, or a plurality of modules may be composed of one program. A plurality of programs may be used. The plurality of modules may be executed by one computer, or one module may be executed by a plurality of computers in a distributed or parallel environment. Hereinafter, “connection” includes not only physical connection but also logical connection.

また、システムとは、複数のコンピュータ、ハードウェア、装置等がネットワーク等で接続されて構成されるほか、１つのコンピュータ、ハードウェア、装置等によって実現される場合も含まれる。 The system includes a configuration in which a plurality of computers, hardware, devices, and the like are connected via a network and the like, and includes a case where the system is realized by a single computer, hardware, devices, and the like.

次に、図１を用いて本実施の形態に係る処理の概要について説明する。本実施の形態では、同図に示されるように、文書画像毎ではなく、領域毎（表領域ａ、ｂ、ｃ、ｄ）に変換規則情報を持っている。 Next, the outline of the processing according to the present embodiment will be described with reference to FIG. In this embodiment, as shown in the figure, the conversion rule information is stored not for each document image but for each region (table regions a, b, c, d).

そして、本実施の形態では、同図に示されるように、異なる文書画像Ａ、Ｂにおいて、文書画像Ａから表領域ａ、ｂ、ｃ、ｄを抽出し、抽出した表領域ａ、ｂ、ｃ、ｄを文書画像Ｂに応じて変換規則情報を用いて変換し、それらを文書画像Ｂに応じて再配置する。ここでの表領域ａ、ｂ、ｃ、ｄの種別は例えば組織、商品などとなっている。 In the present embodiment, as shown in the figure, in different document images A and B, the table areas a, b, c, and d are extracted from the document image A, and the extracted table areas a, b, and c are extracted. , D are converted using the conversion rule information according to the document image B, and rearranged according to the document image B. The types of the table areas a, b, c, and d here are, for example, organization and product.

このようにすることで、同図に示されるように、文書画像Ａ、Ｂの２種類存在しても、変換規則情報が増えることを抑制することができる。 By doing so, it is possible to suppress an increase in the conversion rule information even if there are two types of document images A and B as shown in FIG.

次に、本実施の形態に係るモジュール構成について、図２を用いて説明する。本実施の形態に係るモジュールは、図２に示すように、入力モジュール１８、分割モジュール２０、認識モジュール２２、判断モジュール２４、変換モジュール２６、及び変更モジュール２８、出力モジュール３０を含んで構成される。 Next, the module configuration according to the present embodiment will be described with reference to FIG. As shown in FIG. 2, the module according to the present embodiment includes an input module 18, a division module 20, a recognition module 22, a determination module 24, a conversion module 26, a change module 28, and an output module 30. .

入力モジュール１８は、判断モジュールモジュール２４及び認識モジュール２２と接続されており、文字列を含む領域を複数有する文書画像を入力する。 The input module 18 is connected to the determination module module 24 and the recognition module 22 and inputs a document image having a plurality of regions including character strings.

ここで、文書画像とは、電子的な文書であり、より具体的には、スキャナにより入力された画像である文書、文書生成アプリケーション（例えば、ワープロ等）により生成された文書画像などを含む。そして、入力するとは、スキャナを用いて取得すること、文書データベースから文書を取得すること、通信回線を介して、外部のシステムから取得すること等を含む。また、取得する文書は、１ページであってもよいし、複数ページの文書であってもよい。 Here, the document image is an electronic document, and more specifically includes a document that is an image input by a scanner, a document image generated by a document generation application (for example, a word processor), and the like. The inputting includes acquiring using a scanner, acquiring a document from a document database, acquiring from an external system via a communication line, and the like. Further, the document to be acquired may be a single page or a multi-page document.

なお、入力モジュール１８は、認識モジュール２２と接続されているが、これらの関係については後述する。 The input module 18 is connected to the recognition module 22, and the relationship between them will be described later.

分割モジュール２０は、入力モジュール１８と認識モジュール２２に接続されており、入力モジュール１８によって入力された文書画像を領域毎に分割する。 The division module 20 is connected to the input module 18 and the recognition module 22 and divides the document image input by the input module 18 for each area.

認識モジュール２２は、入力モジュール１８、分割モジュール２０、判断モジュール２４、変換モジュール２６、及び変更モジュール２８に接続している。認識モジュール２２は、入力モジュール１８によって入力された文書画像から文字列を認識するモジュールであり、分割モジュール２０により分割された文書画像から文字列を認識しても良いし、入力モジュール１８により入力された文書画像から文字列を認識した後に、分割モジュール２０により文書画像を分割するようにしても良い。 The recognition module 22 is connected to the input module 18, the division module 20, the determination module 24, the conversion module 26, and the change module 28. The recognition module 22 is a module that recognizes a character string from the document image input by the input module 18. The recognition module 22 may recognize a character string from the document image divided by the dividing module 20, or may be input by the input module 18. After the character string is recognized from the document image, the document image may be divided by the dividing module 20.

判断モジュール２４は、認識モジュール２２に接続し、認識モジュール２２が認識した文字列が予め記憶されている領域毎に定められた文字列に含まれているか否かを判断する。 The determination module 24 is connected to the recognition module 22 and determines whether or not the character string recognized by the recognition module 22 is included in a character string determined for each prestored area.

変換モジュール２６は、認識モジュール２２及び変更モジュール２８に接続し、判断モジュール２４により、認識モジュール２２によって認識された文字列が予め記憶されている領域毎に定められた文字列に含まれていると判断された場合には、認識モジュール２２によって認識された文字列を予め記憶されている領域毎に定められた文字列と対応する一意の属性名（特定文字列）に変換するモジュールである。変更モジュール２８は、変更モジュール２８及び出力モジュール３０に接続し、一意の属性名が示す内容を含む文書画像である出力文書画像を出力モジュール３０が出力するために、一意の属性名を、出力文書画像に応じて変更するモジュールである。また、変更モジュール２８は、認識モジュール２２によって認識された値を、出力文書画像に応じて変更する。 The conversion module 26 is connected to the recognition module 22 and the change module 28, and the determination module 24 includes the character string recognized by the recognition module 22 in a character string determined for each pre-stored area. When it is determined, the module converts the character string recognized by the recognition module 22 into a unique attribute name (specific character string) corresponding to a character string determined for each prestored area. The change module 28 is connected to the change module 28 and the output module 30, and in order for the output module 30 to output an output document image that is a document image including the content indicated by the unique attribute name, the unique attribute name is output to the output document. This module changes according to the image. The change module 28 changes the value recognized by the recognition module 22 according to the output document image.

出力モジュール３０は、変更モジュール２８に接続し、変更モジュール２８が変更した一意の属性名、及び値を含む出力文書画像を出力する。 The output module 30 is connected to the change module 28 and outputs an output document image including the unique attribute name and value changed by the change module 28.

次に、図３を参照して、実施の形態の画像処理システムのハードウェア構成例について説明する。図３に示す構成は、例えばパーソナルコンピュータ（ＰＣ）などによって構成される画像処理システムであり、スキャナ等のデータ読み取り部１２１７と、プリンタなどのデータ出力部１２１８を備えたハード構成例を示している。なお、このハードウェア構成は、他の実施の形態についても適用する。 Next, a hardware configuration example of the image processing system according to the embodiment will be described with reference to FIG. The configuration illustrated in FIG. 3 is an image processing system configured by, for example, a personal computer (PC), and illustrates a hardware configuration example including a data reading unit 1217 such as a scanner and a data output unit 1218 such as a printer. . This hardware configuration is also applied to other embodiments.

本実施の形態による画像処理システムは、図３に示すように、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１２０１、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１２０２、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１２０３、ホストバス１２０４、ブリッジ１２０５、外部バス１２０６、インタフェース１２０７、キーボード１２０８、ポインティングデバイス１２０９、ディスプレイ１２１０、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）１２１１、ドライブ１２１２、接続ポート１２１４、外部接続機器１２１５、通信部１２１６、データ読み取り部１２１７、及びデータ出力部１２１８を有する。 As shown in FIG. 3, an image processing system according to this embodiment includes a CPU (Central Processing Unit) 1201, a ROM (Read Only Memory) 1202, a RAM (Random Access Memory) 1203, a host bus 1204, a bridge 1205, and an external bus. 1206, interface 1207, keyboard 1208, pointing device 1209, display 1210, HDD (Hard Disk Drive) 1211, drive 1212, connection port 1214, external connection device 1215, communication unit 1216, data reading unit 1217, and data output unit 1218 Have.

ＣＰＵ１２０１は、上述の実施の形態において説明した各種のモジュール、すなわち、各モジュールの実行シーケンスを記述したコンピュータ・プログラムに従った処理を実行する制御部である。 The CPU 1201 is a control unit that executes processing according to the various modules described in the above-described embodiments, that is, the computer program describing the execution sequence of each module.

ＲＯＭ１２０２は、ＣＰＵ１２０１が使用するプログラムや演算パラメータ等を格納する。ＲＡＭ１２０３は、ＣＰＵ１２０１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を格納する。これらはＣＰＵバスなどから構成されるホストバス１２０４により相互に接続されている。 The ROM 1202 stores programs used by the CPU 1201, calculation parameters, and the like. The RAM 1203 stores programs used in the execution of the CPU 1201, parameters that change as appropriate during the execution, and the like. These are connected to each other by a host bus 1204 including a CPU bus.

ホストバス１２０４は、ブリッジ１２０５を介して、ＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ／Ｉｎｔｅｒｆａｃｅ）バスなどの外部バス１２０６に接続されている。 The host bus 1204 is connected to an external bus 1206 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 1205.

キーボード１２０８、マウス等のポインティングデバイス１２０９は、操作者により操作される入力デバイスである。ディスプレイ１２１０は、液晶表示装置またはＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）などから成り、各種情報をテキストやイメージ情報として表示する。 A keyboard 1208 and a pointing device 1209 such as a mouse are input devices operated by an operator. The display 1210 includes a liquid crystal display device or a CRT (Cathode Ray Tube), and displays various information as text and image information.

ＨＤＤ１２１１は、ハードディスクを内蔵し、ハードディスクを駆動し、ＣＰＵ１２０１によって実行するプログラムや情報を記録または再生させる。ハードディスクには、各種のデータ処理プログラム等、各種コンピュータ・プログラムが格納される。 The HDD 1211 has a built-in hard disk, drives the hard disk, and records or reproduces a program executed by the CPU 1201 and information. Various computer programs such as various data processing programs are stored in the hard disk.

ドライブ１２１２は、装着されている磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリ等のリムーバブル記録媒体１２１３に記録されているデータまたはプログラムを読み出して、そのデータまたはプログラムを、インタフェース１２０７、外部バス１２０６、ブリッジ１２０５、およびホストバス１２０４を介して接続されているＲＡＭ１２０３に供給する。リムーバブル記録媒体１２１３も、ハードディスクと同様のデータ記録領域として利用可能である。 The drive 1212 reads data or a program recorded on a mounted removable recording medium 1213 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and the data or program is read from the interface 1207 and the external bus 1206. , The bridge 1205, and the RAM 1203 connected via the host bus 1204. The removable recording medium 1213 can also be used as a data recording area similar to a hard disk.

接続ポート１２１４は、外部接続機器１２１５を接続するポートであり、ＵＳＢ、ＩＥＥＥ１３９４等の接続部を持つ。接続ポート１２１４は、インタフェース１２０７、および外部バス１２０６、ブリッジ１２０５、ホストバス１２０４等を介してＣＰＵ１２０１等に接続されている。通信部１２１６は、ネットワークに接続され、外部とのデータ通信処理を実行する。データ読み取り部１２１７は、例えばスキャナであり、ドキュメントの読み取り処理を実行する。データ出力部１２１８は、例えばプリンタであり、ドキュメントデータの出力処理を実行する。 The connection port 1214 is a port for connecting the external connection device 1215 and has a connection unit such as USB, IEEE1394. The connection port 1214 is connected to the CPU 1201 and the like via the interface 1207, the external bus 1206, the bridge 1205, the host bus 1204, and the like. The communication unit 1216 is connected to a network and executes data communication processing with the outside. The data reading unit 1217 is a scanner, for example, and executes document reading processing. The data output unit 1218 is a printer, for example, and executes document data output processing.

なお、図３に示す画像処理システムのハードウェア構成は、１つの構成例を示すものであり、本実施の形態の画像処理システムは、図２に示す構成に限らず、本実施の形態において説明したモジュールを実行可能な構成であればよい。例えば、一部のモジュールを専用のハードウェア（例えばＡＳＩＣ等）で構成してもよく、一部のモジュールは外部のシステム内にあり通信回線で接続しているような形態でもよく、さらに図３に示すシステムが複数互いに通信回線によって接続されていて互いに協調動作するようにしてもよい。また、複写機、ファックス、スキャナ、プリンタ、複合機（多機能複写機とも呼ばれ、スキャナ、プリンタ、複写機、ファックス等の機能を有している）などに組み込まれていてもよい。 Note that the hardware configuration of the image processing system shown in FIG. 3 shows one configuration example, and the image processing system of the present embodiment is not limited to the configuration shown in FIG. 2 and will be described in the present embodiment. Any configuration can be used as long as the module can be executed. For example, some modules may be configured by dedicated hardware (for example, ASIC), and some modules may be in an external system and connected via a communication line. A plurality of systems shown in FIG. 5 may be connected to each other via communication lines so as to cooperate with each other. Further, it may be incorporated in a copying machine, a fax machine, a scanner, a printer, a multifunction machine (also called a multi-function copying machine, which has functions of a scanner, a printer, a copying machine, a fax machine, etc.).

なお、説明したプログラムについては、記録媒体に格納することも可能であり、また、そのプログラムを通信手段によって提供することもできる。その場合、例えば、上記説明したプログラムについて、「プログラムを記録したコンピュータ読み取り可能な記録媒体」の発明として捉えることもできる。 The described program can be stored in a recording medium, and the program can be provided by communication means. In that case, for example, the above-described program can also be regarded as an invention of a “computer-readable recording medium recording the program”.

「プログラムを記録したコンピュータ読み取り可能な記録媒体」とは、プログラムのインストール、実行、プログラムの流通などのために用いられる、プログラムが記録されたコンピュータで読み取り可能な記録媒体をいう。 The “computer-readable recording medium on which a program is recorded” refers to a computer-readable recording medium on which a program is recorded, which is used for program installation, execution, program distribution, and the like.

なお、記録媒体としては、例えば、デジタル・バーサタイル・ディスク（ＤＶＤ）であって、ＤＶＤフォーラムで策定された規格である「ＤＶＤ−Ｒ、ＤＶＤ−ＲＷ、ＤＶＤ−ＲＡＭ等」、ＤＶＤ＋ＲＷで策定された規格である「ＤＶＤ＋Ｒ、ＤＶＤ＋ＲＷ等」、コンパクトディスク（ＣＤ）であって、読出し専用メモリ（ＣＤ−ＲＯＭ）、ＣＤレコーダブル（ＣＤ−Ｒ）、ＣＤリライタブル（ＣＤ−ＲＷ）等、光磁気ディスク（ＭＯ）、フレキシブルディスク（ＦＤ）、磁気テープ、ハードディスク、読出し専用メモリ（ＲＯＭ）、電気的消去および書換可能な読出し専用メモリ（ＥＥＰＲＯＭ）、フラッシュ・メモリ、ランダム・アクセス・メモリ（ＲＡＭ）等が含まれる。 The recording medium is, for example, a digital versatile disc (DVD), which is a standard established by the DVD Forum, such as “DVD-R, DVD-RW, DVD-RAM,” and DVD + RW. Standards such as “DVD + R, DVD + RW, etc.”, compact discs (CDs), read-only memory (CD-ROM), CD recordable (CD-R), CD rewritable (CD-RW), etc. MO), flexible disk (FD), magnetic tape, hard disk, read only memory (ROM), electrically erasable and rewritable read only memory (EEPROM), flash memory, random access memory (RAM), etc. It is.

そして、上記のプログラムまたはその一部は、上記記録媒体に記録して保存や流通等させることが可能である。また、通信によって、例えば、ローカル・エリア・ネットワーク（ＬＡＮ）、メトロポリタン・エリア・ネットワーク（ＭＡＮ）、ワイド・エリア・ネットワーク（ＷＡＮ）、インターネット、イントラネット、エクストラネット等に用いられる有線ネットワーク、あるいは無線通信ネットワーク、さらにはこれらの組合せ等の伝送媒体を用いて伝送することが可能であり、また、搬送波に乗せて搬送することも可能である。 The program or a part of the program can be recorded on the recording medium and stored or distributed. Also, by communication, for example, a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), a wired network used for the Internet, an intranet, an extranet, etc., or wireless communication It can be transmitted using a transmission medium such as a network or a combination of these, and can also be carried on a carrier wave.

さらに、上記のプログラムは、他のプログラムの一部分であってもよく、あるいは別個のプログラムと共に記録媒体に記録されていてもよい。 Furthermore, the above program may be a part of another program, or may be recorded on a recording medium together with a separate program.

次に、画像データの変換を行うための情報である変換規則情報について説明する。変換規則情報とは、多種多様な画像データＡから画像処理によるフォーマット（表の位置や大きさの違いなど）やＯＣＲ処理による文字を認識し、認識した結果を用いて、他の画像データＢに変換する場合に用いられる画像データＡと画像データＢとの関係を示す情報である。 Next, conversion rule information, which is information for converting image data, will be described. The conversion rule information is a method for recognizing a format by image processing (such as a difference in the position and size of a table) or a character by OCR processing from a variety of image data A, and using the recognized result to other image data B. This is information indicating the relationship between the image data A and the image data B used for conversion.

この変換規則情報の種類として、本実施の形態では属性変換規則情報と属性名変換規則情報がある。 In this embodiment, there are attribute conversion rule information and attribute name conversion rule information as types of conversion rule information.

このうち、属性変換規則情報とは、予め記憶されている領域毎に定められた文字列であり、種別毎に設けられている。ここで種別とは、画像データの領域（表構造が形成されていると考えられる領域、この領域は図２の分割モジュール２０によって得られるものである）に示される文字列が表す内容の種類によって分類して表わされるものである。 Among these, the attribute conversion rule information is a character string determined for each pre-stored area, and is provided for each type. Here, the type refers to the type of content represented by the character string shown in the image data area (area where the table structure is considered to be formed, and this area is obtained by the division module 20 of FIG. 2). It is expressed by classification.

この領域には、例えば文書画像が帳票の場合、商品名や価格などに商品情報に関する文字列が記載されており、或いは文書画像が社内文書であれば、組織情報に関する文字列が記載されている。このように、同じ領域に記載されている文字列にはそれぞれ関連した特徴があり、種別は、そのような関連のある文字列を分野ごとに分類することで得られた種類を表わしている。具体的に種別として例えば上述した商品番号、組織名などがある。 In this area, for example, when the document image is a form, a character string related to product information is described in the product name, price, etc., or if the document image is an in-house document, a character string related to organization information is described. . As described above, the character strings described in the same area have associated features, and the type represents a type obtained by classifying such related character strings for each field. Specific types include, for example, the product number and organization name described above.

また、属性変換規則情報は、言葉の表記揺れを吸収し、1つの文字列に変換するための役割を持つ。ここで、表記揺れとは、同一の対象に対して異なる複数の表記（文字列）が存在することを示す。更に本実施の形態では、後述するように画像データに対してＯＣＲ処理を実行して文字列を読み取るため、ＯＣＲ処理により文字の誤認識が生じることを考慮し、属性変換規則情報はそのＯＣＲ誤りを吸収し、本来認識されるべき文字列に変換するための役割も持つ。 Also, the attribute conversion rule information has a role to absorb word notation fluctuation and convert it into one character string. Here, the notation fluctuation indicates that there are a plurality of different notations (character strings) for the same object. Further, in the present embodiment, as will be described later, since the character string is read by executing the OCR process on the image data, the attribute conversion rule information includes the OCR error in consideration of the occurrence of erroneous character recognition due to the OCR process. It also has a role to absorb and convert it into a character string that should be recognized.

まず、図４、図５を用いて、属性変換規則情報の詳細について説明する。図４は、属性変換規則情報の一例として商品に関する属性変換規則情報を示している。同図に示される属性名とは、所定の種別に属する文字列を示している。上記属性名は、更に「表記」と「ＯＣＲ」に分類される。「表記」とは、表記揺れしている文字列を示す。この表記揺れとは、同一の対象に対して異なる複数の表記（文字列）が存在することを示す。 First, the details of the attribute conversion rule information will be described with reference to FIGS. FIG. 4 shows attribute conversion rule information related to a product as an example of attribute conversion rule information. The attribute name shown in the figure indicates a character string belonging to a predetermined type. The attribute names are further classified into “notation” and “OCR”. “Notation” indicates a character string that is shaking. This notation fluctuation indicates that there are a plurality of different notations (character strings) for the same object.

また、「ＯＣＲ」とは、ＯＣＲ処理により正しく認識された文字列と誤認識された文字列を示す。 “OCR” indicates a character string that is erroneously recognized as a character string that is correctly recognized by the OCR process.

また、同図に示される一意の属性名とは、１つ以上の属性名を１つの文字列により示すもので、上記表記揺れ及びＯＣＲ誤りを吸収した文字列を示している。 Also, the unique attribute name shown in the figure indicates one or more attribute names by one character string, and indicates a character string that absorbs the above-mentioned notation fluctuation and OCR error.

同図に示す例では、表記揺れとして商品番号、商品ナンバー、商品コードなどが示されている。これらの文字列に対応するＯＣＲ誤りとして商品香号などが示され、これらを吸収した文字列が商品番号であることが示されている。 In the example shown in the figure, a product number, a product number, a product code, etc. are shown as notation fluctuations. A product incense number or the like is shown as an OCR error corresponding to these character strings, and a character string in which these are absorbed is a product number.

このように、属性変換規則情報においては、予め記憶されている領域毎に定められた文字列（表記：商品番号、商品ナンバーなど）は、一意の属性名（商品番号）が示す意味を有する文字列毎に設けられた集団に属しており、一意の属性名（商品番号）は、集団（商品番号、商品ナンバーなどを含む文字列の集団）毎に対応づけられている。 Thus, in the attribute conversion rule information, a character string (notation: product number, product number, etc.) defined for each pre-stored area is a character having a meaning indicated by a unique attribute name (product number). It belongs to a group provided for each column, and a unique attribute name (product number) is associated with each group (a group of character strings including a product number and a product number).

次に、図５を用いて属性変換規則情報の他の例である組織情報に関する属性変換規則情報について説明する。同図に示される属性名、表記揺れ、ＯＣＲ誤り、及び一意の属性名は、図４で説明した通りである。 Next, attribute conversion rule information relating to organization information, which is another example of attribute conversion rule information, will be described with reference to FIG. The attribute names, notation fluctuations, OCR errors, and unique attribute names shown in the figure are as described in FIG.

同図には、表記揺れとして○○○○技開部などが示されている。これらの文字列に対応するＯＣＲ誤りとして○○○○技開部、○○○○技開剖などが示され、これらを吸収した文字列が○○○○技術開発部であることが示されている。 In the figure, XXXXX technical opening and the like are shown as notation shaking. As the OCR error corresponding to these character strings, XXXXX Technical Opening Department, XXXXX Technical Opening, etc. are shown, and the character string that absorbed them is the XXXXX Technical Development Department. ing.

以上説明した属性変換規則情報には、上述した商品情報に関する種別や組織情報に関する種別だけではなく、多くの種別が存在する。 The attribute conversion rule information described above includes many types in addition to the above-described types related to product information and types related to organization information.

次に、図６を用いて属性名変換規則情報について説明する。属性名変換規則情報は、一意の属性名と、出力に応じた属性文字列とが対応づけられた情報である。この属性名変換規則情報を用いて、変更モジュール２８（図２参照）は、変換モジュール２６（図２参照）により属性変換規則情報を用いて変換された一意の属性名を、出力に応じた属性文字列に変換する。 Next, attribute name conversion rule information will be described with reference to FIG. The attribute name conversion rule information is information in which a unique attribute name is associated with an attribute character string corresponding to the output. Using this attribute name conversion rule information, the change module 28 (see FIG. 2) converts the unique attribute name converted using the attribute conversion rule information by the conversion module 26 (see FIG. 2) into an attribute corresponding to the output. Convert to string.

出力属性名とは、出力に対応した文字列であり、例えばＡ社などの出力先に応じた属性名である。具体的に同図に示される商品番号は、Ａ社が出力先の場合、商品コードに変更されることが示されている。 The output attribute name is a character string corresponding to output, for example, an attribute name corresponding to an output destination such as Company A. Specifically, it is shown that the product number shown in the figure is changed to a product code when company A is the output destination.

以上説明した属性変換規則情報及び属性名変換規則情報により、まずＯＣＲ処理によって読み取られた文字列が属性変換規則情報を用いて一意の属性名に変換され、一意の属性名は、属性名変換ＤＢを用いて出力先属性名に変更される。 Based on the attribute conversion rule information and attribute name conversion rule information described above, the character string read by the OCR process is first converted into a unique attribute name using the attribute conversion rule information, and the unique attribute name is the attribute name conversion DB. Is used to change the output destination attribute name.

なお、種別に応じて予め変換規則情報を設定しておくことで、属性名の変更に留まることなく、対応する属性値の表記法をも変換することが可能である。例えば、金額であれば通貨単位を変更したり、日付欄であれば西暦と和暦を変換する等である。更に、同じ日付欄であってもこれら変換を実施すべき場合と実施すべきでない場合を、領域の種別に応じて設定しておくことで、より柔軟性のある運用が可能である。 Note that by setting conversion rule information in advance according to the type, it is possible to convert notation of the corresponding attribute value without changing the attribute name. For example, if it is an amount, the currency unit is changed, and if it is a date field, the Western calendar and the Japanese calendar are converted. Furthermore, even if the date field is the same, more flexible operation is possible by setting the case where the conversion should be performed and the case where the conversion should not be performed according to the type of area.

なお、上述した各ＤＢはＸＭＬを用いて作成することも可能である。この場合、ＯＣＲ処理により読み取られた文字列である属性名をＸＭＬのタグとしておき、更に出力先属性名もＸＭＬのタグとしておき、それらのタグが変換可能なように、ＸＭＬスキーマを用いて定義する。 Each DB described above can also be created using XML. In this case, an attribute name that is a character string read by OCR processing is set as an XML tag, and an output destination attribute name is also set as an XML tag, and is defined using an XML schema so that these tags can be converted. To do.

以下、本実施の形態に係る処理を、各フローチャートを用いて説明する。以下に示される各フローチャートは、スキャナなどにより読み込まれた帳票などを示す画像データに対してＣＰＵ１２０１が実行する処理を示している。 Hereinafter, processing according to the present embodiment will be described with reference to each flowchart. Each flowchart shown below shows processing executed by the CPU 1201 on image data indicating a form read by a scanner or the like.

まず、図７のフローチャートを用いて領域分割処理について説明する。なお、この領域分割処理は、特開２００２−２０３２４９号公報にその詳細が開示されている。 First, the region division process will be described with reference to the flowchart of FIG. The details of this area division processing are disclosed in Japanese Patent Laid-Open No. 2002-203249.

以上を踏まえて、図７のフローチャートについて説明をする。まず、ステップ１０１で、分割モジュール２０は、連結部分の外接矩形を作成する。連結部分とは、例えば表の罫線のように連結されている領域である。そして、その連結部分を囲む矩形が外接矩形である。 Based on the above, the flowchart of FIG. 7 will be described. First, in step 101, the dividing module 20 creates a circumscribed rectangle for the connected portion. A connection part is an area | region connected like the ruled line of a table | surface, for example. A rectangle surrounding the connected portion is a circumscribed rectangle.

次のステップ１０２で、分割モジュール２０は、作成した外接矩形の縦の辺の長さＨおよび横の辺の長さＷと、あらかじめ定めておいた表候補を判定するための外接矩形の辺の長さに関するしきい値とを比較する。分割モジュール２０は、当該外接矩形の縦の辺の長さＨおよび横の辺の長さＷとも上記しきい値に満たない場合には、ステップ１０６で当該連結部分と外接矩形の組を表候補から除外する。 In the next step 102, the dividing module 20 determines the length H and the length W of the circumscribed rectangle of the created circumscribed rectangle, and the circumscribed rectangle sides for determining a predetermined table candidate. Compare the length threshold. When the length H of the vertical side and the length W of the horizontal side of the circumscribed rectangle are less than the threshold value, the dividing module 20 determines a set of the connected portion and circumscribed rectangle as a table candidate in step 106. Exclude from

一方、ステップＳ１０２において当該外接矩形の縦の辺の長さＨまたは横の辺の長さＷの少なくとも一方が、表候補を判定するためのしきい値よりも長かった場合には、分割モジュール２０は、ステップ１０３で、当該連結部分および外接矩形の組に対して外接矩形内の画素密度を算出する。 On the other hand, if at least one of the vertical side length H or the horizontal side length W of the circumscribed rectangle is longer than the threshold value for determining the table candidate in step S102, the division module 20 In Step 103, the pixel density in the circumscribed rectangle is calculated for the set of the connected portion and circumscribed rectangle.

この外接矩形内の画素密度は、外接矩形の縦の辺の長さＨと横の辺の長さＷとの積で求まる面積と、連結部分を構成するすべての画素数との比として求めることができる。 The pixel density in the circumscribed rectangle is obtained as a ratio between the area obtained by the product of the vertical side length H and the horizontal side length W of the circumscribed rectangle and the number of all pixels constituting the connected portion. Can do.

次のステップ１０４で、分割モジュール２０は、算出した画素密度が、あらかじめ定めた表候補を判定するための画素密度に関するしきい値以下であるか否かを判定し、しきい値以下であると判定した場合には、ステップ１０５で分割モジュール２０は当該連結部分と外接矩形の組を表候補と判定する。一方、算出した画素密度が上記しきい値を超える場合には、上述したステップ１０６で分割モジュール２０は、当該連結部分と外接矩形の組を表候補から除外する。 In the next step 104, the division module 20 determines whether or not the calculated pixel density is equal to or less than a threshold value relating to the pixel density for determining a predetermined table candidate. If it is determined, in step 105, the dividing module 20 determines that the combination of the connected portion and the circumscribed rectangle is a table candidate. On the other hand, if the calculated pixel density exceeds the threshold value, the dividing module 20 excludes the combination of the connected portion and the circumscribed rectangle from the table candidates in step 106 described above.

以上の処理により、表候補が得られる。この表候補が表であるか否かを判定する処理を、図９のグラフ（ａ）（ｂ）を参照しながら、図８のフローチャートを用いて説明する。 Table candidates are obtained by the above processing. Processing for determining whether or not the table candidate is a table will be described with reference to the graphs (a) and (b) of FIG. 9 and the flowchart of FIG.

分割モジュール２０は、表候補である連結画素成分の外接矩形に注目し、Ｓ１１１１でその中心座標（Ｘ_CENTER，Ｙ_CENTER）を算出する。次のステップＳ１１１２で、分割モジュール２０は、水平方向の投影分布に注目し、当該投影分布において頻度しきい値ＴＨ_Hを超える頻度を持つ部分を、垂直方向に伸びる垂直罫線が存在する可能性があると判定して垂直罫線候補として検出する。垂直罫線候補が全く検出されなかった場合に、分割モジュール２０は、ステップＳ１１１３で、当該表候補は表ではないと判定する。 The division module 20 pays attention to the circumscribed rectangle of the connected pixel component that is a table candidate, and calculates the center coordinates (X _CENTER , Y _CENTER ) in _S1111 . In the next step S1112, the segmentation module 20 is focused on the projection distribution in the horizontal direction, a portion having a frequency greater than the frequency threshold value TH _H in the projection distribution, the possibility of vertical ruled line exists extending in a vertical direction It is determined that there is a vertical ruled line candidate and is detected. When no vertical ruled line candidate is detected, the dividing module 20 determines in step S1113 that the table candidate is not a table.

ステップＳ１１１２において１箇所以上で垂直罫線候補が検出された場合に、分割モジュール２０は、垂直方向の投影分布に注目する。そして、分割モジュール２０は、ステップ１１１４で当該投影分布において頻度しきい値ＴＨ_Vを超える頻度を持つ部分を、水平方向に伸びる水平罫線が存在する可能性があると判定して水平罫線候補として検出する。水平罫線候補が全く検出されなかった場合に、分割モジュール２０は、ステップＳ１１１３で、当該表候補は表ではないと判定する。 When the vertical ruled line candidates are detected at one or more places in step S1112, the dividing module 20 pays attention to the vertical projection distribution. In step 1114, the dividing module 20 determines that there is a possibility that a horizontal ruled line extending in the horizontal direction exists in the projection distribution having a frequency exceeding the frequency threshold value TH _V and detects it as a horizontal ruled line candidate. To do. If no horizontal ruled line candidate is detected, the dividing module 20 determines in step S1113 that the table candidate is not a table.

上記の処理において、水平罫線候補及び垂直罫線候補のどちらも検出された場合に、分割モジュール２０は、水平罫線候補が２箇所以上存在するかどうかを判定する。ここで、水平罫線候補が１箇所のみであった場合に、分割モジュール２０は、ステップＳ１１１６で、さらに垂直罫線候補が２箇所以上存在するかどうかを判定する。逆に、水平罫線候補が２箇所以上で存在する場合に、分割モジュール２０は、ステップＳ１１１１で算出した外接矩形の中心座標Ｙ_CENTERを境界とし、図９（ａ）に示す垂直方向の投影分布を当該境界で分割して得られる２つの投影区間Ｙ_START〜Ｙ_CENTERとＹ_CENTER〜Ｙ_ENDにおいてそれぞれ少なくとも１箇所以上で頻度しきい値ＴＨ_Vを超える水平罫線候補が存在するかどうかをステップＳ１１１７で判定する。 When both the horizontal ruled line candidate and the vertical ruled line candidate are detected in the above processing, the dividing module 20 determines whether or not there are two or more horizontal ruled line candidates. Here, if there is only one horizontal ruled line candidate, the dividing module 20 determines whether or not there are two or more vertical ruled line candidates in step S1116. On the other hand, when there are two or more horizontal ruled line candidates, the division module 20 uses the circumscribed rectangle center coordinates Y _CENTER calculated in step S1111 as a boundary, and uses the vertical projection distribution shown in FIG. in step S1117 whether a horizontal ruled line candidate exists that exceeds the frequency threshold TH _V in two projection interval Y _START to Y _CENTER and Y _CENTER to Y _END obtained by dividing in the boundary in at least one or more places, respectively judge.

この条件を満足する場合は、分割モジュール２０は、ステップＳ１１１９で当該表候補である連結画素成分の外接矩形で囲まれる閉領域を表領域であると決定する。逆に、この条件を満足しない場合およびステップＳ１１１６で垂直罫線候補が２箇所以上存在する場合には、ステップＳ１１１１で算出した外接矩形の中心座標Ｘ_CENTERを境界とし、図９（ａ）に示す水平方向の投影分布を当該境界で分割して得られる２つの投影区間Ｘ_START〜Ｘ_CENTERとＸ_CENTER〜Ｘ_ENDにおいてそれぞれ少なくとも１箇所以上で頻度しきい値ＴＨ_Hを超える垂直罫線候補が存在するかどうかをステップＳ１１１８で分割モジュール２０は判定する。この条件を満足する場合に、分割モジュール２０は、ステップＳ１１１９で当該表候補である連結画素成分の外接矩形で囲まれる閉領域を表領域であると決定する。逆に、この条件を満足しない場合に、分割モジュール２０は、ステップ１１１３で当該表候補は表ではないと判定する。 If this condition is satisfied, the dividing module 20 determines in step S1119 that the closed region surrounded by the circumscribed rectangle of the connected pixel component that is the table candidate is a table region. On the other hand, if this condition is not satisfied and there are two or more vertical ruled line candidates in step S1116, the horizontal coordinate shown in FIG. 9A is set with the circumscribed rectangle center coordinate X _CENTER calculated in step S1111 as a boundary. or vertical ruled line candidate exceeding the frequency threshold TH _H in at least one location or more, respectively present a projection distribution of directions in the two projection interval X _START to X _CENTER and X _CENTER to X _END obtained by dividing in the boundary In step S1118, the dividing module 20 determines. When this condition is satisfied, the division module 20 determines in step S1119 that the closed region surrounded by the circumscribed rectangle of the connected pixel component that is the table candidate is a table region. Conversely, if this condition is not satisfied, the dividing module 20 determines in step 1113 that the table candidate is not a table.

以上の処理により、分割モジュール２０は、文書画像を領域毎に分割する。この図８で説明した処理の他に、図１０に示される処理で領域を分割しても良い。まず、ステップ２０１で、分割モジュール２０は、ラベリングを実行する。このラベリングとは、連結成分を構成するすべての画素に同一のラベル値を付加する処理である。次のステップ２０２で、分割モジュール２０は、連結成分の外接矩形を作成し、ステップ２０３で外接矩形の縦の辺、横の辺の長さを算出する。次のステップ２０４で、分割モジュール２０は、外接矩形を文字／図／フィールドセパレータ／ノイズの各候補に分類する。この分類に関する処理の詳細は、特開２０００−９０１９４号公報の段落００３２から００３７に開示されているので、これ以上のここでの説明は省略する。 Through the above processing, the dividing module 20 divides the document image for each area. In addition to the processing described in FIG. 8, the region may be divided by the processing shown in FIG. First, in step 201, the division module 20 performs labeling. This labeling is a process of adding the same label value to all the pixels constituting the connected component. In the next step 202, the dividing module 20 creates a circumscribed rectangle of the connected component, and in step 203 calculates the lengths of the vertical and horizontal sides of the circumscribed rectangle. In the next step 204, the dividing module 20 classifies the circumscribed rectangles into character / figure / field separator / noise candidates. Details of the processing relating to this classification are disclosed in paragraphs 0032 to 0037 of Japanese Patent Laid-Open No. 2000-90194, and thus further explanation here is omitted.

次に、図１１のフローチャートを用いて種別特定処理について説明する。まず、ステップ３０１で、認識モジュール２２は、上記領域分割処理により得られた表領域に存在する表のセルを抽出する。次のステップ３０２で、認識モジュール２２は、表領域に存在する各セル内の文字列をＯＣＲ処理により認識する。 Next, the type specifying process will be described with reference to the flowchart of FIG. First, in step 301, the recognition module 22 extracts a table cell existing in the table area obtained by the area dividing process. In the next step 302, the recognition module 22 recognizes the character string in each cell existing in the table area by OCR processing.

次に、判断モジュール２４は、ステップ３０３で、認識モジュール２２が認識した文字列が属性変換規則情報に含まれるか否か判断する。ステップ３０３で、判断モジュール２４が肯定判断した場合、ステップ３０４で、認識モジュール２２は文字列を保持する。一方、ステップ３０３で判断モジュール２４が否定判断した場合、ステップ３０５に処理が進む。 Next, in step 303, the determination module 24 determines whether the character string recognized by the recognition module 22 is included in the attribute conversion rule information. If the determination module 24 makes an affirmative determination in step 303, the recognition module 22 holds the character string in step 304. On the other hand, if the determination module 24 makes a negative determination in step 303, the process proceeds to step 305.

ステップ３０５で、認識モジュール２２は、表領域に存在する全ての文字列を抽出したか否かを判断する。認識モジュール２２が否定判断した場合、再びステップ３０１に処理が進む。認識モジュール２２が肯定判断した場合、ステップ３０６で、認識モジュール２２は、保持された文字列を最も多く含む属性ＤＢが示す種別を表領域の種別とする。 In step 305, the recognition module 22 determines whether all character strings existing in the table area have been extracted. If the recognition module 22 makes a negative determination, the process proceeds to step 301 again. If the recognition module 22 makes an affirmative determination, in step 306, the recognition module 22 sets the type indicated by the attribute DB including the largest number of stored character strings as the type of the table area.

このステップ３０６の処理について具体的に説明すると、上述したように属性変換規則情報には商品情報に関するものなど多くの種別に関するものがあり、ステップ３０６の処理は、一つの表に存在する各文字列を最も多く含む属性変換規則情報が示す種別が、その表領域の種別と判断する処理である。また、上述したように、属性変換規則情報は、表記揺れ、ＯＣＲ揺れされた文字列も含まれるため、その文字列もステップ３０６では考慮されている。 The processing in step 306 will be described in detail. As described above, the attribute conversion rule information includes many types such as those related to product information, and the processing in step 306 is performed for each character string existing in one table. Is a process for determining that the type indicated by the attribute conversion rule information including the most is the type of the table area. Further, as described above, since the attribute conversion rule information includes a character string that has been shaken and OCR, the character string is also considered in step 306.

このようにして種別が定まった表領域に対して、次の図１２に示される属性値抽出処理が実行される。この処理は、例えば表の１つのセルに価格と記され、その例えば右隣に５００円と記載されている場合、価格と５００円を一つの組として抽出する処理である。この例では、５００円が属性値である。 The attribute value extraction process shown in FIG. 12 is executed on the table area whose type is determined in this way. This process is a process of extracting the price and 500 yen as one set when, for example, the price is written in one cell of the table and 500 yen is written on the right side of the cell. In this example, 500 yen is the attribute value.

まず、ステップ４０１で、認識モジュール２２は、セル内の文字列が属性名か否か判断する。認識モジュール２２が否定判断した場合、ステップ４０６に処理が進み、肯定判断した場合、ステップ４０２で、隣接セルを特定する。この隣接セルの例として、通常の文書において属性値が記載されていることが多い右隣もしくは下隣のセルが挙げられる。 First, in step 401, the recognition module 22 determines whether or not the character string in the cell is an attribute name. If the recognition module 22 makes a negative determination, the process proceeds to step 406, and if an affirmative determination is made, an adjacent cell is specified in step 402. As an example of this adjacent cell, there is a cell on the right side or on the lower side where an attribute value is often described in a normal document.

次のステップ４０３で、認識モジュール２２は、特定したセル内の文字列が属性名か否か判断する。文字列が属性名の場合、属性値を得ることができないため、ステップ４０６に処理が進み、文字列が属性名ではない場合、認識モジュール２２はステップ４０４で、特定したセル内の文字列（属性値）が属性名に対応しているか否か判断する。これは例えば属性名が価格である場合に、２００６年など価格ではない文字列ではないか否かを判断する処理である。この処理は、例えば自然言語解析などを用いて実行され、その場合、例えば図１３で示される固有表現例に示される文字列によりセル内の文字列が対応しているか否かを判断することができる。 In the next step 403, the recognition module 22 determines whether or not the character string in the identified cell is an attribute name. If the character string is an attribute name, the attribute value cannot be obtained, and thus the process proceeds to step 406. If the character string is not an attribute name, the recognition module 22 determines in step 404 the character string (attribute) It is determined whether or not (value) corresponds to the attribute name. For example, when the attribute name is a price, it is a process of determining whether or not the character string is not a price such as 2006. This processing is executed using, for example, natural language analysis. In this case, for example, it is possible to determine whether or not the character string in the cell corresponds to the character string shown in the specific expression example shown in FIG. it can.

上記ステップ４０４で認識モジュール２２が否定判断した場合、ステップ４０６に処理が進み、肯定判断した場合、ステップ４０５で認識モジュール２２は、属性名、属性値の組を属性として抽出する。次のステップ４０６で、認識モジュール２２は、全てのセルに対してステップ４０１からステップ４０５の処理が実行されたか否かを判断し、肯定判断した場合、処理を終了し、否定判断した場合、処理されていないセルに対して再びステップ４０１以降の処理を実行する。 If the recognition module 22 makes a negative determination in step 404 above, the process proceeds to step 406. If an affirmative determination is made, the recognition module 22 extracts a pair of attribute name and attribute value as an attribute in step 405. In the next step 406, the recognition module 22 determines whether or not the processing from step 401 to step 405 has been executed for all the cells. If the determination is affirmative, the processing ends. If the determination is negative, the recognition module 22 The processing after step 401 is executed again for the cells that have not been processed.

次に、図１４を用いて、出力文書画像生成処理について説明する。この画像データ生成処理は、文字列を対応する一意の文字列に変換し、一意の文字列が示す内容を含む文書画像である出力文書画像を出力する処理である。出力文書画像を出力するために、一意の文字列及び値（属性値）を、出力文書画像に応じて変更する。 Next, output document image generation processing will be described with reference to FIG. This image data generation process is a process of converting a character string into a corresponding unique character string and outputting an output document image that is a document image including the content indicated by the unique character string. In order to output the output document image, the unique character string and value (attribute value) are changed according to the output document image.

まず、ステップ５０１で、変換モジュール２６は、属性値抽出処理（図１２参照）により抽出された属性のうちの属性名を属性変換規則情報に基づき一意の属性名に変換する。次のステップ５０２で、変更モジュール２８は、一意の属性名を属性名変換ＤＢに基づき出力属性名に変換する。例えば出力先がＡ社であれば商品番号が商品コードに変換される。更に、変更モジュール２８は、ステップ５０３で、出力文書画像に応じて属性値を変更する。 First, in step 501, the conversion module 26 converts an attribute name of attributes extracted by the attribute value extraction process (see FIG. 12) into a unique attribute name based on the attribute conversion rule information. In the next step 502, the change module 28 converts the unique attribute name into an output attribute name based on the attribute name conversion DB. For example, if the output destination is company A, the product number is converted into a product code. Further, in step 503, the change module 28 changes the attribute value according to the output document image.

次のステップ５０４で、出力モジュール３０は、変更された属性名と属性値を、出力先の書式に基づき配置した出力文書画像を生成し、処理を終了する。 In the next step 504, the output module 30 generates an output document image in which the changed attribute name and attribute value are arranged based on the format of the output destination, and ends the processing.

以上の処理により変換された領域の例を、図１５を用いて説明する。図１５には変換前の領域と変換後の領域とが示されている。同図に示されるように、「商品コード」は「プロダクトナンバー」、「メーカー」は「製造者」、「値段」は「価格」なる出力属性名それぞれ変換されている。また、属性値である「￥５００」は、「＄５．００」に変換されている。出力モジュール３０は、このように変換された領域を組み合わせて出力文書画像を生成する。 An example of the region converted by the above processing will be described with reference to FIG. FIG. 15 shows a region before conversion and a region after conversion. As shown in the figure, “product code” is converted to “product number”, “maker” is converted to “manufacturer”, and “price” is converted to output attribute names “price”. The attribute value “¥ 500” is converted to “$ 5.00”. The output module 30 generates an output document image by combining the regions thus converted.

なお、本実施の形態に係る画像処理システムは、例えば図１６に示されるように、種別毎に専門のサーバＡ、Ｂ、Ｃ、Ｄを設け、それぞれに処理を並行に実行させ、その結果を統合サーバで統合させることで、文書画像を出力することも可能である。この場合、種別特定処理（図１１参照）までを１つのコンピュータで実行し、種別に対する処理を実行する上記サーバＡ、Ｂ、Ｃ、Ｄに対応する表領域の情報を転送し、各サーバがそれぞれ属性値抽出処理（図１２参照）を実行し、得られた情報を統合サーバが画像データ生成処理（図１４）を実行することとなる。 Note that the image processing system according to the present embodiment has specialized servers A, B, C, and D for each type, for example, as shown in FIG. It is also possible to output a document image by integrating with an integrated server. In this case, the type identification process (see FIG. 11) is executed by one computer, the table area information corresponding to the servers A, B, C, and D for executing the process for the type is transferred. The attribute value extraction process (see FIG. 12) is executed, and the integrated server executes the image data generation process (FIG. 14) for the obtained information.

以上説明した各フローチャートの処理の流れは一例であり、本発明の主旨を逸脱しない範囲内で処理順序を入れ替えたり、新たなステップを追加したり、不要なステップを削除したりすることができることは言うまでもない。 The flow of the processing of each flowchart described above is an example, and it is possible to change the processing order, add a new step, or delete an unnecessary step without departing from the gist of the present invention. Needless to say.

本実施の形態に係る処理の概要を示す模式図である。It is a schematic diagram which shows the outline | summary of the process which concerns on this Embodiment. 本実施の形態に係るモジュール構成を示す図である。It is a figure which shows the module structure which concerns on this Embodiment. 本実施の形態に係るパソコンの構成を示す図である。It is a figure which shows the structure of the personal computer which concerns on this Embodiment. 本実施の形態に係る商品情報に関する属性変換規則情報例を示す図である。It is a figure which shows the example of attribute conversion rule information regarding the merchandise information which concerns on this Embodiment. 本実施の形態に係る組織情報に関する属性変換規則情報例を示す図である。It is a figure which shows the example of attribute conversion rule information regarding the organization information which concerns on this Embodiment. 本実施の形態に係る属性名変換規則情報例を示す図である。It is a figure which shows the example of attribute name conversion rule information which concerns on this Embodiment. 本実施の形態に係る領域分割処理を示すフローチャート（その１）である。It is a flowchart (the 1) which shows the area division | segmentation process which concerns on this Embodiment. 本実施の形態に係る表候補判定処理を示すフローチャートである。It is a flowchart which shows the table candidate determination process which concerns on this Embodiment. 本実施の形態に係る投影分布を示す図である。It is a figure which shows the projection distribution which concerns on this Embodiment. 本実施の形態に係る領域分割処理を示すフローチャート（その２）である。It is a flowchart (the 2) which shows the area division | segmentation process which concerns on this Embodiment. 本実施の形態に係る種別特定処理を示すフローチャートである。It is a flowchart which shows the classification specific process which concerns on this Embodiment. 本実施の形態に係る属性値抽出処理を示すフローチャートである。It is a flowchart which shows the attribute value extraction process which concerns on this Embodiment. 本実施の形態に係る固有表現例を示す図である。It is a figure which shows the example of the specific expression which concerns on this Embodiment. 本実施の形態に係る画像データ生成処理を示すフローチャートである。It is a flowchart which shows the image data generation process which concerns on this Embodiment. 本実施の形態に係る領域変換例を示す図である。It is a figure which shows the example of area | region conversion which concerns on this Embodiment. 本実施の形態に係る画像処理システム構成例を示す図である。It is a figure which shows the example of a structure of the image processing system which concerns on this Embodiment.

Explanation of symbols

１８入力モジュール
２０分割モジュール
２２認識モジュール
２４判断モジュール
２６変換モジュール
２８変更モジュール
３０出力モジュール
１２０１ＣＰＵ
１２１１ＨＤＤ 18 Input Module 20 Division Module 22 Recognition Module 24 Judgment Module 26 Conversion Module 28 Change Module 30 Output Module 1201 CPU
1211 HDD

Claims

Input means for inputting a document image having a plurality of regions including character strings;
Dividing means for dividing the document image input by the input means for each area;
Recognition means for recognizing a character string from the document image input by the input means;
Determining means for determining whether or not the character string recognized by the recognizing means for each area divided by the dividing means is included in a character string determined for each prestored area;
If the determination means determines that the character string recognized by the recognition means is included in a character string determined for each pre-stored area, the character string recognized by the recognition means Conversion means for converting the character string defined for each of the previously stored areas into a specific character string corresponding to the character string.

The character string determined for each pre-stored area belongs to a group provided for each character string having the meaning indicated by the specific character string, and the specific character string is associated with each group. The image processing system according to claim 1.

The image processing system according to claim 1, wherein the character string determined for each of the previously stored areas includes a character string when the recognition unit erroneously recognizes the character string.

4. The apparatus according to claim 1, further comprising a changing unit configured to change the specific character string in accordance with the output document image in order to output an output document image that is a document image including content indicated by the specific character string. The image processing system according to claim 1.

The area further includes value information corresponding to the character string and indicating a predetermined value,
The recognizing means recognizes a value from the value information;
The image processing system according to claim 4, wherein the changing unit changes a value recognized by the recognizing unit according to the output document image.

An input step for inputting a document image having a plurality of regions including character strings;
A division step of dividing the document image input by the input step into regions;
A recognition step of recognizing a character string from the document image input by the input step;
A determination step of determining whether or not the character string recognized by the recognition step for each region divided by the division step is included in a character string determined for each pre-stored region;
When it is determined by the determination step that the character string recognized by the recognition step is included in a character string determined for each pre-stored area, the character string recognized by the recognition step An image processing program for causing a computer to execute a process including: a conversion step of converting a predetermined character string corresponding to each of the previously stored areas and a specific character string corresponding thereto.