JP2020135022A

JP2020135022A - Image processor, image processing method, and program

Info

Publication number: JP2020135022A
Application number: JP2019023572A
Authority: JP
Inventors: 陽子井戸; Yoko Ido
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-02-13
Filing date: 2019-02-13
Publication date: 2020-08-31

Abstract

To accurately detect a normal direction of a manuscript even when the manuscript has a small number of characters.SOLUTION: An image processor includes: a first dictionary holding a feature amount of each single character; a second dictionary holding weighting magnification of each prescribed character; extraction means extracting a feature amount of character image data included in manuscript image data; reliability acquisition means comparing a feature amount extracted by the extraction means with a feature amount of the first dictionary with respect to a plurality of prescribed directions of the manuscript image data to acquire reliability about a character associated with the feature amount of the compared first dictionary; weighting means weighting the reliability of the matched character by the weighting magnification when the character acquiring the reliability is in the second dictionary; and normal direction determining means determining a normal direction of the manuscript image data on the basis of the weighted reliability from the plurality of prescribed directions.SELECTED DRAWING: Figure 12

Description

本発明は、スキャナの原稿台に載置された原稿を読み取る技術に関するものである。 The present invention relates to a technique for reading a document placed on a platen of a scanner.

従来から、帳票や、非定型面積の領収書や、名刺、カードなどの原稿をスキャナの原稿台上に複数枚並べてまとめて読み取り、生成されたスキャン画像データから各原稿に対応する画像領域を検出して切り出すマルチクロップ処理が知られている。 Conventionally, multiple sheets of forms, receipts of atypical area, business cards, cards, etc. are read side by side on the platen of a scanner, and the image area corresponding to each document is detected from the generated scanned image data. The multi-crop process for cutting out is known.

マルチクロップ処理を使用する場合、ユーザが手作業で複数枚の原稿を原稿台上に並べるため、原稿の向きが同一方向（上向き・下向き・左向き・右向きのいずれか１つ）に揃わない場合がある。そのためマルチクロップ処理では、まず、スキャン画像データにおける原稿に対応する画像領域と背景との境界線を検出し、検出した境界線で構成される矩形領域の４頂点を特定し、その矩形領域を切り出す。このとき、原稿に対応する画像領域である矩形領域がスキャン画像データに対して傾いている場合は傾きを補正するように切り出した原稿画像データに対して回転処理を行う。このようにして各原稿に対応する画像領域を切り出して原稿画像データを生成した後に、各原稿画像データの向きが正方向となるように（例えば、文字が正しい向きとなるように）必要に応じてさらに回転処理を行う。この原稿画像データを正方向となるように補正する画像処理を「方向検知（または方向補正）処理」と呼ぶこととする。 When using the multi-crop process, the user manually arranges multiple originals on the platen, so the orientations of the originals may not be aligned in the same direction (upward, downward, leftward, or rightward). is there. Therefore, in the multi-crop process, first, the boundary line between the image area corresponding to the original and the background in the scanned image data is detected, the four vertices of the rectangular area composed of the detected boundary line are specified, and the rectangular area is cut out. .. At this time, if the rectangular area, which is the image area corresponding to the original, is tilted with respect to the scanned image data, the original image data cut out so as to correct the tilt is rotated. After cutting out the image area corresponding to each original in this way to generate the original image data, if necessary so that the orientation of each original image data is in the positive direction (for example, the characters are in the correct orientation). And further rotation processing is performed. The image processing for correcting the original image data so as to be in the positive direction is referred to as "direction detection (or direction correction) processing".

方向検知処理には、原稿画像データに対してＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）処理などを実施し、その結果を方向検知処理に用いる方法がある。特許文献１では、１枚の原稿を読み取って生成した原稿画像データに対しＯＣＲ処理を行い、その結果認識された文字方向に基づき方向検知処理を行う方法が検討されている。さらに、特許文献２では、方向検知処理に用いる文字として適切でないものを省くことで、原稿画像データの方向検知精度を上げる方法が提案されている。 As the direction detection process, there is a method in which OCR (Optical Character Recognition) processing or the like is performed on the original image data and the result is used for the direction detection process. In Patent Document 1, a method of performing OCR processing on manuscript image data generated by reading one manuscript and performing direction detection processing based on the character direction recognized as a result is studied. Further, Patent Document 2 proposes a method of improving the direction detection accuracy of the original image data by omitting characters that are not appropriate as characters used in the direction detection process.

特開平１０−１８１１３７号公報Japanese Unexamined Patent Publication No. 10-181137 特開２０１１−３３８２６３号公報Japanese Unexamined Patent Publication No. 2011-338263

しかしながら、小売やサービス業で支払いの証明として顧客が受け取る「レシート」などの比較的文字数が少ない領収書等の原稿画像データを対象とする場合、特許文献１のような方向検知処理では十分な方向検知精度を実現することが難しいという課題がある。方向検知精度を上げるために特許文献２のような方法を用いることもできるが、レシートのように文字数が少ない場合には十分な改善効果が得られないという課題がある。 However, when targeting manuscript image data such as receipts with a relatively small number of characters such as "receipts" received by customers as proof of payment in the retail and service industries, the direction detection process as in Patent Document 1 is sufficient. There is a problem that it is difficult to realize the detection accuracy. Although a method as in Patent Document 2 can be used to improve the direction detection accuracy, there is a problem that a sufficient improvement effect cannot be obtained when the number of characters is small as in a receipt.

そこで本発明では、文字数が少ない原稿であっても精度よく原稿の正方向を検知することを目的とする。 Therefore, an object of the present invention is to accurately detect the forward direction of a document even if the document has a small number of characters.

上記課題を解決するために、本発明は、原稿をスキャンして原稿画像データを生成する画像取得手段を有する画像処理装置であって、単一文字毎の特徴量を保持する第１の辞書と、所定の文字毎の重み付け倍率を保持する第２の辞書と、前記原稿画像データに含まれる文字画像データの特徴量を抽出する抽出手段と、前記原稿画像データの複数の所定の向きに関して、前記抽出手段により抽出した特徴量と前記第１の辞書の特徴量とを比較して、比較した前記第１の辞書の特徴量に対応付けられた文字に関する信頼度を取得する信頼度取得手段と、前記信頼度を取得した文字が前記第２の辞書に存在する場合、一致した文字の前記信頼度に前記重み付け倍率で重み付けする重み付け手段と、前記複数の所定の向きのうちから、重み付けされた前記信頼度に基づいて前記原稿画像データの正方向を決定する正方向決定手段とを備えたことを特徴とする。 In order to solve the above problems, the present invention is an image processing apparatus having an image acquisition means for scanning a document and generating document image data, and a first dictionary that holds a feature amount for each single character and The extraction with respect to a second dictionary that holds a weighting magnification for each predetermined character, an extraction means for extracting the feature amount of the character image data included in the manuscript image data, and a plurality of predetermined orientations of the manuscript image data. The reliability acquisition means for comparing the feature amount extracted by the means with the feature amount of the first dictionary and acquiring the reliability of the character associated with the feature amount of the compared first dictionary, and the above-mentioned When the character whose reliability has been acquired exists in the second dictionary, the weighting means for weighting the reliability of the matching character by the weighting factor and the reliability weighted from the plurality of predetermined directions. It is characterized by providing a forward direction determining means for determining the positive direction of the original image data based on the degree.

本発明によれば、文字数が少ない原稿であっても精度よく原稿の正方向を検知することができる。 According to the present invention, it is possible to accurately detect the forward direction of a document even if the document has a small number of characters.

本発明の一実施形態に関する適用可能なシステムの全体構成を示す図である。It is a figure which shows the whole structure of the applicable system which concerns on one Embodiment of this invention. 本発明の一実施形態に関する画像読取装置のハードウェア構成例を示す図である。It is a figure which shows the hardware configuration example of the image reader which concerns on one Embodiment of this invention. 本発明の一実施形態に関する画像処理機能全体のシーケンスを示す図である。It is a figure which shows the sequence of the whole image processing function which concerns on one Embodiment of this invention. 本発明の一実施形態に関するＵＩ表示を示す図である。It is a figure which shows the UI display which concerns on one Embodiment of this invention. 本発明の一実施形態に関するサンプル画像など説明を行うための図である。It is a figure for demonstrating such as a sample image which concerns on one Embodiment of this invention. 本発明の一実施形態に関する原稿の向きと文字領域の定義を示す図である。It is a figure which shows the orientation of the manuscript and the definition of a character area which concerns on one Embodiment of this invention. 本発明の一実施形態に関する文字領域の向きを説明する図である。It is a figure explaining the direction of the character area which concerns on one Embodiment of this invention. 本発明の一実施形態に関する文字領域の向きとＯＣＲ結果の例を示す図である。It is a figure which shows the orientation of the character area and the example of the OCR result which concerns on one Embodiment of this invention. 本発明の一実施形態に関するＯＣＲ辞書の内容の概要例を示す図である。It is a figure which shows the outline example of the contents of the OCR dictionary which concerns on one Embodiment of this invention. 本発明の一実施形態に関する画像データ種別文字列辞書の内容の概要例である。This is a summary example of the contents of the image data type character string dictionary according to the embodiment of the present invention. 本発明の一実施形態に関するＯＣＲ結果による累積信頼度の例を示す図である。It is a figure which shows the example of the cumulative reliability by the OCR result which concerns on one Embodiment of this invention. 本発明の一実施形態に関する方向検知処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the direction detection processing which concerns on one Embodiment of this invention.

以下、図面を用いて本発明に係る実施形態を詳細に説明する。ただし、この実施形態に記載されている構成要素はあくまで例示であり、この発明の範囲をそれらに限定する趣旨のものではない。 Hereinafter, embodiments according to the present invention will be described in detail with reference to the drawings. However, the components described in this embodiment are merely examples, and the scope of the present invention is not intended to be limited thereto.

＜システム構成＞
図１は本発明の一実施形態に係る画像処理装置が適用可能なシステム例の全体構成を示す図である。 <System configuration>
FIG. 1 is a diagram showing an overall configuration of a system example to which the image processing apparatus according to the embodiment of the present invention can be applied.

本発明が適用可能な典型的なシステムは、図１に示すように、画像処理装置１００、ＰＣ／サーバ端末１０１はイーサネット（登録商標）や無線ＬＡＮなどからなるＬＡＮ１０４に接続され、ＬＡＮ１０４はさらにインターネット１０５に接続された構成である。また、モバイル端末１０３は公衆無線通信網１０２などからインターネット１０５に接続されている。画像処理装置１００、ＰＣ／サーバ端末１０１及び、モバイル端末１０３は、ＬＡＮ１０４又は公衆無線通信網１０２からインターネット１０５に接続され、相互に通信可能となっている。なお、ＰＣ／サーバ端末１０１とモバイル端末１０３に関しては、どちらか一方のみの構成でも良いし、画像処理装置１００のみがＰＣ／サーバ端末１０１やモバイル端末１０３などが実施する処理を行っても良い。 In a typical system to which the present invention can be applied, as shown in FIG. 1, the image processing device 100 and the PC / server terminal 101 are connected to a LAN 104 composed of Ethernet (registered trademark), wireless LAN, etc., and the LAN 104 is further connected to the Internet. It is a configuration connected to 105. Further, the mobile terminal 103 is connected to the Internet 105 from a public wireless communication network 102 or the like. The image processing device 100, the PC / server terminal 101, and the mobile terminal 103 are connected to the Internet 105 from the LAN 104 or the public wireless communication network 102, and can communicate with each other. The PC / server terminal 101 and the mobile terminal 103 may be configured with only one of them, or only the image processing device 100 may perform the processing performed by the PC / server terminal 101, the mobile terminal 103, or the like.

画像処理装置１００は、操作部、スキャナ部及び、プリンタ部を有する複写複合機である。本実施例のシステムで、画像処理装置１００は１枚以上の名刺や免許証、ハガキなど原稿を読み取るスキャン端末として利用される。また、画像処理装置１００は、原稿台上に配置された複数の原稿を読み取って得られたスキャン画像データから原稿毎の原稿画像データを抽出するマルチクロップ処理を実施する。さらに、画像処理装置１００は、表示部や、タッチパネルやハードボタンなどの操作部を有し、操作部ではエラー通知や指示通知などの表示や、ユーザがスキャン操作や設定操作などの操作を行うことができる。 The image processing device 100 is a copying multifunction device having an operation unit, a scanner unit, and a printer unit. In the system of this embodiment, the image processing device 100 is used as a scanning terminal for reading a document such as one or more business cards, a driver's license, and a postcard. Further, the image processing device 100 performs a multi-crop process for extracting the original image data for each original from the scanned image data obtained by reading a plurality of originals arranged on the original table. Further, the image processing device 100 has a display unit and operation units such as a touch panel and hard buttons, and the operation unit displays error notifications and instruction notifications, and the user performs operations such as scanning operations and setting operations. Can be done.

ＰＣ／サーバ端末１０１は、画像処理装置１００で生成された原稿画像を表示することができる。また、ＰＣ／サーバ端末１０１は、画像処理装置１００で生成された原稿画像の保存や、ＯＣＲ処理などを実施し、再利用可能なコンテンツデータを生成する。なお、画像処理装置１００が実施するマルチクロップ処理をＰＣ／サーバ端末１０１で実施しても良い。更に、ＰＣ／サーバ端末１０１は、クラウドやサーバなどの外部ストレージとの通信も可能で、保存した原稿画像データやメタデータを外部ストレージへ送信することができる。なお、本実施例では、画像処理装置１００で原稿画像データの保存、メタデータ生成及び、外部ストレージへの送信を行うフローを説明するが、ＰＣ／サーバ端末１０１で同機能を実現してもよい。 The PC / server terminal 101 can display the original image generated by the image processing device 100. Further, the PC / server terminal 101 saves the original image generated by the image processing device 100, performs OCR processing, and the like to generate reusable content data. The multi-crop process performed by the image processing device 100 may be performed by the PC / server terminal 101. Further, the PC / server terminal 101 can also communicate with an external storage such as a cloud or a server, and can transmit the saved manuscript image data and metadata to the external storage. In this embodiment, the flow of storing the original image data, generating the metadata, and transmitting the original image data to the external storage will be described in the image processing device 100, but the same function may be realized in the PC / server terminal 101. ..

また、モバイル端末１０３は、操作部、無線通信部、ウェブブラウザを動作させるアプリ部を有するスマートフォンやタブレット端末である。本実施例のシステムで、モバイル端末１０３は、ＰＣ／サーバ端末１０１と同様に表示端末、操作端末及び、コンテンツデータ生成・保存端末として利用される。なお、ＰＣ／サーバ端末１０１とモバイル端末１０３は、表示、操作及び、メタデータ生成・コンテンツデータ生成・保存の機能など、どちらか一方の構成でもかまわない。 Further, the mobile terminal 103 is a smartphone or tablet terminal having an operation unit, a wireless communication unit, and an application unit for operating a web browser. In the system of this embodiment, the mobile terminal 103 is used as a display terminal, an operation terminal, and a content data generation / storage terminal in the same manner as the PC / server terminal 101. The PC / server terminal 101 and the mobile terminal 103 may have either configuration such as display, operation, and metadata generation / content data generation / storage functions.

以上の構成要素はあくまで例示であり、すべての構成が必要というものではない。 The above components are merely examples, and not all components are required.

＜画像処理装置１００のハードウェア構成＞
図２は、画像処理装置１００の構成を示すブロック図である。制御部１１０は、ＣＰＵ１１１、記憶装置１１２、ネットワークＩ／Ｆ部１１３、スキャナＩ／Ｆ部１１４、表示・操作部Ｉ／Ｆ部１１５を備え、これらはシステムバス１１６を介して互いに通信可能に接続されている。制御部１１０は、画像処理装置１００全体の動作を制御する。 <Hardware configuration of image processing device 100>
FIG. 2 is a block diagram showing the configuration of the image processing device 100. The control unit 110 includes a CPU 111, a storage device 112, a network I / F unit 113, a scanner I / F unit 114, and a display / operation unit I / F unit 115, which are communicably connected to each other via the system bus 116. Has been done. The control unit 110 controls the operation of the entire image processing device 100.

ＣＰＵ１１１は、記憶装置１１２に記憶された制御プログラムを読み出して読取制御や送信制御などの各種制御を行う。 The CPU 111 reads a control program stored in the storage device 112 and performs various controls such as read control and transmission control.

記憶装置１１２は、上記プログラム、画像データ、メタデータ、設定データ及び、処理結果データなどを格納し保持する。記憶装置１１２は、不揮発性メモリであるＲＯＭ１１７、揮発性メモリであるＲＡＭ１１８及び、大容量記憶領域であるＨＤＤ１１９などを含む。 The storage device 112 stores and holds the above program, image data, metadata, setting data, processing result data, and the like. The storage device 112 includes a ROM 117 which is a non-volatile memory, a RAM 118 which is a volatile memory, an HDD 119 which is a large capacity storage area, and the like.

ＲＯＭ１１７は、制御プログラムなどを保持する。ＣＰＵ１１１はＲＯＭ１１７に記憶された制御プログラムを読み出し、画像処理装置１００を制御する。 The ROM 117 holds a control program and the like. The CPU 111 reads the control program stored in the ROM 117 and controls the image processing device 100.

ＲＡＭ１１８は、ＣＰＵ１１１の主メモリ、ワークエリア等の一時記憶領域として用いられる。 The RAM 118 is used as a temporary storage area such as a main memory and a work area of the CPU 111.

ＨＤＤ１１９は、大容量記憶領域であるＨＤＤで、画像データ、メタデータなどを保存する記憶領域として用いられる。 The HDD 119 is an HDD that is a large-capacity storage area, and is used as a storage area for storing image data, metadata, and the like.

ネットワークＩ／Ｆ部１１３は、制御部１１０（画像処理装置１００）をＬＡＮ１０４に接続するインタフェースである。ネットワークＩ／Ｆ部１１３は、ＰＣ／サーバ端末１０１やモバイル端末１０３等のＬＡＮ１０４上の外部装置に画像データを送信したり、ＬＡＮ１０４上の外部装置から各種情報を受信したりする。 The network I / F unit 113 is an interface for connecting the control unit 110 (image processing device 100) to the LAN 104. The network I / F unit 113 transmits image data to an external device on the LAN 104 such as a PC / server terminal 101 or a mobile terminal 103, and receives various information from the external device on the LAN 104.

スキャナＩ／Ｆ部１１４は、スキャナ部１２０と制御部１１０を接続するインタフェースである。スキャナ部１２０は、原稿台上の原稿を読み取ってスキャン画像データを生成し、スキャナＩ／Ｆ部１１４を介して制御部１１０に入力する画像取得手段である。 The scanner I / F unit 114 is an interface that connects the scanner unit 120 and the control unit 110. The scanner unit 120 is an image acquisition means that reads a document on a platen, generates scanned image data, and inputs the scanned image data to the control unit 110 via the scanner I / F unit 114.

表示・操作部Ｉ／Ｆ部１１５は、表示・操作部１２１と制御部１１０とを接続するインタフェースである。表示・操作部１２１には、タッチパネル機能を有する液晶表示部やテンキー、スタートボタン、キャンセルボタン等のハードキーが備えられている。スタートボタンは、コピーやスキャンの処理を開始させるためのボタンである。キャンセルボタンは画像処理装置１００が実行中の処理を一時停止、または中止するためのボタンである。 The display / operation unit I / F unit 115 is an interface for connecting the display / operation unit 121 and the control unit 110. The display / operation unit 121 is provided with a liquid crystal display unit having a touch panel function and hard keys such as a numeric keypad, a start button, and a cancel button. The start button is a button for starting the copy or scan process. The cancel button is a button for suspending or canceling the processing being executed by the image processing device 100.

その他、画像処理装置１００にはプリンタ部等もあるものがあるが、本実施例では用いないため省略する。 In addition, some image processing devices 100 also have a printer unit and the like, but they are omitted because they are not used in this embodiment.

以上のように、本実施例に係る画像処理装置１００では、上記のハードウェア構成によって、画像処理機能を提供することが可能である。 As described above, the image processing apparatus 100 according to the present embodiment can provide the image processing function by the above hardware configuration.

＜「スキャンして送信」機能の実行フロー＞
図３を用いて、ユーザが「スキャンして送信」機能を用いてマルチクロップ処理を実行するための処理シーケンスを説明する。本実施例で説明する処理は、画像処理装置１００が有しているＣＰＵ１１１が記憶装置１１２に記憶された制御プログラムを読み出して制御プログラムを実行することにより実現される。 <Execution flow of "scan and send"function>
A processing sequence for the user to execute the multi-crop processing by using the "scan and send" function will be described with reference to FIG. The process described in this embodiment is realized by the CPU 111 included in the image processing device 100 reading the control program stored in the storage device 112 and executing the control program.

「スキャンして送信」機能とは、ＬＡＮ等ネットワークに接続された画像処理装置１００で原稿をスキャンし、得られたスキャン画像データを外部装置に送信する機能である。具体的には、スキャナで読み取って生成したスキャン画像データに対し、画像処理やフォーマット変換を実行し、ユーザの指定したサーバのフォルダや、電子メール、複写機内のＨＤＤ１１９に送信する機能である。 The "scan and transmit" function is a function of scanning a document with an image processing device 100 connected to a network such as a LAN and transmitting the obtained scanned image data to an external device. Specifically, it is a function of executing image processing and format conversion on the scanned image data read by a scanner and transmitting the data to a folder on a server specified by the user, an e-mail, or HDD 119 in a copying machine.

機能使用指示Ｓ４００において、ユーザは、表示・操作部１２１を操作して「スキャンして送信」機能ボタンを選択することで、画像形成装置１００に対してスキャン機能の使用を指示することができる。画像処理装置１００は、表示・操作部１２１を介して「スキャンして送信」機能ボタンの選択を受け付ける。図４（ａ）は表示・操作部１２１に表示されるメインメニューＵＩ５００である。メインメニューＵＩ５００は、画像処理装置１００で実施可能な機能がボタンとして表示される画面である。例えば、「コピー」機能ボタン５０１、「スキャンして送信」機能ボタン５０２、「スキャンして保存」機能ボタン５０３、「保存ファイルの利用」機能ボタン５０４、「プリント」機能ボタン５０５などが表示される。画像処理装置１００はメインメニューＵＩ５００を介して、ユーザからの実施したい機能の選択を受け付ける。本実施例では、ユーザが「スキャンして送信」機能ボタン５０２をタップ操作して選択したとする。 In the function use instruction S400, the user can instruct the image forming apparatus 100 to use the scan function by operating the display / operation unit 121 and selecting the “scan and send” function button. The image processing device 100 accepts the selection of the "scan and send" function button via the display / operation unit 121. FIG. 4A is a main menu UI500 displayed on the display / operation unit 121. The main menu UI 500 is a screen in which functions that can be performed by the image processing device 100 are displayed as buttons. For example, the "copy" function button 501, the "scan and send" function button 502, the "scan and save" function button 503, the "use saved file" function button 504, the "print" function button 505, and the like are displayed. .. The image processing device 100 accepts a user's selection of a function to be executed via the main menu UI 500. In this embodiment, it is assumed that the user taps and selects the "scan and send" function button 502.

設定ＵＩ表示Ｓ４０１において、画像処理装置１００は、表示・操作部１２１に「スキャンして送信」機能の設定画面を表示する。図４（ｂ）は表示・操作部１２１に表示される設定画面である「スキャンして送信」設定ＵＩ５１０の一例である。「スキャンして送信」設定ＵＩ５１０は「スキャンして送信」機能の各種設定の状態を示している。例えば、「送信先」ブロック５１１は、スキャンして生成したスキャン画像データを送信する送信先のアドレスを表示する。ユーザが「送信先」ブロック５１１をタップ操作すると不図示の送信先設定画面が表示され、ユーザはスキャン画像データの送信先を入力することができる。本実施例では、画像処理装置１００がスキャン画像データに対してマルチクロップ処理を行って生成した原稿画像データをＰＣ／サーバ端末１０１に送信する。そこで、「送信先」ブロック５１１にＰＣ／サーバ端末１０１のＵＲＬ（ＵｎｉｆｏｒｍＲｅｓｏｕｒｃｅＬｏｃａｔｏｒ）やＩＰアドレス等が設定される。「スキャン／送信設定」ボタン５１２は、生成するスキャン画像データのカラー設定や生成する画像ファイルのフォーマット、原稿の種類の状態を表示する。また、「その他の機能」ボタン５１３は「スキャンして送信」設定ＵＩ５１０に表示されていない応用機能を設定するためのボタンである。 In the setting UI display S401, the image processing device 100 displays the setting screen of the "scan and transmit" function on the display / operation unit 121. FIG. 4B is an example of the “scan and send” setting UI 510, which is a setting screen displayed on the display / operation unit 121. The "scan and send" setting UI 510 shows the states of various settings of the "scan and send" function. For example, the "destination" block 511 displays the address of the destination to which the scanned image data generated by scanning is transmitted. When the user taps the "destination" block 511, a destination setting screen (not shown) is displayed, and the user can input the destination of the scanned image data. In this embodiment, the image processing device 100 transmits the original image data generated by performing multi-crop processing on the scanned image data to the PC / server terminal 101. Therefore, the URL (Uniform Resource Locator), IP address, and the like of the PC / server terminal 101 are set in the "destination" block 511. The "scan / transmit setting" button 512 displays the color setting of the scan image data to be generated, the format of the image file to be generated, and the state of the type of the original. Further, the "other function" button 513 is a button for setting an applied function that is not displayed in the "scan and send" setting UI 510.

基本設定指示Ｓ４０２において、ユーザは、表示・操作部１２１を操作して「スキャンして送信」設定ＵＩ５１０上の各ボタンを選択することで、画像形成装置１００に対して基本機能の設定を指示することができる。画像処理装置１００は、ユーザから「スキャンして送信」設定ＵＩ５１０で設定することのできる設定項目の設定指示を受け付ける。基本設定指示Ｓ４０２で受け付けられる設定とは、例えば、生成するスキャン画像データのカラー設定や生成する画像ファイルのフォーマット選択などである。画像処理装置１００は、「送信先」ブロック５１１、「スキャン／送信設定」ブロック５１２のいずれかのタップ操作を受け付けて、それぞれに対応する設定項目の入力を受け付ける。 In the basic setting instruction S402, the user instructs the image forming apparatus 100 to set the basic function by operating the display / operation unit 121 and selecting each button on the “scan and send” setting UI 510. be able to. The image processing device 100 receives a setting instruction of a setting item that can be set by the "scan and send" setting UI 510 from the user. The settings accepted by the basic setting instruction S402 include, for example, color setting of the scanned image data to be generated, format selection of the image file to be generated, and the like. The image processing device 100 accepts a tap operation of any one of the "destination" block 511 and the "scan / transmission setting" block 512, and accepts the input of the corresponding setting items.

基本設定Ｓ４０３において、画像形成装置１００は、Ｓ４０２でユーザが指示した「スキャンして送信」機能の基本設定の設定値を画像処理装置１００のＲＡＭ１１８に記憶する。 In the basic setting S403, the image forming apparatus 100 stores the set value of the basic setting of the “scan and transmit” function instructed by the user in S402 in the RAM 118 of the image processing apparatus 100.

次に、応用設定指示Ｓ４０４において、ユーザは、「その他の機能」ボタン５１３をタップ操作し選択することで、画像形成装置１００に対して応用機能の設定を指示することができる。 Next, in the application setting instruction S404, the user can instruct the image forming apparatus 100 to set the application function by tapping and selecting the "other function" button 513.

詳細設定ＵＩ表示Ｓ４０５において、画像処理装置１００は、ユーザによる「その他の機能」ボタン５１３の選択を受け付けると、応用機能の設定を行うための画面を表示・操作部１２１に表示する。図４（ｃ）は応用機能を設定するための「その他の機能」設定ＵＩ５２０の一例である。「その他の機能」設定ＵＩ５２０は、画像処理装置１００が実行可能な「スキャンして送信」機能の各種応用機能を設定するためのボタンを表示する。「その他の機能」設定ＵＩ５２０は、例えば、「ページ集約」ボタン５２１や「カラータイプ」ボタン５２２、「原稿の種類」ボタン５２３、「カラーの調整」ボタン５２４、「ファイル名」ボタン５２５、「マルチクロップ」ボタン５２６などを表示する。「マルチクロップ」ボタン５２６は、原稿を読み取って生成されたスキャン画像データから、各原稿に対応する原稿画像データを抽出する処理の実行を指示するためのボタンである。本実施例では、「その他の機能」設定ＵＩ５２０は、「スキャンして送信」設定ＵＩ５１０から設定することのできる設定項目と重複する設定項目を表示する。「その他の機能」設定ＵＩ５２０は、「スキャンして送信」設定ＵＩ５１０から設定することのできる設定項目を除いた設定項目を表示するものとしてもよい。 In the detailed setting UI display S405, when the image processing device 100 receives the user's selection of the "other function" button 513, the image processing device 100 displays a screen for setting the applied function on the display / operation unit 121. FIG. 4C is an example of the “other function” setting UI 520 for setting the applied function. The "other functions" setting UI 520 displays buttons for setting various application functions of the "scan and send" function that can be executed by the image processing device 100. The "other functions" setting UI 520 includes, for example, "page aggregation" button 521, "color type" button 522, "original type" button 523, "color adjustment" button 524, "file name" button 525, and "multi". The "Crop" button 526 and the like are displayed. The "multi-crop" button 526 is a button for instructing execution of a process of extracting the original image data corresponding to each original from the scanned image data generated by scanning the original. In this embodiment, the "other functions" setting UI 520 displays setting items that overlap with the setting items that can be set from the "scan and send" setting UI 510. The "other function" setting UI 520 may display setting items excluding the setting items that can be set from the "scan and send" setting UI 510.

マルチクロップ設定指示Ｓ４０６において、ユーザは「その他の機能」設定ＵＩ５２０の「マルチクロップ」ボタン５２６をタップ操作して選択することで、画像形成装置１００に対してマルチクロップ処理の設定を指示することができる。 In the multi-crop setting instruction S406, the user can instruct the image forming apparatus 100 to set the multi-crop process by tapping and selecting the "multi-crop" button 526 of the "other function" setting UI 520. it can.

ユーザが「マルチクロップ」ボタン５２６をタップ操作して選択すると、Ｓ４０７において画像処理装置１００は、マルチクロップ処理を実行することを示すマルチクロップ処理フラグをＯＮに設定する。マルチクロップ処理フラグはＲＡＭ１１８に記憶される。また、Ｓ４０７において、画像処理装置１００は、表示・操作部１２１に「その他の機能」設定ＵＩ５２０において図４（ｄ）に示すように「マルチクロップ」ボタン５２６を選択されたことを示すように表示させる。例えば、ユーザが「マルチクロップ」ボタン５２６を選択すると、「その他の機能」設定ＵＩ５２０の「マルチクロップ」ボタン５２６の色が反転し、マルチクロップ処理がオンに設定されていることを示す画面が表示される。 When the user taps and selects the "multi-crop" button 526, the image processing device 100 sets the multi-crop processing flag ON indicating that the multi-crop processing is to be executed in S407. The multi-crop processing flag is stored in the RAM 118. Further, in S407, the image processing device 100 displays the display / operation unit 121 indicating that the "multi-crop" button 526 has been selected in the "other function" setting UI 520 as shown in FIG. 4 (d). Let me. For example, when the user selects the "multi-crop" button 526, the color of the "multi-crop" button 526 of the "other functions" setting UI 520 is inverted, and a screen indicating that the multi-crop process is set to on is displayed. Will be done.

また、ユーザが「マルチクロップ」ボタン５２６をタップすると、図４（ｅ）のようにユーザに原稿の文字列の向きを設定するための画面５３０が表示される。ここにはレシートが該当する「横書き」ボタン５３１、名刺などが該当する「縦書き」ボタン５３２、縦書きと横書きの原稿が混在して原稿台に置かれている場合に設定する「混在」ボタン５３３が配置されている。なお、選択肢となるボタンの表示には、文字列の向きではなく「レシート」、「名刺」といった原稿の種類名を使用しても構わない。 Further, when the user taps the "multi-crop" button 526, the screen 530 for setting the orientation of the character string of the document is displayed to the user as shown in FIG. 4 (e). Here, the "horizontal writing" button 531 corresponding to the receipt, the "vertical writing" button 532 corresponding to the business card, etc., and the "mixed" button to be set when the vertical writing and horizontal writing documents are mixed and placed on the platen. 533 is arranged. Note that the button type name such as "receipt" or "business card" may be used instead of the orientation of the character string to display the button as an option.

原稿の文字列向き設定指示Ｓ４０８において、ユーザは、「横書き」ボタン５３１、「縦書き」ボタン５３２、「混在」ボタン５３３のいずれか１つをタップ操作して選択することで、画像形成装置１００に対して原稿の文字列向きを指示することができる。本実施例での対象原稿はレシートなので、ユーザにより「横書き」ボタン５３１がタップ操作により選択されたものとする。 In the character string orientation setting instruction S408 of the original, the user taps and selects any one of the "horizontal writing" button 531 and the "vertical writing" button 532 and the "mixed" button 533 to select the image forming apparatus 100. It is possible to instruct the direction of the character string of the document. Since the target manuscript in this embodiment is a receipt, it is assumed that the "horizontal writing" button 531 is selected by the user by tapping.

このユーザ操作に従って、Ｓ４０９において画像処理装置１００は、原稿の文字列向きを示す文字列向き設定を「横書き」に設定する。文字列向き設定はＲＡＭ１１８に記憶される。また、Ｓ４０９において、画像処理装置１００は、表示・操作部１２１に図４（ｅ）に示す「横書き」ボタン５３１の色を反転させることにより、横書き原稿設定がオンに設定されていることを示す。続いてユーザが「閉じる」ボタン５３４をタップ操作して選択すると、表示・操作部１２１は「その他の機能」設定ＵＩ５２０を表示する。さらにユーザが、「閉じる」ボタン５２７をタップ操作して選択すると、表示・操作部１２１は「スキャンして送信」設定ＵＩ５１０を表示する。 According to this user operation, in S409, the image processing device 100 sets the character string orientation setting indicating the character string orientation of the document to "horizontal writing". The character string orientation setting is stored in the RAM 118. Further, in S409, the image processing device 100 indicates that the horizontal writing document setting is set to ON by inverting the color of the "horizontal writing" button 531 shown in FIG. 4 (e) on the display / operation unit 121. .. Subsequently, when the user taps and selects the "close" button 534, the display / operation unit 121 displays the "other function" setting UI 520. Further, when the user taps and selects the "close" button 527, the display / operation unit 121 displays the "scan and send" setting UI 510.

スキャン指示Ｓ４１０において、ユーザは、スタートキー５０６をタップすることで、画像処理装置１００に原稿の読み取り開始を指示することができる。スタートキー５０６がタップされると、画像処理装置１００は、原稿をスキャンするための各種設定情報をＲＡＭ１１８に書き込み、原稿の読み取り処理の準備を開始する。 In the scan instruction S410, the user can instruct the image processing device 100 to start scanning the original by tapping the start key 506. When the start key 506 is tapped, the image processing device 100 writes various setting information for scanning the document to the RAM 118, and starts preparing for the document reading process.

原稿の読み取り処理が開始されるとまず、スキャンＳ４１１において、画像処理装置１００は、スキャナ部１２０に対してスキャナを駆動させ、スキャナ部１２０の原稿台に置かれた原稿を読み取らせる。 When the scanning process of the document is started, first, in scan S411, the image processing device 100 drives the scanner unit 120 to read the document placed on the platen of the scanner unit 120.

画像形成Ｓ４１２において、画像処理装置１００は、スキャンＳ４１１で原稿を読み取らせることによりスキャナ部１２０から得られる信号値を画像処理で扱えるビットマップ形式の画像データ（スキャン画像データ）へ変換する。例えば、Ｓ４１２において、画像処理装置１００は、スキャナ部１２０から入力された輝度信号値を８ビットのデジタル信号に変換し、ＨＤＤ１１９にスキャン画像データとして格納を行う。 In the image forming S412, the image processing apparatus 100 converts the signal value obtained from the scanner unit 120 into image data (scanned image data) in a bitmap format that can be handled by the image processing by scanning the document with the scan S411. For example, in S412, the image processing device 100 converts the luminance signal value input from the scanner unit 120 into an 8-bit digital signal and stores it in the HDD 119 as scanned image data.

図５は、複数枚のレシート原稿が原稿台に載置された状態を示す模式図である。ここでは、Ａ３サイズの領域を読み取ることが可能なスキャナ部１２０に７枚の原稿が配置されているとする。レシートは比較的サイズの小さな原稿であり、様々なサイズや縦横比の種類が存在する。また見た目のデザインもまちまちであり、ユーザに向かって裏返しに置くことを考えると、図のように乱雑に置かれることが一般的となる。もちろん原稿台に載置される原稿の枚数、配置はこれに限らない。 FIG. 5 is a schematic view showing a state in which a plurality of receipt documents are placed on a platen. Here, it is assumed that seven documents are arranged in the scanner unit 120 capable of reading an A3 size area. Receipts are relatively small documents, and there are various sizes and aspect ratio types. In addition, the appearance design is also different, and considering that it is placed inside out toward the user, it is generally placed in a messy manner as shown in the figure. Of course, the number and arrangement of documents placed on the platen are not limited to this.

マルチクロップ処理Ｓ４１３において、画像処理装置１００は、Ｓ４１２においてＨＤＤ１１９内に格納したスキャン画像データを取得する。画像処理装置１００は、ＣＰＵ１１１により、取得したスキャン画像データから各原稿領域と背景との境界（エッジ）を識別して各原稿に対応する矩形の原稿領域を構成する４頂点の座標を検出し、検出した各原稿領域の４頂点座標値をＲＡＭ１１８に記憶する。画像処理装置１００は、マルチクロップ座標検出処理Ｓ４１３で検出した各原稿領域の４頂点座標値を取得し、Ｓ４１２において生成されたスキャン画像データから各原稿領域に対応する画像データ（原稿画像データ）を切り出す。画像処理装置１００は、原稿領域の各辺がスキャン画像の各辺に対して平行又は垂直でない場合は、各原稿領域の４頂点座標を元に、射影変換や台形変換、アフィン変換などを用いて傾き補正も同時に行われるようにして、原稿画像データの切り出し処理を行う。画像処理装置１００は、切り出し後の各原稿画像データをＨＤＤ１１９に格納する。このとき画像処理装置１００は、原稿画像データを圧縮し、データ圧縮後の原稿画像データをＨＤＤ１１９に格納してもよい。 In the multi-crop process S413, the image processing device 100 acquires the scanned image data stored in the HDD 119 in S412. The image processing device 100 identifies the boundary (edge) between each document area and the background from the acquired scanned image data by the CPU 111, and detects the coordinates of the four vertices constituting the rectangular document area corresponding to each document. The four-vertex coordinate values of each detected document area are stored in the RAM 118. The image processing device 100 acquires the four-vertical coordinate values of each document area detected by the multi-crop coordinate detection process S413, and obtains image data (original image data) corresponding to each document area from the scanned image data generated in S412. break the ice. When each side of the document area is not parallel or perpendicular to each side of the scanned image, the image processing device 100 uses projective transformation, trapezoidal transformation, affine transformation, or the like based on the four vertex coordinates of each document region. The original image data is cut out so that the tilt correction is also performed at the same time. The image processing device 100 stores each original image data after cutting out in the HDD 119. At this time, the image processing device 100 may compress the original image data and store the original image data after data compression in the HDD 119.

文字認識前処理Ｓ４１４において、画像処理装置１００は、ＨＤＤ１１９から原稿画像データを取得する。画像処理装置１００は、ＣＰＵ１１１により、取得した原稿画像データに対してＯＣＲ処理に必要な画像前処理（例えば、二値化、線分除去、ノイズ除去、レイアウト分析など）を行い、処理済の原稿画像データをＲＡＭ１１８に記憶する。原稿画像データが圧縮されている場合は、画像前処理前に原稿画像データに復号を行う。 In the character recognition preprocessing S414, the image processing device 100 acquires the original image data from the HDD 119. The image processing device 100 performs image preprocessing (for example, binarization, line segment removal, noise removal, layout analysis, etc.) necessary for OCR processing on the acquired original image data by the CPU 111, and the processed original. The image data is stored in the RAM 118. If the original image data is compressed, the original image data is decoded before the image preprocessing.

方向検知処理Ｓ４１５において、画像処理装置１００は、ＣＰＵ１１１によりＲＡＭ１１８に記憶した原稿画像データに対して原稿向きの絞り込み処理を行う。その後画像処理装置１００は、ＯＣＲ処理を実行し、文字画像から得られた文字毎の特徴ベクトルから、原稿画像データの方向を判別する。この処理については詳細を後述する。 In the direction detection process S415, the image processing device 100 performs a document-oriented narrowing process on the document image data stored in the RAM 118 by the CPU 111. After that, the image processing device 100 executes OCR processing and determines the direction of the original image data from the feature vector for each character obtained from the character image. The details of this process will be described later.

方向補正画像処理Ｓ４１６において、画像処理装置１００は、Ｓ４１５において検知された回転角度に基づいて、Ｓ４１３で切り出された原稿画像データに対し正方向になるように回転処理を行う。一般的に原稿画像データの正方向は、原稿領域が矩形の場合、その原稿領域の各辺を水平な上辺とする向きである、０°、９０°、１８０°、２７０°のいずれかの１方向なので、ここでの回転処理は上記４方向への方向補正処理となる。なお、言うまでもないが、Ｓ４１５で正方向が０°と判定された場合は、Ｓ４１６の回転処理をスキップするようにしてもよい。 In the direction correction image processing S416, the image processing apparatus 100 performs rotation processing based on the rotation angle detected in S415 so as to be in the positive direction with respect to the original image data cut out in S413. Generally, the positive direction of the original image data is one of 0 °, 90 °, 180 °, and 270 °, which is the direction in which each side of the original area is a horizontal upper side when the original area is rectangular. Since it is a direction, the rotation process here is the direction correction process in the above four directions. Needless to say, when the positive direction is determined to be 0 ° in S415, the rotation process of S416 may be skipped.

マルチクロップ処理結果ＵＩ表示Ｓ４１７において、画像処理装置１００は、マルチクロップ処理および原稿方向補正処理後の各原稿画像データを表示・操作部１２１に表示する。図４（ｆ）は表示・操作部１２１に表示されるマルチクロップ及び原稿方向補正の処理結果を示す画面の一例である。図４（ｆ）に示すように「原稿検出結果表示」ＵＩ５４０は、マルチクロップ及び原稿方向補正処理で得られた各原稿画像を並べて表示する。 In the multi-crop processing result UI display S417, the image processing device 100 displays each document image data after the multi-crop processing and the document direction correction processing on the display / operation unit 121. FIG. 4F is an example of a screen showing the processing result of the multi-crop and the document direction correction displayed on the display / operation unit 121. As shown in FIG. 4 (f), the "manuscript detection result display" UI540 displays each manuscript image obtained by the multi-crop and the manuscript direction correction processing side by side.

送信指示Ｓ４１８において、ユーザは、表示・操作部１２１に表示された原稿画像から原稿画像データの切り出し・方向補正処理結果を確認する。そして、ユーザは切り出されて方向補正された原稿画像データの保存、送信指示をする。ユーザは、スタートキー５０６を押下することで、切り出し・方向補正処理後の各原稿画像データの保存、送信を画像処理装置１００に指示する。画像処理装置１００は、Ｓ４１８において、原稿画像データをＰＣ／サーバ端末１０１へ送信を行うための送信指示をユーザから受け付ける。 In the transmission instruction S418, the user confirms the result of cutting out / direction correction processing of the original image data from the original image displayed on the display / operation unit 121. Then, the user gives an instruction to save and transmit the cut-out and direction-corrected original image data. By pressing the start key 506, the user instructs the image processing device 100 to save and transmit each original image data after the cutting / direction correction processing. In S418, the image processing device 100 receives a transmission instruction from the user for transmitting the original image data to the PC / server terminal 101.

画像送信Ｓ４１９において、画像処理装置１００は、クロップ処理後の原稿画像データをＰＣ/サーバ端末１０１に送信する。Ｓ４１９においてＰＣ/サーバ端末１０１に送信される原稿画像データは、「スキャンして送信」設定ＵＩ５１０や「その他の機能」設定ＵＩ５２０を介して設定された設定値を反映した画像データである。 In the image transmission S419, the image processing device 100 transmits the original image data after the crop processing to the PC / server terminal 101. The original image data transmitted to the PC / server terminal 101 in S419 is image data that reflects the setting values set via the "scan and transmit" setting UI 510 and the "other function" setting UI 520.

保存Ｓ４２０において、ＰＣ／サーバ端末１０１は、画像処理装置１００から送信された各原稿画像データを保存する。ここで、ＰＣ/サーバ端末１０１は原稿画像データを保存するだけでも良いが、例えば、原稿画像データに対し文字認識処理（ＯＣＲ処理）を行い、文字認識処理結果をメタデータとして原稿画像データに付加して記憶するようにしてもよい。このようにすることで、原稿画像データの検索性が向上させたり、原稿画像データから抽出された情報をシステムに登録したりすることができる。 In the storage S420, the PC / server terminal 101 stores each original image data transmitted from the image processing device 100. Here, the PC / server terminal 101 may only save the original image data, but for example, the original image data is subjected to character recognition processing (OCR processing), and the character recognition processing result is added to the original image data as metadata. You may try to memorize it. By doing so, the searchability of the manuscript image data can be improved, and the information extracted from the manuscript image data can be registered in the system.

格納データの閲覧指示Ｓ４２１において、ユーザは、ＰＣ／サーバ端末１０１を操作して原稿画像データの表示を指示することができる。 In the stored data viewing instruction S421, the user can instruct the display of the original image data by operating the PC / server terminal 101.

格納データの表示提供Ｓ４２２において、ＰＣ／サーバ端末１０１は、ユーザから表示が指示された原稿画像データをＰＣ/サーバ端末１０１の表示部に表示する。ＰＣ/サーバ端末１０１に記憶された原稿画像データに対してＯＣＲ処理等の処理がされている場合、ＰＣ/サーバ端末１０１はＳ４２０においてＯＣＲ処理の結果等を表示することも可能である。本実施例では、保存された原稿画像データをＰＣ/サーバ端末１０１の表示部に表示するとした。ＰＣ/サーバ端末１０１がユーザからの指示に従って、ユーザのＰＣやタブレット端末等のクライアント端末に原稿画像データを送信するとしてもよい。 Display of stored data In the provision S422, the PC / server terminal 101 displays the manuscript image data instructed to be displayed by the user on the display unit of the PC / server terminal 101. When the original image data stored in the PC / server terminal 101 is processed by OCR processing or the like, the PC / server terminal 101 can also display the result of the OCR processing or the like in S420. In this embodiment, it is assumed that the saved manuscript image data is displayed on the display unit of the PC / server terminal 101. The PC / server terminal 101 may transmit the manuscript image data to the client terminal such as the user's PC or tablet terminal according to the instruction from the user.

＜方向検知に利用するＯＣＲの信頼度の計算＞
本実施例の方向検知処理Ｓ４１５において、画像処理装置１００は、ＣＰＵ１１１により既存技術である「特徴ベクトルを利用したＯＣＲ」による文字認識を実行する。ここで説明するのは、後述するステップＳ６０３とステップＳ６０４に相当する処理である。 <Calculation of reliability of OCR used for direction detection>
In the direction detection process S415 of this embodiment, the image processing device 100 executes character recognition by the CPU 111 by the existing technology "OCR using the feature vector". What will be described here is a process corresponding to steps S603 and S604 described later.

ここで本発明は、原稿の方向を最も正確に表しているのは文字であることに着目し、原稿中の数種類の文字領域に対して０°、９０°、１８０°、２７０°の方向から文字認識を行う。つまり、原稿画像データを画像回転処理により、９０°、１８０°、２７０°と回転させ、回転前の０°を含むそれぞれの向きでＯＣＲを行う。その結果得られるそれら各方向における文字認識の信頼度が最も高い方向を原稿の正方向とするよう正方向決定を行う。信頼度とは、文字認識結果の信頼度、もしくは文字認識結果と文字認識辞書に含まれる文字毎の特徴量との類似度と読み替えることもできる。文字認識辞書の詳細については後述する。 Here, the present invention pays attention to the fact that characters most accurately represent the direction of the manuscript, and from the directions of 0 °, 90 °, 180 °, and 270 ° with respect to several types of character areas in the manuscript. Perform character recognition. That is, the original image data is rotated to 90 °, 180 °, and 270 ° by image rotation processing, and OCR is performed in each direction including 0 ° before rotation. The positive direction is determined so that the direction in which the reliability of character recognition in each of the resulting directions is the highest is the positive direction of the document. The reliability can be read as the reliability of the character recognition result or the similarity between the character recognition result and the feature amount for each character included in the character recognition dictionary. The details of the character recognition dictionary will be described later.

本実施例では、後述するステップＳ６０１の処理にて、原稿の向きとしては０°と１８０°又は９０°と２７０°に候補が絞り込まれるため、ここからは絞り込まれた２方向について図示しながら説明する。 In this embodiment, since the candidates are narrowed down to 0 ° and 180 ° or 90 ° and 270 ° as the orientation of the document in the process of step S601 described later, the two narrowed directions will be described with reference to the drawings. To do.

一般的にＯＣＲの前処理として、原稿画像データから文字領域の矩形情報を抽出する。ここで、文字領域とは、文章部、タイトル部、表中の文字部などである。例えば、図６（ａ）、（ｃ）に示す原稿の場合は、それぞれ図６（ｂ）、（ｄ）に示すような文字領域の矩形情報が抽出される。抽出された文字領域に対して、さらに文字領域内文字ブロック（各文字単位のブロック）に分割して、すべての文字領域内の各文字ブロックに対してＯＣＲによる文字認識を行う。文字領域内文字ブロックとは、図６（ｅ）に示す文字領域に対して図６（ｆ）に示すような１文字単位での矩形情報を指す。 Generally, as a preprocessing of OCR, rectangular information of a character area is extracted from original image data. Here, the character area is a sentence part, a title part, a character part in a table, and the like. For example, in the case of the manuscripts shown in FIGS. 6 (a) and 6 (c), the rectangular information of the character area as shown in FIGS. 6 (b) and 6 (d) is extracted. The extracted character area is further divided into character blocks within the character area (blocks for each character unit), and character recognition by OCR is performed for each character block in all the character areas. The character block in the character area refers to rectangular information in units of characters as shown in FIG. 6 (f) with respect to the character area shown in FIG. 6 (e).

図７（ａ）、（ｂ）は、「合」と「計」とからなる文字領域内文字ブロック列を抽出した例である。図７（ａ）はこの文字列が正方向である場合を示しており、図７（ｂ）は同文字列が１８０°回転した状態を示している。ここでこの文字ブロック列の最初の「合」に注目する。「合」により文字方向を判別する場合は、図８（ａ）に示すように、１つの文字画像「合」について、０°、１８０°の２方向から文字認識を行う。なお、２方向について文字認識を行う方法としては、例えば、切り出された原稿画像データ（または各文字画像データ）を０°、１８０°の２方向に回転させて各文字画像データの特徴ベクトルを抽出し、文字認識辞書と比較して文字認識を行うようにしてもよい。また、各文字画像データから特徴ベクトルを抽出して、抽出した特徴ベクトルを０°、１８０°の２方向に回転させて文字認識辞書との比較を行うようにしてもよい。 7 (a) and 7 (b) are examples of extracting a character block string in a character area consisting of "go" and "total". FIG. 7A shows a case where the character string is in the positive direction, and FIG. 7B shows a state where the character string is rotated by 180 °. Here, pay attention to the first "go" of this character block string. When the character direction is determined by "go", as shown in FIG. 8A, character recognition is performed for one character image "go" from two directions of 0 ° and 180 °. As a method of performing character recognition in two directions, for example, the cut-out original image data (or each character image data) is rotated in two directions of 0 ° and 180 ° to extract a feature vector of each character image data. However, character recognition may be performed by comparing with a character recognition dictionary. Further, a feature vector may be extracted from each character image data, and the extracted feature vector may be rotated in two directions of 0 ° and 180 ° for comparison with a character recognition dictionary.

各回転角度における文字認識結果は、図８（ｂ）に示すように、互いに異なっている。さらに、図８（ｃ）には図８（ｂ）に示された各回転角度における文字認識処理結果の信頼度が示されている。なお、図８（ｂ）、（ｃ）の文字認識処理結果および信頼度は一例であり、ＯＣＲのアルゴリズムやスキャン環境（ノイズ等）に依存するので、現実にこのとおりになるとは限らない。 The character recognition results at each rotation angle are different from each other as shown in FIG. 8 (b). Further, FIG. 8C shows the reliability of the character recognition processing result at each rotation angle shown in FIG. 8B. The character recognition processing results and reliability in FIGS. 8 (b) and 8 (c) are examples, and depend on the OCR algorithm and the scanning environment (noise, etc.), so that the actual results are not always the same.

図８（ｂ）に示すように、正方向（０°）から文字認識を行った場合は、「合」と正しく認識され、信頼度も０．９０と高い値となる。１８０°回転した方向から文字認識を行った場合は、「号」と誤認識され、信頼度も０．３０と低下する。このように誤認識が発生し、信頼度も低下するのは、回転した方向から見た場合の特徴ベクトルに基づいて文字認識を行ったからである。なお、文字認識の方向判別の信頼度は、複雑な文字であればある程、その差が顕著に現れてくる。 As shown in FIG. 8B, when character recognition is performed from the positive direction (0 °), it is correctly recognized as “go” and the reliability is as high as 0.90. When character recognition is performed from the direction rotated by 180 °, it is erroneously recognized as "No." and the reliability is lowered to 0.30. The reason why the erroneous recognition occurs and the reliability is lowered in this way is that the character recognition is performed based on the feature vector when viewed from the rotation direction. It should be noted that the more complicated the characters are, the more remarkable the difference in the reliability of the direction determination of the character recognition becomes.

＜ＯＣＲ辞書（文字認識辞書）の構成＞
図９は本実施例で使用するＯＣＲ辞書の一例である。ＯＣＲ辞書とは、既存技術である特徴ベクトルを利用したＯＣＲにおいて、認識したい単一文字毎に対応付けられた基準となる特徴ベクトル（特徴量）を収めたものであり、ＲＯＭ１１７等に格納されている。このＯＣＲ辞書は、後述するステップＳ６０４で使用されるものである。ＯＣＲ辞書に格納する文字の数に制限はない。しかし、図９に示すようにレシートに頻出する文字や、レシートには必ず現れる数字（０〜９）など、後述する原稿の種別の文字列辞書に格納している文字については必ず格納しておく。レシートに頻出する文字としては、「合」、「計」、「金」、「額」、「点」、「数」、「品」、「目」などの文字がある。なお、図９のＯＣＲ辞書例に示した文字には、上記の一部しか記載していないことに注意されたい。また、後述する頻出文字例も一部のみであり、利用実例に基づいて追加調整されるべきなのはいうまでもない。 <Structure of OCR dictionary (character recognition dictionary)>
FIG. 9 is an example of the OCR dictionary used in this embodiment. The OCR dictionary contains a reference feature vector (feature amount) associated with each single character to be recognized in OCR using a feature vector, which is an existing technology, and is stored in ROM 117 or the like. .. This OCR dictionary is used in step S604 described later. There is no limit to the number of characters that can be stored in the OCR dictionary. However, as shown in FIG. 9, characters that frequently appear on the receipt, numbers (0 to 9) that always appear on the receipt, and other characters stored in the character string dictionary of the manuscript type described later are always stored. .. Characters that frequently appear on receipts include characters such as "go", "total", "gold", "amount", "dot", "number", "article", and "eye". It should be noted that the characters shown in the OCR dictionary example of FIG. 9 contain only a part of the above. In addition, there are only a few examples of frequently-used characters, which will be described later, and it goes without saying that additional adjustments should be made based on actual usage examples.

辞書内では、これらの文字がｎ次元の特徴ベクトルに変換されて保存される。この特徴ベクトルは、ＯＣＲ処理の際に取り出され、原稿画像データから抽出された各文字ブロック内の文字画像データと比較される。なお、図６に示す特徴ベクトルの値は一例であり、実装する特徴ベクトルが現実にこの通りである必要は無い。 In the dictionary, these characters are converted into n-dimensional feature vectors and stored. This feature vector is taken out during the OCR process and compared with the character image data in each character block extracted from the original image data. The value of the feature vector shown in FIG. 6 is an example, and it is not necessary that the feature vector to be implemented is actually the same.

＜ＯＣＲの信頼度と重み付けを考慮した累積信頼度の計算＞
図１１（ａ）〜（ｄ）には、ある１枚のレシート原稿を、前述のように０°、９０°、１８０°、２７０°と回転させ原稿を示し、図１１（ａ）、（ｃ）にはそれぞれの向きでＯＣＲしたときの累積信頼度の一例を表している。本実施例では、後述するステップＳ６０１の処理にて、原稿向きとしては０°と１８０°に候補が絞り込まれるため、絞り込まれた２方向についてのみ累積信頼度をグラフ１１００、グラフ１１０５として図示している。 <Calculation of cumulative reliability considering OCR reliability and weighting>
11 (a) to 11 (d) show the original by rotating one receipt document to 0 °, 90 °, 180 ° and 270 ° as described above, and FIGS. 11 (a) and 11 (c) show the original. ) Indicates an example of cumulative reliability when OCR is performed in each direction. In this embodiment, since the candidates are narrowed down to 0 ° and 180 ° for the document orientation in the process of step S601 described later, the cumulative reliability is shown as Graph 1100 and Graph 1105 only in the narrowed down two directions. There is.

ここでは、前述したＯＣＲ辞書を使い、各文字がＯＣＲで検出されたときの原稿向き毎の信頼度を、原稿全体で合計した累積値を「累積信頼度」として、黒い棒グラフとして表している。つまり、図１１（ａ）に示す原稿向きでの文字「合」の累積信頼度は黒い棒グラフ１１０１となり、図１１（ｃ）に示す原稿向きでの文字「合」の累積信頼度は黒い棒グラフ１１０３となる。一般的にＯＣＲでは、１文字を認識処理したとき、複数の候補の抽出とそれらの候補に対する信頼度を計算できる。累積信頼度はその各候補に対する信頼度値を原稿内で総計したヒストグラムと考えてよい。 Here, using the OCR dictionary described above, the reliability for each orientation of the document when each character is detected by OCR is represented as a black bar graph with the cumulative value of the total of the entire document as the “cumulative reliability”. That is, the cumulative reliability of the character "go" in the document orientation shown in FIG. 11 (a) is the black bar graph 1101, and the cumulative reliability of the character "go" in the document orientation shown in FIG. 11 (c) is the black bar graph 1103. It becomes. Generally, in OCR, when one character is recognized and processed, it is possible to extract a plurality of candidates and calculate the reliability for those candidates. The cumulative reliability can be thought of as a histogram of the reliability values for each candidate in the manuscript.

ここまでは一般的な累積信頼度の求め方の一例であるが、ここから本発明における特徴である、特定文字の信頼度に対する重み付けに関して説明する。本説明は、後述するステップＳ６０５、ステップＳ６０６、ステップＳ６０７に相当する処理に該当する。 Up to this point, it is an example of a general method for obtaining the cumulative reliability, but from here, the weighting for the reliability of a specific character, which is a feature of the present invention, will be described. This description corresponds to the processing corresponding to step S605, step S606, and step S607 described later.

まず、図１０の原稿画像データに対応する原稿の種別毎に別々に作成された原稿種別辞書、又は文字列辞書について説明する。ここでは単一文字も文字列に含むものとし、以降、原稿種別文字列辞書とする。本実施例における原稿種別文字列辞書とは、原稿画像データを生成する際に読み取られた原稿の種別（本実施例ではレシート）に頻出する文字列と、各文字列に対応する方向検知の信頼度に対する所定の重み付け倍率が記載されている辞書である。レシートに頻出する文字列とは、例えば「合計」、「金額」、「点数」、「品目」といった文字列である。これらがＯＣＲ処理により検出された場合は、その文字列に含まれる各文字（例えば「合計」であれば「合」と「計」）の信頼度それぞれに２．０倍の重み付け倍率を乗算する。この重み付け倍率は文字列毎に変更することが可能であり、「領収書」といった３文字からなる文字列に対しては各文字の重み付け倍率を３．０倍にするといった変化をつけることも可能である。逆に、文字列としては現れなくても頻出する文字である「合」が検出された場合は１．５倍にするといった調整も可能である。なお、図１０の辞書例に示した文字は、レシートに頻出する文字列の一部しか記載していないことに注意されたい。ここにあげた頻出文字例も一部のみであり利用実例に基づいて追加し調整されるべきなのはいうまでもない。 First, a manuscript type dictionary or a character string dictionary created separately for each type of manuscript corresponding to the manuscript image data of FIG. 10 will be described. Here, a single character is also included in the character string, and hereinafter, the manuscript type character string dictionary will be used. The manuscript type character string dictionary in this embodiment is a character string that frequently appears in the type of manuscript (receipt in this embodiment) read when generating manuscript image data, and the reliability of direction detection corresponding to each character string. It is a dictionary in which a predetermined weighting ratio for a degree is described. The character strings that frequently appear on the receipt are, for example, character strings such as "total", "amount", "points", and "item". When these are detected by OCR processing, the reliability of each character included in the character string (for example, "total" for "total" and "total") is multiplied by a weighting factor of 2.0 times. .. This weighting ratio can be changed for each character string, and it is also possible to change the weighting ratio of each character to 3.0 times for a character string consisting of three characters such as "receipt". Is. On the contrary, if "go", which is a character that frequently appears even if it does not appear as a character string, is detected, it can be adjusted by 1.5 times. It should be noted that the characters shown in the dictionary example of FIG. 10 describe only a part of the character strings that frequently appear on the receipt. Needless to say, the frequently-used character examples given here are only a part and should be added and adjusted based on the usage examples.

画像データ種別の文字列辞書は、「レシート」、「名刺」といった原稿画像データ種別毎に複数作成することができ、ＲＯＭ１１７等に格納できる。これらはスキャンする原稿に応じて切り替えて使用することができる。 A plurality of character string dictionaries for image data types can be created for each manuscript image data type such as "receipt" and "business card", and can be stored in ROM 117 or the like. These can be switched and used according to the document to be scanned.

続いて、各原稿向きにおいて原稿内のすべての文字の重み付けを加味した信頼度を合算して累積信頼度を算出する。原稿向き図１１（ａ）の累積信頼度グラフ１１００内にある白い棒グラフ１１０２や、原稿向き図１１（ｃ）の累積信頼度グラフ１１０５内にある白い棒グラフ１１０３は、重み付けを考慮した信頼度の増分である。そのため、黒い棒グラフ１１０１＋白い棒グラフ１１０２や黒い棒グラフ１１０３＋白い棒グラフ１１０４の値が、本実施例における各原稿向きにおける「合」の累積信頼度となる。また、図１１（ｅ）は、図１１（ａ）、（ｃ）の白い棒グラフ１１０２、１１０４を数値で示したものである。図中の累積信頼度の値は一例であり、実装する特徴ベクトルが現実にこの通りになる必要は無い。 Subsequently, the cumulative reliability is calculated by adding up the reliability including the weighting of all the characters in the original for each original orientation. The white bar graph 1102 in the cumulative reliability graph 1100 of the document orientation FIG. 11 (a) and the white bar graph 1103 in the cumulative reliability graph 1105 of the document orientation FIG. 11 (c) are increments of reliability in consideration of weighting. Is. Therefore, the values of the black bar graph 1101 + white bar graph 1102 and the black bar graph 1103 + white bar graph 1104 are the cumulative reliability of "go" in each document orientation in this embodiment. Further, FIG. 11 (e) shows the white bar graphs 1102 and 1104 of FIGS. 11 (a) and 11 (c) numerically. The cumulative reliability values in the figure are examples, and the feature vectors to be implemented do not have to be exactly the same.

＜「方向検知処理Ｓ４１５」のフローチャート＞
図１２は、前述の構成と計算の方法を統合して、方向検知をおこなう画像処理装置１００の処理を示すフローチャートである。図３のＳ４１５の内容を詳細に記載したものである。以下の処理は、ＣＰＵ１１１がＲＯＭ１１７、ＨＤＤ１１９等に記憶された制御プログラムを実行することにより実現される。 <Flowchart of "direction detection process S415">
FIG. 12 is a flowchart showing the processing of the image processing apparatus 100 that performs direction detection by integrating the above-mentioned configuration and the calculation method. The contents of S415 of FIG. 3 are described in detail. The following processing is realized by the CPU 111 executing a control program stored in the ROM 117, HDD 119, or the like.

ステップＳ６０１において、画像処理装置１００は、ＣＰＵ１１１により原稿の向きを絞り込む処理を行う。具体的には、まず原稿画像データをＨＤＤ１１９から読み込み、ＲＡＭ１１８に記憶された原稿の文字列向き設定「横書き」を読み込む。続いて、原稿画像データを０°、９０°、１８０°、２７０°（図１１(ａ)〜（ｄ））と回転させ、各方向の原稿画像データの縦方向・横方向に対して黒画素数のヒストグラムを取ることによって、どの方向に文字列が並んでいるかを判定する。この方法の具体的な例としては、特開２０１２−８３５００の手法などが考えられる。そして上記４方向のうち横書きに対応する「水平方向に文字列が並ぶ２方向」を絞り込む。本実施例では図１１（ａ）、（ｃ）が該当し、これら２方向を原稿の正方向候補として抽出する。 In step S601, the image processing device 100 performs a process of narrowing down the orientation of the document by the CPU 111. Specifically, first, the original image data is read from the HDD 119, and the character string orientation setting "horizontal writing" of the original stored in the RAM 118 is read. Subsequently, the original image data is rotated to 0 °, 90 °, 180 °, 270 ° (FIGS. 11A to 11D), and black pixels are obtained in the vertical and horizontal directions of the original image data in each direction. By taking a histogram of the numbers, it is determined in which direction the character strings are arranged. As a specific example of this method, the method of JP2012-83500 can be considered. Then, out of the above four directions, "two directions in which character strings are lined up in the horizontal direction" corresponding to horizontal writing are narrowed down. In this embodiment, FIGS. 11A and 11C correspond to these two directions, and these two directions are extracted as candidates for the forward direction of the document.

ステップＳ６０２において、画像処理装置１００は、ＣＰＵ１１１により辞書の内容をＲＯＭ１１７等から読み出してＲＡＭ１１８に格納する。ここで展開する辞書は、図９に示したＯＣＲ辞書と、図１０に例示した画像データ種別の文字列辞書である。 In step S602, the image processing apparatus 100 reads the contents of the dictionary from the ROM 117 or the like by the CPU 111 and stores them in the RAM 118. The dictionaries developed here are the OCR dictionary shown in FIG. 9 and the character string dictionary of the image data type illustrated in FIG.

ステップＳ６０３において、画像処理装置１００は、原稿画像データから文字領域を１つ選択し、その文字領域内の各文字ブロックの特徴ベクトルを抽出する。 In step S603, the image processing apparatus 100 selects one character area from the original image data and extracts the feature vector of each character block in the character area.

ステップＳ６０４において、画像処理装置１００は、Ｓ６０２のＯＣＲ辞書内の特徴ベクトルと、Ｓ６０３で抽出した同一文字領域内の各文字ブロックの特徴ベクトルとを比較し、認識された各文字のＯＣＲ処理の信頼度を算出し、信頼度取得を行う。 In step S604, the image processing apparatus 100 compares the feature vector in the OCR dictionary of S602 with the feature vector of each character block in the same character area extracted in S603, and trusts the OCR processing of each recognized character. Calculate the degree and acquire the reliability.

ステップＳ６０５において、画像処理装置１００は、文字ブロックの１つを選択し、その文字ブロックで認識された文字が、同一文字領域内の連続する前後の他の文字ブロックで認識された文字とで文字列辞書に登録された文字列を構成するか否かを判定する。選択した文字ブロックで認識された文字が文字列辞書に登録された文字列と一致する文字列を構成する場合、ステップＳ６０６へ進む。一致する文字列がなければステップＳ６０７へ進む。 In step S605, the image processing apparatus 100 selects one of the character blocks, and the character recognized by the character block is a character that is recognized by other character blocks before and after the continuation in the same character area. Determines whether or not to constitute the character string registered in the column dictionary. If the character recognized in the selected character block constitutes a character string that matches the character string registered in the character string dictionary, the process proceeds to step S606. If there is no matching character string, the process proceeds to step S607.

ステップＳ６０６において、画像処理装置１００は、Ｓ６０５で選択した文字ブロックで認識された文字の信頼度に対して、文字列辞書に設定された重み付け倍率に従って重み付けを行う。 In step S606, the image processing device 100 weights the reliability of the characters recognized by the character block selected in S605 according to the weighting ratio set in the character string dictionary.

ステップＳ６０７において、画像処理装置１００は、Ｓ６０５で選択した文字ブロックで算出した信頼度を、その文字ブロックで認識された文字の累積信頼度に加算する。なお、ステップＳ６０３からステップＳ６０７の処理の詳細については、前述したとおりである。 In step S607, the image processing apparatus 100 adds the reliability calculated in the character block selected in S605 to the cumulative reliability of the characters recognized in the character block. The details of the processes from steps S603 to S607 are as described above.

ステップＳ６０８において、画像処理装置１００は、Ｓ６０３で選択した現在の文字領域内の全文字ブロックに対して各文字ブロックで算出した信頼度を文字別の累積信頼度に加算したか否かを判定する。累積信頼度への加算が終了していない文字ブロックが同一文字領域内に存在すれば、ステップＳ６０５に戻る。同一文字領域内の全文字ブロックで累積信頼度への加算が終了していれば、ステップＳ６０９へ進む。 In step S608, the image processing apparatus 100 determines whether or not the reliability calculated for each character block is added to the cumulative reliability for each character for all the character blocks in the current character area selected in S603. .. If a character block for which addition to the cumulative reliability has not been completed exists in the same character area, the process returns to step S605. If the addition to the cumulative reliability is completed in all the character blocks in the same character area, the process proceeds to step S609.

ステップＳ６０９にて、画像処理装置１００は、原稿画像データ内の全文字領域に対して特徴ベクトルの抽出が終了したか否かを判定する。特徴ベクトルの抽出が終了していない文字領域が存在すればステップＳ６１１へ進んで、次の文字領域を選択してステップＳ６０３に戻る。 In step S609, the image processing device 100 determines whether or not the extraction of the feature vector has been completed for the entire character area in the original image data. If there is a character area for which the extraction of the feature vector has not been completed, the process proceeds to step S611, the next character area is selected, and the process returns to step S603.

ステップＳ６１０において、画像処理装置１００は、図１１（ａ）に示す原稿向き０°と、図１１（ｃ）に示す原稿向き１８０°とに対してこれまでの処理を行ったか否かを判定する。終了していない原稿向きがあればステップＳ６１２に進み、次に処理するべき角度へ原稿画像データを回転させ、ステップＳ６０３に戻る。すべての原稿向きに関して処理を終了していれば、ステップＳ６１３へ進む。 In step S610, the image processing apparatus 100 determines whether or not the previous processing has been performed on the document orientation 0 ° shown in FIG. 11A and the document orientation 180 ° shown in FIG. 11C. .. If there is an unfinished document orientation, the process proceeds to step S612, the document image data is rotated to an angle to be processed next, and the process returns to step S603. If the processing has been completed for all the document orientations, the process proceeds to step S613.

ステップＳ６１３において、画像処理装置１００は、後述する方法に基づき原稿の正方向を決定する。ここで本処理について図１１を使って説明する。図１１は、本実施形態により図１０に示したＯＣＲ信頼度の結果から原稿の方向検知を行う計算例である。図１１（ｅ）で各文字の累積信頼度を図１１（ａ）、（ｃ）の原稿向き毎に合計している。ここで原稿向き０°（図１１（ａ））の合計値が９５．６８と、原稿向き１８０°（図１１（ｃ））の合計値５２．８４よりも高い。そのため、画像処理装置１００は、原稿画像データの正方向は図１１（ａ）に示す０°の向きであると決定する。 In step S613, the image processing apparatus 100 determines the forward direction of the document based on the method described later. Here, this process will be described with reference to FIG. FIG. 11 is a calculation example in which the direction of the document is detected from the result of the OCR reliability shown in FIG. 10 according to the present embodiment. In FIG. 11 (e), the cumulative reliability of each character is totaled for each orientation of the documents in FIGS. 11 (a) and 11 (c). Here, the total value of 0 ° for the original (FIG. 11 (a)) is 95.68, which is higher than the total value of 180 ° for the original (FIG. 11 (c)) of 52.84. Therefore, the image processing apparatus 100 determines that the positive direction of the original image data is the direction of 0 ° shown in FIG. 11A.

以上により、本発明では、文字数が比較的少ない帳票であるレシートであっても、特定の文字列を構成する文字の信頼度に重み付けすることにより、原稿向き毎の累積信頼度の差を大きくすることが可能である。 As described above, in the present invention, even if the receipt is a form having a relatively small number of characters, the difference in cumulative reliability for each orientation of the document is increased by weighting the reliability of the characters constituting a specific character string. It is possible.

＜原稿の文字列向き設定指示Ｓ４０８において「混在」＞
上記実施例では、原稿を横書きのレシートと限定し、図３のステップＳ４０８にてマルチクロップ時の文字列向き設定に「横書き」を指定されていた場合について説明した。ここでは、文字列向き設定時に「混在」と指定された場合について説明する。 <"Mixed" in the character string orientation setting instruction S408 of the document>
In the above embodiment, the case where the original is limited to the horizontal writing receipt and "horizontal writing" is specified for the character string orientation setting at the time of multi-crop in step S408 of FIG. 3 has been described. Here, the case where "mixed" is specified at the time of setting the character string orientation will be described.

ユーザから「混在」を指定されるケースは、例えば縦書き・横書きのどちらも多く存在する名刺を複数スキャンするケースである。この場合は図１２におけるステップＳ６０１の原稿向きを絞り込む処理前に、文字列向きを検出する処理を追加する方法がある。文字列向きを検出する方法としては、特開２０１２−８３５００で示された方法などがある。 The case where "mixed" is specified by the user is, for example, the case of scanning a plurality of business cards in which both vertical writing and horizontal writing are present. In this case, there is a method of adding a process of detecting the character string orientation before the process of narrowing down the document orientation in step S601 in FIG. As a method for detecting the character string orientation, there is a method shown in Japanese Patent Application Laid-Open No. 2012-83500.

また、文字列向き検出処理を追加しない場合は、ステップＳ６０１での原稿向き絞り込み処理はスキップし、原稿向き０°、９０°、１８０°、２７０°のすべてに対して累積信頼度の算出処理を行う方法でもよい。 If the character string orientation detection process is not added, the document orientation narrowing process in step S601 is skipped, and the cumulative reliability calculation process is performed for all of the document orientations 0 °, 90 °, 180 °, and 270 °. It may be done.

なお、原稿文字列向き設定に「縦書き」と指定された場合は、ステップＳ６０１の原稿向き絞り込み時に縦方向に文字列が並んでいる原稿向きを採用する。 When "vertical writing" is specified in the original character string orientation setting, the original orientation in which the character strings are arranged in the vertical direction is adopted when the original orientation is narrowed down in step S601.

また、本実施例では文字列向きおよび原稿の種類の設定が必ず行われることを前提に説明したが、文字列向きおよび原稿の種類の設定が行われない場合は、「混在」と指定された場合と同じ処理を行えばよい。 Further, in this embodiment, the description is made on the premise that the character string orientation and the document type are always set, but when the character string orientation and the document type are not set, it is specified as "mixed". The same processing as in the case may be performed.

（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other Examples)
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

画像処理装置１００
制御部１１０
スキャナ部１２０ Image processing device 100
Control unit 110
Scanner unit 120

Claims

An image processing device having an image acquisition means for scanning a document and generating image data of the document.
A first dictionary that holds features for each single character,
A second dictionary that holds a weighting factor for each given character,
An extraction means for extracting the feature amount of the character image data included in the manuscript image data, and
With respect to a plurality of predetermined orientations of the manuscript image data, the feature amount extracted by the extraction means and the feature amount of the first dictionary were compared and associated with the feature amount of the first dictionary compared. A reliability acquisition method for acquiring the reliability of characters, and
When the character for which the reliability has been acquired exists in the second dictionary, the weighting means for weighting the reliability of the matching character by the weighting factor and
An image processing apparatus comprising: a forward direction determining means for determining a positive direction of the original image data based on the weighted reliability from the plurality of predetermined orientations.

The first aspect of claim 1, wherein the positive direction determining means acquires the cumulative values of all the reliabilitys for each of the predetermined directions, and sets the predetermined direction having the largest cumulative value as the positive direction. Image processing equipment.

The second dictionary further holds a weighting factor for each predetermined character string composed of a plurality of characters.
When the character for which the reliability has been acquired constitutes a predetermined character string of the second dictionary, the weighting means corresponds to the reliability of each character constituting the predetermined character string of the second dictionary. The image processing apparatus according to claim 1 or 2, wherein the weighting is performed by a weighting factor of the predetermined character string.

The image processing apparatus according to claim 3, wherein the predetermined character string held by the second dictionary is a character string that frequently appears for each type of the manuscript.

The second dictionary includes a plurality of manuscript type dictionaries separately created for each type of manuscript.
The image processing apparatus according to any one of claims 1 to 4, wherein the weighting means is weighted based on the manuscript type dictionary corresponding to the manuscript type corresponding to the manuscript image data.

When the orientation of the original image data does not match the positive direction determined by the positive direction determining means, a rotation processing means for rotating the original image data so that the orientation of the original image data is the positive direction is further added. The image processing apparatus according to any one of claims 1 to 5, wherein the image processing apparatus is provided.

The image acquisition means generates scanned image data including an image area corresponding to a plurality of documents placed on a platen.
The invention according to any one of claims 1 to 6, further comprising an image processing means for cutting out the image area from the scanned image data and generating a plurality of original image data corresponding to each of the plurality of originals. The image processing apparatus described.

The image processing apparatus according to any one of claims 1 to 7, further comprising a character string orientation acquisition means for acquiring the character string orientation of the original image data.

The image processing according to claim 8, wherein the character string orientation acquisition means acquires the character string orientation associated with the type of the original document based on the type of the original document corresponding to the original image data. apparatus.

Further provided with a character string orientation detecting means for detecting a character string orientation candidate in the original image data,
The image processing according to any one of claims 1 to 9, wherein the reliability acquisition means and the forward direction determining means set the detected character string orientation candidate in the plurality of predetermined orientations. apparatus.

Any one of claims 1 to 9, wherein the reliability acquisition means and the forward direction determining means have a plurality of predetermined directions in which each side of the original image data is a horizontal upper side. The image processing apparatus according to the section.

An extraction means for extracting the feature amount of the character image data included in the original image data generated by scanning the original with the image acquisition means, and an extraction means.
It has a first dictionary that holds a feature amount for each single character, and compares the feature amount extracted by the extraction means with the feature amount of the first dictionary with respect to a plurality of predetermined orientations of the manuscript image data. Then, the step of acquiring the reliability of the characters associated with the feature quantities of the first dictionary compared with each other, and
When a second dictionary having a weighting factor for each predetermined character is held and a character having acquired the reliability is present in the second dictionary, the reliability of the matching character is weighted by the weighting factor. Steps and
An image processing method comprising a step of determining a positive direction of the original image data based on the weighted reliability from the plurality of predetermined orientations.

A program for causing a computer to execute the image processing method according to claim 12.