JP2021093225A

JP2021093225A - Information processing device, program, and information processing method

Info

Publication number: JP2021093225A
Application number: JP2021045907A
Authority: JP
Inventors: 満夫木村; Mitsuo Kimura
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-03-19
Filing date: 2021-03-19
Publication date: 2021-06-17

Abstract

To make character recognition processing applicable to an image in which accurate detection of an outline of a character is difficult.SOLUTION: In accordance with this invention, a search region is set to a recognition object image, and a plurality of cutout regions are set within the search region, and images corresponding to the plurality of set cutout regions respectively are extracted, and each of the extracted images is compared with dictionary data to detect candidate character information and position information of a cutout region corresponding to the candidate character. Then, a piece of candidate character information having the highest evaluation value, out of the detected candidate character information is outputted as a recognition result. Further, a search region for the next character is set on the basis of position information of the cutout region corresponding to the recognition result.SELECTED DRAWING: Figure 12

Description

本発明は、文字認識処理に関する。 The present invention relates to character recognition processing.

従来、紙文書をスキャンして得た文書画像に対する文字認識処理は、文書画像から文字のアウトライン（輪郭）を検出して、一文字ごとの文字画像の切り出しを行い、当該切り出した文字画像に対して、何の文字であるかを識別する文字認識処理を行っていた。また、文字の切り出し位置が誤っていると、正しい文字が認識されないため、ユーザの指示により文字の切り出し位置を修正する技術も提供されている。例えば、１つの文字画像を、複数の文字として切り出してしまった場合（例えば、１つの漢字を、偏と旁に分割して切り出してしまった場合）、それらを１つの文字として修正する技術がある。特許文献１では、ユーザが、文字の認識結果を修正すると、未修整の箇所から同様の誤認識を行っている箇所を検索して、同様の修正を適用する技術が開示されている。 Conventionally, the character recognition process for a document image obtained by scanning a paper document detects the outline (outline) of a character from the document image, cuts out the character image for each character, and obtains the cut out character image. , Character recognition processing was performed to identify what character it was. Further, since the correct character is not recognized if the character cutout position is incorrect, a technique for correcting the character cutout position according to the user's instruction is also provided. For example, if one character image is cut out as a plurality of characters (for example, if one kanji is cut out by dividing it into a bias and a right component), there is a technique for correcting them as one character. .. Patent Document 1 discloses a technique in which when a user corrects a character recognition result, a part in which the same erroneous recognition is performed is searched from an unmodified part and the same correction is applied.

また、近年、スマートフォンやデジタルカメラなどの普及により、文字情報を含む画像情報が手軽に取れるようになってきた。これによって、より多種多様な観測環境から、文字認識処理を行って文字情報を取り込む大きな市場が開けつつある。例えば、鉱山などの採石現場では、ダンプトラックに使用されているタイヤを管理するために、タイヤに刻印されたシリアルナンバーを用いるというユースケースがある。そこで、タイヤに刻印されたシリアルナンバーをスマートフォンやデジタルカメラなどで撮影し、撮影した画像に対して文字認識処理を行い、その文字認識結果のシリアルナンバーを用いて管理することが考えられる。しかしながら、タイヤに刻印されたシリアルナンバーなど、撮影画像において、文字と背景のコントラストが小さかったり、表面に汚れが多くありノイズが多かったりすると、従来技術のように、文字のアウトラインを正確に検出すること自体が困難である。 Further, in recent years, with the spread of smartphones and digital cameras, it has become possible to easily obtain image information including text information. This is opening up a large market for capturing character information by performing character recognition processing from a wider variety of observation environments. For example, at a quarry site such as a mine, there is a use case in which a serial number stamped on a tire is used to manage the tire used for a dump truck. Therefore, it is conceivable to take a picture of the serial number engraved on the tire with a smartphone, a digital camera, or the like, perform character recognition processing on the taken image, and manage using the serial number of the character recognition result. However, if the contrast between the characters and the background is small, or if the surface is dirty and noisy, such as the serial number engraved on the tire, the outline of the characters will be detected accurately as in the conventional technology. The thing itself is difficult.

特開平１１−１４３９８３号公報Japanese Unexamined Patent Publication No. 11-143983

文字のアウトラインを正確に検出できないような画像に対して、文字のアウトラインに基づいて文字を切り出す従来の技術を適用すると、文字の切り出し位置を誤る頻度が高くなってしまい、ユーザが認識結果を修正する負担も大きくなる。 If the conventional technique of cutting out characters based on the outline of characters is applied to an image in which the outline of characters cannot be detected accurately, the frequency of erroneous cutting out positions of characters increases, and the user corrects the recognition result. The burden of doing it also increases.

本発明は、文字のアウトラインを正確に検出するのが困難な画像に対して、文字認識処理を適用できるようにすることを目的とする。 An object of the present invention is to make it possible to apply character recognition processing to an image in which it is difficult to accurately detect the outline of characters.

上記課題を解決するために、本発明の情報処理装置は、認識対象画像に対して、探索領域を設定する第１の設定手段と、前記探索領域内の複数カ所に、切り出し領域を設定する第２の設定手段と、前記第２の設定手段によって設定された前記複数の切り出し領域それぞれに対応する画像を抽出し、当該抽出した各画像と辞書データとの比較を行うことにより、候補文字情報と当該候補文字に対応する切り出し領域の位置情報とを検出し、当該検出された候補文字情報の中から評価値の最も高い候補文字情報を認識結果として出力する文字検出手段と、を有し、前記第１の設定手段が、さらに、前記文字検出手段で出力された前記認識結果に対応する切り出し領域の位置情報に基づいて、次の文字に関する探索領域を設定することにより、前記第２の設定手段と前記文字検出手段とによる処理が繰り返し実行されることを特徴とする。 In order to solve the above problems, the information processing apparatus of the present invention has a first setting means for setting a search area for a recognition target image, and a first setting means for setting a cutout area at a plurality of places in the search area. By extracting the images corresponding to each of the setting means 2 and the plurality of cutout areas set by the second setting means and comparing each extracted image with the dictionary data, the candidate character information can be obtained. It has a character detecting means that detects the position information of the cutout area corresponding to the candidate character and outputs the candidate character information having the highest evaluation value from the detected candidate character information as a recognition result. The first setting means further sets a search area for the next character based on the position information of the cutout area corresponding to the recognition result output by the character detecting means, thereby causing the second setting means. It is characterized in that the processing by the character detecting means and the character detecting means is repeatedly executed.

文字のアウトラインを正確に検出するのが困難な画像に対して、文字があると推定される探索領域内で位置をずらしながら複数の領域を切り出し、当該切り出した複数の領域に基づいて文字認識処理を適用することで、文字認識処理の精度を向上できる。 For an image in which it is difficult to accurately detect the outline of characters, multiple areas are cut out while shifting the position in the search area where characters are presumed to be present, and character recognition processing is performed based on the cut out multiple areas. By applying, the accuracy of character recognition processing can be improved.

モバイル端末の外観の一例An example of the appearance of a mobile device ハードウェア構成の一例An example of hardware configuration モバイル端末１００におけるソフトウェアの構成の一例An example of software configuration in mobile terminal 100 文字画像情報（辞書データ）の一例An example of character image information (dictionary data) 文字認識処理の概念図Conceptual diagram of character recognition processing 認識結果の表示画面例Recognition result display screen example 認識結果の修正指示時の表示画面例Example of display screen when instructing to correct the recognition result 認識結果の修正後の表示画面例Display screen example after correction of recognition result 修正処理後の切り出し領域が再設定される様子を示す例An example showing how the cutout area is reset after the correction process 文字画像情報（辞書データ）のデータ構造の一例An example of the data structure of character image information (dictionary data) 文字認識結果のデータ構造の一例An example of the data structure of the character recognition result 文字認識処理の詳細を示すフローチャートFlowchart showing details of character recognition processing 文字認識結果が修正された後に実行される処理のフローチャートFlowchart of processing executed after the character recognition result is corrected 文字検出処理の詳細を示すフローチャートFlowchart showing details of character detection processing 文字認識の処理の詳細を示すフローチャートFlowchart showing details of character recognition processing

（実施例１）
本実施形態に係る情報処理装置の一例として、モバイル端末（携帯端末）を例に説明する。モバイル端末は、無線通信機能などを用いて外部と通信可能な端末である。 (Example 1)
As an example of the information processing device according to the present embodiment, a mobile terminal (mobile terminal) will be described as an example. A mobile terminal is a terminal capable of communicating with the outside by using a wireless communication function or the like.

図１は、モバイル端末１００（モバイル端末の前面１０１と背面１０３）の外観と、被写体１０５となるタイヤを示す図である。モバイル端末前面部１０１には、タッチパネルディスプレイ１０２が備えられ、表示とタッチ操作入力との２つの機能を有する。モバイル端末背面部１０３には、被写体を撮影して画像を取り込むためのカメラユニット１０４が備えられる。本実施形態では、モバイル端末１００のユーザは、モバイル端末のＣＰＵにより動作するモバイルアプリ（詳細は後述）を利用して、被写体１０５を撮影し、文字認識処理を実行させることができる。被写体１０５は、タイヤの例である。タイヤのシリアルＩＤ（シリアルナンバーもしくはセリアルナンバーともいう）が記載されている部分を、モバイル端末のカメラユニットを用いて撮影することにより、撮影画像１０６を取得することができる。シリアルＩＤ１０７はタイヤに刻印されたシリアルナンバーを示しており、タイヤを一意に識別するためのＩＤである。 FIG. 1 is a diagram showing the appearance of the mobile terminal 100 (front surface 101 and back surface 103 of the mobile terminal) and a tire serving as a subject 105. The mobile terminal front portion 101 is provided with a touch panel display 102, and has two functions of display and touch operation input. The back surface 103 of the mobile terminal is provided with a camera unit 104 for photographing a subject and capturing an image. In the present embodiment, the user of the mobile terminal 100 can take a picture of the subject 105 and execute the character recognition process by using the mobile application (details will be described later) operated by the CPU of the mobile terminal. Subject 105 is an example of a tire. The captured image 106 can be acquired by photographing the portion where the serial ID (also referred to as the serial number or the serial number) of the tire is described by using the camera unit of the mobile terminal. The serial ID 107 indicates a serial number stamped on the tire, and is an ID for uniquely identifying the tire.

なお、本実施形態では、被写体１０５としてタイヤを例にして説明するが、被写体はタイヤに限るものではない。後述のモバイルアプリは、被写体１０５の画像を取り込み、タッチパネル１０２にその画像を出力することができる。 In the present embodiment, the subject 105 will be described by taking a tire as an example, but the subject is not limited to the tire. The mobile application described later can capture an image of the subject 105 and output the image to the touch panel 102.

図２は、モバイル端末１００のハードウェアの構成の一例を示す図である。ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２０１は、各種のプログラムを実行することによって様々な機能を実現する処理ユニットである。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２０２は、各種の情報の記憶や、ＣＰＵ２０１の一時的な作業記憶領域として利用されるユニットである。不揮発性メモリ（例えばＲＯＭ）２０３は、各種のプログラムやデータ等を記憶するユニットである。ＣＰＵ２０１は、不揮発性メモリ２０３に記憶されているプログラムをＲＡＭ２０２にロードしてプログラムを実行する。すなわち、モバイル端末のＣＰＵ（コンピュータ）は、該プログラムを実行することにより、図３で説明するような各処理部として機能し、後述するシーケンスの各ステップを実行する。なお、不揮発性メモリ２０３は、フラッシュメモリ、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）又はＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｉｓｋ）などであってもよい。なお、モバイル端末１００の各機能ならびに後述するシーケンスに係る処理の全部又は一部については専用のハードウェアを用いて実現してもよい。Ｉｎｐｕｔ／Ｏｕｔｐｕｔインターフェース２０４は、タッチパネル１０２とデータを送受信する。ＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）２０５は、モバイル端末１００をネットワーク（不図示）に接続するためのユニットである。カメラインターフェース２０６は、カメラユニット１０４と接続し、被写体１０５の画像をモバイル端末１００に取り込む。上述したユニットは、バス２０７を介してデータの送受信を行うことができる。 FIG. 2 is a diagram showing an example of the hardware configuration of the mobile terminal 100. The CPU (Central Processing Unit) 201 is a processing unit that realizes various functions by executing various programs. The RAM (Random Access Memory) 202 is a unit used for storing various types of information and as a temporary working storage area of the CPU 201. The non-volatile memory (for example, ROM) 203 is a unit that stores various programs, data, and the like. The CPU 201 loads the program stored in the non-volatile memory 203 into the RAM 202 and executes the program. That is, the CPU (computer) of the mobile terminal functions as each processing unit as described with reference to FIG. 3 by executing the program, and executes each step of the sequence described later. The non-volatile memory 203 may be a flash memory, an HDD (Hard Disk Drive), an SSD (Solid State Disk), or the like. It should be noted that each function of the mobile terminal 100 and all or part of the processing related to the sequence described later may be realized by using dedicated hardware. The Input / Output interface 204 transmits / receives data to / from the touch panel 102. The NIC (Network Interface Card) 205 is a unit for connecting the mobile terminal 100 to a network (not shown). The camera interface 206 is connected to the camera unit 104 and captures an image of the subject 105 into the mobile terminal 100. The unit described above can send and receive data via the bus 207.

次に、モバイル端末１００におけるソフトウェア構成について説明する。図３は、モバイル端末１００のソフトウェア構成の一例を示す概念図である。モバイル端末のＣＰＵは、モバイルアプリ（モバイル端末用のアプリケーションプログラム）３０２を実行することにより、各処理部（各処理モジュール）３０３〜３０８として機能する。また、モバイル端末１００のＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）（不図示）は、画像管理部３０１として機能する。 Next, the software configuration in the mobile terminal 100 will be described. FIG. 3 is a conceptual diagram showing an example of the software configuration of the mobile terminal 100. The CPU of the mobile terminal functions as each processing unit (each processing module) 303 to 308 by executing the mobile application (application program for the mobile terminal) 302. Further, the OS (Operating System) (not shown) of the mobile terminal 100 functions as the image management unit 301.

画像管理部３０１は、画像やアプリケーションデータを管理する。ＯＳは、データ管理部３０１を利用するための制御ＡＰＩ（ＡｐｐｌｉｃａｔｉｏｎＰｒｏｇｒａｍｍｉｎｇＩｎｔｅｒｆａｃｅ）を提供している。各アプリケーションは、その制御ＡＰＩを利用することでデータ管理部３０１に対し、画像やアプリケーションデータの取得処理や保存処理を行う。 The image management unit 301 manages images and application data. The OS provides a control API (Application Programming Interface) for using the data management unit 301. Each application uses the control API to perform image and application data acquisition processing and storage processing on the data management unit 301.

モバイルアプリ３０２は、モバイル端末１００のＯＳのインストール機能を利用して、ダウンロードしインストールすることにより実行可能なアプリケーションである。モバイルアプリ３０２は、カメラインターフェース２０６を介して取り込んだ被写体１０５の画像に対する各種のデータ処理を行う。 The mobile application 302 is an application that can be executed by downloading and installing using the OS installation function of the mobile terminal 100. The mobile application 302 performs various data processing on the image of the subject 105 captured via the camera interface 206.

メイン制御部３０３は、後述する各モジュール部（３０３〜３０８）に対する指示及び管理を行う。 The main control unit 303 gives instructions and manages each module unit (303 to 308) described later.

情報表示部３０４は、メイン制御部３０３からの指示に従い、図６〜８に示すようなモバイルアプリ３０２のユーザインタフェース（ＵＩ）をタッチパネルに表示するように制御する。 The information display unit 304 controls the user interface (UI) of the mobile application 302 as shown in FIGS. 6 to 8 to be displayed on the touch panel in accordance with the instruction from the main control unit 303.

図６〜８は、モバイルアプリ３０２のＵＩ（携帯端末用のＵＩ）の画面（モバイル端末画面６００）の一例を示す図である。モバイル端末画面６００は、モバイル端末１００のタッチパネル１０２に表示される。モバイル端末画面６００は、領域６０１にカメラ１０４を用いて取り込んだ画像を表示し、また、画像やＵＩ等に対するユーザによる操作（ユーザ操作）を受け付ける。シャッターボタン６０２は、カメラユニットから入力された画像を、ＲＡＭ２０２やデータ管理部３０１に保存するためのボタンであり、以下では、保存された画像を撮影画像と呼ぶこととする。ズームボタン６０３は、表示画像の拡縮を行うためのボタンである。６０４〜６０７は、認識対象を撮影すべき位置の目安となるガイドである。ユーザは、認識対象のシリアルＩＤ１０７を４つのガイドによって囲まれる矩形の領域内に収まるように撮影位置を調整してタイヤを撮影する。６０８は、シリアルＩＤ１０７の文字認識結果を表示するための表示領域である。認識結果が誤っている場合、ユーザは、認識結果表示領域６０８の中の修正対象文字をタッチして、認識結果の修正を行う。図６の画面において、ユーザが認識結果表示領域６０８の修正対象の文字の部分をタッチすると、図７の画面のように、当該タッチされた文字に対する修正候補文字が候補文字領域７０１〜７０３に表示される。図７の画面で候補文字領域７０１〜７０３のいずれかがタッチされると、認識結果表示領域６０８の文字が、選択された候補文字に更新される（図８の画面は候補文字領域７０２がタッチされ修正した後の例を示す）。 6 to 8 are diagrams showing an example of a screen (mobile terminal screen 600) of the UI (UI for a mobile terminal) of the mobile application 302. The mobile terminal screen 600 is displayed on the touch panel 102 of the mobile terminal 100. The mobile terminal screen 600 displays an image captured by the camera 104 in the area 601 and also accepts a user operation (user operation) on the image, UI, or the like. The shutter button 602 is a button for saving the image input from the camera unit in the RAM 202 or the data management unit 301, and hereinafter, the saved image will be referred to as a captured image. The zoom button 603 is a button for scaling the displayed image. Reference numerals 604 to 607 are guides that serve as a guide for the position at which the recognition target should be photographed. The user adjusts the shooting position so that the serial ID 107 to be recognized fits within the rectangular area surrounded by the four guides, and shoots the tire. 608 is a display area for displaying the character recognition result of the serial ID 107. When the recognition result is incorrect, the user touches the character to be corrected in the recognition result display area 608 to correct the recognition result. When the user touches the part of the character to be corrected in the recognition result display area 608 on the screen of FIG. 6, the correction candidate character for the touched character is displayed in the candidate character areas 701 to 703 as shown in the screen of FIG. Will be done. When any of the candidate character areas 701 to 703 is touched on the screen of FIG. 7, the characters in the recognition result display area 608 are updated to the selected candidate characters (the candidate character area 702 is touched on the screen of FIG. 8). An example is shown after the correction has been made).

なお、モバイルアプリ３０２のＵＩの形態（位置、大きさ、範囲、配置、表示内容など）は、図に示す形態に限定されるものではなく、モバイル端末１００の機能を実現することができる適宜の構成を採用することができる。 The UI form (position, size, range, arrangement, display content, etc.) of the mobile application 302 is not limited to the form shown in the figure, and can realize the functions of the mobile terminal 100 as appropriate. The configuration can be adopted.

再び図３に戻って各モジュールの説明を行う。操作情報取得部３０５は、モバイルアプリのＵＩ上で為されたユーザ操作に関する情報を取得し、当該取得した情報をメイン制御部３０３に通知する。例えば、領域６０１をユーザが手で触れると、操作情報取得部３０５は、触れられた画面上の位置の情報を感知し、感知した位置の情報をメイン制御部３０３に送信する。 Each module will be described by returning to FIG. The operation information acquisition unit 305 acquires information related to the user operation performed on the UI of the mobile application, and notifies the main control unit 303 of the acquired information. For example, when the user touches the area 601 by hand, the operation information acquisition unit 305 senses the information on the touched position on the screen and transmits the information on the sensed position to the main control unit 303.

画像処理部３０６は、カメラモユニット２０６を介して取り込んだ被写体１０５の撮影画像に対して、グレイスケール変換やエッジ抽出、特徴量抽出といった文字認識を行うために必要な画像処理を行う。 The image processing unit 306 performs image processing necessary for character recognition such as grayscale conversion, edge extraction, and feature amount extraction on the captured image of the subject 105 captured via the camera motor unit 206.

文字認識部３０７は、画像処理部３０６で処理した画像から文字が記載されていると推定される領域を複数切り出し、各領域の画像を比較対象の文字画像情報（辞書データ）と比較して、最も類似する文字を判別する。 The character recognition unit 307 cuts out a plurality of areas in which characters are presumed to be described from the image processed by the image processing unit 306, compares the image of each area with the character image information (dictionary data) to be compared, and determines the image. Determine the most similar characters.

文字画像管理部３０８は、文字認識部３０７が文字の認識を行う際に、比較対象として使用する文字画像情報（いわゆる文字認識辞書の辞書データとして使用される情報）を管理する。図４は、文字認識部３０７が、画像から文字を認識する際に、比較対象として使用する文字画像情報の例である。文字画像情報は、認識対象のタイヤで使用されている文字の種類それぞれについて用意する。文字画像情報４０１〜４１０は、数字の画像の例であるが、本実施例の認識対象であるタイヤのシリアルＩＤ１０７は、数字に加えて大文字のアルファベットの文字画像（不図示）を含むものとする。 The character image management unit 308 manages character image information (information used as dictionary data of a so-called character recognition dictionary) used as a comparison target when the character recognition unit 307 recognizes characters. FIG. 4 is an example of character image information used as a comparison target when the character recognition unit 307 recognizes a character from an image. Character image information is prepared for each type of character used in the tire to be recognized. The character image information 401 to 410 is an example of a numerical image, but the serial ID 107 of the tire to be recognized in this embodiment includes a character image (not shown) of an uppercase alphabet in addition to the number.

なお、当該文字画像管理部で管理される文字画像情報（辞書データ）は、タイヤに刻印されている文字のフォントに基づいて作成された各文字の特徴を示す特徴情報であってもよいし、各文字の画像そのものであってもよい。どのような辞書データを用いるかは、認識対象の画像と辞書データとを照合する際に用いるアルゴリズムに応じたものとすればよい。 The character image information (dictionary data) managed by the character image management unit may be characteristic information indicating the characteristics of each character created based on the font of the characters engraved on the tire. It may be the image itself of each character. What kind of dictionary data is used may be determined according to the algorithm used when collating the image to be recognized with the dictionary data.

図５は、本実施例における文字認識処理について説明した図である。認識対象画像５０１は、カメラユニット１０４およびカメラインターフェース２０６を介して取り込んだ被写体１０５の画像の一部を切り出した画像である。図６で説明したように、ユーザは、モバイルアプリ３０２のＵＩに提示されたガイド（図６の６０４〜６０７）に、シリアルＩＤ１０７がちょうど納まるように撮影位置を調整してタイヤを撮影する。モバイルアプリ３０２は、撮影された画像から、ガイドで囲まれた部分の画像を切り出して、認識対象画像５０１とする。 FIG. 5 is a diagram illustrating the character recognition process in this embodiment. The recognition target image 501 is an image obtained by cutting out a part of the image of the subject 105 captured via the camera unit 104 and the camera interface 206. As described with reference to FIG. 6, the user adjusts the shooting position so that the serial ID 107 fits exactly in the guide (604 to 607 of FIG. 6) presented in the UI of the mobile application 302, and shoots the tire. The mobile application 302 cuts out an image of a portion surrounded by a guide from the captured image and sets it as a recognition target image 501.

なお、タイヤのシリアルＩＤ１０７は、メーカーごとにフォーマットが決まっており、本実施例では、桁数は９桁で、数字と大文字のアルファベットで構成されるものとして説明する。 The format of the tire serial ID 107 is determined for each manufacturer, and in this embodiment, the number of digits is 9 and the tire serial ID 107 is described as being composed of a number and an uppercase alphabet.

タイヤに刻印されているシリアルＩＤなどを撮影した画像は、文字と背景のコントラストが小さかったり、タイヤ（被写体）の表面に汚れがあったりするので、文字のアウトラインを正確に検出することが困難な画像である。したがって、文字のアウトラインに基づいて文字を切り出す従来の技術を適用すると、文字の切り出し位置を誤る可能性が高く、その結果文字認識処理の精度が悪くなってしまう。そのため、本発明では、まず、文字が存在すると考えられる領域を探索領域として設定し、当該設定された探索領域内で、位置とサイズとを変えながら複数の切り出し領域を設定して複数の領域画像の切り出しを繰り返す。そして、切り出した複数の領域画像それぞれと辞書データ（文字画像管理部で管理される比較対象の文字画像情報）とを比較して、各領域画像に対する文字認識結果とその類似度とを求める。その後、それらの結果の中から最も類似度が高い文字認識結果とその文字認識結果を得るのに用いた切り出し領域とを、その探索領域における認識結果とする。そして、その認識結果の切り出し領域の位置をもとに、次の文字に対する探索領域を設定し、同様の処理を繰り返す。本実施例においては、認識対象画像５０１に含まれる９桁のシリアルＩＤ１０７を、１桁目（左端の文字）から順に認識していく。 It is difficult to accurately detect the outline of characters in an image of a serial ID stamped on a tire because the contrast between the characters and the background is small and the surface of the tire (subject) is dirty. It is an image. Therefore, if the conventional technique of cutting out a character based on the outline of the character is applied, there is a high possibility that the character cutting position is erroneous, and as a result, the accuracy of the character recognition process deteriorates. Therefore, in the present invention, first, a region in which characters are considered to exist is set as a search region, and within the set search region, a plurality of cutout regions are set while changing the position and size, and a plurality of region images Repeat the cutting out. Then, each of the cut out plurality of region images is compared with the dictionary data (character image information to be compared, which is managed by the character image management unit), and the character recognition result for each region image and its similarity are obtained. After that, the character recognition result having the highest degree of similarity among those results and the cutout area used to obtain the character recognition result are set as the recognition result in the search area. Then, based on the position of the cutout area of the recognition result, the search area for the next character is set, and the same process is repeated. In this embodiment, the 9-digit serial ID 107 included in the recognition target image 501 is recognized in order from the first digit (the leftmost character).

１桁目の文字の探索領域５０２は、ガイド６０４〜６０７に基づいて切り出された認識対象画像５０１の左端から所定座標離れた位置に設定される。この最初の探索領域５０２の位置は、ガイドに収まるように撮影した場合に左端の文字が存在する可能性が高い領域として予め設定しておくものとする。そして、探索領域５０２内に切り出し領域５０５を設定して、その切り出し領域５０５の画像を抽出し、１桁目に出現する可能性のある文字に関する辞書データと比較して、辞書データに含まれる各文字との間の類似度を評価する。また、切り出し領域５０５は、探索領域５０２内で水平方向（ｘ軸方向）と垂直方向（ｙ軸方向）のそれぞれをずらした複数の位置に設定され、それぞれの位置の切り出し領域の画像について辞書データと比較して類似度を評価する。すなわち、探索領域５０２全体を網羅するように所定サイズの切り出し領域を複数カ所に設定して、それぞれの位置の切り出し領域の画像について辞書データとの比較を行う。その後、さらに、切り出し領域５０５の幅と高さを変更して、再度、探索領域５０２全体を網羅するように複数カ所に切り出し領域を設定して画像データを抽出して辞書データとの比較を行う。例えば、切り出し領域５０５の幅を３パターン、高さを２パターン変更する場合、切り出し領域５０５のサイズは、全部で３×２＝６パターンとなる。また、切り出し領域５０５を、水平方向に４回、垂直方向に４回スライドさせた位置それぞれに設定した場合、探索領域５０２に対して、切り出し領域５０５を（４＋１）×（４＋１）＝２５カ所に設定することになる。切り出し領域のサイズ変更が６パターンで、設定する位置が２５カ所であるので、トータルで６×２５＝１５０回、探索領域５０２から切り出し領域の画像を切り出すことになる。そして、画像を切り出すたびに、１桁目に出現する可能性のある文字の辞書データ（比較対象の文字画像情報）と比較し、それぞれの文字に対する類似度を評価する。 The search area 502 for the first digit character is set at a position separated by a predetermined coordinate from the left end of the recognition target image 501 cut out based on the guides 604 to 607. The position of the first search area 502 is set in advance as an area in which there is a high possibility that the leftmost character exists when the image is taken so as to fit in the guide. Then, a cutout area 505 is set in the search area 502, an image of the cutout area 505 is extracted, compared with dictionary data relating to characters that may appear in the first digit, and each included in the dictionary data. Evaluate the similarity between letters. Further, the cutout area 505 is set at a plurality of positions in the search area 502 that are shifted in the horizontal direction (x-axis direction) and the vertical direction (y-axis direction), and dictionary data is obtained for the image of the cutout area at each position. Evaluate the similarity in comparison with. That is, the cutout area of a predetermined size is set at a plurality of places so as to cover the entire search area 502, and the image of the cutout area at each position is compared with the dictionary data. After that, the width and height of the cutout area 505 are further changed, the cutout areas are set at a plurality of places so as to cover the entire search area 502, and the image data is extracted and compared with the dictionary data. .. For example, when the width of the cutout area 505 is changed by 3 patterns and the height is changed by 2 patterns, the size of the cutout area 505 is 3 × 2 = 6 patterns in total. Further, when the cutout area 505 is set at each of the positions where the cutout area 505 is slid four times in the horizontal direction and four times in the vertical direction, the cutout area 505 is set to (4 + 1) × (4 + 1) = 25 places with respect to the search area 502. It will be set. Since the size of the cutout area is changed in 6 patterns and the setting positions are 25 places, the image of the cutout area is cut out from the search area 502 a total of 6 × 25 = 150 times. Then, each time the image is cut out, it is compared with the dictionary data (character image information of the character to be compared) of the character that may appear in the first digit, and the similarity to each character is evaluated.

すべての切り出し領域の画像を評価した結果のうち、最も類似度が高かった文字を１桁目の認識結果として確定するとともに、その最も類似度が高かったときの切り出し領域の位置を１桁目の文字の位置とする。５０４は、類似度が最も高かった「Ｂ」が１桁目の文字の認識結果として確定され、そのときの切り出し位置を示すものである。 Of the results of evaluating the images of all the cutout areas, the character with the highest similarity is determined as the recognition result of the first digit, and the position of the cutout area when the similarity is the highest is the first digit. The position of the character. In 504, "B" having the highest degree of similarity is determined as the recognition result of the first digit character, and indicates the cutout position at that time.

その後、次の隣接する文字（左から２番目の文字）の探索領域５０３を設定する。探索領域５０３は、１桁目の認識結果の位置５０４からの相対位置で設定される。２桁目の文字についても、１桁目の場合と同様に、探索領域５０３内で複数の切り出し領域５０６を設定してそれぞれについて評価を行い、最も類似度の高い文字を決定していく。３桁目以降も、同様に、探索領域の設定と、切り出し領域の設定と、辞書データとの類似比較とを順次行って、認識結果の文字を確定させていく。 After that, the search area 503 of the next adjacent character (second character from the left) is set. The search area 503 is set at a relative position from the position 504 of the recognition result of the first digit. For the second digit character, as in the case of the first digit, a plurality of cutout areas 506 are set in the search area 503 and each of them is evaluated to determine the character having the highest degree of similarity. In the third and subsequent digits as well, the search area is set, the cutout area is set, and the similarity comparison with the dictionary data is sequentially performed to determine the characters of the recognition result.

なお、撮影時に左右にずれて撮影されることも考慮し、１桁目の文字に対する探索領域５０２はやや広めにとるのが望ましい。一方、文字間のスペースは被写体の文字列に応じて予め決まっているので、２桁目以降の文字に対する探索領域５０３は、探索領域５０２より狭めに設定してもよい。 It is desirable to make the search area 502 for the first digit character slightly wider in consideration of the fact that the characters are shifted to the left and right during shooting. On the other hand, since the space between characters is predetermined according to the character string of the subject, the search area 503 for the characters in the second and subsequent digits may be set narrower than the search area 502.

図１０は、文字画像管理部３０８で管理する文字画像情報（辞書データ）のデータ構造の一例である。文字画像情報リストは、複数の文字画像情報を含む。文字画像情報（文字認識辞書の辞書データ）は、各文字の文字情報（キャラクターコード）と、各文字の文字画像から抽出した特徴情報を含む。各文字の特徴情報は、例えば、ＨＯＧ（ＨｉｓｔｏｇｒａｍｓｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔｓ）特徴量を使用すればよいが、その他の特徴量であっても構わない。 FIG. 10 is an example of a data structure of character image information (dictionary data) managed by the character image management unit 308. The character image information list includes a plurality of character image information. The character image information (dictionary data of the character recognition dictionary) includes character information (character code) of each character and feature information extracted from the character image of each character. As the feature information of each character, for example, HOG (Histograms of Oriented Gradients) feature amount may be used, but other feature amounts may be used.

図１１は、文字認識部３０７により実行された文字認識処理の結果情報のデータ構造の一例である。認識結果情報は、複数の認識結果文字情報を含む。認識結果文字情報は、文字単位の認識結果に対応し、１つの認識結果文字情報に対して複数の候補文字情報を含む。本実施例のシリアルＩＤ１０７は９桁なので、認識結果情報は各桁に対応する９つの認識結果文字情報を含む。また、各候補文字情報は、矩形情報（各候補文字に対応する切り出し領域の位置とサイズ）と、文字情報（キャラクターコード）と、評価値とを含む。評価値は、図１０の文字画像情報の特徴情報と、切り出し領域で切り出した画像から抽出した特徴情報とを比較した結果の相関係数（類似度）である。 FIG. 11 is an example of a data structure of result information of the character recognition process executed by the character recognition unit 307. The recognition result information includes a plurality of recognition result character information. The recognition result character information corresponds to the recognition result for each character, and includes a plurality of candidate character information for one recognition result character information. Since the serial ID 107 of this embodiment has 9 digits, the recognition result information includes 9 recognition result character information corresponding to each digit. Further, each candidate character information includes rectangular information (position and size of a cutout area corresponding to each candidate character), character information (character code), and an evaluation value. The evaluation value is a correlation coefficient (similarity) as a result of comparing the feature information of the character image information of FIG. 10 with the feature information extracted from the image cut out in the cutout region.

図１２は、モバイルアプリ３０２の文字認識部３０７が、タイヤを撮影した後に実行する文字認識処理の詳細を示すフローチャートである。 FIG. 12 is a flowchart showing details of the character recognition process executed by the character recognition unit 307 of the mobile application 302 after shooting the tire.

ステップＳ１２０１で、文字認識部３０７は、ガイドに基づいて撮影画像から切り出された認識対象画像５０１に対して、１桁目の文字の探索領域（図５の５０２）を設定する。 In step S1201, the character recognition unit 307 sets the search area for the first digit character (502 in FIG. 5) for the recognition target image 501 cut out from the captured image based on the guide.

ステップＳ１２０２で、文字認識部３０７は、探索領域の画像を切り出す。 In step S1202, the character recognition unit 307 cuts out an image in the search area.

ステップＳ１２０３で、文字認識部３０７は、切り出した探索領域の画像に対して、切り出し領域の設定と、辞書データとの類似比較とを順次行って、切り出し領域の位置とそれぞれの位置における候補文字とを検出する（文字検出処理）。なお、ステップＳ１２０３の処理の詳細は、図１４を用いて後述する。 In step S1203, the character recognition unit 307 sequentially sets the cutout area and compares the similarity with the dictionary data for the image of the cutout search area, and sets the position of the cutout area and the candidate characters at each position. (Character detection processing). The details of the process in step S1203 will be described later with reference to FIG.

ステップＳ１２０４で、最後の桁（９桁目）の文字かどうかを判断し、最後の桁の文字と判断した場合は、ステップＳ１２０７に進む。最後の桁の文字でないと判断した場合、ステップＳ１２０５に進む。 In step S1204, it is determined whether or not the character is the last digit (9th digit), and if it is determined to be the character of the last digit, the process proceeds to step S1207. If it is determined that the character is not the last digit, the process proceeds to step S1205.

テップＳ１２０５で、文字認識部３０７は、図１１で示した認識結果文字情報から、評価値（類似度）の最も高い候補文字情報を検索し、矩形情報（その候補文字情報に対応する切り出し領域の位置情報）を取得する。 In TEP S1205, the character recognition unit 307 searches for the candidate character information having the highest evaluation value (similarity) from the recognition result character information shown in FIG. 11, and the rectangular information (the cutout area corresponding to the candidate character information). Position information) is acquired.

ステップＳ１２０６で、ステップＳ１２０５で取得した矩形情報に基づいて、次の桁の探索領域を設定し、ステップＳ１２０２に進む。 In step S1206, a search area for the next digit is set based on the rectangular information acquired in step S1205, and the process proceeds to step S1202.

ステップＳ１２０７で、文字認識部３０７は、情報表示部３０４を介して、認識結果を画面の認識結果表示領域６０８に表示して終了する。 In step S1207, the character recognition unit 307 displays the recognition result in the recognition result display area 608 of the screen via the information display unit 304, and ends.

図１３は、図６〜８で説明したようにユーザの指示により文字認識結果を修正した後に、モバイルアプリ３０２で実行される処理のフローチャートである。 FIG. 13 is a flowchart of processing executed by the mobile application 302 after correcting the character recognition result according to the user's instruction as described with reference to FIGS. 6 to 8.

ステップＳ１３０１で、文字認識部３０７は、当該修正された文字の次の桁から最後の桁までの各文字の評価値（候補文字情報の中で最も高い評価値）の和を算出する。 In step S1301, the character recognition unit 307 calculates the sum of the evaluation values (highest evaluation value in the candidate character information) of each character from the next digit to the last digit of the corrected character.

ステップＳ１３０２で、文字認識部３０７は、修正対象の文字に対応する認識結果文字情報の中から、修正後の文字と文字情報（キャラクターコード）が一致する候補文字情報を検索する。 In step S1302, the character recognition unit 307 searches for candidate character information in which the corrected character and the character information (character code) match from the recognition result character information corresponding to the character to be corrected.

ステップＳ１３０３で、文字認識部３０７は、Ｓ１３０２で検索された候補文字情報に含まれる矩形情報を取得する。 In step S1303, the character recognition unit 307 acquires the rectangular information included in the candidate character information searched in S1302.

ステップＳ１３０４で、文字認識部３０７は、Ｓ１３０３で取得した矩形情報（修正後の文字に対応する切り出し領域の位置情報）に基づいて、次の桁の探索領域を再設定する。 In step S1304, the character recognition unit 307 resets the search area of the next digit based on the rectangular information (position information of the cutout area corresponding to the corrected character) acquired in S1303.

ステップＳ１３０５で、文字認識部３０７は、探索領域の画像を切り出す。 In step S1305, the character recognition unit 307 cuts out an image in the search area.

ステップＳ１３０６で、文字認識部３０７は、切り出した探索領域の画像に対して、切り出し領域の設定と、辞書データとの類似比較とを順次行って、切り出し領域の位置とそれぞれの位置における候補文字とを検出する（文字検出処理）。ステップＳ１３０６の処理の詳細は、Ｓ１２０３の処理と同様であり、図１４を用いて後述する。 In step S1306, the character recognition unit 307 sequentially sets the cutout area and compares the similarity with the dictionary data for the image of the cutout search area, and sets the position of the cutout area and the candidate characters at each position. (Character detection processing). The details of the process of step S1306 are the same as the process of S1203, and will be described later with reference to FIG.

ステップＳ１３０７で、文字認識部３０７は、最後の桁（９桁目）の文字かどうかを判断し、最後の桁の文字と判断した場合は、ステップＳ１３１０に進み、最後の桁の文字でないと判断した場合、ステップＳ１３０８に進む。 In step S1307, the character recognition unit 307 determines whether or not the character is the last digit (9th digit), and if it is determined to be the last digit character, proceeds to step S1310 and determines that the character is not the last digit character. If so, the process proceeds to step S1308.

ステップＳ１３０８で、文字認識部３０７は、認識結果文字情報から、評価値の最も高い候補文字情報を検索し、矩形情報を取得する。ステップＳ１３０９で、ステップＳ１３０８で取得した矩形情報から、次の桁の探索領域を設定し、ステップＳ１３０５に進む。 In step S1308, the character recognition unit 307 searches for the candidate character information having the highest evaluation value from the recognition result character information, and acquires the rectangle information. In step S1309, the search area for the next digit is set from the rectangular information acquired in step S1308, and the process proceeds to step S1305.

以上のように、Ｓ１３０４で再設定された探索領域に基づきＳ１３０５〜Ｓ１３０９の処理が再実行されるので、当該修正された文字の次の桁以降の認識結果は、図１２での結果と異なる可能性がある。 As described above, since the processing of S1305 to S1309 is re-executed based on the search area reset in S1304, the recognition result after the next digit of the corrected character may be different from the result in FIG. There is sex.

ステップＳ１３１０で、文字認識部３０７は、Ｓ１３０５〜Ｓ１３０９の処理の結果に基づいて、修正された文字の次の桁から最後の桁までの各文字の評価値（候補文字情報の中で最も高い評価値）の和を算出する。 In step S1310, the character recognition unit 307 evaluates each character from the next digit to the last digit of the corrected character (highest evaluation among the candidate character information) based on the processing result of S1305 to S1309. Value) sum is calculated.

ステップＳ１３１１で、文字認識部３０７は、ステップＳ１３１０で算出した（修正後に認識処理Ｓ１３０５〜Ｓ１３０９を実行した後の）評価値の和が、ステップＳ１３０１で算出した（修正前の）評価値の和より高いか判定する。ステップＳ１３１１において、ステップＳ１３１０で算出した評価値の和の方が高いと判定した場合、ステップＳ１３１２に進む。一方、ステップＳ１３１０で算出した評価値の和の方が高くないと判定した場合はそのまま終了する。 In step S1311, the character recognition unit 307 calculates the sum of the evaluation values (after executing the recognition processes S1305 to S1309 after the correction) in step S1310 from the sum of the evaluation values calculated in step S1301 (before the correction). Determine if it is high. If it is determined in step S1311 that the sum of the evaluation values calculated in step S1310 is higher, the process proceeds to step S1312. On the other hand, if it is determined that the sum of the evaluation values calculated in step S1310 is not higher, the process ends as it is.

ステップＳ１３１２で、文字認識部３０７は、画面の認識結果表示領域６０８において、当該修正された文字の次の桁から最後の桁までの認識結果を、Ｓ１３０５〜Ｓ１３０９の処理で得た候補文字を用いて更新する。 In step S1312, the character recognition unit 307 uses the candidate characters obtained in the processing of S1305 to S1309 for the recognition result from the next digit to the last digit of the corrected character in the recognition result display area 608 of the screen. And update.

図１４は、図１２のステップＳ１２０３、および、図１３のステップＳ１３０６の文字検出処理の詳細を示すフローチャートである。特に、切り出し領域のサイズを変えながら、探索領域内の複数の位置に切り出し領域を設定して認識処理を行う処理の詳細を示すものである。 FIG. 14 is a flowchart showing details of the character detection process of step S1203 of FIG. 12 and step S1306 of FIG. In particular, it shows the details of the process of setting the cutout area at a plurality of positions in the search area and performing the recognition process while changing the size of the cutout area.

ステップＳＳ１４０１で、文字認識部３０７は、切り出し領域（図５の５０５、５０６）の幅を、最小値に設定し、ステップＳ１４０２に進む。 In step SS1401, the character recognition unit 307 sets the width of the cutout area (505, 506 in FIG. 5) to the minimum value, and proceeds to step S1402.

ステップＳ１４０２で、切り出し領域の幅が所定の最大値を超えたかどうかを判断する。切り出し領域の幅が最大値を超えたと判断された場合は終了する。切り出し領域の幅が最大値を超えていないと判断された場合は、ステップＳＳ１４０３で、切り出し領域の高さを、最小値に設定し、ステップＳ１４０４に進む。 In step S1402, it is determined whether or not the width of the cutout region exceeds a predetermined maximum value. If it is determined that the width of the cutout area exceeds the maximum value, the process ends. If it is determined that the width of the cutout area does not exceed the maximum value, the height of the cutout area is set to the minimum value in step SS1403, and the process proceeds to step S1404.

ステップＳ１４０４で、切り出し領域の高さが所定の最大値を超えたかどうかを判断する。切り出し領域の高さが最大値を超えたと判断された場合は、ステップＳ１４１３で、切り出し領域の幅を所定量大きくして、ステップＳ１４０２に進む。ステップＳ１４０４で、切り出し領域の高さが最大値を超えていないと判断された場合は、ステップＳＳ１４０５に進む。 In step S1404, it is determined whether or not the height of the cutout region exceeds a predetermined maximum value. If it is determined that the height of the cutout region exceeds the maximum value, the width of the cutout region is increased by a predetermined amount in step S1413, and the process proceeds to step S1402. If it is determined in step S1404 that the height of the cutout region does not exceed the maximum value, the process proceeds to step SS1405.

ステップＳ１４０５で、文字認識部３０７は、切り出し領域の左端のｘ座標を、初期値（探索領域の左端のｘ座標）に設定し、ステップＳ１４０６に進む。ステップＳ１４０６で、切り出し領域の右端のｘ座標が、探索領域の右端のｘ座標を超えたかどうかを判断する。切り出し領域の右端のｘ座標が、探索領域の右端のｘ座標を超えたと判断された場合は、ステップＳ１４１２で、切り出し領域の高さを所定量大きくして、ステップＳ１４０４に進む。ステップＳ１４０６で、切り出し領域の右端のｘ座標が、探索領域の右端のｘ座標を超えていないと判断された場合は、ステップＳＳ１４０７で、切り出し領域の上端のｙ座標を、初期値（探索領域の上端のｙ座標）に設定し、ステップＳ１４０８に進む。 In step S1405, the character recognition unit 307 sets the x-coordinate of the left end of the cutout area to the initial value (x-coordinate of the left end of the search area), and proceeds to step S1406. In step S1406, it is determined whether or not the x-coordinate of the right end of the cutout area exceeds the x-coordinate of the right end of the search area. If it is determined that the x-coordinate at the right end of the cut-out area exceeds the x-coordinate at the right end of the search area, the height of the cut-out area is increased by a predetermined amount in step S1412, and the process proceeds to step S1404. If it is determined in step S1406 that the x-coordinate of the right end of the cut-out area does not exceed the x-coordinate of the right end of the search area, the y-coordinate of the upper end of the cut-out area is set to the initial value (in the search area) in step SS1407. It is set to the y coordinate of the upper end), and the process proceeds to step S1408.

ステップＳ１４０８で、切り出し領域の下端のｙ座標が、探索領域の下端のｙ座標を超えたかどうかを判断する。切り出し領域の下端のｙ座標が、探索領域の下端のｙ座標を超えたと判断された場合は、ステップＳ１４１１で、切り出し領域をｘ軸方向にスライド（ｘ座標を大きく）して、ステップＳ１４０６に進む。ステップＳ１４０８で、切り出し領域の下端のｙ座標が、探索領域の下端のｙ座標を超えていないと判断された場合は、ステップＳ１４０９で当該切り出し領域の画像に対して文字画像情報（辞書データ）との比較処理（文字認識処理）を行う。ステップＳ１４０９の処理の詳細は図１５で説明する。ステップＳ１４１０で、切り出し領域をｙ軸方向にスライド（ｙ座標を大きく）して、ステップＳ１４０８に進む。 In step S1408, it is determined whether or not the y-coordinate of the lower end of the cutout area exceeds the y-coordinate of the lower end of the search area. If it is determined that the y-coordinate of the lower end of the cut-out area exceeds the y-coordinate of the lower end of the search area, the cut-out area is slid in the x-axis direction (the x-coordinate is increased) in step S1411, and the process proceeds to step S1406. .. If it is determined in step S1408 that the y-coordinate of the lower end of the cut-out area does not exceed the y-coordinate of the lower end of the search area, the character image information (dictionary data) is added to the image of the cut-out area in step S1409. Comparison processing (character recognition processing) is performed. Details of the process in step S1409 will be described with reference to FIG. In step S1410, the cutout region is slid in the y-axis direction (y-coordinate is increased), and the process proceeds to step S1408.

図１５は、図１４のステップＳ１４０９の文字認識の処理の詳細なフローチャートである。 FIG. 15 is a detailed flowchart of the character recognition process in step S1409 of FIG.

ステップＳ１５０１で、文字認識部３０７は、切り出し領域（図５の５０５、５０６）の画像を切り出し、ステップＳ１５０２で、画像処理部３０６は、当該切り出した画像から特徴情報（ＨＯＧ特徴量）を抽出する。 In step S1501, the character recognition unit 307 cuts out an image of the cutout region (505, 506 in FIG. 5), and in step S1502, the image processing unit 306 extracts feature information (HOG feature amount) from the cut out image. ..

ステップＳ１５０３で、文字認識部３０７は、図１０で示した文字画像情報リストの先頭の文字画像情報（辞書データ）を取得する。ステップＳ１５０４で、当該取得した文字画像情報に含まれる特徴情報と、ステップＳ１５０２で抽出した特徴情報とを比較して、相関係数（類似度）を評価値として求める。 In step S1503, the character recognition unit 307 acquires the character image information (dictionary data) at the beginning of the character image information list shown in FIG. In step S1504, the feature information included in the acquired character image information is compared with the feature information extracted in step S1502, and the correlation coefficient (similarity) is obtained as an evaluation value.

ステップＳ１５０５で、文字認識部３０７は、図１１で示した候補文字情報を作成し、ステップＳ１５０４の比較結果の相関係数を評価値として設定する。このとき、候補文字情報の文字情報（キャラクターコード）には、文字画像情報の文字情報、矩形情報には、切り出し領域の位置とサイズを設定する。 In step S1505, the character recognition unit 307 creates the candidate character information shown in FIG. 11 and sets the correlation coefficient of the comparison result in step S1504 as an evaluation value. At this time, the character information (character code) of the candidate character information is set to the character information of the character image information, and the rectangular information is set to the position and size of the cutout area.

ステップＳ１５０６で、文字認識部３０７は、処理中の桁の文字に関して、（図１１で示した）認識結果文字情報の候補文字情報を検索し、ステップＳ１５０５で作成した候補文字情報と、文字情報が一致する候補文字情報がすでに存在するかどうかを判断する。ステップＳ１５０６で、文字情報が一致する候補文字情報が存在しないと判断された場合は、ステップＳ１５０９に進む。文字情報が一致する候補文字情報がすでに存在すると判断された場合は、ステップＳ１５０７に進む。 In step S1506, the character recognition unit 307 searches for the candidate character information of the recognition result character information (shown in FIG. 11) with respect to the character of the digit being processed, and the candidate character information created in step S1505 and the character information are obtained. Determine if matching candidate character information already exists. If it is determined in step S1506 that there is no candidate character information that matches the character information, the process proceeds to step S1509. If it is determined that the candidate character information that matches the character information already exists, the process proceeds to step S1507.

ステップＳ１５０７で、文字認識部３０７は、ステップＳ１５０５で作成した候補文字情報の評価値の方が、既に存在する候補文字情報の評価値より高いかどうかを判断する。ステップＳ１５０５で作成した候補文字情報の方が高いと判断されなかった場合は、ステップＳ１５１０に進む。ステップＳ１５０５で作成した候補文字情報の方が高いと判断された場合は、ステップＳ１５０８に進み、認識結果文字情報内に既に存在する候補文字情報を削除する。そして、ステップＳ１５０９で、ステップＳ１５０５で作成した候補文字情報を、認識結果文字情報に格納し、ステップＳ１５１０へ進む。 In step S1507, the character recognition unit 307 determines whether or not the evaluation value of the candidate character information created in step S1505 is higher than the evaluation value of the candidate character information that already exists. If it is not determined that the candidate character information created in step S1505 is higher, the process proceeds to step S1510. If it is determined that the candidate character information created in step S1505 is higher, the process proceeds to step S1508, and the candidate character information already existing in the recognition result character information is deleted. Then, in step S1509, the candidate character information created in step S1505 is stored in the recognition result character information, and the process proceeds to step S1510.

ステップＳ１５１０で、文字画像情報リストの最後かどうかを判断し、最後でないと判断された場合は、ステップＳ１５１１で、文字画像情報リストの次の文字画像情報を取得する。ステップＳ１５１０で、文字画像情報リストの最後と判断された場合は終了する。 In step S1510, it is determined whether or not it is the last of the character image information list, and if it is determined that it is not the last, in step S1511, the next character image information of the character image information list is acquired. If it is determined in step S1510 that the end of the character image information list is reached, the process ends.

図９は、本実施例の文字認識結果に対して修正処理を行った後、切り出し領域が再設定される様子の例を示した図である。 FIG. 9 is a diagram showing an example in which the cutout area is reset after the character recognition result of this embodiment is corrected.

９０１は、撮影したタイヤのシリアルＩＤ１０７の画像である。この画像に対して図１２の処理を実行することにより、最初の文字認識結果として図６の６０８に示したような結果が得られたものとする。この最初の文字認識結果９０２に対応する切り出し領域の位置は、９０３の位置であったとする。その後、図７を用いて説明したように、左から３番目の文字がユーザの指示により修正されると、当該修正後の文字に対応する矩形領域が検索され、当該修正された文字以降の桁に対して探索領域の再設定と切り出し領域の設定と認識処理とが再実行される。９０４、９０５は、それぞれ、図１３で説明した処理を実行した結果の文字認識結果とその切り出し領域とを示している。最初の切り出し領域９０３は、３桁目の文字の切り出し領域を誤って判定した結果、４桁目以降の切り出し領域も不正になったため、文字認識結果も誤っている。そして、ユーザが３桁目の修正を行うと、図１３の処理が実行され、その結果、修正後の切り出し領域９０５では４桁目以降も修正される。 901 is an image of the serial ID 107 of the photographed tire. By executing the process of FIG. 12 on this image, it is assumed that the result as shown in 608 of FIG. 6 is obtained as the first character recognition result. It is assumed that the position of the cutout region corresponding to the first character recognition result 902 is the position of 903. After that, as described with reference to FIG. 7, when the third character from the left is corrected by the user's instruction, the rectangular area corresponding to the corrected character is searched, and the digits after the corrected character are searched. The search area is reset, the cutout area is set, and the recognition process is re-executed. Reference numeral 904 and 905 indicate a character recognition result as a result of executing the process described with reference to FIG. 13 and a cutout area thereof, respectively. As a result of erroneously determining the cutout area of the third digit character in the first cutout area 903, the cutout area of the fourth and subsequent digits is also invalid, so that the character recognition result is also incorrect. Then, when the user corrects the third digit, the process of FIG. 13 is executed, and as a result, the fourth and subsequent digits are also corrected in the corrected cutout area 905.

以上述べたように、ユーザが認識結果を修正すると、当該修正された文字以降の文字について、探索領域の再設定と、当該再設定された探索領域内での切り出し領域の設定とを再度実行して、認識結果の修正を行う。したがって、１つの文字の誤認識に伴って生じていたそれ以降の文字の誤認識についても、当該１つの文字が修正されるとそれ以降の文字の誤認識についても修正されることになる。よって、ユーザが認識結果の誤りを修正する負担を軽減することができる。また、修正した文字に続くすべての文字の修正前の評価値と、修正後の評価値とを比較して、修正後の評価値が高かった場合に修正後の認識結果を画面に反映するため、修正前の認識結果より悪い認識結果で、画面を更新することを防ぐようにもしている。 As described above, when the user corrects the recognition result, the search area is reset and the cutout area is set in the reset search area again for the characters after the corrected character. Then, the recognition result is corrected. Therefore, the misrecognition of the subsequent characters caused by the misrecognition of one character is also corrected, and the misrecognition of the subsequent characters is also corrected when the one character is corrected. Therefore, it is possible to reduce the burden on the user to correct the error in the recognition result. In addition, in order to compare the evaluation value before correction and the evaluation value after correction of all characters following the corrected character, and to reflect the recognition result after correction on the screen when the evaluation value after correction is high. , It also prevents the screen from being updated with a recognition result that is worse than the recognition result before correction.

（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other Examples)
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

上述した解題を解決するために、本発明の情報処理装置は、探索領域内の所定の位置に設けた第１の切り出し領域と、当該所定の位置とは異なる位置に設けた切り出し領域であって前記第一の切り出し領域とサイズの異なる第二の切り出し領域と、を含む複数の切り出し領域を前記探索領域内に設定する第一の設定手段と、前記第一の設定手段によって設定された前記複数の切り出し領域それぞれに対応する画像を抽出し、当該抽出した各画像と辞書データとの比較を行うことにより各画像の候補文字情報を取得し、当該取得した各画像の候補文字情報の中から評価値の最も高い候補文字情報を、認識結果として出力する文字検出手段と、を有することを特徴とする。 In order to solve the above-mentioned problem, the information processing apparatus of the present invention includes a first cutout area provided at a predetermined position in the search area and a cutout area provided at a position different from the predetermined position. A first setting means for setting a plurality of cutout areas including a first cutout area and a second cutout area having a size different from the first cutout area in the search area, and the plurality of pieces set by the first setting means. Image corresponding to each of the cutout areas of is extracted, candidate character information of each image is acquired by comparing each extracted image with dictionary data, and evaluation is performed from the candidate character information of each acquired image. It is characterized by having a character detecting means for outputting the candidate character information having the highest value as a recognition result.

Claims

The first setting means for setting the search area for the image to be recognized, and
A second setting means for setting a cutout area at a plurality of places in the search area, and
By extracting an image corresponding to each of the plurality of cutout regions set by the second setting means and comparing each extracted image with dictionary data, the candidate character information and the candidate character are supported. It has a character detecting means that detects the position information of the cutout area and outputs the candidate character information having the highest evaluation value from the detected candidate character information as a recognition result.
The first setting means further sets a search area for the next character based on the position information of the cutout area corresponding to the recognition result output by the character detecting means, thereby setting the second setting. An information processing device characterized in that processing by the means and the character detecting means is repeatedly executed.

A display means for displaying the recognition result output by the character detection means, and a display means for displaying the recognition result.
Based on the user's instruction, the correction means for executing the correction of the recognition result displayed by the display means, and the correction means.
An acquisition means for acquiring the position information of the cutout area corresponding to the character after the correction by the correction means, and
With more
Based on the position information acquired by the acquisition means, the first setting means resets the search area for the next character of the modified character, based on the reset search area. The information processing apparatus according to claim 1, wherein the processing by the second setting means and the character detecting means is executed.

By resetting the search area for the character next to the corrected character, the result of processing by the second setting means and the character detecting means is executed based on the reset search area. The information processing apparatus according to claim 2, wherein the evaluation value is compared with the evaluation value before the correction to determine whether or not to correct the recognition result of the character next to the corrected character.

A program for causing a computer to function as each means of the information processing apparatus according to any one of claims 1 to 3.

The first setting step of setting the search area for the image to be recognized, and
A second setting step for setting a cutout area at a plurality of locations in the search area, and
By extracting the images corresponding to each of the plurality of cutout regions set in the second setting step and comparing each extracted image with the dictionary data, the candidate character information and the candidate character are supported. It has a character detection step that detects the position information of the cutout area and outputs the candidate character information having the highest evaluation value from the detected candidate character information as a recognition result.
In the first setting step, the second setting is further performed by setting a search area for the next character based on the position information of the cutout area corresponding to the recognition result output by the character detecting means. An information processing method characterized in that the processes in the step and the character detection step are repeatedly executed.