JP2005352705A

JP2005352705A - Device and method for pattern recognition, and character recognizing method

Info

Publication number: JP2005352705A
Application number: JP2004171878A
Authority: JP
Inventors: Takashi Nose; 隆能勢; Taizo Umezaki; 太造梅崎; Naoki Okamoto; 直樹岡本
Original assignee: Omron Corp; Omron Tateisi Electronics Co
Current assignee: Omron Corp
Priority date: 2004-06-09
Filing date: 2004-06-09
Publication date: 2005-12-22

Abstract

<P>PROBLEM TO BE SOLVED: To precisely recognize a character string even hen a frame line whose position is indefinite is present in the circumference of a pattern (e.g. character string) to be recognized. <P>SOLUTION: A readout range conforming with a specified shape is cut out of an image of the same subject which is continuous in time series, an image of an object pattern recognition line is taken out of the cut image, and while a pattern recognition candidate included in the image of the object pattern recognition line is extracted, a non-pattern-recognition candidate is removed from the pattern recognition candidate, so that the pattern recognition candidate from which the non-pattern-recognition candidate is removed is outputted as a determined pattern. When a character string encircled with frame lines is recognized as patterns and a part of a frame line is closed to the character string and misrecognized as the character string (for example, a longitudinal bar of a frame line is misrecognized as "I" or "1".), the misrecognized character is removed as a non-pattern-recognition candidate. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、パターン認識装置、パターン認識方法及び文字認識方法に関し、たとえば、自動車等車両の登録番号（いわゆる車両ナンバー）の認識や、広告看板等に掲載された文字列（例：電話番号、ホームページアドレス、電子メールアドレスなど）の認識に用いて好適なパターン認識装置、パターン認識方法及び文字認識方法に関する。 The present invention relates to a pattern recognition apparatus, a pattern recognition method, and a character recognition method, for example, recognition of a registration number (so-called vehicle number) of a vehicle such as an automobile, and a character string (eg, telephone number, homepage) posted on an advertisement signboard. The present invention relates to a pattern recognition apparatus, a pattern recognition method, and a character recognition method suitable for use in recognition of addresses, e-mail addresses, and the like.

パターン認識とは、人間が視覚系を通して外界を認識する行為、つまり、未知の入力パターンと、あらかじめ記憶している標準パターンとの間の類似性を評価し、もっとも類似性の高い標準パターンを入力パターンとして認識する一連の行為をコンピュータに代行させることをいう。 Pattern recognition is the action that humans recognize the outside world through the visual system, that is, the similarity between an unknown input pattern and a standard pattern stored in advance is evaluated, and the standard pattern with the highest similarity is input. This refers to having a computer act as a series of actions that are recognized as patterns.

ここで、標準パターンの種類のことを“クラス”という。たとえば、数字の場合は０〜９までの１０個のクラスがある。パターン認識の典型的な対象例は、数字や文字又は記号など（以下「文字等」という。）である。これは、文字等のクラスには上限があり、扱いやすいからである。たとえば、数字やアルファベットは、ＡＳＣＩＩ（American Standard Code for Information Interchange）コードと呼ばれる７ビットの計１２８種類のクラスしかない（正確には空白文字と制御記号を除外した９４種類）。 Here, the type of standard pattern is called “class”. For example, in the case of numbers, there are 10 classes from 0 to 9. Typical target examples of pattern recognition are numbers, characters, symbols, and the like (hereinafter referred to as “characters”). This is because there is an upper limit in the class of characters and the like, and it is easy to handle. For example, numbers and alphabets have only a total of 128 types of 7-bit classes called ASCII (American Standard Code for Information Interchange) codes (94 types excluding blank characters and control symbols).

また、他の欧州言語で用いられるアクセント記号付きの文字やキリル文字、アラビア文字なども表せるようにした拡張ＡＳＣＩＩ文字セット（８ビットＡＳＣＩＩ文字）も定義されている。さらに、６３種類の１バイト仮名文字の規格（ＪＩＳ７／ＪＩＳ８の２方式）や、世界中の主要な文字（日本語、中国語、韓国語等）を一括して扱う多重言語文字セットの規格（いわゆるＵｎｉｃｏｄｅ）なども定義されている。 In addition, an extended ASCII character set (8-bit ASCII characters) is also defined which can express accented characters, Cyrillic characters, Arabic characters, etc. used in other European languages. In addition, there are 63 different standards for 1-byte kana characters (JIS7 / JIS8) and multilingual character set standards that handle major characters (Japanese, Chinese, Korean, etc.) all over the world ( So-called Unicode) is also defined.

このように、パターン認識のうち、文字等を認識するものは、たとえば、その対象が数字とアルファベット及び一部の記号である場合、高々９４種類のクラスを評価すればよいので、コンピュータの記憶容量の圧迫を招かず、しかも、コンピュータの処理負担も軽いので実用的である。しかしながら、本発明の思想は、かかる限定的な用途のパターン認識に限定されない。文字等に限らず、クラスとして定義できるものであればよく、任意にデザインされた図形を認識するものであってもよい。以下においては、認識対象のパターンを「文字等」として説明するが、これは説明の簡単化のための便宜である。 As described above, among the pattern recognitions, those that recognize characters and the like, for example, when the target is numerals, alphabets, and some symbols, it is sufficient to evaluate at most 94 types of classes. This is practical because it does not cause any pressure on the computer, and the processing load on the computer is light. However, the idea of the present invention is not limited to such limited pattern recognition. It may be anything that can be defined as a class, not limited to characters and the like, and may recognize a figure designed arbitrarily. In the following description, the pattern to be recognized is described as “characters”, but this is for convenience of explanation.

パターン認識装置の従来技術について説明する。たとえば、特許文献１には、「時系列文書画像から、少ない計算コスト・メモリ量で広範囲な文書を読み取る方法及び装置」に関する技術が記載されている。 The prior art of the pattern recognition apparatus will be described. For example, Patent Document 1 describes a technique relating to “a method and apparatus for reading a wide range of documents from a time-series document image with a small calculation cost and memory amount”.

この従来技術によれば、“時系列文書画像”とは、時系列画像取得手段、すなわち、時々刻々と画像を取得し、出力できるビデオカメラのような装置、又は既に収録されている文書動画像を再生することができるビデオ装置などによって取得された画像のこととされている。 According to this prior art, “time-series document image” means time-series image acquisition means, that is, a device such as a video camera that can acquire and output an image every moment, or a document moving image that has already been recorded. It is assumed that the image is acquired by a video device or the like that can reproduce the video.

一般に、ビデオカメラは毎秒数十フレーム（たとえば、３０フレーム／秒）の静止画像を生成出力し、また、ビデオ装置はビデオカメラの出力画像などを収録（録画）すると共に必要に応じて再生出力するものであるから、上記の“時系列文書画像”は、文書（紙の上に文字列を書き込み又は印刷したもの）を被写体にして、その被写体を所定の時間、ビデオカメラで撮影した動画像であるということができる。そして、この場合の“時系列”とは、一定の周期、たとえば、３０フレーム／秒程度の間隔で静止画が連続していることを意味するものと解される。 In general, a video camera generates and outputs a still image of several tens of frames per second (for example, 30 frames / second), and a video device records (records) an output image of the video camera and reproduces and outputs it as necessary. Therefore, the above “time-series document image” is a moving image obtained by taking a document (a character string written or printed on paper) as a subject and photographing the subject with a video camera for a predetermined time. It can be said that there is. The “time series” in this case is understood to mean that still images are continuous at a constant period, for example, an interval of about 30 frames / second.

さて、この従来技術においては、微小な時間差をおいて獲得した画像（つまり、時系列文書画像）より抽出される文字認識結果は多くの重複を有するので、重複部分を対応づけることによって２つの認識結果を合成することができるという基本的な原理を示した上で、時系列文書画像中の文字数が少ないと、文字の切り出しを誤りやすいという技術課題を示し、動的計画法を用いた非線形な対応づけ（ＤＰマッチング）により切り出し誤りを考慮した対応づけを行い、更に動的計画法の部分区間における累積距離を利用することにより、上記の技術課題の解決、すなわち、認識誤りをおかしやすい文字コード（たとえば、“ば”と“ぱ”など）を正しく識別できるとされている。 In this prior art, character recognition results extracted from images (that is, time-series document images) acquired with a small time difference have a large number of duplicates. In addition to showing the basic principle that results can be synthesized, if the number of characters in a time-series document image is small, this indicates a technical problem that character extraction is likely to be erroneous. By matching (DP matching) considering cut-out errors, and using the cumulative distance in the partial section of dynamic programming, the character code that solves the above technical problem, that is, easily recognizes the recognition error. (For example, “BA” and “PA”) can be correctly identified.

ここで、ＤＰマッチングについて概説すると、通常、同じパターンの図形でも標準パターンと未知入力パターンの長さ（特徴ベクトルのデータ個数）が異なることがある。また、同じパターンの長さでも局所的に伸縮してみると非常によく一致することがある。このようなときに有効なのが「動的計画法を用いて二つのパターンの要素間の対応づけ（整列化）を行い、それによって類似度を計算する」という処理である。この処理は、動的計画法（Dynamic Programing）の頭文字をとってＤＰマッチングと呼ばれている。動的計画法は、ある問題を解きたいとき、“それと同じタイプで、それよりサイズが小さい一群の問題”の解を利用すると、計算量が少なく、しかも同じ手続きの繰り返しで解を得ることができという問題解決手法であり、とりわけ、コンピュータ向きの手法である。 Here, when DP matching is outlined, there is a case where the length of the standard pattern and the unknown input pattern (number of data of feature vectors) is usually different even in the same pattern figure. Also, even when the length of the same pattern is used, it may match very well when locally expanded or contracted. A process that is effective in such a case is a process of “corresponding (alignment) between elements of two patterns using dynamic programming, and thereby calculating a similarity”. This process is called DP matching after the acronym of Dynamic Programming. When you want to solve a certain problem, dynamic programming uses a solution of “a group of problems of the same type and smaller size”, and it can be obtained with a small amount of computation and by repeating the same procedure. It is a problem-solving technique that can be done, especially a computer-oriented technique.

特許第２８５８５６０号公報Japanese Patent No. 2858560

しかしながら、上記の従来技術は、ＤＰマッチングを応用し、認識誤りをおかしやすい文字コードを正しく区別できるようにした点で有益ではあるが、たとえば、位置不定の枠線で囲まれた文字列の認識精度が充分でないという問題点がある。 However, the above prior art is useful in that DP matching is applied so that character codes that are likely to cause recognition errors can be correctly distinguished. For example, recognition of a character string surrounded by an indefinite frame is recognized. There is a problem that the accuracy is not sufficient.

図２０は、枠線で囲まれた文字列画像の一例を示す図である。この図において、便宜的に示す“１２３４５６”の文字列１は、横長矩形状の枠線２に囲まれている。 FIG. 20 is a diagram illustrating an example of a character string image surrounded by a frame line. In this figure, a character string 1 of “123456” shown for convenience is surrounded by a horizontally long rectangular frame 2.

これらの文字列１と枠線２とを含む画像３から文字列１を切り出して認識する場合、枠線２の一部が文字として誤認されることがある。たとえば、文字列１と枠線２の距離Ｌが極接近している場合、文字列１の前後に位置する枠線２の一部が“Ｉ”や“１”などと誤認されることがある。この誤認は、ＤＰマッチングを適用しても回避できない。微小な時間差をおいて獲得した画像の各々にも枠線２が写っているからである。なお、枠線２の位置が固定であれば（たとえば、文字列１からの距離Ｌが大きく且つ既知であれば）、文字列１のみの切り出しウィンドウを設定するなどして不要な枠線２の取り込みを抑制し、誤認を回避できると考えられるが、枠線２の位置が不定である場合には、適切な切り出しウィンドウの設定は困難であり、文字列１の誤認を否めない。 When the character string 1 is cut out and recognized from the image 3 including the character string 1 and the frame line 2, a part of the frame line 2 may be mistaken as a character. For example, when the distance L between the character string 1 and the frame line 2 is extremely close, a part of the frame line 2 positioned before and after the character string 1 may be mistaken for “I” or “1”. . This misidentification cannot be avoided even if DP matching is applied. This is because the frame 2 is also reflected in each of the images acquired with a minute time difference. If the position of the frame line 2 is fixed (for example, if the distance L from the character string 1 is large and known), an unnecessary frame line 2 is set by setting a clipping window for only the character string 1 or the like. Although it is considered that the capturing can be suppressed and the misidentification can be avoided, when the position of the frame line 2 is indefinite, it is difficult to set an appropriate cutout window, and the misidentification of the character string 1 cannot be denied.

このような誤認のケースは、たとえば、装飾枠付の車両のナンバープレートを認識するときに起こり得る。とりわけ、ナンバープレートの縁から内側へと装飾枠がはみ出している場合で、しかも、文字列（車両ナンバー）の先頭や後尾に装飾枠が接近し、あるいは、文字列の一部に装飾枠が重なっている場合などに起こり得る。加えて、ナンバープレートの装飾枠は様々なデザインのものが使用されるため、それらの装飾枠によって形成される枠線の位置も不定となるからである。 Such a misidentification case may occur, for example, when a license plate of a vehicle with a decorative frame is recognized. In particular, when the decorative frame protrudes from the edge of the license plate to the inside, the decorative frame approaches the beginning or tail of the character string (vehicle number), or the decorative frame overlaps part of the character string. This can happen if you are. In addition, because the decorative frame of the license plate has various designs, the position of the frame line formed by the decorative frame is also undefined.

なお、“装飾枠”とは、ナンバープレートを補強するため又は美観を高めるために、そのナンバープレートの周囲を取り囲むようにして所望により取り付けられる枠体のことをいう。ナンバープレートフレームともいう。単純なデザインのものから色や形に凝ったものあるいは文字装飾や図形装飾を施したものまで多種多様なものが用いられている。 The “decorative frame” refers to a frame that is attached as desired so as to surround the license plate in order to reinforce the license plate or enhance the appearance. Also called license plate frame. A wide variety of products are used, from simple designs to elaborate colors and shapes, or decorated with letters and graphics.

または、文字列４が横長楕円形状の枠線５に囲まれている場合、上記と同様に、これらの文字列４と枠線５とを含む画像６から文字列４を切り出して認識する際に、やはり、枠線５の一部が文字として誤認（文字列４の前後に位置する枠線５の一部が“（”や“）”など）されることがある。 Alternatively, when the character string 4 is surrounded by a horizontally long elliptical frame 5, when the character string 4 is cut out and recognized from the image 6 including the character string 4 and the frame 5 as described above. Again, a part of the frame line 5 may be misidentified as a character (a part of the frame line 5 positioned before and after the character string 4 may be “(”, “)”, etc.).

このような誤認のケースは、たとえば、広告看板等の文字列を認識するときに起こり得る。当該文字列を目立たせるために楕円の枠線で囲むとき、デザインの都合等により、文字列の前後と枠線との間に十二分な空きスペースを確保できない場合などである。 Such a misidentification case may occur, for example, when a character string such as an advertising billboard is recognized. When the character string is surrounded by an elliptical frame to make it stand out, there is a case where a sufficient free space cannot be secured between the front and rear of the character string and the frame due to the convenience of the design.

そこで本発明の目的は、認識対象パターン（たとえば、文字列）の周囲に位置不定の枠線が存在していた場合でも、当該文字列を精度よく認識できるパターン認識装置、パターン認識方法及び文字認識方法を提供することにある。 Therefore, an object of the present invention is to provide a pattern recognition device, a pattern recognition method, and a character recognition that can accurately recognize a character string even when a frame with an indefinite position exists around the recognition target pattern (for example, a character string). To provide a method.

本発明は、同一の被写体の画像を一定又は不定の周期で撮影して時系列的に出力する撮像手段と、前記画像の中から所定の形状に合致する読み取り範囲を切り出す切り出し手段と、前記切り出し手段によって切り出された切り出し画像の中から対象となるパターン認識行の画像を取り出すパターン認識行取り出し手段と、前記パターン認識行の画像に含まれるパターン認識候補を抽出する抽出手段と、前記パターン認識候補の中から非パターン認識候補を除去する除去手段と、前記非パターン認識候補が除去されたパターン認識候補を確定パターンとして出力する出力手段とを備えたことを特徴とする。
好ましくは、前記除去手段は、ＤＰマッチングによって前記パターン認識候補の中から非パターン認識候補を除去することを特徴とする。
また、好ましくは、前記所定の形状は、車両のナンバープレートの外形に相当する形状であり、且つ、前記パターン認識行は、当該ナンバープレート上の文字列行であることを特徴とする。
ここで、前記撮像手段は、前記被写体の画像をリアルタイムに撮影するものであってもよいが、これに限定されず、たとえば、ビデオカメラやビデオ装置のように、事前に撮影し又は録画しておいた前記被写体の画像を再生出力するものであってもよい。 The present invention provides an imaging unit that captures images of the same subject at a constant or indefinite period and outputs them in time series, a clipping unit that clips a reading range that matches a predetermined shape from the image, and the clipping Pattern recognition line extraction means for extracting an image of a target pattern recognition line from the cut image cut out by the means, extraction means for extracting a pattern recognition candidate included in the image of the pattern recognition line, and the pattern recognition candidate Removing means for removing non-pattern recognition candidates from the pattern, and output means for outputting the pattern recognition candidates from which the non-pattern recognition candidates have been removed as a definite pattern.
Preferably, the removing unit removes non-pattern recognition candidates from the pattern recognition candidates by DP matching.
Preferably, the predetermined shape is a shape corresponding to an outer shape of a license plate of a vehicle, and the pattern recognition line is a character string line on the license plate.
Here, the image pickup unit may take an image of the subject in real time, but is not limited thereto. For example, the image pickup unit may take a picture or record in advance like a video camera or a video device. It is also possible to reproduce and output the image of the subject that has been placed.

本発明では、時系列的に連続する同一の被写体の画像の中から所定の形状に合致する読み取り範囲を切り出し、この切り出し画像の中から対象となるパターン認識行の画像を取り出し、この対象となるパターン認識行の画像に含まれるパターン認識候補を抽出するとともに、このパターン認識候補の中から非パターン認識候補を除去し、非パターン認識候補が除去されたパターン認識候補を確定パターンとして出力するので、たとえば、枠線に囲まれた文字列をパターン認識する場合であって、その枠線の一部が文字列に接近し、文字列と誤認（枠線の縦棒が“Ｉ”や“１”などと誤認）されるような場合においては、これらの誤認文字が非パターン認識候補として除去されるため、正しい文字認識を行うことができ、文字列の周囲に位置不定の枠線が存在していた場合でも、当該文字列を精度よく認識できるパターン認識装置、パターン認識方法及び文字認識方法を提供することができる。
また、枠状の物体又は枠状の図形を含む同一の被写体物を時系列的に撮像した複数の画像の中から、前記枠状の物体又は枠状の図形の周囲の画像を含む枠周囲画像を切り出し、前記複数の画像に対応する各々の該枠周囲画像に対してそれぞれ文字認識を行って複数の文字認識候補の文字を算出し、該文字認識候補の文字列のうち、部分集合の一致度合いが高い文字列を文字認識結果として出力するようにすれば、枠の位置が特定できない連続した動画像であって、しかも、文字の桁数や文字配置の種類が多数存在する場合であっても、文字画像を誤って非文字と誤認識することがない。これは、文字以外の画像に対して認識した結果は、不定文字と判定されるか、あるいは、安定して同一の文字として認識されないからである。
また、前記枠周囲画像を上下にサーチして文字領域を含まない部分を検出し、該文字領域を含まない部分を前記枠周囲画像から削除し、該該文字領域を含まない部分を削除した枠周囲画像に対して文字認識を行うようにすれば、枠との干渉によって文字が誤って排除されるという、セグメンテーション誤りを抑制することができる。 In the present invention, a reading range that matches a predetermined shape is cut out from images of the same subject that are continuous in time series, an image of a target pattern recognition line is taken out from the cut-out image, and becomes the target. Since the pattern recognition candidate included in the image of the pattern recognition line is extracted, the non-pattern recognition candidate is removed from the pattern recognition candidate, and the pattern recognition candidate from which the non-pattern recognition candidate is removed is output as a confirmed pattern. For example, in the case of pattern recognition of a character string surrounded by a frame line, a part of the frame line approaches the character string and is mistaken as a character string (the vertical bar of the frame line is “I” or “1”. In such a case, these misidentified characters are removed as non-pattern recognition candidates. Even if the border is present, the pattern recognition apparatus of the character string can be recognized accurately, it is possible to provide a pattern recognition method and a character recognition method.
In addition, a frame peripheral image including an image around the frame-shaped object or the frame-shaped graphic from among a plurality of images obtained by capturing the same subject including the frame-shaped object or the frame-shaped graphic in time series And character recognition is performed on each of the frame surrounding images corresponding to the plurality of images to calculate a plurality of character recognition candidate characters, and a subset of the character recognition candidate character strings is matched. If a character string with a high degree is output as a character recognition result, it is a continuous moving image in which the position of the frame cannot be specified, and there are many types of character digits and character arrangements. However, the character image is not erroneously recognized as a non-character. This is because the result of recognition for an image other than a character is determined as an indefinite character or is not stably recognized as the same character.
Further, the frame surrounding image is searched up and down to detect a portion that does not include a character region, a portion that does not include the character region is deleted from the frame surrounding image, and a portion that does not include the character region is deleted. If character recognition is performed on the surrounding image, it is possible to suppress a segmentation error in which characters are erroneously excluded due to interference with the frame.

以下、本発明の実施の形態を、特に限定しないが、「車両ナンバープレート読み取り装置」への適用を例にして図面に基づいて説明する。なお、以下の説明における様々な細部の特定ないし実例および数値や文字列その他の記号の例示は、本発明の思想を明瞭にするための、あくまでも参考であって、それらのすべてまたは一部によって本発明の思想が限定されないことは明らかである。また、周知の手法、周知の手順、周知のアーキテクチャおよび周知の回路構成等（以下「周知事項」）についてはその細部にわたる説明を避けるが、これも説明を簡潔にするためであって、これら周知事項のすべてまたは一部を意図的に排除するものではない。かかる周知事項は本発明の出願時点で当業者の知り得るところであるので、以下の説明に当然含まれている。 In the following, embodiments of the present invention will be described with reference to the drawings, taking application to a “vehicle license plate reading device” as an example, although not particularly limited. It should be noted that the specific details or examples in the following description and illustrations of numerical values, character strings, and other symbols are for reference only to clarify the idea of the present invention, and all or some of them may be used as a reference. Obviously, the idea of the invention is not limited. In addition, a well-known technique, a well-known procedure, a well-known architecture, a well-known circuit configuration, and the like (hereinafter, “well-known matter”) are not described in detail, but this is also to simplify the description. Not all or part of the matter is intentionally excluded. Such well-known matters are known to those skilled in the art at the time of filing of the present invention, and are naturally included in the following description.

図１は、車両ナンバープレート読み取り装置のシステム構成図である。この図において、自動車等車両（以下「自車」という。）１０に搭載された車両ナンバープレート読み取り装置１１（パターン認識装置）は、テレビカメラ１２（撮像手段）と、コントロールユニット１３と、ディスプレイユニット１４とを備える。 FIG. 1 is a system configuration diagram of a vehicle license plate reader. In this figure, a vehicle license plate reading device 11 (pattern recognition device) mounted on a vehicle such as an automobile (hereinafter referred to as “own vehicle”) 10 includes a television camera 12 (imaging means), a control unit 13, and a display unit. 14.

テレビカメラ１２は、所定画角の撮影レンズ１５を任意方向（ここでは自車１０の前方とする。）に向けて車体に取り付けられており、自車１０の前方画像を所定の短い周期（たとえば、毎秒３０フレーム程度）で撮影し、時系列的に連続した静止画の集まりである動画を生成し、コントロールユニット１３に出力する。ここで、テレビカメラ１５は、上記の動画を生成できる撮像デバイスを備えたものであればよく、原理的には、たとえば、真空管方式の撮像管を用いたものであってもよいが、消費電力や重量の点で、ＣＣＤ（Charge Coupled Devices）カメラやＣＭＯＳ（Complementary Metal Oxide Semiconductor）カメラなどの半導体撮像デバイスの使用が望ましい。さらに、屋外におけるあらゆる明るさの撮影環境を考慮すると、とりわけダイナミックレンジが広いカメラ（たとえば、ＣＭＯＳカメラ）の使用が望ましい。但し、発明の思想上は、テレビカメラ１５の具体的構成は限定されない。自車１０の前方画像を所定の短い周期（一定の周期が望ましいが、極端に異ならなければ不定の周期でも構わない。）で撮影し、時系列的に連続した静止画の集まりである動画を生成して出力できるものであればよい。 The TV camera 12 is attached to the vehicle body with the photographing lens 15 having a predetermined angle of view facing an arbitrary direction (here, the front of the host vehicle 10), and the front image of the host vehicle 10 is displayed in a predetermined short period (for example, , About 30 frames per second) to generate a moving image that is a collection of still images that are continuous in time series, and outputs the generated moving image to the control unit 13. Here, the television camera 15 may be any device provided with an imaging device capable of generating the above-described moving image. In principle, for example, a television tube imaging tube may be used. In view of weight and weight, it is desirable to use a semiconductor imaging device such as a CCD (Charge Coupled Devices) camera or a CMOS (Complementary Metal Oxide Semiconductor) camera. Furthermore, considering a shooting environment of all brightness outdoors, it is desirable to use a camera (for example, a CMOS camera) having a particularly wide dynamic range. However, the specific configuration of the television camera 15 is not limited in the concept of the invention. A front image of the host vehicle 10 is shot at a predetermined short cycle (a constant cycle is desirable, but an indefinite cycle may be used as long as it is not extremely different), and a moving image that is a collection of continuous still images in time series is taken. Anything that can be generated and output is acceptable.

コントロールユニット１３は、テレビカメラ１５からの動画を用いて、認識対象車両（ここでは自車１０の前方に位置する「先行車１６」とする。）のナンバープレート１７の文字列（車両ナンバー）を認識するものであり、また、ディスプレイユニット１４は、文字認識の結果を表示して自車１０の乗員（運転者等）に知らせるものである。 The control unit 13 uses the moving image from the TV camera 15 to set the character string (vehicle number) of the license plate 17 of the vehicle to be recognized (here, “preceding vehicle 16” positioned in front of the host vehicle 10). In addition, the display unit 14 displays a result of character recognition and notifies a passenger (such as a driver) of the vehicle 10.

図２は、車両ナンバープレート読み取り装置１１の概念的な機能ブロック図である。車両ナンバープレート読み取り装置１１は、機能別に、画像入力部１８（撮像手段）、位置検出部１９（切り出し手段）、上下方向背景除去部２０（パターン認識行取り出し手段）、文字認識部２１（抽出手段）、補正処理部２２（除去手段）、認識結果出力部２３（出力手段）及び制約条件記憶部２４の各部に分けることができる。 FIG. 2 is a conceptual functional block diagram of the vehicle license plate reader 11. The vehicle license plate reader 11 includes an image input unit 18 (imaging unit), a position detection unit 19 (cutout unit), a vertical background removal unit 20 (pattern recognition line extraction unit), and a character recognition unit 21 (extraction unit) for each function. ), Correction processing unit 22 (removal unit), recognition result output unit 23 (output unit), and constraint condition storage unit 24.

画像入力部１８は、自車１０の前方画像を所定の短い周期で撮影し、時系列的に連続した静止画の集まりである動画を生成して出力するものであり、この画像入力部１８は、上記のテレビカメラ１５に相当する。なお、画像の“入力部”としているのは、テレビカメラ１５の出力画像だけに限定されることなく、たとえば、ビデオ装置の再生画像であってもよいことを意味する。 The image input unit 18 captures a front image of the host vehicle 10 at a predetermined short cycle, generates a moving image that is a collection of still images continuous in time series, and outputs the moving image. Corresponds to the television camera 15 described above. Note that the “input part” of the image is not limited to the output image of the television camera 15, but may mean, for example, a playback image of a video device.

画像入力部１８以外の各部、すなわち、位置検出部１９、上下方向背景除去部２０、文字認識部２１、補正処理部２２、認識結果出力部２３及び制約条件記憶部２４は、コントロールユニット１３の内部機能ブロックであり、たとえば、コントロールユニット１３がコンピュータとその周辺回路によって構成されているものとするとき、それらのコンピュータや周辺回路等のハードウェアリソースと、当該コンピュータの基本プログラムや各種の応用プログラムなどのソフトウェアリソースとの有機的結合によって実現される機能ブロックであるが、言うまでもなく、それらの機能の全て又は一部をハードロジックで構成しても構わない。 Each unit other than the image input unit 18, that is, the position detection unit 19, the vertical background removal unit 20, the character recognition unit 21, the correction processing unit 22, the recognition result output unit 23, and the constraint condition storage unit 24 are included in the control unit 13. For example, when the control unit 13 is composed of a computer and its peripheral circuits, hardware resources such as those computers and peripheral circuits, basic programs of the computer, various application programs, etc. However, it is needless to say that all or part of these functions may be configured by hard logic.

図３は、車両ナンバープレート読み取り装置１１のシステム全体の動作フローを示す図である。この図において、ステップＳ１は画像入力部１８の動作工程（撮像工程）、ステップＳ２は位置検出部１９の動作工程（切り出し工程）、ステップＳ３は上下方向背景除去部２０の動作工程（パターン認識行取り出し工程）、ステップＳ４は文字認識部２１の動作工程（抽出工程）、ステップＳ５は補正処理部２２の動作工程（除去工程）、ステップＳ６は認識結果出力部２３の動作工程（出力工程）である。車両ナンバープレート読み取り装置１１は、これらのステップＳ１〜ステップ６を逐次に実行しながら、先行車１６のナンバープレート１７の文字列（車両ナンバー）を認識し、その認識結果をディスプレイユニット１４に表示するという動作を自車１０の走行中、繰り返し実行する。 FIG. 3 is a diagram showing an operation flow of the entire system of the vehicle license plate reader 11. In this figure, step S1 is an operation process (imaging process) of the image input unit 18, step S2 is an operation process (cutout process) of the position detection unit 19, and step S3 is an operation process (pattern recognition line) of the vertical direction background removal unit 20. Step S4 is an operation process (extraction process) of the character recognition unit 21, step S5 is an operation process (removal process) of the correction processing unit 22, and step S6 is an operation process (output process) of the recognition result output unit 23. is there. The vehicle license plate reader 11 recognizes the character string (vehicle number) of the license plate 17 of the preceding vehicle 16 while sequentially executing these steps S1 to S6, and displays the recognition result on the display unit 14. This operation is repeatedly executed while the host vehicle 10 is traveling.

なお、ここでは、先行車１６の車両ナンバーの認識結果を単にディスプレイユニット１４に表示しているだけであるが、これに限定されない。たとえば、別途に車両ナンバー登録用のデータベースを備えておき、先行車１６の車両ナンバーの認識結果とデータベースの登録情報とを照合して一致／不一致をディスプレイユニット１４に表示するようにしてもよい。このようにすると、たとえば、盗難車両の照合作業を自動化することができる。 Here, the recognition result of the vehicle number of the preceding vehicle 16 is simply displayed on the display unit 14, but the present invention is not limited to this. For example, a vehicle number registration database may be provided separately, and the result of recognition of the vehicle number of the preceding vehicle 16 and the registration information in the database may be collated to display a match / mismatch on the display unit 14. If it does in this way, collation work of a stolen vehicle can be automated, for example.

まず、ステップＳ１で、先行車１６を含む動画像を時系列的に連続して取り込み、続くステップＳ２〜ステップＳ６で、動画像を構成する各静止画（フローチャートでは“１画像”と称している。）ごとに所要の処理を行う。すなわち、ステップＳ２で、静止画からナンバープレート１７の画像（少なくともプレート枠と文字列とを含む画像）を切り出し、ステップＳ３で、ナンバープレート１７の切り出し画像から文字列が存在する行の画像（文字列とその文字列の前後に位置するプレート枠とを含む画像）を取り出す。次に、ステップＳ４で、文字列が存在する行の画像から当該行を除く上下の背景を除去して、その背景を除去した画像に含まれる文字を認識し、認識された文字列を仮認識文字列として出力する。次に、ステップＳ５で、ＤＰマッチング処理を行い、仮認識文字列の部分集合の一致度合いが高い文字列を補正認識文字列として決定して出力する。そして、最後に、補正認識文字列をもとに認識結果を判定して、その判定結果をディスプレイユニット１４に出力する。なお、図２の制約条件記憶部２４については後で説明する。 First, in step S1, moving images including the preceding vehicle 16 are continuously captured in time series, and in subsequent steps S2 to S6, each still image constituting the moving image (referred to as “one image” in the flowchart). )) To perform the required processing. That is, in step S2, an image of the license plate 17 (an image including at least a plate frame and a character string) is cut out from the still image, and in step S3, an image of a line (characters) in which a character string exists from the cut-out image of the license plate 17 An image including a column and a plate frame positioned before and after the character string is taken out. Next, in step S4, the upper and lower backgrounds excluding the line are removed from the image of the line in which the character string exists, the character included in the image from which the background is removed is recognized, and the recognized character string is temporarily recognized. Output as a string. Next, in step S5, DP matching processing is performed to determine and output a character string having a high degree of matching of a subset of the temporarily recognized character string as a corrected recognized character string. Finally, the recognition result is determined based on the corrected recognition character string, and the determination result is output to the display unit 14. The constraint condition storage unit 24 in FIG. 2 will be described later.

各部の動作を詳しく説明する。
図４は、位置検出部１９の具体的な動作フローを示す図である。
まず、ステップＳ２ａで、画像入力部１８で入力された動画像から１フレームの画像（静止画）を取り出す。 The operation of each part will be described in detail.
FIG. 4 is a diagram illustrating a specific operation flow of the position detection unit 19.
First, in step S2a, one frame image (still image) is extracted from the moving image input by the image input unit 18.

図５は、画像入力部１８で入力された動画像２５を示す図である。動画像２５は、自時系列的に連続した静止画の集まりである動画であり、各々の静止画は図中の符号Ｆ１、Ｆ２、Ｆ３、・・・・で示されている。 FIG. 5 is a diagram illustrating the moving image 25 input by the image input unit 18. The moving image 25 is a moving image that is a collection of still images that are continuous in time series, and each still image is indicated by reference signs F1, F2, F3,.

次いで、ステップＳ２ｂで、１フレームの画像（静止画）のエッジ抽出を行う。
図６は、エッジ抽出画像２６を示す図である。エッジ抽出とは、画像中の輪郭線を際立たせる処理のことをいう。図示のエッジ抽出画像２６では、先行車１６の後部画像のうち、ボディ外形、リアウィンドウ、テールランプ等の灯火具、リアバンパー、テールゲート開閉ノブなどの輪郭線が強調表示（図では白レベル強調）されていると共に、ナンバープレート外枠と、そのプレート内の文字の外形線も同様に強調表示されている。 Next, in step S2b, edge extraction of one frame image (still image) is performed.
FIG. 6 is a diagram showing the edge extraction image 26. Edge extraction refers to processing that makes a contour line in an image stand out. In the edge extraction image 26 shown in the drawing, contour lines of the rear image of the preceding vehicle 16 such as a body outer shape, a rear window, a lighting tool such as a tail lamp, a rear bumper, and a tail gate opening / closing knob are highlighted (white level emphasis in the figure). In addition, the outer frame of the license plate and the outline of the characters in the plate are also highlighted.

次に、ステップＳ２ｃで、エッジ抽出画像２６のサイズを小さくする。ここでは、便宜的にエッジ抽出画像２６のサイズを１９３×９６画素（ピクセル）とし、それを３２×１６画素にサイズに圧縮するものとする。 Next, in step S2c, the size of the edge extraction image 26 is reduced. Here, for the sake of convenience, it is assumed that the size of the edge extraction image 26 is 193 × 96 pixels (pixels) and is compressed to 32 × 16 pixels.

次いで、ステップＳ２ｄで、サイズ圧縮したエッジ抽出画像２６の中から注目領域（ここではナンバープレート１７）の重心を求める。 Next, in step S2d, the center of gravity of the attention area (here, the license plate 17) is obtained from the edge-extracted image 26 that has been subjected to size compression.

図７は、注目領域（ここではナンバープレート１７）の重心を求める際に用いられる概念図であり、ここでは、ニューラルネットワーク（以下「ＮＮ」と略す。）を例にしている。図示のＮＮは、入力層２７、中間層２８及び出力層２９の３層構造を有しており、中間層２８にはナンバープレートの重心座標が学習されている。また、ナンバープレート以外の、たとえば、リアウィンドウ、テールランプ等の灯火具、リアバンパー、テールゲート開閉ノブなどについては抑制学習を行う。入力層２７にサイズ圧縮したエッジ抽出画像２６を与えると、出力層２９からナンバープレートの重心の位置が取り出される。 FIG. 7 is a conceptual diagram used when obtaining the center of gravity of the region of interest (here, the license plate 17). Here, a neural network (hereinafter abbreviated as “NN”) is taken as an example. The illustrated NN has a three-layer structure of an input layer 27, an intermediate layer 28, and an output layer 29, and the center-of-gravity coordinates of the license plate are learned in the intermediate layer 28. Further, other than the license plate, for example, lighting learning such as a rear window and a tail lamp, a rear bumper, and a tail gate opening / closing knob are subjected to suppression learning. When the edge extraction image 26 whose size is compressed is given to the input layer 27, the position of the center of gravity of the license plate is extracted from the output layer 29.

最後に、ステップＳ２ｅで、ナンバープレートの重心を元画像サイズに投影加算し、部分画像（ナンバープレート領域を含む画像）の切り出し位置を決定し、ステップＳ２ｆで、ナンバープレート画像とその位置を出力する。なお、投影加算の際に、ガウスフィルタを掛けることによって近似的に連続的な分布を作ることができ、位置検出の精度を高めることができる。 Finally, in step S2e, the center of gravity of the license plate is projected and added to the original image size to determine the cutout position of the partial image (image including the license plate area), and in step S2f, the license plate image and its position are output. . In addition, an approximate continuous distribution can be created by applying a Gaussian filter during projection addition, and the accuracy of position detection can be improved.

図８は、ナンバープレート画像の例を示す図である。図（ａ）において、ナンバープレート画像３１（切り出し画像）は、車両ナンバーの文字列（図では一例として米国ナンバーの“４ＵＭＶ８４４”を示している。）と、その文字列の周囲を取り囲む横長矩形状の枠線とを含み、いずれも白抜きのエッジ線で強調されている。図（ｂ）は、画素数を削減して圧縮したナンバープレート画像３２を示す図である。 FIG. 8 is a diagram illustrating an example of a license plate image. In FIG. 1A, a license plate image 31 (cut-out image) is a vehicle number character string (in the figure, US number “4UMV844” is shown as an example) and a horizontally long rectangular shape surrounding the character string. These are all highlighted with white edge lines. FIG. (B) is a diagram showing a license plate image 32 compressed by reducing the number of pixels.

図９は、上下方向背景除去部２０の具体的な動作フローを示す図である。
まず、ステップＳ３ａで、ナンバープレート画像３２を入力し、次いで、ステップＳ３ｂで、垂直エッジ画像を水平方向に投影してヒストグラムを作成する。 FIG. 9 is a diagram showing a specific operation flow of the vertical direction background removal unit 20.
First, in step S3a, a license plate image 32 is input, and then in step S3b, a vertical edge image is projected in the horizontal direction to create a histogram.

図１０は、ナンバープレート画像３２とそのヒストグラム分布図である。図（ａ）において、ナンバープレート画像３２の周囲の黒い部分は、ナンバープレートの装飾枠であり、装飾枠に囲まれた白色部分の文字列（“４ＷＰＤ６０２”）は車両ナンバーである。また、車両ナンバーの若干上部には、判読不能な文字列が認められ、また、装飾枠の内部にも、二箇所程度の判読不能な文字列が認められる。 FIG. 10 is a license plate image 32 and its histogram distribution diagram. In FIG. 5A, the black part around the license plate image 32 is a decorative frame of the license plate, and the character string (“4WPD 602”) of the white part surrounded by the decorative frame is the vehicle number. In addition, an unreadable character string is recognized slightly above the vehicle number, and approximately two unreadable character strings are also recognized inside the decorative frame.

図（ｂ）は、垂直方向のエッジだけを強調した画像３４である。この垂直エッジ強調画像３４には、読み取り対象の文字列（車両ナンバー）３４ａの他に、上記の判読不能文字列に相当するいくつかの文字列３４ｂ、３４ｃ、３４ｄや、ナンバープレートの装飾枠の垂直エッジ部３４ｅ、３４ｆなどの不要部分が含まれている。 FIG. 2B is an image 34 in which only the edges in the vertical direction are emphasized. In addition to the character string (vehicle number) 34a to be read, the vertical edge-enhanced image 34 includes several character strings 34b, 34c, 34d corresponding to the above-described unreadable character strings, and decorative frames of license plates. Unnecessary portions such as the vertical edge portions 34e and 34f are included.

さて、このようなナンバープレート画像３２の垂直エッジ画像を水平方向に投影してヒストグラムを作成すると、図（ｂ）に示すように、中央付近の大きな山３５ａと、その上下に位置する小さな二つの山３５ｂ、３５ｃとを有するヒストグラム３５が得られる。大きな山３５ａは、読み取り対象の文字列（車両ナンバー）３４ａの位置を表し、小さな二つの山３５ｂ、３５ｃは、それぞれ読み取り不要文字列３４ｂ、３４ｃ、３４ｄの位置を表している。大きな山３５ａと、小さな二つの山３５ｂ、３５ｃは、明らかに異なっており、特に、大きな山３５ａの裾幅Ａは、小さな二つの山３５ｂ、３５ｃのそれよりも遙かに大きいから、読み取り対象の文字列（車両ナンバー）３４ａの行位置を確実に特定することができる。 Now, when a histogram is created by projecting the vertical edge image of such a license plate image 32 in the horizontal direction, as shown in FIG. 4B, a large mountain 35a near the center and two small peaks positioned above and below it. A histogram 35 having peaks 35b and 35c is obtained. The large peak 35a represents the position of the character string (vehicle number) 34a to be read, and the two small peaks 35b and 35c represent the positions of the character strings 34b, 34c, and 34d that are not required to be read. The large mountain 35a and the two small peaks 35b and 35c are clearly different. In particular, the hem width A of the large mountain 35a is much larger than that of the two small peaks 35b and 35c, so The line position of the character string (vehicle number) 34a can be reliably specified.

したがって、ステップＳ３ｃで、ヒストグラム３５の分布に基づき、画像３４の中心から上下方向に文字列行の検索を行い、ステップＳ３ｄで、文字列行の幅と上下端情報を算出し、ステップＳ３ｅで、プレート領域を含む画像３４に判別分析法による二値化を行い、ステップＳ３ｆで、上下端情報を元に、二値化された画像から文字を含まない上下部分の背景除去を行う。これにより、読み取り対象の文字列（車両ナンバー）３４ａの行のみの画像を取り出すことができる。 Therefore, in step S3c, a character string row is searched in the vertical direction from the center of the image 34 based on the distribution of the histogram 35. In step S3d, the width and upper / lower end information of the character string row are calculated. In step S3e, The binarization is performed on the image 34 including the plate area by the discriminant analysis method. In step S3f, the background of upper and lower portions not including characters is removed from the binarized image based on the upper and lower end information. Thereby, it is possible to extract an image of only the row of the character string (vehicle number) 34a to be read.

図１１は、そのようにして取り出された読み取り対象の文字列（車両ナンバー）３４ａの行のみの画像３６（パターン認識行の画像）を示す図である。図１０の元画像３４と比較すると、読み取り対象の文字列（車両ナンバー）３４ａの行の上下の余分な部分が除去されている。但し、この画像３６は、読み取り対象の文字列（車両ナンバー）３４ａの前後（図面の左右方向）に、ナンバープレートの装飾枠の垂直エッジ部３４ｅ、３４ｆが残っているため、この不要部分を文字として誤認（図示の例の場合は縦棒であるから“Ｉ”や“１”などと誤認）するおそれがある。この不要部分は、次の文字認識部２１と補正処理部２２で取り除かれる。 FIG. 11 is a diagram showing an image 36 (image of a pattern recognition line) of only the line of the character string (vehicle number) 34a to be read extracted in this way. Compared with the original image 34 of FIG. 10, the upper and lower excess portions of the line of the character string (vehicle number) 34a to be read are removed. However, in this image 36, since the vertical edge portions 34e and 34f of the decorative frame of the license plate remain before and after the character string (vehicle number) 34a to be read (the left-right direction in the drawing), this unnecessary portion is represented by characters. May be misidentified (in the example shown, it is a vertical bar, so it is misidentified as “I” or “1”). This unnecessary portion is removed by the next character recognition unit 21 and correction processing unit 22.

図１２は、文字認識部２１の具体的な動作フローを示す図である。
この文字認識部２１では、まず、ステップＳ４ａで、ラベリングされた文字の大きさを１２×２４画素に正規化し、次いで、ステップＳ４ｂで、３層英数字認識用ＮＮに入力する。 FIG. 12 is a diagram illustrating a specific operation flow of the character recognition unit 21.
In the character recognition unit 21, first, in step S4a, the size of the labeled character is normalized to 12 × 24 pixels, and then in step S4b, the character is input to the three-layer alphanumeric character recognition NN.

図１３は、３層英数字認識用ＮＮの概念構造図である。この図において、識別する英数字は“０〜９”、“Ａ〜Ｚ”及び“その他”の３７文字とするが、これは説明の便宜である。３層英数字認識用ＮＮは、入力層３７、中間層３８及び出力層３９からなり、ラベリングされた文字画像４０を入力層３７に与えると、各桁３７文字の中で、もっとも出力値が高いものを当該文字画像４０の文字認識結果として出力する。 FIG. 13 is a conceptual structural diagram of a three-layer alphanumeric recognition NN. In this figure, 37 alphanumeric characters “0 to 9”, “A to Z” and “others” are identified for convenience of explanation. The three-layer alphanumeric character recognition NN includes an input layer 37, an intermediate layer 38, and an output layer 39. When the labeled character image 40 is given to the input layer 37, the output value is the highest among the 37 characters in each digit. The object is output as the character recognition result of the character image 40.

図１４は、補正処理部２２の具体的な動作フローを示す図である。
この補正処理部２２では、まず、ステップＳ５ａで、文字認識部２１が出力した文字認識候補を入力し、ステップＳ５ｂで、ＤＰマッチングにより、文字認識候補間の距離が最短で且つ部分集合の一致度合いが最も高い位置を算出する。そして、一致度合いがもっとも高い位置からの文字列を最終的な補正文字認識結果として出力する。 FIG. 14 is a diagram illustrating a specific operation flow of the correction processing unit 22.
In this correction processing unit 22, first, in step S5a, the character recognition candidate output by the character recognition unit 21 is input, and in step S5b, the distance between the character recognition candidates is the shortest and the degree of subset matching by DP matching. The position where is the highest is calculated. Then, the character string from the position with the highest degree of coincidence is output as the final corrected character recognition result.

ここで、動画像におけるフレーム毎のナンバープレート文字切り出し精度は、環境変化や車体色などに影響を受けるために安定しない。さらに、車両走行中は画像ブレが生じるため、頻繁に切り出しミスが起き、文字列の桁数を誤る場合も多い。たとえば、バンパーやプレートの境目などを切り出してしまい、文字と誤判定してしまう現象はその典型的な例である。 Here, the license plate character extraction accuracy for each frame in the moving image is not stable because it is affected by environmental changes, vehicle body colors, and the like. In addition, image blurring occurs while the vehicle is running, so that frequent clipping errors often occur and the number of digits in the character string is often incorrect. For example, a phenomenon in which a bumper or a plate boundary is cut out and erroneously determined as a character is a typical example.

我が国のナンバープレートは桁数が既知であり、陸支コードや車種コードなども、文字の小ささを除けば数字のみで構成されている。したがって、我が国のナンバープレートを対象とする限り、車両ナンバーや陸支コード及び車種コードの既知の位置情報を用いて補正を行うことも可能であるが、海外のナンバープレートでは、桁数が一定でないものや、また、英数字の組み合わせでだけでなく、記号やマークなどの使用も認められているものもあり、こうした外国のナンバープレートを対象とする場合に、とりわけ誤判別を起こしやすい。 The number of digits is known in Japan's license plate, and land codes and vehicle type codes are composed of numbers only, except for the small letters. Therefore, as long as the Japanese license plate is targeted, it is possible to make corrections using the known position information of the vehicle number, land code, and vehicle type code, but the number of digits is not constant in overseas license plates. Some of them are not only combinations of alphanumeric characters but also symbols and marks, and are especially prone to misclassification when targeting foreign license plates.

そこで、本実施形態においては、前フレームの認識結果を用いて、現フレームの文字切り出し結果を補正する手法を用いる。この手法では、認識対象とする文字列の桁数は未知であっても構わない。前フレームで切り出された文字列と現フレームの文字列の対応点を求めて、切り出し位置のずれを吸収することにより、誤りを補正する。 Therefore, in the present embodiment, a method of correcting the character cutout result of the current frame using the recognition result of the previous frame is used. In this method, the number of digits of the character string to be recognized may be unknown. Corresponding points between the character string clipped in the previous frame and the character string of the current frame are obtained, and the error is corrected by absorbing the shift of the cutout position.

対応点の算出には、ＤＰマッチングを用いる。
図１５は、補正処理部２２におけるＤＰマッチングの概念図である。また、次式（１）は、ＤＰマッチングの計算式であり、この式（１）を用いて、前フレームと現フレームとの認識文字列間の最短距離を算出し、その位置を現フレームの補正結果とする。また、位置補正処理後の文字列を時系列データとして文字列スタックに投票し、桁位置ごとに最も出現頻度の高い文字を最終認識結果とする。 DP matching is used to calculate the corresponding points.
FIG. 15 is a conceptual diagram of DP matching in the correction processing unit 22. Further, the following expression (1) is a DP matching calculation expression, and using this expression (1), the shortest distance between the recognized character strings of the previous frame and the current frame is calculated, and the position is calculated for the current frame. The correction result. In addition, the character string after the position correction process is voted on the character string stack as time-series data, and the character having the highest appearance frequency for each digit position is set as the final recognition result.

ここで、ｉ、ｊは、それぞれ現フレームと前フレーム内の認識文字の桁位置。ｇ（ｉ、ｊ）は認識文字列の始端から文字（ｉ、ｊ）までの累積距離である。この距離が最短となる位置を現フレームの補正位置とする。 Here, i and j are digit positions of recognized characters in the current frame and the previous frame, respectively. g (i, j) is the cumulative distance from the beginning of the recognized character string to the character (i, j). The position where this distance is the shortest is set as the correction position of the current frame.

図１６は、認識結果出力部２３の具体的な動作フローを示す図である。
認識結果出力部２３では、ステップＳ６ａで、補正文字認識結果の入力があるか否かを判定し、補正文字認識結果の入力があれば、次に、ステップＳ６ｂで、所定の制約条件を満たしているか否かを判定する。そして、所定の制約条件を満たしていれば、ステップＳ６ｃで、補正文字認識結果が連続して同一であるか否かを判定し、補正文字認識結果が連続して同一であれば、ステップＳ６ｄで、その補正文字認識結果をディスプレイユニット１４に出力する。 FIG. 16 is a diagram illustrating a specific operation flow of the recognition result output unit 23.
In step S6a, the recognition result output unit 23 determines whether there is an input of a corrected character recognition result. If there is an input of a corrected character recognition result, next, in step S6b, a predetermined constraint condition is satisfied. It is determined whether or not. If the predetermined constraint condition is satisfied, it is determined in step S6c whether or not the corrected character recognition results are continuously the same. If the corrected character recognition results are the same continuously, in step S6d. The corrected character recognition result is output to the display unit 14.

なお、所定の制約条件とは、図２の制約条件記憶部２４に記憶されている情報のことである。
図１７は、制約条件記憶部２４に記憶されている情報の一例を示す図である。図示の例では、４つの制約条件が記憶されている。第一の制約条件４１は、「桁数が最小文字数以下のときあるいは最大文字数以上のときには認識結果として出力しない。」というものであり、第二の制約条件４２は、「認識候補文字列が開始文字列に登録されている文字列で始まり、最小文字数以下又は最大文字数以上のときには認識結果として出力しない。」というものであり、第三制約条件４３は、「認識候補文字列の中に中間文字列に登録されている文字列が存在し、最小文字数以下又は最大文字数以上のときには認識結果として出力しない。」というものである。 Note that the predetermined constraint condition is information stored in the constraint condition storage unit 24 of FIG.
FIG. 17 is a diagram illustrating an example of information stored in the constraint condition storage unit 24. In the illustrated example, four constraint conditions are stored. The first constraint condition 41 is “when the number of digits is less than the minimum number of characters or when the number of characters is greater than or equal to the maximum number of characters”, and the second constraint condition 42 is “the recognition candidate character string starts. The third constraint 43 starts with a character string registered in the character string and does not output as a recognition result when the number of characters is less than or greater than the minimum number of characters. When there is a character string registered in the column and the number is less than the minimum number of characters or greater than the maximum number of characters, no recognition result is output.

次に、本実施形態の車両ナンバープレート読み取り装置１１の評価実験について説明する。
評価実験で使用した画像データは、カリフォルニア州の高速道路走行中に、自車１０の前方の視野内をＣＣＤカメラで撮影した多数のモノクロ画像（有用範囲８ビットで切り出した画像）を使用した。画像サイズは６４０×４８０画素である。また、大きな輝度変化に対応するため、ダイナミックレンジの広いＣＣＤカメラを用いた。画像データベースにはナンバープレートが写し込まれている４６種のシーンを格納した。なお、目視でも文字列が読み取れないシーンはデータベースから除去してある。表１に画像データの数を示す。 Next, an evaluation experiment of the vehicle license plate reader 11 according to this embodiment will be described.
As the image data used in the evaluation experiment, a large number of monochrome images (images cut out with a useful range of 8 bits) taken with a CCD camera in the field of view in front of the vehicle 10 while driving on a highway in California were used. The image size is 640 × 480 pixels. Also, a CCD camera with a wide dynamic range was used in order to cope with a large luminance change. The image database stored 46 types of scenes with license plates. Note that scenes whose character strings cannot be read visually are removed from the database. Table 1 shows the number of image data.

画像データは、大きく分けて装飾枠（ナンバープレートの装飾枠）が取り付けてあるものと、装飾枠がないものとに分類される。また、７桁の英数字列のものを標準レイアウトとし、それ以外のものを特殊レイアウトとする。 The image data is roughly classified into those having a decorative frame (number plate decorative frame) and those having no decorative frame. A 7-digit alphanumeric string is used as a standard layout, and other layouts are used as a special layout.

まず、位置検出処理について説明する。評価実験では、位置検出ＮＮの学習データとして、画像データベースに収められている全１６３９２枚の画像の中から、ナンバープレートが全て異なるように無作為に取り出した５３枚の画像を用いた。また、車のテールランプ部分やエンブレムなど、ＮＮが誤反応しやすい部分が含まれる１７枚の画像を抑制学習に用いた。 First, the position detection process will be described. In the evaluation experiment, as the learning data for the position detection NN, 53 images that were randomly extracted from all 16392 images stored in the image database so that the license plates were all different were used. In addition, 17 images including portions where the NN is likely to react erroneously, such as a tail lamp portion and an emblem of a car, were used for the suppression learning.

位置検出処理において、前フレームでナンバープレートの重心位置が判明している場合には、現フレームを探索するときに画像全体を検索する必要はない。また、画像ブレなどによりエッジが不鮮明な場合は、位置検出ＮＮが背景などに誤反応することもある。そこで、ナンバープレートの位置が前フレームの処理で判明している場合には、次フレームでその座標近傍のみを探索することにした。これにより、計算量を最小限に抑え、処理速度を向上させることができた。 In the position detection process, when the center of gravity position of the license plate is known in the previous frame, it is not necessary to search the entire image when searching for the current frame. In addition, when the edge is unclear due to image blurring or the like, the position detection NN may erroneously react to the background or the like. Therefore, when the position of the license plate is known by the processing of the previous frame, we decided to search only the vicinity of the coordinates in the next frame. As a result, the calculation amount can be minimized and the processing speed can be improved.

図１８は、フレーム間移動距離の分布図、図１９（ａ）は、探索領域限定による効果を示す図である。評価に用いた動画像データでのナンバープレートフレーム間移動距離は、図１８のフレーム間移動距離の分布図４５ａ、４５ｂに示すように、ｘ軸方向に最大３５画素、ｙ軸方向に２５画素である。図１９（ａ）の探索領域限定による効果図４６に示すように、各フレームを全検索する方法（棒グラフ４６ａ）に比べて、現フレームの位置検出座標から次フレームの検索領域を制限する方法（棒グラフ４６ｂ）は、位置検出精度が８７．１％（１４２８３／１６３９２）から９４．２％（１５４４６／１６３９２）となり、平均フレームレートは毎秒７．８フレームから毎秒１３．８フレームと大幅に向上した。さらに、文字認識処理においては、英数字が１文字も認識できない場合にも、非存在フレーム、もしくはナンバープレート重心位置の誤検出と判断する（棒グラフ４６ｃ）ことにより、認識率は９５．１％（１５５８３／１６３９２）となり、平均フレームレートも毎秒１４．３フレームに向上した。 FIG. 18 is a distribution diagram of the inter-frame movement distance, and FIG. 19A is a diagram showing the effect of limiting the search area. The movement distance between license plate frames in the moving image data used for the evaluation is 35 pixels in the x-axis direction and 25 pixels in the y-axis direction, as shown in the distribution diagrams 45a and 45b of the movement distance between frames in FIG. is there. Effect of Limiting Search Area in FIG. 19A As shown in FIG. 46, compared to the method of searching all the frames (bar graph 46a), the method of limiting the search area of the next frame from the position detection coordinates of the current frame ( In the bar graph 46b), the position detection accuracy is increased from 87.1% (14283/16392) to 94.2% (15446/16392), and the average frame rate is greatly improved from 7.8 frames per second to 13.8 frames per second. . Furthermore, in the character recognition process, even when one alphanumeric character cannot be recognized, the recognition rate is 95.1% (bar graph 46c) by determining that it is a false detection of a non-existing frame or license plate centroid position (bar graph 46c). The average frame rate was improved to 14.3 frames per second.

位置検出処理により切り出された装飾枠なしのナンバープレート画像（１０６７４枚）と装飾枠ありの画像（４９０９枚）を用いて英数字認識を行った。垂直方向エッジ情報を用いて、文字列の上下端を推定することで、文字列下端に接触している装飾枠などの背景領域を効果的に除去することができた。文字の上下端を決定する閾値は予備実験により、ヒストグラムの最大となる「頻度×０．０８５」とした。この手法により、装飾枠なしのナンバープレート英数字の８９．１％（９５１１／１０６７４）に加え、さらに、装飾枠ありの英数字の８３．５％（４０９９／４９０９）も切り出すことができた。 Alphanumeric recognition was performed using a license plate image without a decorative frame (10673 sheets) and an image with a decorative frame (4909 sheets) cut out by the position detection process. By estimating the upper and lower ends of the character string using the edge information in the vertical direction, it was possible to effectively remove the background area such as a decorative frame in contact with the lower end of the character string. The threshold for determining the upper and lower ends of the characters was set to “frequency × 0.085”, which is the maximum of the histogram, by preliminary experiments. By this method, 83.5% (4099/4909) of alphanumeric characters with a decorative frame could be cut out in addition to 89.1% (9511/10673) of license plate alphanumeric characters without a decorative frame.

文字認識処理で得られた文字列情報を用いて、ＤＰマッチングにより前フレームの文字列との対応点を求め、桁位置補正処理を行った。英数字認識ＮＮのみの場合の認識率は７２．５％であるが、ＤＰマッチングを用いた文字列補正処理を加えたことで、桁位置ずれや文字の脱落による誤認識が補正され、認識率９７．８％を得た。 Using the character string information obtained by the character recognition process, the corresponding point with the character string of the previous frame was obtained by DP matching, and the digit position correction process was performed. The recognition rate in the case of only the alphanumeric recognition NN is 72.5%, but by adding character string correction processing using DP matching, misrecognition due to digit position shifts or missing characters is corrected, and the recognition rate 97.8% was obtained.

図１９（ｂ）は、ＤＰマッチングによる認識改善例を示す図である。この図において、補正処理なしの場合は、シーンの最後まで誤認識フレームが断続的に存在するのに対し、ＤＰマッチングを用いた補正処理を加えた場合は、始めの２５フレーム程度で認識結果が確定した。これにより、誤認識フレームが補正されていることが確認できた。文字列先頭のセグメンテーション誤りによる誤認識や輝度変化による脱落誤りが生じた場合も同様に認識結果が補正された。 FIG. 19B is a diagram illustrating an example of recognition improvement by DP matching. In this figure, when there is no correction processing, erroneous recognition frames exist intermittently until the end of the scene, whereas when correction processing using DP matching is added, the recognition result is obtained in the first 25 frames. Confirmed. This confirmed that the misrecognized frame was corrected. The recognition result was corrected in the same way when there was a misrecognition due to a segmentation error at the beginning of the character string or a drop error due to a luminance change.

以上のとおり、車両搭載型のナンバープレート認識システムとその評価実験について述べたが、従来のＮＮ法による位置検出処理と文字認識処理に、動画像としての時系列情報を加えたことにより、撮影環境がリアルタイムに変化していくシーンにおいても、車両ナンバーを高精度に認識できることを確認できた。また、従来法では困難であった文字列にナンバープレートの装飾枠が隣接しているような場合においても、適応的に文字列を分離することが確認できた。 As described above, the vehicle-mounted license plate recognition system and its evaluation experiment have been described. By adding time-series information as moving images to the conventional position detection processing and character recognition processing by the NN method, It was confirmed that the vehicle number can be recognized with high accuracy even in a scene where the vehicle changes in real time. In addition, it was confirmed that the character strings were adaptively separated even when the decorative frame of the license plate was adjacent to the character strings, which was difficult with the conventional method.

以上のとおりであるから、本実施形態によれば、時系列的に同一対象（先行車１６のナンバープレート１７を含む被写体）を撮影して得られた動画を構成する複数の静止画の中で文字を含む枠の位置が特定できない複数の画像から、文字の桁数と文字配置の種類が多数あった場合であっても、装飾の制約条件が少ない枠内にある文字部分の画像にある文字の文字認識を正しく行うことができるという優れた効果が得られる。 As described above, according to the present embodiment, among a plurality of still images constituting a moving image obtained by photographing the same object (subject including the license plate 17 of the preceding vehicle 16) in time series. Even if there are many types of character digits and character arrangements from multiple images in which the position of the frame containing the characters cannot be specified, the characters in the image of the character part in the frame with few decoration restrictions It is possible to obtain an excellent effect that character recognition can be performed correctly.

また、枠内に表記されたすべての文字と絵を分離して文字認識をしなくても、英数字のみや英数字とカタカナなどの限られた文字のみを正確に認識することができれば、以下に示すような様々な用途で利用できる。 Also, if you can accurately recognize only alphanumeric characters or only limited characters such as alphanumeric characters and katakana without separating all characters and pictures written in the frame and recognizing them, the following It can be used for various purposes as shown in

たとえば、車載カメラを使って先行車のナンバープレートや、道路標識、あるいは、看板などに表記された文字を認識して利用することができる。利用の仕方としては、車両の追跡や盗難車等の特定、環境情報や公国情報の取得など様々なものが考えられる。なお、ナンバープレートでは、登録番号の部分が読み取れればよい。 For example, it is possible to recognize and use characters written on a license plate of a preceding vehicle, a road sign, or a signboard using an in-vehicle camera. There are various ways of use such as tracking a vehicle, identifying a stolen vehicle, and acquiring environmental information and principality information. It should be noted that the registration plate may be read with the license plate.

また、動画カメラ付携帯電話に応用してもよい。ポスターやメニュー又は看板などに表記された文字を携帯電話機のカメラで撮影して認識し利用することができる。利用例としては、表記された電話番号やメールアドレスを認識してメールを送信したり電話を掛けたり、または、看板に表記された店舗名を認識して電話番号を検索したり電話を掛けたり、様々なものが考えられる。 Moreover, you may apply to the mobile phone with a moving image camera. Characters written on posters, menus, signboards, etc. can be recognized and used by photographing with a camera of a mobile phone. Examples of usage include recognizing the phone number or email address shown to send an email or making a call, or recognizing the store name shown on the signboard to find a phone number or making a phone call. Various things can be considered.

また、ビデオカメラで撮影した画像をパソコンなどの画像処理装置に入力して、入力した画像のなかで撮影されていた枠で囲まれた表示板、ポスター、看板、標識、値札、番号札、名札、ナンバープレートなどに表記された文字を認識して利用してもよい。利用例は、画像編集のタイトルとして使ったり、シーン検索のキーワードとして使ったり、または、撮影対象を特定するために使ったりすることができる。 Also, images taken with a video camera are input to an image processing device such as a personal computer, and a display board, poster, signboard, sign, price tag, number tag, name tag surrounded by a frame that was captured in the input image. The characters written on the license plate may be recognized and used. The usage example can be used as a title for image editing, as a keyword for scene search, or used to specify a shooting target.

これらの用途で共通する画像の特徴は、時系列で撮影された画像に同じ枠の画像が繰り返して現れる、画像のなかの枠の位置をあらかじめ知ることができない、画像ごとに文字画像の認識しやすさが変動することがある（詳しくは、カメラの動きや対象の動きによって画像中の枠の位置が移動することがある、自然光や照明、影などの環境変化によって枠や文字の照度が変動することがある、カメラと枠を構成する物体との間にある物体の影響で枠や文字が一時的に隠れることがある、枠内に文字以外で大きさと形状が文字と類似する画像が含まれることがある）、などである。 The image features common to these applications are that images of the same frame appear repeatedly in images taken in time series, the position of the frame in the image cannot be known in advance, and character images are recognized for each image. (In particular, the position of the frame in the image may move depending on the movement of the camera or the movement of the target. The illuminance of the frame or text varies due to environmental changes such as natural light, lighting, or shadows. The frame and text may be temporarily hidden by the influence of the object between the camera and the objects that make up the frame, and images that are similar in size and shape to characters other than text are included in the frame ), Etc.

また、ナンバープレートや、道路標識、表示板など特定目的に使われる枠に含まれる画像には、さらに、枠内に表記される文字の種類と大きさや位置には制約がある、文字以外の画像の表記に制約がある、などの特徴がある。これらの特徴を活用することで、さらに認識率を向上させることができる。 In addition, images included in frames used for specific purposes, such as license plates, road signs, and display boards, are also non-character images that have restrictions on the type, size, and position of characters displayed in the frames. There are features such as restrictions on notation. By utilizing these features, the recognition rate can be further improved.

本発明では、前記の応用を可能とするため繰り返して撮影した画像（同一対象を複数回撮影して得られる時系列の画像）に含まれる枠の画像に関する特徴を最大限に活用して、枠内に表記された文字のうち限定された文字に関して正確に認識できるようにする。 In the present invention, the feature relating to the image of the frame included in the image repeatedly photographed (time-series image obtained by photographing the same object a plurality of times) in order to enable the application described above is utilized to the maximum extent. To make it possible to accurately recognize limited characters among the characters written in the box.

同一対象を複数回撮影して得られる時系列の画像には、枠の位置はカメラと枠の関係において制約された条件で画像内を移動することがある、時系列画像の中で枠内の同じ位置にある文字において、認識しやすい画像と認識しにくい画像があり、繰り返して撮影して認識すると正しい認識結果が含まれることがある、などの特徴がある。 For time-series images obtained by shooting the same object multiple times, the frame position may move within the image under conditions constrained by the relationship between the camera and the frame. Characters at the same position include an image that is easy to recognize and an image that is difficult to recognize, and there are features such that a correct recognition result may be included when repeatedly taken and recognized.

前後の画像の文字に対応関係があるが、誤認識の可能性があるので完全に一致するとは限らない。このため、複数回撮影した画像の特徴を使って認識精度を向上させようとすると、１枚の画像から文字認識を行う場合には発生しない以下の課題についても解決する必要がある。 There is a correspondence relationship between the characters in the preceding and following images, but there is a possibility of misrecognition, so they do not always match completely. For this reason, if it is attempted to improve the recognition accuracy by using the characteristics of an image taken a plurality of times, it is necessary to solve the following problems that do not occur when character recognition is performed from one image.

つまり、枠の画像内の位置が変動しても対応できるようにして、同じ枠の画像を使って認識精度を向上させる必要がある。また、表記された文字列の文字数は既知ではなく、枠内で認識される文字は全て正しい場合もあるが、誤認識された文字を含むことがあるので、そのような文字認識候補の対応関係をもとに一致度合いを算出して補正するためには、枠内の文字列の前後の部分で抜けが無いように認識する必要がある。また、複数回撮影しても、枠内の文字が枠あるいは枠の外側にあると誤認識されると、認識文字候補として出力されないので、補正できなくなる。また、一致度合いを計算できるためには、認識できない文字が含まれている場合にも、文字認識の際に文字抜けが発生しないようにして文字列の対応関係に関する情報が失われないように、認識できている文字の認識結果を出力する必要がある。 That is, it is necessary to improve the recognition accuracy by using the image of the same frame so that it can cope with the fluctuation of the position in the image of the frame. In addition, the number of characters in the written character string is not known, and all characters recognized in the frame may be correct, but may include misrecognized characters. In order to calculate and correct the degree of coincidence based on the above, it is necessary to recognize that there is no omission in the part before and after the character string in the frame. Even if the image is taken a plurality of times, if a character in the frame is erroneously recognized as being outside the frame or the frame, the character is not output as a recognized character candidate and cannot be corrected. In addition, in order to be able to calculate the degree of match, even when unrecognizable characters are included, so that character omission is not lost during character recognition so that information on the correspondence between character strings is not lost, It is necessary to output the recognition result of recognized characters.

前記、複数回撮影した画像を使って認識精度を向上させるときの課題を以下のようにすることによって解決する。
（イ）以前の画像に対して行った文字認識で枠内に文字列が認識されている場合には、その位置情報をもとに次フレームの探索範囲を限定して、前回の枠とは異なる画像を切り出すことを避ける。
（ロ）文字列の長さ方向に枠周辺の画像を含む画像に対して予測される最大文字数で認識を行い、そこに含まれる文字以外の画像部分を含めて認識し、文字列の抜けが無いように認識する。
（ハ）他の文字や文字以外の画像を、限定された文字と誤認識しないように文字認識を行い、認識できない文字を不定文字として、認識候補文字列の文字の対応関係がずれないように一致度合いを算出する。 The above-described problem in improving the recognition accuracy using images taken a plurality of times is solved as follows.
(B) When a character string is recognized in the frame by character recognition performed on the previous image, the search range of the next frame is limited based on the position information, and the previous frame Avoid cropping different images.
(B) Recognition is performed with the maximum number of characters predicted for an image including an image around the frame in the length direction of the character string, and recognition is performed including image portions other than the characters included therein, and character strings are missing. Recognize that there is no.
(C) Character recognition is performed so that other characters and images other than characters are not mistakenly recognized as limited characters, and unrecognizable characters are set as indefinite characters so that the correspondence between characters in the recognition candidate character strings does not shift. The degree of coincidence is calculated.

なお、文字認識において、類似度が最も近い文字を認識結果とする方式では、英数字以外の文字を英数字と誤認識してしまうので、それをさけるために英数字のいずれかと類似度が高いときにそれを認識候補として出力し、英数字のいずれとも類似度が低い場合には不定文字を認識候補として出力することが望ましい。 In character recognition, the method that uses the character with the closest similarity as the recognition result misrecognizes a non-alphanumeric character as an alphanumeric character, so the similarity is high with any of the alphanumeric characters to avoid it. It is sometimes desirable to output it as a recognition candidate, and to output an indefinite character as a recognition candidate if the similarity of any alphanumeric characters is low.

以上のとおりにすると、ナンバープレートの位置が特定できない連続した動画像から、文字の桁数と文字配置の種類が多数あり、装飾の制約条件が少ないナンバープレートにおいて、文字画像を誤って文字でないと誤認識することが抑制される。 In this way, there are many types of character digits and character arrangements from consecutive moving images where the position of the license plate cannot be specified. Misrecognition is suppressed.

また、文字以外の画像に対して認識した結果は文字認識部で不定文字と判定されるか、安定して同一の文字として認識されないので、補正処理で排除される。また、枠や装飾と干渉することで文字が誤って排除されるセグメンテーション誤りが抑制される。また、ナンバープレートの一部しか画像に含まれていないとき桁数が少ないときに、撮影範囲に入った文字のみを認識して、撮影範囲外の文字が脱落した認識結果が出力されない。また、ナンバープレートの部分が撮影範囲外から撮影範囲内に移動するときに、一部の文字が脱落する誤認識が発生しない。また、ナンバープレートの部分が撮影範囲内から撮影範囲外に移動するときに、一部の文字が脱落する誤認識が発生しない。 In addition, the result of recognition for an image other than a character is determined as an indefinite character by the character recognition unit, or is not stably recognized as the same character, and thus is excluded by the correction process. Further, a segmentation error in which characters are erroneously excluded due to interference with a frame or decoration is suppressed. Further, when only a part of the license plate is included in the image, when the number of digits is small, only the characters that are in the shooting range are recognized, and the recognition result that the characters outside the shooting range are dropped is not output. Further, when the number plate part moves from outside the shooting range to within the shooting range, there is no erroneous recognition that some characters are dropped. Further, when the number plate part moves from the shooting range to the outside of the shooting range, there is no erroneous recognition that some characters are dropped.

車両ナンバープレート読み取り装置のシステム構成図である。It is a system block diagram of a vehicle license plate reader. 車両ナンバープレート読み取り装置１１の概念的な機能ブロック図である。2 is a conceptual functional block diagram of a vehicle license plate reading device 11. FIG. 車両ナンバープレート読み取り装置１１のシステム全体の動作フローを示す図である。It is a figure which shows the operation | movement flow of the whole system of the vehicle license plate reader. 位置検出部１９の具体的な動作フローを示す図である。It is a figure which shows the specific operation | movement flow of the position detection part 19. FIG. 画像入力部１８で入力された動画像２５を示す図である。It is a figure which shows the moving image 25 input by the image input part 18. FIG. エッジ抽出画像２６を示す図である。It is a figure which shows the edge extraction image 26. FIG. 注目領域（ここではナンバープレート１７）の重心を求める際に用いられる概念図である。It is a conceptual diagram used when calculating | requiring the gravity center of attention area (here number plate 17). ナンバープレート画像の例を示す図である。It is a figure which shows the example of a license plate image. 上下方向背景除去部２０の具体的な動作フローを示す図である。It is a figure which shows the specific operation | movement flow of the up-down direction background removal part. ナンバープレート画像３２とそのヒストグラム分布図である。It is a license plate image 32 and its histogram distribution map. 読み取り対象の文字列（車両ナンバー）３４ａの行のみの画像３６を示す図である。It is a figure which shows the image 36 of only the line of the character string (vehicle number) 34a of the reading object. 文字認識部２１の具体的な動作フローを示す図である。It is a figure which shows the specific operation | movement flow of the character recognition part. ３層英数字認識用ＮＮの概念構造図である。It is a conceptual structure figure of NN for 3 layer alphanumeric character recognition. 補正処理部２２の具体的な動作フローを示す図である。It is a figure which shows the specific operation | movement flow of the correction process part. 補正処理部２２におけるＤＰマッチングの概念図である。It is a conceptual diagram of DP matching in the correction processing unit 22. 認識結果出力部２３の具体的な動作フローを示す図である。It is a figure which shows the specific operation | movement flow of the recognition result output part. 制約条件記憶部２４に記憶されている情報の一例を示す図である。6 is a diagram illustrating an example of information stored in a constraint condition storage unit 24. FIG. フレーム間移動距離の分布図である。It is a distribution map of the movement distance between frames. 探索領域限定による効果を示す図及びＤＰマッチングによる認識改善例を示す図である。It is a figure which shows the effect by search area limitation, and the figure which shows the example of recognition improvement by DP matching. 枠線で囲まれた文字列画像の一例を示す図である。It is a figure which shows an example of the character string image enclosed with the frame line.

Explanation of symbols

Ｓ１ステップ（撮像工程）
Ｓ２ステップ（切り出し工程）
Ｓ３ステップ（パターン認識行取り出し工程）
Ｓ４ステップ（抽出工程）
Ｓ５ステップ（除去工程）
Ｓ６ステップ（出力工程）
１１車両ナンバープレート読み取り装置（パターン認識装置）
１２テレビカメラ（撮像手段）
１８画像入力部（撮像手段）
１９位置検出部（切り出し手段）
２０上下方向背景除去部（パターン認識行取り出し手段）
２１文字認識部（抽出手段）
２２補正処理部（除去手段）
２３認識結果出力部（出力手段）
３１ナンバープレート画像（切り出し画像）
３６画像（パターン認識行の画像）
S1 step (imaging process)
S2 step (cutout process)
S3 step (pattern recognition line extraction process)
S4 step (extraction process)
S5 step (removal process)
S6 step (output process)
11 Vehicle license plate reader (pattern recognition device)
12 TV camera (imaging means)
18 Image input unit (imaging means)
19 Position detector (cutout means)
20 Vertical background removal unit (pattern recognition line extraction means)
21 Character recognition part (extraction means)
22 Correction processing unit (removal means)
23. Recognition result output unit (output means)
31 License plate image (cutout image)
36 images (images of pattern recognition lines)

Claims

Imaging means for capturing images of the same subject at a constant or indefinite period and outputting them in time series;
Clipping means for cutting out a reading range that matches a predetermined shape from the image;
Pattern recognition line extraction means for extracting an image of a target pattern recognition line from the cutout image cut out by the cutout means;
Extraction means for extracting pattern recognition candidates included in the image of the pattern recognition line;
Removing means for removing non-pattern recognition candidates from the pattern recognition candidates;
An output means for outputting the pattern recognition candidate from which the non-pattern recognition candidate has been removed as a definite pattern.

The pattern recognition apparatus according to claim 1, wherein the removing unit removes a non-pattern recognition candidate from the pattern recognition candidates by DP matching.

2. The pattern recognition apparatus according to claim 1, wherein the predetermined shape is a shape corresponding to an outer shape of a license plate of a vehicle, and the pattern recognition line is a character string line on the license plate. .

An imaging process for capturing images of the same subject at a constant or indefinite period and outputting them in time series;
A cutout step of cutting out a reading range that matches a predetermined shape from the image;
A pattern recognition row extraction step for extracting an image of a target pattern recognition row from the cutout image cut out by the cutout step;
An extraction step of extracting pattern recognition candidates included in the image of the pattern recognition line;
Removing the non-pattern recognition candidate from the pattern recognition candidates;
An output step of outputting the pattern recognition candidate from which the non-pattern recognition candidate is removed as a definite pattern.

5. The pattern recognition method according to claim 4, wherein the removing step removes non-pattern recognition candidates from the pattern recognition candidates by DP matching.

5. The pattern recognition method according to claim 4, wherein the predetermined shape is a shape corresponding to an outer shape of a license plate of a vehicle, and the pattern recognition line is a character string line on the license plate. .

A frame peripheral image including an image around the frame-shaped object or the frame-shaped figure is cut out from a plurality of images obtained by chronologically capturing the same subject including the frame-shaped object or the frame-shaped graphic. , Character recognition is performed on each of the frame surrounding images corresponding to the plurality of images to calculate a plurality of character recognition candidate characters, and the matching degree of the subset of the character recognition candidate character strings is A character recognition method that outputs a high character string as a character recognition result.

A frame surrounding image in which the frame surrounding image is searched up and down to detect a portion that does not include a character region, the portion that does not include the character region is deleted from the frame surrounding image, and the portion that does not include the character region is deleted The character recognition method according to claim 7, wherein character recognition is performed on the character.