JP4492258B2

JP4492258B2 - Character and figure recognition and inspection methods

Info

Publication number: JP4492258B2
Application number: JP2004247342A
Authority: JP
Inventors: 良介三高
Original assignee: Panasonic Corp; Matsushita Electric Works Ltd
Current assignee: Panasonic Corp; Panasonic Electric Works Co Ltd
Priority date: 2004-08-26
Filing date: 2004-08-26
Publication date: 2010-06-30
Anticipated expiration: 2024-08-26
Also published as: JP2006065581A

Description

本発明は、主として工業製品（たとえば、加工食品）のような対象物に製品管理などの目的で印刷ないし刻印されているドット文字・図形をＴＶカメラやスキャナのような画像入力装置で読み取り、画像入力装置で得た画像に画像処理を施すことによってドット文字・図形を認識する文字・図形の認識方法および検査方法に関するものである。 The present invention mainly reads dot characters and figures printed or engraved on an object such as an industrial product (for example, processed food) for the purpose of product management or the like with an image input device such as a TV camera or a scanner. The present invention relates to a character / graphic recognition method and an inspection method for recognizing dot characters / graphics by performing image processing on an image obtained by an input device.

従来から、文字・図形（「文字・図形」とは、キャラクタジェネレータで生成できるような、文字や図形を意味しており、以下では単に「文字」と呼ぶ）をＴＶカメラやスキャナのような画像入力装置で読み取り、画像入力装置で得た画像に画像処理を施すことによって文字を認識する技術が種々提案されている。文字を認識する技術は、あらかじめ指定されたカテゴリに対応付けるパターン認識の技術であるから、カテゴリの対応付けが容易になるような文字の構造の記述、文字の変形に対する補正、対象物に表記された文字を１文字ずつ切り出すセグメンテーションなどの技術が必要になる。 Conventionally, a character / graphic (“character / graphic” means a character or graphic that can be generated by a character generator, and hereinafter simply referred to as “character”) is an image like a TV camera or a scanner. Various techniques for recognizing characters by reading with an input device and applying image processing to an image obtained with the image input device have been proposed. Since the technology for recognizing characters is a pattern recognition technology that associates with categories specified in advance, description of the structure of characters that makes it easy to associate categories, correction for deformation of characters, written on the object A technique such as segmentation that cuts out characters one by one is required.

たとえば、手書き文字のように複雑に変形する文字を認識するために、文字の構造を以下の形で記述することが提案されている（たとえば、特許文献１、特許文献２参照）。特許文献１、２に記載の技術では、文字の画像を細線化して分岐点あるいは交差点を特異点として抽出し、特異点の周囲の構造を、「Ｘ」のような交差、「Ｋ」のような接触、「Ｔ」のような分岐の３種類に分類することによって、特異点で接続される線を複数のストロークに分割し、さらに、プリミティブと呼ぶ単調曲線の連結構造であるプリミティブ系列を用いて各ストロークを表し、プリミティブ系列と特異点との組により、文字の構造を記述している。ストロークは文字を手書きする際のひとつながりの部分（一筆で書く部分にほぼ一致する）に相当し、プリミティブとは、二次元画像内においてｘ、ｙ座標が単調に変化する単調曲線を意味している。また、プリミティブの連結構造は、２つのプリミティブを連結することにより形成される凸の方向と、連結点の回りでのプリミティブの右手系での配置順序とで表され、プリミティブ系列は１つのストロークを構成するプリミティブの連結構造の集合になる。 For example, in order to recognize a complicatedly deformed character such as a handwritten character, it has been proposed to describe the character structure in the following form (for example, refer to Patent Document 1 and Patent Document 2). In the techniques described in Patent Documents 1 and 2, a character image is thinned and a branch point or an intersection is extracted as a singular point, and the structure around the singular point is an intersection such as “X”, such as “K”. By dividing into three types of contact and branch such as “T”, a line connected at a singular point is divided into a plurality of strokes, and a primitive series that is a connected structure of monotone curves called primitives is used. Each stroke is represented, and a character structure is described by a combination of a primitive sequence and a singular point. A stroke corresponds to a part of a hand-written character (which almost coincides with a part written with a single stroke), and a primitive means a monotone curve in which x and y coordinates change monotonically in a two-dimensional image. Yes. The primitive connection structure is expressed by the convex direction formed by connecting two primitives and the arrangement order of primitives around the connection point in the right-handed system. It becomes a set of linked structure of the primitives that make up.

また、特許文献１、２では、文字の変形に対する補正のために、バウンディングボックス（文字を囲む最小の直立長方形）を用い、バウンディングボックスの縦横比を変更せずに長い方の辺を単位長にするようにスケールを正規化し、さらにアフィン変換による補正を行う技術も開示されている。 In Patent Documents 1 and 2, a bounding box (the smallest upright rectangle surrounding the character) is used to correct the deformation of the character, and the longer side is set to the unit length without changing the aspect ratio of the bounding box. A technique is also disclosed in which the scale is normalized and correction is further performed by affine transformation.

すなわち、特許文献１、２に記載の技術では、入力画像について文字のセグメンテーションを行った後に、細線化してプリミティブ系列と特異点とで表される構造を解析し、次に入力画像から切り出した部分画像とモデルとの構造についてのマッチングによりカテゴリの候補を抽出し、さらに部分画像を正規化してモデルとの距離計算を行い類似度を評価することで、部分画像をモデルの文字のカテゴリに対応付けている。このように、文字の構造的情報と定量的情報とを用いることによりロバストなパターン認識が可能になっている。 That is, in the techniques described in Patent Documents 1 and 2, after segmentation of characters is performed on the input image, the thinned line is analyzed to analyze the structure represented by the primitive sequence and the singular point, and then cut out from the input image Category candidates are extracted by matching the structure of the image and the model, and the partial images are normalized by calculating the distance to the model and evaluating the similarity, thereby associating the partial images with the model character categories. ing. Thus, robust pattern recognition is possible by using the structural information and quantitative information of characters.

ところで、特許文献１、２にはセグメンテーションの技術については、とくに説明されていないが、セグメンテーションの技術としては、入力された文字列画像を細線化して得られる骨格線図形を単純弧や単純閉曲線を用いて部分曲線に分割した後、部分曲線における分岐点あるいは交差点の有無と部分曲線の横幅と文字列の高さとを用いて、文字の接触の可能性を推定する技術が提案されている（たとえば、特許文献３参照）。また、特許文献３には、骨格線図形を分割して得られた部分曲線を組合せることにより生成される文字列について特許文献１、２と同様の技術を用いて文字を認識し、認識結果の組合せのうち文字として認識された部分図形をノードとするネットワークを構成し、ネットワークから最適経路を選択することにより、ノードの認識文字の組合せを出力する技術も開示されている。さらに、文字数が既知である場合には、ネットワークから文字数と一致するノード数の経路から最適経路を選択することも記載されている。 By the way, although the segmentation technique is not specifically described in Patent Documents 1 and 2, as a segmentation technique, a skeleton line figure obtained by thinning an input character string image is converted to a simple arc or a simple closed curve. After dividing into partial curves, a technique for estimating the possibility of touching a character using the presence or absence of a branch point or an intersection in the partial curve, the horizontal width of the partial curve, and the height of the character string has been proposed (for example, And Patent Document 3). Further, Patent Document 3 recognizes a character using a technique similar to Patent Documents 1 and 2 for a character string generated by combining partial curves obtained by dividing a skeleton line figure, and results of recognition. There is also disclosed a technique of forming a network having a partial figure recognized as a character among the combinations of nodes and outputting a combination of recognized characters of the node by selecting an optimum route from the network. Further, it is described that when the number of characters is known, an optimum route is selected from a route having the number of nodes that matches the number of characters from the network.

ところで、上述した特許文献１、２に記載の技術では、文字の構造的情報を用いて入力画像のカテゴリを絞り込んではいるものの、構造的情報として線図形の分岐点と交差点を基準に用いており、文字を構成する線が連続していることが前提になっている。このことは、特許文献３に記載の技術においても同様である。 By the way, in the techniques described in Patent Documents 1 and 2 described above, although the category of the input image is narrowed down using the structural information of characters, the structural information is used on the basis of branch points and intersections of line figures. It is assumed that the lines constituting the characters are continuous. The same applies to the technique described in Patent Document 3.

したがって、対象物に印刷したり刻印したりした文字のように変形の比較的少ない文字であっても、食品の賞味期限や消費期限の表示などにおいて広く用いられているドット文字、印刷装置の不具合などに起因して生じるかすれた文字（以下、「かすれ文字」という）、文字を構成する連続すべき線が不連続に分断した文字（以下、「分断文字」という）などでは、文字の構造的情報を正確に抽出することができず、特許文献１〜３に記載された技術を採用したとしても、文字を正確に認識することができない可能性がある。 Therefore, even for characters with relatively little deformation, such as characters printed or engraved on the object, dot characters widely used for displaying the expiration date or expiry date of food, defects in the printing device For characters that are faint due to the above (hereinafter referred to as “faint characters”), characters that are discontinuously separated from the lines that constitute the characters (hereinafter referred to as “divided characters”), etc. Information cannot be extracted accurately, and even if the techniques described in Patent Documents 1 to 3 are adopted, there is a possibility that characters cannot be recognized accurately.

この種の問題に対応するために、本発明者は、入力された画像から文字・図形の骨格線を抽出して単純図形である近似線で近似し、それらの近似線から互いに近接した近似線の集合をターゲットクラスタとして抽出し、あらかじめ文字・図形の標準をターゲットクラスタと同様の近似線の集合として表したマスタクラスタと照合する技術を先に提案した（特願２００３−４２５８５３３号）。この技術では、ターゲットクラスタとマスタクラスタとを照合する際に、ターゲットクラスタとマスタクラスタとのそれぞれの外接矩形をあらかじめ抽出しておき、両者の外接矩形を重ね合わせるように位置合わせを行った上で、両者に含まれる近似線の近接の度合いを形状類似度として評価し、形状類似度がもっとも高くなるマスタクラスタを着目するターゲットクラスタのカテゴリに対応付けている。
特開平７−２１３１７号公報特許第３１８３９４９号公報特開平６−７６１１４号公報 In order to deal with this type of problem, the present inventor extracts the skeleton lines of characters and figures from the input image, approximates them with approximate lines that are simple figures, and approximate lines that are close to each other from these approximate lines. Has previously been proposed (Japanese Patent Application No. 2003-4258533), in which a set of characters is extracted as a target cluster and collated with a master cluster in which character and figure standards are represented in advance as a set of approximate lines similar to the target cluster. In this technology, when the target cluster and the master cluster are collated, the circumscribed rectangles of the target cluster and the master cluster are extracted in advance, and after aligning the circumscribed rectangles of both, The degree of proximity of the approximate lines included in both is evaluated as the shape similarity, and the master cluster having the highest shape similarity is associated with the target cluster category.
JP-A-7-21317 Japanese Patent No. 3183949 JP-A-6-76114

上述したように、特願２００３−４２５８５３３号において開示した技術では、文字の構造を用いる代わりに線図形で表現されたマスタクラスタと照合することにより、ドット文字、かすれ文字、分断文字であっても正しい認識が可能になる。しかしながら、きわめて小さいドットからなるドット文字のように、ドットの寸法に比較してドットの間隔が広い場合には、ターゲットクラスタにおいてマスタクラスタに含まれる近似線と照合できる特徴が少なくなって形状類似度が低くなり、認識率が低下する可能性がある。 As described above, in the technique disclosed in Japanese Patent Application No. 2003-4258533, even if a dot character, a blurred character, or a divided character is detected by collating with a master cluster represented by a line figure instead of using a character structure. Correct recognition is possible. However, when the dot spacing is wide compared to the dot size, such as a dot character consisting of extremely small dots, the target cluster has fewer features that can be matched with the approximate line included in the master cluster, and the shape similarity May decrease and the recognition rate may decrease.

本発明は上記事由に鑑みて為されたものであり、その目的は、ドット文字・図形、かすれ文字・図形、分断文字・図形などであっても認識可能であり、とくにドットの寸法に対するドットの間隔が広いドット文字・図形であっても正確に認識することが可能な文字・図形の認識方法および文字・図形の判読性を検査することができる文字・図形の検査方法を提供することにある。 The present invention has been made in view of the above reasons, and its purpose is to recognize even dot characters / figures, blurred characters / figures, divided characters / figures, etc. To provide a character / graphic recognition method capable of accurately recognizing dot characters / graphics having a wide interval and a character / graphic inspection method capable of inspecting the legibility of characters / graphics. .

請求項１ないし請求項６の発明は文字・図形の認識方法に関するものである。 The inventions of claims 1 to 6 relate to a method for recognizing characters and figures.

請求項１の発明は、認識対象として入力された文字・図形の画像から文字・図形の骨格線を抽出し、次に骨格線を単純形状である近似線の集合として近似するとともに、連続している近似線を１つの近似線群に加え、かつ連続していない各一対の近似線間の距離を評価することにより当該近似線群の近傍にある近似線も当該近似線群に加えて、１単位として扱う近似線群をターゲットクラスタとして抽出するターゲットデータ抽出過程と、縦横複数個ずつの配置点からなるマトリクス内で文字・図形の形状に沿う配置点にドットを配置して構成したドット文字・図形におけるドットの配列およびドット文字・図形の外接矩形の寸法とからなるドット文字・図形マスタと前記ターゲットクラスタとの一致の程度を評価する照合度算出過程とを有し、照合度算出過程では、前記ターゲットクラスタについて設定した外接矩形に対してドット文字・図形マスタの外接矩形の位置を合わせた後に、ドット文字・図形マスタの各配置点を基準とする注目領域を設定してターゲットクラスタを構成する近似線と各注目領域との位置関係により注目領域ごとにターゲットクラスタにおけるドットの有無の程度を数値化したドット照合度を求めるとともに、各注目領域におけるドット照合度とドット文字・図形マスタに含まれるドットの配列との類似度を数値で表したクラスタ照合度を求め、クラスタ照合度が閾値以上になるときにターゲットクラスタをドット文字・図形マスタが表す文字・図形と認識することを特徴とする。 The invention of claim 1 extracts a skeleton line of a character / figure from a character / figure image input as a recognition target, and then approximates the skeleton line as a set of approximate lines that are simple shapes, and continuously In addition, an approximate line in the vicinity of the approximate line group is added to the approximate line group by evaluating the distance between each pair of approximate lines that are not continuous. A target data extraction process that extracts the approximate line group treated as a unit as a target cluster, and a dot character that consists of dots arranged at arrangement points along the shape of the character / figure in a matrix consisting of multiple arrangement points vertically and horizontally A matching degree calculation process for evaluating the degree of coincidence between the dot character / figure master and the target cluster consisting of the dot arrangement in the figure and the size of the circumscribed rectangle of the dot character / figure. In the matching degree calculation process, after aligning the circumscribed rectangle of the dot character / figure master with the circumscribed rectangle set for the target cluster, a region of interest is set based on each arrangement point of the dot character / figure master. In addition, the degree of dot matching in each target area is calculated by calculating the degree of dot matching in the target cluster for each target area based on the positional relationship between the approximate line constituting the target cluster and each target area. The cluster matching degree is expressed as a numerical value of the similarity to the dot array contained in the character / graphic master, and when the cluster matching degree exceeds the threshold, the target cluster is recognized as the character / graphic represented by the dot character / graphic master. It is characterized by doing.

この方法によれば、認識対象である文字・図形の画像との照合に用いるマスタデータ（一種のテンプレート）として、ドットの並びからなるドット文字・図形マスタを用い、ドット文字・図形マスタにおいてドットを配置することができる配置点の近傍に設定した注目領域とターゲットクラスタに属する近似線との重複の程度を評価するから、ターゲットクラスタを用いることによって、ドット文字・図形、かすれ文字・図形、分断文字・図形などでも認識することが可能になる。しかも、ドット文字・図形に対しては、理想的な配列のドットと照合するから、ドットの間隔の粗密に関わらずドット文字・図形を認識することができる。つまり、ドットの寸法に比較してドットの間隔が広い場合でもドット文字・図形の認識が可能になる。 According to this method, as master data (a type of template) used for collation with a character / graphic image to be recognized, a dot character / graphic master composed of an array of dots is used. Evaluate the degree of overlap between the attention area set in the vicinity of the placement point that can be placed and the approximate line belonging to the target cluster. By using the target cluster, dot characters / figures, blurred characters / figures, split characters・ It becomes possible to recognize even figures. In addition, since dot characters / graphics are collated with an ideal arrangement of dots, the dot characters / graphics can be recognized regardless of the density of the dot spacing. That is, dot characters and figures can be recognized even when the dot interval is wider than the dot size.

請求項２の発明では、請求項１の発明において、前記クラスタ照合度は、各注目領域におけるドット照合度とドット文字・図形マスタに含まれるドットの配列との正規化した相互相関値を用いることを特徴とする。 In the invention of claim 2, in the invention of claim 1, the cluster matching degree uses a cross-correlation value normalized between the dot matching degree in each region of interest and the arrangement of dots included in the dot character / graphic master. It is characterized by.

この方法によれば、クラスタ照合度としてドット照合度とドット文字・図形マスタに含まれるドットの配列との正規化した相互相関値を用いることにより、存在すべきドットが存在しない場合、あるいはドットが存在しないはずの場所にノイズが存在する場合には、いずれもクラスタ照合度が低下し、クラスタ照合度を用いることによりドット文字・図形マスタとターゲットクラスタとの類似度を容易に評価することができる。また、相互相関値は正規化しているから、印刷されたドット文字・図形のにじみなどによるドットの大きさの変化に対してロバストである。 According to this method, by using the normalized cross-correlation value between the dot matching degree and the dot arrangement included in the dot character / figure master as the cluster matching degree, when there is no dot that should be present, If there is noise in a place that should not exist, the cluster matching degree will decrease, and the degree of similarity between the dot character / graphic master and the target cluster can be easily evaluated by using the cluster matching degree. . In addition, since the cross-correlation value is normalized, the cross-correlation value is robust against changes in the size of dots due to blurring of printed dot characters / graphics.

請求項３の発明では、請求項１または請求項２の発明において、前記ドット文字・図形マスタにおけるドットの配列は、ドット文字・図形におけるドットの位置とドット間を連結する仮想線の延長方向を反映した各ドットの位置での方向値とからなり、前記照合度算出過程では、仮想線の延長方向に交差する方向において各ドットからの距離が規定した閾値以内に存在するターゲットクラスタの対応点を探索し、対応点が検出されるとドット文字・図形マスタの各ドットと対応点との一致度が高くなるようにドット文字・図形マスタにアフィン変換による変形を施した後に、ドット照合度を求めることを特徴とする。 According to a third aspect of the present invention, in the first or second aspect of the present invention, the dot arrangement in the dot character / graphic master is determined by the dot line in the dot character / graphic and the extension direction of the virtual line connecting the dots. It is composed of the direction value at the position of each reflected dot, and in the matching degree calculation process, the corresponding point of the target cluster existing within the threshold defined by the distance from each dot in the direction intersecting the extension direction of the virtual line is calculated. After the search and corresponding points are detected, the dot character / figure master is transformed by affine transformation so that the degree of matching between each dot of the dot character / figure master and the corresponding point is increased, and then the dot matching degree is obtained. It is characterized by that.

この方法によれば、ターゲットクラスタとドット文字・図形マスタとの位置合わせの誤差を補正することができるから、ドット文字・図形が位置ずれ、回転、スケール変化を伴っていても精度よく認識することができる。 According to this method, the alignment error between the target cluster and the dot character / graphic master can be corrected, so that even if the dot character / graphic is misaligned, rotated, or changed in scale, it can be accurately recognized. Can do.

請求項４の発明では、請求項１ないし請求項３の発明において、文字・図形の画像は濃淡画像であって、明度が規定の範囲内であり、かつ空間二次微分値が規定の閾値以上である画素を前記骨格線を抽出する対象とすることを特徴とする。 According to a fourth aspect of the present invention, in the first to third aspects of the present invention, the character / graphic image is a grayscale image, the lightness is within a specified range, and the spatial secondary differential value is equal to or greater than a specified threshold value. This pixel is a target for extracting the skeleton line.

この方法によれば、線状あるいはドット状のパターンに対して大きな値が得られる空間二次微分法と明度によって領域を分割する二値化処理とを併用するから、明度が急に変化するエッジや明度が滑らかに変化する明度むらの影響を除去して認識対象である文字・図形を構成する線あるいはドットの特徴だけを抽出することができる。 According to this method, since the spatial quadratic differentiation method that obtains a large value for a linear or dot pattern and the binarization process that divides the area by lightness are used together, an edge whose lightness changes abruptly In addition, it is possible to extract only the characteristics of the lines or dots constituting the character / graphic to be recognized by removing the influence of the brightness unevenness in which the brightness changes smoothly.

請求項５の発明では、請求億１ないし請求項３の発明において、文字・図形の画像は濃淡画像であって、背景に相当する領域の明度の分布を二次曲面で近似し、文字・図形の画像における各画素の明度のうち当該二次曲面で表される明度との差分が規定の閾値以上である画素を前記骨格線を抽出する対象とすることを特徴とする。 In the invention of claim 5, in the inventions of claim 100 to claim 3, the character / graphic image is a grayscale image, and the lightness distribution of the area corresponding to the background is approximated by a quadratic curved surface. Of the lightness of each pixel in the image, the pixel whose difference from the lightness represented by the quadratic curved surface is equal to or greater than a predetermined threshold is the target for extracting the skeleton line.

この方法によれば、背景に比較的滑らかで単純な明度むらが生じているような場合に、背景から得られる明度分布を二次曲面で近似し、文字・図形の領域における明度を明度分布から予測される明度との差分によって二値化するから、二値化の際に明度むらの影響を除去することができる。また、二次曲面を求めるには最小自乗法を利用することができ、空間二次微分の演算よりも演算が少ないから、明度むらを除去する処理を少ない演算量で実施することができる。 This method approximates the brightness distribution obtained from the background with a quadratic surface when the background has relatively smooth and simple brightness unevenness, and the brightness in the character / graphic area is calculated from the brightness distribution. Since binarization is performed based on the difference from the predicted brightness, the influence of uneven brightness can be eliminated during binarization. Further, the least square method can be used to obtain the quadratic curved surface, and the number of operations is less than that of the spatial quadratic differentiation, so that the processing for removing unevenness of brightness can be performed with a small amount of calculation.

請求項６の発明では、請求項１ないし請求項３の発明において、文字・図形の画像は濃淡画像であって、明度を二値化することにより前記骨格線を抽出する画素を選択するにあたり、文字・図形の画像内で文字・図形の明度を含む明度範囲内の画素の占める面積が、文字・図形が占める既知の面積以上であって規定範囲内の面積になるように明度範囲を設定し、当該明度範囲内の画素を前記骨格線を抽出する対象とすることを特徴とする。 According to a sixth aspect of the present invention, in the first to third aspects of the present invention, the character / graphic image is a grayscale image, and when selecting pixels for extracting the skeleton line by binarizing the brightness, Set the brightness range so that the area occupied by the pixels in the brightness range including the brightness of the characters and figures in the image of the letters and figures is equal to or greater than the known area occupied by the letters and figures. The pixels within the brightness range are targeted for extracting the skeleton line.

この方法によれば、認識対象である文字・図形が既知である場合に、認識対象である文字・図形が画面中に占める面積をもとに明度のヒストグラムの累積頻度値を参照することによって二値化のための閾値を設定するから、画像に占める認識対象である文字・図形の面積が小さい場合でも、適切な閾値を設定して文字・図形の領域を的確に抽出することが可能になる。 According to this method, when the character / graphic to be recognized is known, the cumulative frequency value of the brightness histogram is referred to based on the area occupied by the character / graphic to be recognized in the screen. Since the threshold value for the digitization is set, it is possible to accurately extract the character / graphic area by setting an appropriate threshold value even when the area of the character / graphic object to be recognized in the image is small. .

請求項７ないし請求項１０の発明は文字・図形の検査方法に関するものである。 The inventions of claims 7 to 10 relate to a method for inspecting characters and figures.

請求項７の発明では、請求項１ないし請求項６のいずれか１項に記載の文字・図形の認識方法における前記照合度算出過程においてクラスタ照合度が閾値以上になるときのドット文字・図形マスタの位置において得られるドット照合度とドット文字・図形マスタにおけるドットの配列とを用い、ドット文字・図形マスタの各配置点ごとにドットの有無が文字・図形の画像と一致するか否かを判断し、不一致である配置点の個数により認識対象である文字・図形の良否を判定する判読性評価過程を有することを特徴とする。 According to a seventh aspect of the present invention, there is provided a dot character / graphic master when the cluster collation degree is equal to or greater than a threshold in the collation degree calculation process in the character / graphic recognition method according to any one of the first to sixth aspects. Using the dot matching degree obtained at the position and the dot arrangement in the dot character / graphic master, it is determined whether the dot presence / absence matches the character / graphic image for each placement point of the dot character / graphic master. And having a legibility evaluation process for determining the quality of a character / figure that is a recognition target based on the number of disagreement arrangement points.

この方法によれば、認識対象である文字・図形とドット文字・図形マスタとの相違の程度をドット単位で評価することができ、認識対象がドットを配列した文字・図形であっても判読性を評価することができる。 According to this method, it is possible to evaluate the degree of difference between a character / figure to be recognized and a dot character / figure master in dot units, and even if the recognition target is a character / figure in which dots are arranged, it is legible. Can be evaluated.

請求項８の発明では、請求項７の発明において、前記判読性評価過程は、ドット文字・図形マスタにおけるドットの有無が文字・図形の画像と不一致である配置点が、ドット文字・図形マスタにおいてドットの存在する配置点であるときにドット抜け点とし、ドットの存在しない配置点であるときにノイズ点として、ドット抜け点とノイズ点との個数の一方が規定の閾値を越えるときに判読性が不良と判断することを特徴とする。 In the invention of claim 8, in the invention of claim 7, in the legibility evaluation process, in the dot character / figure master, the arrangement point where the presence / absence of dots in the dot character / figure master does not match the character / figure image is determined. A dot missing point when the dot is located, a noise point when the dot is not located, and a legibility when one of the number of dot missing points or noise points exceeds the specified threshold Is judged to be defective.

この方法によれば、認識対象である文字・図形におけるドットの抜けとノイズとをそれぞれ不良として検出することができる。 According to this method, it is possible to detect missing dots and noise in a character / graphic to be recognized as defects.

請求項９の発明では、請求項７または請求項８の発明において、前記ドット文字・図形マスタは判読性に関わる重要部分の配置点があらかじめ登録されており、前記判読性評価過程は、ドット文字・図形マスタにおけるドットの有無が文字・図形の画像と不一致である配置点が重要部分であるときには判読性が不良と判断することを特徴とする。 In the invention of claim 9, in the invention of claim 7 or claim 8, the dot character / figure master has pre-registered arrangement points of important parts related to legibility, and the legibility evaluation process comprises -When the arrangement point where the presence or absence of dots in the graphic master does not match the character / graphic image is an important part, it is determined that the legibility is poor.

この方法によれば、文字・図形において他の文字・図形との誤認を生じやすい部位を重要部分としてドット抜けやノイズを判断するから、単に判読性を判断するだけではなく、文字・図形の誤認を生じる可能性を低減することができる。 According to this method, because missing parts and noise are judged as important parts of characters / graphics that are likely to be misidentified with other characters / graphics, it is not only possible to judge legibility but also misidentification of characters / graphics. Can be reduced.

請求項１０の発明では、請求項７ないし請求項９の発明において、前記判読性評価過程において、認識対象である文字・図形が複数個でありかつドットの並びにより構成された文字である場合に、ドット文字・図形マスタにおけるドットの有無が文字・図形の画像と不一致である配置点がドット文字・図形マスタにおいてドットの存在する配置点であるときの配置点の個数を、認識対象である文字・図形とドット文字・図形マスタとについて、認識対象である文字・図形の並ぶ方向に沿って並ぶドットの各列ごとにそれぞれ求め、認識対象である文字・図形から求めた配置点の個数をドット文字・図形マスタについて求めた配置点の個数で除算した結果が規定の閾値以下であるときに、認識対象である文字・図形において個数を求めたドットの一列が抜けていると判断することを特徴とする。 In the invention of claim 10, in the inventions of claims 7 to 9, in the legibility evaluation process, there are a plurality of characters / figures to be recognized and characters composed of a series of dots. The number of placement points when the placement point where the presence or absence of a dot in the dot character / graphic master does not match the character / graphic image is the placement point where the dot exists in the dot character / graphic master.・ For figures and dot characters and figure masters, find each line of dots aligned along the direction in which the characters and figures to be recognized are arranged, and dot the number of arrangement points obtained from the characters and figures to be recognized. When the result of dividing by the number of placement points obtained for the character / graphic master is less than or equal to the specified threshold, the dot for which the number of characters / figures to be recognized was obtained Characterized by determining that a row is missing.

この方法によれば、文字・図形がドットの並びによって形成されている場合に、文字・図形を印字ないし刻印する装置の異常によって一列のドットが抜けるいわゆるライン抜けを早期に検出することができ、文字・図形を印字ないし刻印する装置の異常を自動的かつ早期に発見することが可能になる。 According to this method, when a character / graphic is formed by an array of dots, it is possible to detect so-called line omission where a single line of dots is lost due to an abnormality in a device that prints or stamps the character / graphic, It is possible to automatically and quickly detect an abnormality in a device for printing or engraving characters / graphics.

本発明の文字・図形の認識方法によれば、認識対象である文字・図形の画像との照合に用いるマスタデータとして、ドットの並びからなるドット文字・図形マスタを用い、ドット文字・図形マスタにおいてドットを配置することができる配置点の近傍に設定した注目領域とターゲットクラスタに属する近似線との重複の程度を評価するから、ターゲットクラスタを用いることによって、ドット文字、かすれ文字、分断文字などでも認識することが可能になるという利点がある。しかも、ドット文字・図形に対しては、理想的な配列のドットと照合するから、ドットの間隔の粗密に関わらずドット文字・図形を認識することができ、ドットの寸法に比較してドットの間隔が広い場合でもドット文字・図形の認識が可能になるという利点がある。 According to the character / figure recognition method of the present invention, a dot character / figure master consisting of an array of dots is used as master data used for collation with a character / figure image to be recognized. Evaluate the degree of overlap between the attention area set in the vicinity of the placement point where dots can be placed and the approximate line belonging to the target cluster. By using the target cluster, dot characters, fading characters, split characters, etc. There is an advantage that it becomes possible to recognize. In addition, since dot characters and figures are matched with the ideal arrangement of dots, dot characters and figures can be recognized regardless of the density of the dot spacing, and the dot size compared to the dot size. There is an advantage that dot characters and figures can be recognized even when the interval is wide.

本発明の文字・図形の検査方法によれば、認識対象である文字・図形とドット文字・図形マスタとの相違の程度をドット単位で評価することができ、認識対象がドットを配列した文字・図形であっても判読性を評価することができるという利点がある。 According to the character / figure inspection method of the present invention, the degree of difference between a character / figure to be recognized and a dot character / figure master can be evaluated in dot units. Even if it is a figure, there is an advantage that legibility can be evaluated.

本実施形態に用いる装置の一例を図２に示す。図示例では、シート状あるいは立体形状の対象物１１の画像を入力する画像入力装置としてＴＶカメラ１２を備える。対象物１１には、文字（従来構成と同様に文字・図形を意味する）が印字あるいは刻印（対象物１１の表面がエンボス状に陥没あるいは隆起）され、ＴＶカメラ１２では文字を含む画像を撮像する。本実施形態は、主として印字あるいは刻印による文字の認識を行うものであり、手書き文字の認識については想定していない。ＴＶカメラ１２で撮像されたアナログ信号の画像は、デジタル画像生成装置１３に入力されてデジタル信号に変換され、デジタル画像生成装置１３からは画素値が明度である濃淡画像が生成される。この濃淡画像は記憶装置１４に格納され、記憶装置１４に格納された濃淡画像に対して画像処理装置１５において以下に説明する画像処理が施される。記憶装置１４は半導体メモリとハードディスク装置とからなり、記憶装置１４には、対象物１１を撮像した画像のほかに、文字の認識に必要なドット文字マスタも格納される。以下では、対象物１１に文字が印字されている場合を想定して説明するが、文字が対象物１１に刻印されている場合でも同様の技術を採用できる。 An example of an apparatus used in this embodiment is shown in FIG. In the illustrated example, a TV camera 12 is provided as an image input device for inputting an image of a sheet-like or three-dimensional object 11. Characters (meaning characters / graphics as in the conventional configuration) are printed or stamped on the object 11 (the surface of the object 11 is depressed or raised in an embossed shape), and the TV camera 12 captures an image including the characters. To do. The present embodiment mainly recognizes characters by printing or engraving, and does not assume recognition of handwritten characters. An analog signal image captured by the TV camera 12 is input to the digital image generation device 13 and converted into a digital signal. The digital image generation device 13 generates a grayscale image whose pixel value is lightness. The grayscale image is stored in the storage device 14, and the grayscale image stored in the storage device 14 is subjected to image processing described below in the image processing device 15. The storage device 14 includes a semiconductor memory and a hard disk device. The storage device 14 stores a dot character master necessary for character recognition in addition to an image obtained by imaging the object 11. The following description will be made on the assumption that characters are printed on the object 11, but the same technique can be adopted even when characters are stamped on the object 11.

ところで、ドット文字マスタは、対象物１１に印字された文字と照合するための一種のテンプレートであって、図３に示すように、縦ｍ個×横ｎ個（本実施形態では、７個×５個）の配置点ｑｉ（ｉ＝１，２，…，３５）からなるマトリクスを１文字の文字単位とし、文字単位内で文字の形状に沿った配置点ｑｉにドットｄｉ（ｉ＝１，２，…）を配置して構成される。一般にドットの並びを文字として人が認識する際には、隣接する一対のドット間に仮想線ｒｉ（ｉ＝１，２，…）を補完し、文字形状に沿って仮想線ｒｉを連結することによりドットｄｉの並びを一連の仮想線ｒｉからなる文字として認識していると考えられる。 By the way, the dot character master is a kind of template for collating with characters printed on the object 11, and, as shown in FIG. 3, m vertical × n horizontal (in this embodiment, 7 × A matrix composed of five (5) arrangement points qi (i = 1, 2,..., 35) is set as one character unit, and dots di (i = 1, 1) are arranged at arrangement points qi along the character shape within the character unit. 2, ...) are arranged. In general, when a person recognizes an arrangement of dots as characters, a virtual line ri (i = 1, 2,...) Is complemented between a pair of adjacent dots, and the virtual line ri is connected along the character shape. Thus, it is considered that the arrangement of dots di is recognized as a character composed of a series of virtual lines ri.

そこで、ドット文字マスタとして、文字単位内でのドットｄｉの位置と、ドットｄｉ間を結ぶ仮想線ｒｉの延長方向を示すことができる方向値（図３における両端矢印で示す方向を表す値）と、文字単位を囲む外接矩形とを組にしたデータを用いる。また、ドット文字マスタでは、ドットｄｉを配置した配置点ｑｉに対してドットｄｉを配置していない配置点ｑｉとは異なる値を割り当てる。たとえば、ドットｄｉを配置した配置点ｑｉに「１」を割り当て、ドットｄｉを配置していない配置点ｑｉに「０」を割り当てる。方向値は、ドットｄｉに仮想線ｒｉが１本だけ連結されているときには当該仮想線ｒｉに直交する方向とし、ドットｄｉに仮想線ｒｉが２本連結されているときには両仮想線ｒｉのなす角度を二等分する方向とする。したがって、方向値は仮想線ｒｉを連結して表される文字のドットｄｉの位置における法線方向に相当する情報を持っており、しかも１個のドットｄｉに２本の仮想線ｒｉが連結されているときには、方向値を用いることにより各仮想線ｒｉを個別に記述する場合よりもデータ量を低減できることになる。文字が膨張収縮する際には、ドットｄｉの位置が方向値で表される方向に移動すると考えられるから、対象物１１から得られた画像内の文字とドット文字マスタとの寸法を合わせる際には、方向値で表される方向にドットｄｉを変位させればよい。 Therefore, as a dot character master, a position value (a value indicating a direction indicated by a double-ended arrow in FIG. 3) that can indicate the position of the dot di within the character unit and the extending direction of the virtual line ri connecting the dots di. Data using a circumscribed rectangle surrounding the character unit is used. In the dot character master, a value different from the arrangement point qi where the dot di is not arranged is assigned to the arrangement point qi where the dot di is arranged. For example, “1” is assigned to the placement point qi where the dot di is placed, and “0” is assigned to the placement point qi where the dot di is not placed. The direction value is the direction perpendicular to the virtual line ri when only one virtual line ri is connected to the dot di, and the angle formed by both virtual lines ri when two virtual lines ri are connected to the dot di. Is the direction of bisecting. Therefore, the direction value has information corresponding to the normal direction at the position of the dot dot of the character represented by connecting the virtual line ri, and two virtual lines ri are connected to one dot di. When using the direction value, the data amount can be reduced as compared with the case where each virtual line ri is individually described. When the character expands and contracts, it is considered that the position of the dot di moves in the direction represented by the direction value. Therefore, when the size of the character in the image obtained from the object 11 and the dot character master are matched. The dot di may be displaced in the direction represented by the direction value.

図３は文字「２」を表すドット文字マスタの例を示しており、各配置点ｑｉにおけるドットｄｉの有無と、ドットｄｉごとの方向値とを組にしてドットｄｉの配列を表し、さらに、外接矩形ｒｃの位置および寸法についてもドットｄｉの配列と組にしてドット文字マスタとして用いることを表している。図３では、たとえば配置点ｑ_１にはドットがなく、配置点ｑ_２にはドットｄ_２が存在し、ドットｄ_２の方向値は水平方向に対して１１２．５度の方向になる。 FIG. 3 shows an example of a dot character master representing the character “2”, and represents the arrangement of the dots di by combining the presence / absence of the dots di at each placement point qi and the direction value for each dot di, The position and size of the circumscribed rectangle rc are also used as a dot character master in combination with the dot di array. In Figure 3, for example no dots in the arrangement point _{q 1,} the constellation points _{q 2} exist dot _{d 2,} the direction value of the dot _{d 2} is in the direction of 112.5 degrees with respect to the horizontal direction.

画像処理装置１５において行う画像処理の基本的な処理手順は、図１に示す通りであって、まずターゲットデータ抽出過程（Ｓ１）において、文字の認識を行う対象物１１を文字を含めて撮像し、この画像からドット文字マスタと比較すべき文字の属性をターゲットデータとして抽出する。従来の技術では、ターゲットデータを抽出する前処理としてセグメンテーションを行い、セグメンテーションにより１文字ずつに切り分けているが、本実施形態ではターゲットデータ抽出過程においてはセグメンテーションを行わず、ドット文字、分断文字、不連続部分を含む文字などに対してターゲットデータ抽出過程においてひとまとまりに扱うことで、１文字よりも小さい単位に分割されることを抑制している。つまり、分離された部分同士の距離が規定した閾値以内であれば、分離された部分同士を１つの単位として扱う。このような単位を以下ではクラスタと呼ぶ。ターゲットデータから抽出したクラスタはターゲットクラスタと呼び、ターゲットクラスタは大抵の場合において１文字以上を含み、複数文字を含む場合もある。 The basic processing procedure of the image processing performed in the image processing device 15 is as shown in FIG. 1. First, in the target data extraction process (S1), the object 11 for character recognition is imaged including characters. From this image, the attribute of the character to be compared with the dot character master is extracted as target data. In the conventional technique, segmentation is performed as a pre-processing for extracting target data, and each character is segmented by segmentation. However, in this embodiment, segmentation is not performed in the target data extraction process, and dot characters, divided characters, By handling characters including a continuous part as a whole in the target data extraction process, division into units smaller than one character is suppressed. That is, if the distance between the separated parts is within a prescribed threshold value, the separated parts are handled as one unit. Such units are hereinafter referred to as clusters. A cluster extracted from the target data is called a target cluster, and the target cluster usually includes one or more characters, and may include a plurality of characters.

ターゲットデータ抽出過程においてターゲットクラスタに区分されたターゲットデータが得られると、ターゲットクラスタにドット文字マスタを照合する照合度算出過程（Ｓ２）が実施される。照合度算出過程においては、ターゲットクラスタにドット文字マスタの位置を合わせ、ドット文字マスタの各配置点ｑｉに対応するドットｄｉの有無を表すドット照合度が求められ、さらにドット照合度の配列に基づいてターゲットクラスタとドット文字マスタ全体との類似度を表すクラスタ照合度が算出される。つまり、照合度算出過程においては、ドット文字マスタとターゲットクラスタとの一致の程度を評価することにより、ターゲットクラスタで表される文字がドット文字マスタのどの文字に対応するかを判断することにより文字を認識する。 When target data divided into target clusters is obtained in the target data extraction process, a matching degree calculation process (S2) for matching the dot character master with the target cluster is performed. In the collation degree calculation process, the dot character master position is aligned with the target cluster, the dot collation degree indicating the presence or absence of the dot di corresponding to each arrangement point qi of the dot character master is obtained, and further based on the dot collation degree array. Thus, a cluster matching degree representing the similarity between the target cluster and the entire dot character master is calculated. In other words, in the matching degree calculation process, by evaluating the degree of matching between the dot character master and the target cluster, it is possible to determine which character of the dot character master corresponds to the character represented by the target cluster. Recognize

さらに、文字の認識後には、照合度算出過程において求められたクラスタ照合度が最大になる位置を照合位置として、照合位置においてドット照合度を再度求め、ドット文字マスタにおけるドットの位置と対象物１１から得た文字の画像との整合性を検証し、対象物１１に印字された文字が判読可能か否かを判断して最終的に対象物１１に維持された文字の良否を判定する判読性評価過程（Ｓ３）を実施する。 Further, after the character recognition, the dot collation degree is obtained again at the collation position with the position where the cluster collation degree obtained in the collation degree calculation process is maximized as the collation position, and the dot position and the object 11 in the dot character master are obtained. Interpretation for verifying consistency with the image of the character obtained from the above, determining whether the character printed on the object 11 is legible, and finally determining whether the character maintained on the object 11 is good or bad An evaluation process (S3) is performed.

以下では、ターゲットデータ抽出過程、照合度算出過程、判読性評価過程について処理手順を具体的に説明する。 Hereinafter, the processing procedure will be specifically described for the target data extraction process, the matching degree calculation process, and the legibility evaluation process.

ターゲット抽出過程の処理手順を図４に示す。ターゲットデータ抽出過程においては、入力された濃淡画像を二値化し（Ｓ１１）、二値画像から骨格線を抽出する（Ｓ１２）。骨格線の抽出には、Ｈｉｌｄｉｔｃｈ法のような周知の技術を用いればよいが、二値化の際には文字以外の余計な特徴を残さない技術を採用する必要がある。 The processing procedure of the target extraction process is shown in FIG. In the target data extraction process, the input grayscale image is binarized (S11), and skeleton lines are extracted from the binary image (S12). For extraction of the skeleton line, a well-known technique such as the Holditch method may be used, but it is necessary to adopt a technique that leaves no extra features other than characters in the binarization.

二値化の際の問題を説明するために、明度むらやノイズが存在する濃淡画像を図５（ａ）に示し、図５（ａ）のＡ−Ａ′線上の明度の変化を図５（ｂ）に示す。図５（ａ）では文字「Ｒ」の部分の明度が背景よりも低く、Ａ−Ａ′線が文字と交差している部位は１箇所であるから図５（ｂ）では理想的には１箇所だけ明度が低くなるはずであるが、実際には明度むら（図５（ａ）においては明度の相違を斜線部の斜線の違いで表している）やノイズｎｓの存在によって、実際には複数箇所で明度が低くなっている。つまり、図５（ｂ）に示す一定値である閾値Ｔｈ１で二値化した場合には、図５（ｃ）に示すように、二値画像に文字以外の部分が残ることになる。図５はドットを並べて構成したドット文字ではないが、ドット文字を認識しようとすればドットの並びを評価しなければならないから、二値画像においてノイズｎｓが点状に残っているとドット文字を構成するドットとの識別が困難になりドット文字の認識を阻害する。したがって、二値化に際しては文字以外の特徴が残らないように、以下に説明する３種類の技術のいずれかを採用する。なお、以下に説明する例では文字の領域の明度が背景よりも低い場合を想定するが、文字の領域の明度が背景よりも高い場合や、文字の領域の明度に対して明るい領域と暗い領域とが背景に存在する場合でも同様の技術を適用することが可能である。 In order to explain the problem at the time of binarization, a gray-scale image with uneven brightness and noise is shown in FIG. 5A, and the change in brightness on the line AA ′ in FIG. Shown in b). In FIG. 5A, the lightness of the character “R” is lower than that of the background, and there is only one portion where the AA ′ line intersects the character. Therefore, in FIG. The brightness should be lowered only at the location, but in actuality, due to the unevenness of brightness (in FIG. 5 (a), the difference in brightness is represented by the difference in the hatched portion) and the presence of noise ns The brightness is low at the location. That is, when binarization is performed with the threshold value Th1 that is a constant value shown in FIG. 5B, as shown in FIG. 5C, portions other than characters remain in the binary image. Although FIG. 5 is not a dot character formed by arranging dots, since it is necessary to evaluate the dot arrangement if the dot character is to be recognized, if the noise ns remains in the binary image, the dot character is changed. This makes it difficult to distinguish between the constituent dots and obstructs the recognition of dot characters. Therefore, any of the three types of techniques described below is employed so that no features other than characters remain in the binarization. In the example described below, it is assumed that the brightness of the character area is lower than the background. However, when the brightness of the character area is higher than the background, the light area and the dark area with respect to the lightness of the character area. The same technique can be applied even when and exist in the background.

二値化の第１の方法としては、明度を閾値で二値化して抽出される画素であって、かつ濃淡画像から求めた空間二次微分値が閾値以上である画素を抽出する方法がある。空間二次微分値を求めるには、ラプラシアン−ガウシアンフィルタのような叩き込みフィルタを濃淡画像に適用すればよい。図６（ａ）は図５（ａ）に示した濃淡画像にラプラシアン−ガウシアンフィルタを適用し各画素の画素値を空間二次微分値とした空間二次微分画像である。図６（ａ）の空間二次微分画像における図５（ａ）のＡ−Ａ′線に相当する線上の空間二次微分値の変化を図６（ｂ）に示す。空間二次微分画像では、特定の太さの線状の特徴を抽出することができるので、文字と交差するＡ−Ａ′線上では図６（ｂ）のように文字に対応する部位において空間二次微分値が大きくなるのに対して、明度むらやノイズが生じている部位では空間二次微分値が小さくなる。つまり、空間二次微分値に適宜の閾値Ｔｈ２を設定することにより、文字に対応すると考えられる部位を明度むらやノイズの部位から分離することができる。 As a first method of binarization, there is a method of extracting pixels that are extracted by binarizing lightness with a threshold value and whose spatial second-order differential value obtained from a grayscale image is equal to or greater than the threshold value. . In order to obtain the spatial second-order differential value, a striking filter such as a Laplacian-Gaussian filter may be applied to the grayscale image. FIG. 6A is a spatial secondary differential image in which a Laplacian-Gaussian filter is applied to the grayscale image shown in FIG. 5A and the pixel value of each pixel is a spatial secondary differential value. FIG. 6B shows a change in the spatial second derivative value on the line corresponding to the AA ′ line in FIG. 5A in the spatial second derivative image in FIG. 6A. In the spatial second-order differential image, a linear feature having a specific thickness can be extracted. Therefore, on the line A-A ′ intersecting with the character, the space second at the portion corresponding to the character as shown in FIG. Whereas the second derivative value becomes larger, the spatial second derivative value becomes smaller at the part where the brightness unevenness or noise occurs. That is, by setting an appropriate threshold value Th2 for the spatial second-order differential value, it is possible to separate a part considered to correspond to a character from a lightness unevenness or a noise part.

ただし、空間二次微分値を閾値Ｔｈ２によって二値化するだけでは文字以外の部位が含まれることがあるから、文字のみを抽出するには、通常の濃淡画像に対して閾値を適用して得た二値画像と空間二次微分画像との両方でともに抽出された画素を文字の部位の画素として採用するのが望ましい。このような処理により、図６（ｃ）のように文字に対応する領域のみを抽出することができる。空間二次微分値に対する閾値Ｔｈ２は、明度が閾値Ｔｈ１以下である画素における空間二次微分値の平均値を採用し、空間二次微分画像では空間二次微分値が閾値Ｔｈ２以上である画素を白画素（背景）とすればよい。 However, since binarization of the spatial second-order differential value by the threshold Th2 may include a part other than the character, in order to extract only the character, the threshold is applied to a normal gray image. In addition, it is desirable to employ pixels extracted from both the binary image and the spatial second-order differential image as the pixels of the character portion. By such processing, only the region corresponding to the character can be extracted as shown in FIG. As the threshold Th2 for the spatial secondary differential value, an average value of the spatial secondary differential values in the pixels whose lightness is equal to or less than the threshold Th1 is adopted, and in the spatial secondary differential image, pixels whose spatial secondary differential value is equal to or greater than the threshold Th2 are used. A white pixel (background) may be used.

二値化の第２の方法としては、文字以外の背景を二次式で近似し、近似した二次式との差分が閾値以上になる画素を文字として抽出することにより明度むらを除去する方法がある。すなわち、まず明度を閾値で二値化することにより文字に相当する領域を大まかに切り出し、二値化によって背景となった画素の明度と座標との関係を二次式で近似する。つまり、明度の分布を二次曲面で近似する。近似する二次式としては下式のような二次双曲面を表す二次式を用いるのが望ましいが、他の二次式を用いることも可能である。
ｂ（ｘ，ｙ）＝ａ１・ｘ^２＋ａ２・ｙ^２＋ａ３・ｘ＋ａ４・ｙ＋ａ５
ただし、ｂ（ｘ，ｙ）は座標（ｘ，ｙ）における明度、ａ１〜ａ５は係数である。上式を回帰方程式に用いて係数ａ１〜ａ５を決定すれば、文字を除いた背景の明度ｂ（ｘ，ｙ）の近似式が得られる。この近似式から得られる明度ｂ（ｘ，ｙ）と実際の明度との差分を求め、差分が規定の閾値以上である画素を文字に対応する画素として抽出すれば文字と背景とに分離するように二値化することができる。この技術は、明度むらの影響を除去するのに好適であるが、近似式との明度差の大きいノイズが生じている場合や、二次式では表すことのできない模様が背景に存在する場合には、背景を除去することができないが、空間二次微分値を求める場合に比較すると演算量が少なくなる。したがって、二次式で近似できるような滑らかな明度むらが生じている場合には少ない演算量で高速な演算が期待できる。 As a second method of binarization, a background that is not a character is approximated by a quadratic expression, and a pixel whose difference from the approximated quadratic expression is equal to or greater than a threshold is extracted as a character to remove unevenness in brightness. There is. That is, first, a region corresponding to a character is roughly cut out by binarizing the lightness with a threshold value, and the relationship between the lightness and coordinates of the pixel that has become the background by binarization is approximated by a quadratic expression. That is, the lightness distribution is approximated by a quadric surface. As a quadratic expression to be approximated, it is desirable to use a quadratic expression representing a quadratic hyperboloid such as the following expression, but other quadratic expressions can also be used.
b (x, y) = a1 · x ² + a2 · y ² + a3 · x + a4 · y + a5
However, b (x, y) is the brightness at the coordinates (x, y), and a1 to a5 are coefficients. If the coefficients a1 to a5 are determined using the above equation as a regression equation, an approximate expression of the background brightness b (x, y) excluding characters can be obtained. If the difference between the lightness b (x, y) obtained from this approximate expression and the actual lightness is obtained, and pixels whose difference is equal to or greater than a predetermined threshold are extracted as pixels corresponding to the character, the character and the background are separated. Can be binarized. This technique is suitable for removing the effect of uneven brightness, but when noise with a large difference in brightness from the approximate expression occurs or when there is a pattern in the background that cannot be expressed by a quadratic expression. The background cannot be removed, but the amount of calculation is smaller compared to the case of obtaining the spatial second-order differential value. Therefore, when smooth brightness unevenness that can be approximated by a quadratic expression occurs, high-speed computation can be expected with a small amount of computation.

二値化の第３の方法は、文字に対応する領域の画素数を用いて二値化のための閾値を決定する方法である。いま、図７（ａ）のように画像のサイズに対して文字に対応する領域が小さい場合に、明度のヒストグラムを求めると図７（ｂ）のように文字に対応する領域の明度の度数（ピークｐ１付近の度数）は背景となる領域の明度の度数よりも大幅に少なくなる。一般にヒストグラムを用いて二値化のための閾値を決定する場合には、ヒストグラムに現れるピークのうち度数の大きい２つのピーク間の明度を閾値に用いることが多い。しかしながら、図７（ｂ）のようなヒストグラムでは、４個のピークｐ１〜ｐ４が生じており、どのピークｐ１〜ｐ４の間を閾値に用いるのが望ましいかを自動的に判断することができない。とくに、図７に示す例では、文字に対応する領域の画素数が少ないから文字に対応する領域と考えられるピークｐ１の度数が小さく、度数の大きい２つのピーク間の明度を閾値に用いる方法では適切な閾値を決定することができない。 The third method of binarization is a method of determining a threshold for binarization using the number of pixels in an area corresponding to a character. Now, when the area corresponding to the character is small with respect to the size of the image as shown in FIG. 7A, the brightness histogram of the area corresponding to the character as shown in FIG. The frequency near the peak p1) is significantly less than the brightness level of the background region. In general, when a threshold value for binarization is determined using a histogram, lightness between two peaks having a high frequency among peaks appearing in the histogram is often used as the threshold value. However, in the histogram as shown in FIG. 7B, four peaks p1 to p4 are generated, and it is not possible to automatically determine which peak p1 to p4 is preferably used as the threshold value. In particular, in the example shown in FIG. 7, since the number of pixels in the region corresponding to the character is small, the frequency of the peak p1 that is considered to be the region corresponding to the character is small, and the lightness between the two peaks having the high frequency is used as the threshold value. An appropriate threshold cannot be determined.

ところで、製品管理などの目的で印刷ないし刻印されている文字の認識や検査の際には文字の属性は既知であるから、画像内で文字が占める面積を推定することができるから、明度が閾値以下である領域の面積（画素数）が推定した面積以上になるように閾値を設定する。ただし、文字がかすれ文字である場合やノイズがある場合には、推定された面積が得られる閾値（図７の９Ａを想定している）を採用すると、文字に対応する領域に対してノイズに対応する領域の画素数が比較的多くなり、文字の特徴を抽出するのに十分な画素数を得られない場合があるから、推定された面積に相当する画素数よりも多めの画素数が抽出されるように閾値をやや高く設定する。たとえば、推定された面積が得られる閾値９Ａよりも大きく、かつ明度に対する度数の変化率が規定値以下から規定値以上になるになる閾値（図７では９Ｃを想定している）を求め、推定された面積が得られる閾値９Ａと閾値９Ｃとの間の適宜の閾値９Ｂを二値化に用いる。閾値９Ｂは２つの閾値９Ａ，９Ｃの間で適宜に設定することができるが、たとえば、閾値９Ａ以下の画素数と閾値９Ｃ以下の画素数との平均の画素数が得られるように閾値９Ｂを設定する。この技術は、背景において文字との明度差の小さい領域が多く含まれる場合には閾値を決定するのが困難な場合もあるが、文字の面積を推定して閾値を決定するから、対象物１１に対する照明具合などによる明度変化があっても適切な閾値を設定することができ、しかも空間二次微分値を求める場合に比較すると演算量が少なくなる。 By the way, since the attribute of the character is known at the time of recognition or inspection of the character printed or stamped for the purpose of product management, the area occupied by the character in the image can be estimated. The threshold value is set so that the area (number of pixels) of the following region is equal to or larger than the estimated area. However, if the character is a faint character or there is noise, adopting a threshold value (assuming 9A in FIG. 7) that obtains the estimated area will cause noise in the region corresponding to the character. The number of pixels in the corresponding area is relatively large, and there may be cases where it is not possible to obtain a sufficient number of pixels to extract character features. Therefore, a larger number of pixels than the estimated area is extracted. Set the threshold value to be slightly higher. For example, a threshold (9C is assumed in FIG. 7) that is larger than the threshold 9A at which the estimated area is obtained, and the rate of change of the frequency with respect to the brightness is greater than or equal to a specified value is estimated and estimated. An appropriate threshold value 9B between the threshold value 9A and the threshold value 9C from which the obtained area is obtained is used for binarization. The threshold value 9B can be appropriately set between the two threshold values 9A and 9C. For example, the threshold value 9B is set so that an average number of pixels of the number of pixels equal to or less than the threshold value 9A and the number of pixels equal to or less than the threshold value 9C is obtained. Set. In this technique, it may be difficult to determine the threshold when the background includes many regions having a small brightness difference from the character, but the threshold is determined by estimating the area of the character. Even if there is a change in lightness due to the lighting condition or the like, an appropriate threshold value can be set, and the amount of calculation is reduced compared with the case of obtaining a spatial second derivative value.

上述した３種類の二値化技術のうち対象物１１に応じた適宜の二値化技術を用いて得られる二値画像から骨格線を抽出した骨格線画像が得られると、骨格線からノードと線要素とを抽出し、線要素を直線や楕円弧のような単純形状からなる近似線に近似する。ノードはノードの座標とノード名とで表される。以下に、この処理について具体例を挙げて説明する。 When a skeleton line image obtained by extracting a skeleton line from a binary image obtained by using an appropriate binarization technique according to the object 11 among the three types of binarization techniques described above, The line element is extracted, and the line element is approximated to an approximate line having a simple shape such as a straight line or an elliptical arc. A node is represented by the coordinates of the node and the node name. This process will be described below with a specific example.

ここでは、骨格線画像において、図８（ａ）のように「Ｒ」という文字の骨格線が抽出されているとする場合を例として説明する。この場合、二値の骨格線画像をラスタ走査し、骨格線から端点と分岐点（交差点を含む）とをノードとして抽出すると（図４のＳ１３）、図８（ｂ）のように、端点ｔ１，ｔ２と分岐点ｊ１，ｊ２とが抽出される。ノードの抽出後には、各ノード間を結ぶ線要素を抽出する（Ｓ１４）。すなわち、各端点ｔ１，ｔ２と各分岐点ｊ１，ｊ２とをそれぞれ始点として他の端点ｔ１，ｔ２または他の分岐点ｊ１，ｊ２に到達するまで線の追跡を行い（Ｓ１４ｂ）、線要素ｅ１〜ｅ４を抽出して記録する（Ｓ１４ｃ）。線要素ｅ１〜ｅ４は、一端が開いている場合には、他の端点ｔ１，ｔ２または分岐点ｊ１，ｊ２に到達するまでの追跡経路を１つの線要素とし（たとえば、線要素ｅ３，ｅ４）、両端が閉じている場合は始点となった端点ｔ１，ｔ２または分岐点ｊ１，ｊ２に到達するまでの追跡経路を１つの線要素とする（たとえば、線要素ｅ１，ｅ２）。追跡を終了したノードが端点ｔ１，ｔ２であれば追跡を終了し、追跡を行っていない端点ｔ１，ｔ２から追跡を行い、また、追跡を終了したノードが分岐点ｊ１，ｊ２であれば、当該分岐点ｔ１，ｔ２からさらに追跡する（Ｓ１４ａ，Ｓ１４ｄ）。 Here, a case where the skeleton line of the character “R” is extracted from the skeleton line image as shown in FIG. 8A will be described as an example. In this case, when a binary skeletal line image is raster scanned and end points and branch points (including intersections) are extracted from the skeletal line as nodes (S13 in FIG. 4), the end point t1 as shown in FIG. 8B. , T2 and branch points j1, j2 are extracted. After the nodes are extracted, line elements connecting the nodes are extracted (S14). That is, the lines are traced from the end points t1 and t2 and the branch points j1 and j2 to the other end points t1 and t2 or the other branch points j1 and j2 (S14b). e4 is extracted and recorded (S14c). When one end of the line elements e1 to e4 is open, the tracking path to reach the other end points t1 and t2 or the branch points j1 and j2 is one line element (for example, line elements e3 and e4). When both ends are closed, the tracking path to reach the end points t1 and t2 or the branch points j1 and j2 that are the start points is set as one line element (for example, line elements e1 and e2). If the node that has finished tracking is the end points t1 and t2, the tracking is terminated, and the tracking is performed from the end points t1 and t2 that are not being tracked. Further tracking is performed from the branch points t1 and t2 (S14a, S14d).

上述のようにして線要素ｅ１〜ｅ４が抽出されると、各線要素ｅ１〜ｅ４を単純形状（直線、楕円のような幾何形状）である「近似線」の集合として近似する。ここでは、単純形状として簡単な数式で表される近似線を用いる。また、近似線としては、直線と楕円の一部（以下、「楕円弧」と呼ぶ）との２種類を用いる。このように近似線として直線と楕円弧とを用いることによって、ベジェ曲線やスプライン曲線を用いる場合に比較して近似線の表現のためのパラメータを少なくすることができ、種々演算に対する処理負荷も小さくなる。図８（ｂ）に示す形状では、線要素ｅ１は、「Ｒ」の文字の左上角を形成する２本の直線と、上辺の直線の右端から分岐点ｊ２までの１本の楕円弧とで近似することができる。つまり、図８（ｃ）のように、線要素ｅ１を２本の直線である近似線ｌ１，ｌ２と１本の楕円弧である近似線ｌ３とで近似し、直線である２本の近似線ｌ１，ｌ２を接続する角点ｃ１と、直線である一方の近似線ｌ２に楕円弧である他方の近似線ｌ３と接続する線接続点ｄ１とをノードとして導入する。線要素ｅ２〜ｅ４は直線であるから、そのまま近似線ｌ４〜ｌ６に用いる。要するに、線要素を数式で表すことのできる近似線で近似するとともに、角点および線接続点を抽出する（図４のＳ１５）。 When the line elements e1 to e4 are extracted as described above, the line elements e1 to e4 are approximated as a set of “approximate lines” which are simple shapes (geometric shapes such as straight lines and ellipses). Here, an approximate line represented by a simple mathematical expression is used as a simple shape. Two types of approximate lines are used: a straight line and a part of an ellipse (hereinafter referred to as an “elliptical arc”). By using straight lines and elliptical arcs as approximate lines in this way, parameters for expressing approximate lines can be reduced as compared to the case of using Bezier curves and spline curves, and the processing load for various operations is also reduced. . In the shape shown in FIG. 8B, the line element e1 is approximated by two straight lines forming the upper left corner of the character “R” and one elliptical arc from the right end of the upper straight line to the branch point j2. can do. That is, as shown in FIG. 8C, the line element e1 is approximated by the approximate lines 11 and 12 that are two straight lines and the approximate line l3 that is one elliptical arc, and two approximate lines 11 that are straight lines. , L2 and a line connection point d1 connected to one approximate line l2 that is a straight line and the other approximate line l3 that is an elliptical arc are introduced as nodes. Since the line elements e2 to e4 are straight lines, they are used as they are for the approximate lines l4 to l6. In short, a line element is approximated by an approximate line that can be expressed by a mathematical expression, and corner points and line connection points are extracted (S15 in FIG. 4).

骨格線画像から骨格線のノードと線要素とを抽出する処理を、画像の全面または画像内において指定した特定範囲について実施し、ノードおよび近似線を抽出した後、１つの単位として扱う範囲のノードおよび近似線に関するノードデータおよび近似線データをまとめてターゲットクラスタとする。ノードデータは、ノード名にノードの座標およびノードに接続される近似線名が対応付けられ、近似線データは、近似線名に線要素のパラメータおよび線要素に接続されるノードのノード名が対応付けられる。たとえば、濃淡画像を二値化した二値画像が図９（ａ）のようになるときに、ターゲットクラスタで表される図形は図９（ｂ）のようになる。 The process of extracting skeletal line nodes and line elements from the skeleton line image is performed for the entire surface of the image or a specific range specified in the image, and after extracting the nodes and approximate lines, the nodes in the range handled as one unit In addition, node data and approximate line data related to the approximate line are collected as a target cluster. In node data, the coordinates of the node and the approximate line name connected to the node are associated with the node name. In the approximate line data, the parameter of the line element and the node name of the node connected to the line element correspond to the approximate line name. Attached. For example, when a binary image obtained by binarizing a grayscale image is as shown in FIG. 9A, the figure represented by the target cluster is as shown in FIG. 9B.

ターゲットクラスタを生成する処理は、図４においてステップＳ１６とステップＳ１７とで表す手順で行われる。まず、適宜に選択したノードを起点ノードとし（Ｓ１６ｂ）、起点ノードを一端とする近似線の他端のノードを検出する（Ｓ１６ｄ）。他端のノードの種類を端点と分岐点とその他とに分類し（Ｓ１６ｅ）、端点であれば起点ノードに接続された別の近似線について同様の処理を行い（Ｓ１６ｃ）、分岐点であれば当該分岐点を新たな起点ノードにし（Ｓ１６ｇ）、同様の処理を行う。また、他端のノードが端点でも分岐点でもない場合には、当該ノードを新たな起点ノードにし（Ｓ１６ｆ）、当該ノードが一端に接続された近似線の他端のノードについて種類を分類する。この処理を行うと最初に選択した起点ノードに対して近似線を介して接続されているすべてのノードを抽出することができる。ただし、最初に選択した起点ノードと他のノードとの間には近似線が何本介在していてもよい。したがって、上述の処理が未処理である近似線がなくなれば（Ｓ１６ｃ）、最初に選択した起点ノードを含み１文字分とみなされるノードデータおよび近似線データの組が生成されるから、このノードデータおよび近似線データの組をクラスタ候補として一時的に記憶する（Ｓ１６ｈ）。クラスタ候補の生成および記憶の作業は画像内のすべてのノードがいずれかのクラスタ候補に含まれるようになるまで繰り返される（Ｓ１６ａ）。以上の処理により図９（ｃ）のようにクラスタ候補ｐｃ１〜ｐｃ３３が抽出される。図９（ｃ）では各クラスタ候補ｐｃ１〜ｐｃ３３を外接矩形で囲んである。 The process of generating the target cluster is performed according to the procedure represented by step S16 and step S17 in FIG. First, an appropriately selected node is set as a starting point node (S16b), and a node at the other end of the approximate line having the starting point node as one end is detected (S16d). The type of the node at the other end is classified into an end point, a branch point, and others (S16e). If it is an end point, the same processing is performed for another approximate line connected to the origin node (S16c). The branch point is set as a new starting node (S16g), and the same processing is performed. If the node at the other end is neither an end point nor a branch point, the node is set as a new starting node (S16f), and the type is classified for the node at the other end of the approximate line in which the node is connected to one end. When this processing is performed, all nodes connected to the first selected starting node via the approximate line can be extracted. However, any number of approximate lines may be interposed between the starting node selected first and another node. Therefore, if there is no approximate line that has not been processed (S16c), a set of node data and approximate line data including the first selected origin node and regarded as one character is generated. The set of approximate line data is temporarily stored as cluster candidates (S16h). The cluster candidate generation and storage operations are repeated until all nodes in the image are included in any cluster candidate (S16a). Through the above processing, cluster candidates pc1 to pc33 are extracted as shown in FIG. In FIG. 9C, the cluster candidates pc1 to pc33 are surrounded by a circumscribed rectangle.

上述のようにして得られるクラスタ候補ｐｃ１〜ｐｃ３３は、図９（ｃ）を見れば明らかなように、必ずしも１文字分ごとのクラスタを形成していない。そこで、図４に示すステップＳ１７では、１文字分に相当するクラスタ候補を結合する処理を行う。クラスタ候補ｐｃ１〜ｐｃ３３を結合するか否かは、各一対のクラスタ候補ｐｃ１〜ｐｃ３３に含まれるノードおよび近似線の距離によって評価する。すなわち、各クラスタ候補ｐｃ１〜ｐｃ３３から２個のクラスタ候補を抽出し、互いに他方のクラスタ候補に含まれるすべてのノード間の距離と近似線間の距離との最短距離を求める（Ｓ１７ｂ）。この最短距離が規定した閾値以下であるときには（Ｓ１７ｃ）、両クラスタ候補は１つの単位として扱うべきであると判断し、両クラスタ候補を１つのクラスタとして扱ってターゲットデータとして登録する（Ｓ１７ｅ）。また、上述した最短距離が規定した閾値よりも大きいときには、両クラスタ候補は各別の文字と判断し、各別のクラスタとして扱ってターゲットデータを登録する（Ｓ１７ｄ）。たとえば、図９（ｃ）のクラスタ候補ｐｃ１〜ｐｃ７におけるノード間の最短距離が閾値よりも小さくなるように閾値を設定することによって、両クラスタ候補ｐｃ１〜ｐｃ７を、図９（ｄ）のように１つの単位として扱うことになる。ただし、クラスタ候補ｐｃ２，ｐｃ３のようにノードと近似線との距離が近い場合に、複数文字が１文字として扱われる場合があり、図９（ｄ）のように、クラスタ候補ｐｃ１〜ｐｃ３３の評価により得られるターゲットクラスタｔｃ１〜ｔｃ４の一部には複数文字を含む場合が生じる。 As can be seen from FIG. 9C, the cluster candidates pc1 to pc33 obtained as described above do not necessarily form a cluster for each character. Therefore, in step S17 shown in FIG. 4, processing for combining cluster candidates corresponding to one character is performed. Whether or not the cluster candidates pc1 to pc33 are to be combined is evaluated based on the distance between the nodes and approximate lines included in each pair of cluster candidates pc1 to pc33. That is, two cluster candidates are extracted from each cluster candidate pc1 to pc33, and the shortest distance between the distances between all the nodes included in the other cluster candidate and the distance between the approximate lines is obtained (S17b). When the shortest distance is equal to or less than the prescribed threshold value (S17c), it is determined that both cluster candidates should be handled as one unit, and both cluster candidates are handled as one cluster and registered as target data (S17e). If the shortest distance is larger than the prescribed threshold value, both cluster candidates are determined as different characters, and are treated as different clusters and target data is registered (S17d). For example, by setting a threshold value so that the shortest distance between nodes in the cluster candidates pc1 to pc7 in FIG. 9C is smaller than the threshold value, both cluster candidates pc1 to pc7 are changed as shown in FIG. It will be handled as one unit. However, when the distance between the node and the approximate line is short like cluster candidates pc2 and pc3, a plurality of characters may be treated as one character, and evaluation of cluster candidates pc1 to pc33 as shown in FIG. In some cases, the target clusters tc1 to tc4 obtained by the above include a plurality of characters.

上述した処理によって、クラスタ候補の結合を行うことにより、「ｉ」や「は」など不連続に分離された部位を含む文字やドット文字のように、１つの文字が隣接した２個以上の部分で構成されている場合であっても、１文字のクラスタにまとめることが可能になる。このことは、かすれ文字や分断文字であっても同様である。 By combining the cluster candidates by the above-described processing, two or more portions where one character is adjacent, such as a character or a dot character including discontinuously separated parts such as “i” and “ha” Even in the case of being composed of, it is possible to combine them into a one-character cluster. This is the same even if it is a faint character or a divided character.

次に、照合度算出過程では、ドット文字マスタと上述したターゲットデータとを位置合わせし、後に実施する判読性評価過程においてドットごとの有無を判断するためのドット照合度と、位置合わせのために必要な文字形状の全体的な類似性を評価するクラスタ照合度とを求める処理を実施する。ドット照合度とクラスタ照合度とを求める照合度算出過程の処理手順について図１０を用いて詳細に説明する。 Next, in the collation degree calculation process, the dot character master and the target data described above are aligned, and in the subsequent legibility evaluation process, the dot collation degree for determining the presence / absence of each dot and the alignment are performed. Processing for obtaining a cluster matching degree for evaluating the overall similarity of necessary character shapes is performed. The processing procedure of the matching degree calculation process for obtaining the dot matching degree and the cluster matching degree will be described in detail with reference to FIG.

ターゲットクラスタとドット文字マスタとの位置合わせには、ターゲットクラスタとドット文字マスタとのそれぞれに設定した外接矩形同士の位置を合わせる（Ｓ２２）。ただし、ターゲットクラスタは１文字分以上のターゲットデータを含み、複数文字分のターゲットを含むことがあるから、ドット文字マスタとターゲットクラスタとの高さ寸法と幅寸法とのそれぞれの寸法比を用いてターゲットクラスタに含まれる文字数を推定し、推定した文字数に応じて図１１に示すようにターゲットクラスタｔｃ内でドット文字マスタｄｍを移動させ、それぞれの位置で後述するのクラスタ照合度を求め、クラスタ照合度が最大になる位置を照合位置として、照合位置において後述するドット照合度を求め、このドット照合度を後述する判読性評価過程に用いる。 For the alignment of the target cluster and the dot character master, the positions of the circumscribed rectangles set in the target cluster and the dot character master are aligned (S22). However, since the target cluster includes target data for one character or more and may include a target for a plurality of characters, the dimensional ratio between the height dimension and the width dimension of the dot character master and the target cluster is used. Estimate the number of characters included in the target cluster, move the dot character master dm within the target cluster tc as shown in FIG. 11 according to the estimated number of characters, obtain the cluster matching degree described later at each position, Using the position where the degree is the maximum as a collation position, a dot collation degree to be described later is obtained at the collation position, and this dot collation degree is used in a legibility evaluation process to be described later.

上述のようにターゲットクラスタｔｃ内でドット文字マスタｄｍを移動させ、クラスタ照合度が最大になる位置を求めることにより、複数文字を含むターゲットクラスタｔｃにおける文字の切り分けがなされる。ここに、クラスタ照合度はターゲットクラスタｔｃにおけるクラスタ候補ｐｃとドット文字マスタｄｍの全体との類似度を表しており、クラスタ照合度が高いことは、クラスタ候補ｐｃがドット文字マスタｄｍで表される文字と一致する可能性が高いことを表す。このように、ドット文字マスタｄｍとターゲットクラスタｔｃとの外接矩形の位置を合わせてクラスタ照合度を求めることにより、テンプレートを１画素ずつずらしてドット照合度を求める場合に比較すると、照合すべき領域が大幅に少なくなり、処理負荷を軽減することができる。 As described above, the dot character master dm is moved in the target cluster tc, and the position where the cluster matching degree is maximized is obtained, so that the characters in the target cluster tc including a plurality of characters are separated. Here, the cluster matching degree represents the similarity between the cluster candidate pc in the target cluster tc and the entire dot character master dm, and the fact that the cluster matching degree is high is represented by the dot character master dm. Represents a high probability of matching a character. In this way, by comparing the positions of the circumscribed rectangles of the dot character master dm and the target cluster tc to obtain the cluster matching degree, the template is shifted pixel by pixel, and compared with the case where the dot matching degree is obtained, the area to be matched Can be greatly reduced, and the processing load can be reduced.

ターゲットクラスタとドット文字マスタとの位置合わせの後、まずドット文字マスタをターゲットクラスタのドットの並びに合致させるように変形させる補正を行う（Ｓ２３〜Ｓ２５）。すなわち、図１２（ａ）に示すように、ターゲットクラスタと（ノードを白丸で表し、ノード間を連結する線分を線要素としている部分）のノードおよび線要素にドット文字マスタのドットｄ１〜ｄ１４が重なるように、各ドットｄ１〜ｄ１４の位置から方向値（図３参照）で示される方向にターゲットクラスタの対応点を探索し、ターゲットクラスタが検出されたときには、ターゲットクラスタにおいて検出された位置にドットｄ１〜ｄ１４を対応付ける。つまり、仮想線の延長方向に交差する方向において各ドットｄ１〜ｄ１４からの距離が規定した閾値以内に存在するターゲットクラスタの対応点を探索し、ターゲットクラスタに対応点が検出されると、その位置にドットｄ１〜ｄ１４を対応付ける。ドット文字マスタの各ドットｄ１〜ｄ１４に設定された方向値で示される方向に探索してもターゲットクラスタを検出できないドット（図示例では、ｄ４，ｄ１０，ｄ１３，ｄ１４については、方向値で示される方向に対して規定の角度範囲内で距離がもっとも近い位置のノードを対応付ける（Ｓ２３）。 After the alignment between the target cluster and the dot character master, first, correction is performed so that the dot character master is deformed so as to be aligned with the dots of the target cluster (S23 to S25). That is, as shown in FIG. 12 (a), dots d1 to d14 of the dot character master are added to the nodes and line elements of the target cluster (parts where the nodes are represented by white circles and the line segments connecting the nodes are line elements). When the target cluster is detected from the position of each dot d1 to d14 in the direction indicated by the direction value (see FIG. 3) and the target cluster is detected, the position detected in the target cluster is The dots d1 to d14 are associated with each other. That is, when a corresponding point of a target cluster that is within a threshold value defined by the distance from each of the dots d1 to d14 in a direction intersecting with the extending direction of the virtual line is searched for and the corresponding point is detected in the target cluster, Are associated with dots d1 to d14. Dots that cannot detect the target cluster even if searching in the direction indicated by the direction value set for each dot d1 to d14 of the dot character master (in the example shown, d4, d10, d13, and d14 are indicated by the direction value). The node having the closest distance to the direction within the specified angle range is associated (S23).

さらに、ドット文字マスタにおけるドットｄ１〜ｄ１４とターゲットクラスタの対応点との位置関係に基づいて、次式に基づいてドット文字マスタを変形させるためのパラメータａ１〜ａ６を最小自乗法によって求める（Ｓ２４）。
Ｘ＝ａ１・ｘ＋ａ２・ｙ＋ａ３
Ｙ＝ａ４・ｘ＋ａ５・ｙ＋ａ６
（Ｘ，Ｙ）はターゲットクラスタ上の座標であり、（ｘ，ｙ）はドット文字マスタ上の座表を示す。この変形式は、位置と回転とスケールと座標の斜交角の変形を補償するアフィン変換式であるから、図９（ｂ）のようにドット文字マスタの幅よりもターゲットクラスタの幅がやや狭くなる変形を生じている文字であってもドット文字マスタを重ね合わせることが可能になる（Ｓ２５）。ただし、ドット文字マスタのアフィン変換を行ってもドット文字マスタをターゲットクラスタに完全に一致させるのは困難であるから、図１３（ａ）のように、ドット文字マスタの変形後に各配置点ｑｉごとに配置点ｑｉを中心とした注目領域ａｉ（ｉ＝１，２，…，３５）を設定する（Ｓ２６）。注目領域ａｉは、図示例ではドット文字マスタの変形後の配置点ｑｉの位置を中心とする同じ大きさの正方形状の領域であって、隣接する配置点ｑｉについて設定した注目領域ａｉが互いに重複しない程度の大きさに設定される。なお、注目領域ａｉは正方形状ではなく円形などを用いることも可能である。 Further, based on the positional relationship between the dots d1 to d14 in the dot character master and the corresponding points of the target cluster, parameters a1 to a6 for deforming the dot character master are obtained by the least square method based on the following equation (S24). .
X = a1 · x + a2 · y + a3
Y = a4 · x + a5 · y + a6
(X, Y) is a coordinate on the target cluster, and (x, y) is a table on the dot character master. Since this deformation equation is an affine transformation equation that compensates for deformation of the oblique angle of position, rotation, scale, and coordinates, the width of the target cluster is slightly narrower than the width of the dot character master as shown in FIG. 9B. Even if the character is deformed, the dot character master can be superimposed (S25). However, even if the affine transformation of the dot character master is performed, it is difficult to make the dot character master completely coincide with the target cluster. Therefore, as shown in FIG. The attention area ai (i = 1, 2,..., 35) centered on the arrangement point qi is set in (S26). In the illustrated example, the attention area ai is a square area having the same size centered on the position of the arrangement point qi after deformation of the dot character master, and the attention areas ai set for the adjacent arrangement points qi overlap each other. It is set to a size that does not. Note that the attention area ai may be circular instead of square.

次に、ドット文字マスタの各配置点ｑｉについて設定した注目領域ａｉについてターゲットクラスタが重なっているか否かを評価するドット照合度を求め（Ｓ２７）、ドット照合度に基づいてターゲットクラスタとドット文字マスタとの全体的な類似性を示すクラスタ照合度を求める（Ｓ２８）。 Next, a dot matching degree for evaluating whether or not the target cluster overlaps the attention area ai set for each arrangement point qi of the dot character master is obtained (S27), and the target cluster and the dot character master are determined based on the dot matching degree. The cluster matching degree indicating the overall similarity to is obtained (S28).

まず、ドット照合度を求める方法について説明する。ドット照合度には、それぞれの注目領域ａｉに含まれるターゲットクラスタの線要素ｌｉ（ｉ＝１〜７）の長さを用いる。たとえば、注目領域ａ１６はターゲットクラスタに含まれる線要素ｌ２，ｌ５の一部と重複しているから、注目領域ａ１６に対するドット照合度は、線要素ｌ２，ｌ５が当該注目領域ａ１６内を通る部分の長さの和で求められる。ドット照合度は、注目領域ａ１〜ａ３５内に線要素が多く存在するほど大きい値になる。つまり、ターゲットクラスタにおけるドットの有無を表す数値になる。図１３（ａ）について求めたドット照合度の例を図１３（ｂ）に示す。 First, a method for obtaining the dot matching degree will be described. For the dot matching degree, the length of the line element li (i = 1 to 7) of the target cluster included in each attention area ai is used. For example, since the attention area a16 overlaps with part of the line elements l2 and l5 included in the target cluster, the dot matching degree with respect to the attention area a16 is the portion of the line element l2 and l5 passing through the attention area a16. Calculated as the sum of lengths. The dot matching degree becomes larger as more line elements are present in the attention areas a1 to a35. That is, it is a numerical value indicating the presence or absence of dots in the target cluster. An example of the dot matching degree obtained for FIG. 13A is shown in FIG.

ドット照合度はドット文字マスタを構成する配置点ｑｉごとに求められるから、ドット文字マスタの１文字について配置点ｑｉごとドット照合度を対応付けたドット照合度マトリクスが得られる。また、ドット照合度マトリクスの各注目領域ａ１〜ａ３５のドット照合度は、ターゲットクラスタを抽出した文字の形状とドット文字マスタとの一致の程度の目安になる。そこで、ドット照合度が得られると、ドット照合度マトリクスとドット文字マスタとの類似性を評価するクラスタ照合度を求める（Ｓ２８）。クラスタ照合度を求めるには、まず図１４に示すように、ドット文字マスタのうちドットｄ１〜ｄ１４の存在する配置点ｑｉに「１」を与え、残りの配置点ｑｉに「０」を与えた二値化マトリクスを形成し、図１３（ｂ）のようなドット照合度マトリクスと二値化マトリクスとの正規化した相互相関値（相互相関係数）を求め、この値をクラスタ照合度とする。 Since the dot matching degree is obtained for each arrangement point qi constituting the dot character master, a dot matching degree matrix in which the dot matching degree is associated with each arrangement point qi for one character of the dot character master is obtained. Further, the dot matching degree of each of the attention areas a1 to a35 of the dot matching degree matrix is a measure of the degree of matching between the character shape from which the target cluster is extracted and the dot character master. Therefore, when the dot matching degree is obtained, the cluster matching degree for evaluating the similarity between the dot matching degree matrix and the dot character master is obtained (S28). In order to obtain the cluster matching degree, first, as shown in FIG. 14, “1” is given to the arrangement points qi where the dots d1 to d14 exist in the dot character master, and “0” is given to the remaining arrangement points qi. A binarized matrix is formed, a normalized cross-correlation value (cross-correlation coefficient) between the dot matching degree matrix and the binarizing matrix as shown in FIG. 13B is obtained, and this value is used as the cluster matching degree. .

図１３（ｂ）のドット照合度マトリクスと図１４の二値化マトリクスとの正規化した相互相関係数は０．９１９になり、これは両者の類似度が９１．９％であることを意味する。クラスタ照合度は、ドット文字マスタのドットの存在する位置においてターゲットデータが存在しない場合や、ドット文字マスタにおいてドットが存在しない位置においてターゲットデータが存在する場合には、相互相関係数は小さい値になるから、この値をクラスタ照合度に用いることによって、文字全体の類似性をよく表した特徴量を得ることができる。 The normalized cross-correlation coefficient between the dot matching degree matrix of FIG. 13B and the binarization matrix of FIG. 14 is 0.919, which means that the similarity between them is 91.9%. To do. When the target data does not exist at the position where the dot of the dot character master exists or when the target data exists at the position where the dot does not exist in the dot character master, the cross-correlation coefficient is a small value. Therefore, by using this value as the cluster matching degree, it is possible to obtain a feature quantity that well represents the similarity of the whole character.

したがって、ターゲットクラスタの複数の位置にドット文字マスタの位置を合わせるとともにクラスタ照合度を求め、クラスタ照合度が最大になる位置をドット文字マスタに類似した文字がターゲットクラスタに存在する位置と判断し、このときに得られたドット照合度マトリクスを図示しないバッファに保存する。具体的には、クラスタ照合度が得られると、それまでに得られているクラスタ照合度の最大値と比較し（Ｓ２９）、新たに得られたクラスタ照合度がクラスタ照合度の最大値よりも大きいときには、新たに得られたクラスタ照合度を最大値に置き換える（Ｓ３０）。また、このときのドット文字マスタの位置を照合位置とし（Ｓ３１）、さらにドット照合度マトリクスを検査用ドット照合度マトリクスとしてバッファに登録する（Ｓ３２）。ステップＳ２２〜Ｓ３２の処理はターゲットクラスタのすべての位置でクラスタ照合度を求めるまで繰り返して行い（Ｓ３３）、その後、印字された文字の良否判定を行う判読性評価過程に移行する。 Therefore, the position of the dot character master is aligned with a plurality of positions of the target cluster and the cluster matching degree is obtained, and the position where the cluster matching degree is maximized is determined as the position where the character similar to the dot character master exists in the target cluster, The dot matching degree matrix obtained at this time is stored in a buffer (not shown). Specifically, when the cluster matching degree is obtained, it is compared with the maximum value of the cluster matching degree obtained so far (S29), and the newly obtained cluster matching degree is larger than the maximum value of the cluster matching degree. When it is larger, the newly obtained cluster matching degree is replaced with the maximum value (S30). Further, the position of the dot character master at this time is set as a collation position (S31), and a dot matching degree matrix is registered in the buffer as a dot matching degree matrix for inspection (S32). The processes in steps S22 to S32 are repeated until the cluster matching degree is obtained at all positions of the target cluster (S33), and then the process proceeds to a legibility evaluation process in which the quality of printed characters is determined.

ところで、判読性とは、文字が判読できるか否かの性質であって定性的な判断に基づくものである。文字が判読できないのは、たとえば図１５（ａ）のように文字が大きく欠けていたり、図１５（ｂ）のように文字以外のノイズが混入したり、図１５（ｃ）のように文字に著しい変形が生じたりするなどの原因が考えられる。判読性評価過程では、この種の原因による判読性の程度を定量的な基準で判断可能としている。 By the way, legibility is the property of whether or not characters can be read and is based on qualitative judgment. The reason why the character cannot be read is that, for example, the character is largely missing as shown in FIG. 15 (a), noise other than the character is mixed as shown in FIG. 15 (b), or the character is added as shown in FIG. 15 (c). Possible causes such as significant deformation. In the legibility assessment process, the degree of legibility due to this type of cause can be judged on a quantitative basis.

判読性評価過程では、図１７に示すように、ドット文字マスタにおいてドットｄ１〜ｄ１４が存在している配置点に対して対象物１１に印刷された文字では対応する部分が存在しないドット抜け点ｆ１、またはドット文字マスタにおいてドットｄ１〜ｄ１４が存在していない配置点に対して対象物１１の文字ではドットに相当するノイズが存在しているノイズ点ｆ２を抽出し、ドット抜け点ｆ１およびノイズ点ｆ２の個数をそれぞれ求める（Ｓ４１〜Ｓ４７）。すなわち、照合度抽出過程において求めたドット照合度マトリクスに対して適宜の閾値を適用することにより、ドット文字マスタの各ドットｄ１〜ｄ１４ごとに、対応するドットが対象物１１に印刷された文字に存在するか否かを判定し、ドットｄ１〜ｄ１４に対してドット照合度が閾値以下であるとドット抜け点ｆ１、ドットｄ１〜ｄ１４以外でドット照合度が閾値以上の配置点をノイズ点ｆ２とする。 In the legibility evaluation process, as shown in FIG. 17, a dot missing point f1 in which there is no corresponding portion in the character printed on the object 11 with respect to the arrangement points where the dots d1 to d14 exist in the dot character master. Alternatively, a noise point f2 where noise corresponding to a dot exists in the character of the object 11 is extracted from the arrangement points where the dots d1 to d14 do not exist in the dot character master, and the dot missing point f1 and the noise point are extracted. The number of f2 is obtained (S41 to S47). That is, by applying an appropriate threshold value to the dot matching degree matrix obtained in the matching degree extraction process, a corresponding dot is printed on the character printed on the object 11 for each dot d1 to d14 of the dot character master. It is determined whether or not it exists, and if the dot matching degree is less than or equal to the threshold for the dots d1 to d14, the dot missing point f1 is set, and an arrangement point having a dot matching degree equal to or more than the threshold other than the dots d1 to d14 is referred to as the noise point f2. To do.

上述のようにしてドット抜け点ｆ１とノイズ点ｆ２との個数をそれぞれ求めた後、それぞれ適宜の閾値と比較する（Ｓ５２，Ｓ５３）。ドット抜け点ｆ１、ノイズ点ｆ２はいずれもドット文字マスタと一致しない部位であるから、これらの個数が多いことは判読性が低いことを意味する。したがって、ドット抜け点ｆ１、ノイズ点ｆ２のうちのいずれかの個数が閾値異常になるときには判読性が低いと評価し、対象物１１に印刷された文字は不良であると判断する（Ｓ５７）。また、どの個数も閾値未満であれば対象物１１に印刷された文字を良好と判断する（Ｓ５６）。ここに、ドット抜け点ｆ１とノイズ点ｆ２との個数に対する閾値は、たとえば、それぞれ２個、３個とすればよい。つまり、ドット抜け点ｆ１が１個以内かつノイズ点ｆ２が２個以内であれば良好と判断するのである。 After obtaining the numbers of dot missing points f1 and noise points f2 as described above, each is compared with an appropriate threshold value (S52, S53). Since both the dot missing point f1 and the noise point f2 are portions that do not coincide with the dot character master, a large number of these means low legibility. Therefore, when any one of the dot missing point f1 and the noise point f2 is abnormal in threshold value, it is evaluated that the legibility is low, and the character printed on the object 11 is determined to be defective (S57). If any number is less than the threshold, it is determined that the character printed on the object 11 is good (S56). Here, the threshold values for the number of missing dot points f1 and noise points f2 may be, for example, 2 and 3, respectively. That is, if the dot missing point f1 is within one and the noise point f2 is within two, it is determined to be good.

判読性の評価には、ドット抜け点ｆ１とノイズ点ｆ２との個数に加えて、隣接しているドット抜け点ｆ１の個数を併用してもよい（Ｓ４８〜Ｓ５０，Ｓ５４）。隣接するドット抜け点ｆ１が存在することは２個以上のドット抜け点ｆ１が存在することを意味しているから、隣接するドット抜け点ｆ１の個数を評価に用いる場合には、ドット抜け点ｆ１の個数に対する閾値を３個とする。 For evaluation of legibility, in addition to the number of dot missing points f1 and noise points f2, the number of adjacent dot missing points f1 may be used together (S48 to S50, S54). Since the presence of the adjacent dot missing point f1 means that there are two or more dot missing points f1, when the number of adjacent dot missing points f1 is used for evaluation, the dot missing point f1. It is assumed that the threshold for the number of is three.

図１６に示した判読性評価過程の処理手順は、ライン抜けの判定も含んでいる（Ｓ５１，Ｓ５５）。ライン抜けとは、文字の印刷（あるいは刻印）の際に、文字を印刷ないし刻印する装置の不具合によって一直線上に連続的にドット抜け点を生じることを意味する。ライン抜けが生じる原因は、文字を構成する各ドットをそれぞれ印刷する複数個の点状の印字部を一列に配列した印字ヘッドを、印字部が並ぶ方向とは直交する方向に走行させて文字を印字する印字装置を用いる場合に、いずれかの印字部の不具合によって印字ヘッドの走行方向に沿った一直線上でドット抜けが連続的に生じることにある。たとえば、インクジェット式の印字装置では、印字ヘッドが詰まることによってライン抜けが生じることがある。 The processing procedure of the legibility evaluation process shown in FIG. 16 includes determination of missing lines (S51, S55). Line missing means that when characters are printed (or engraved), dot missing points are continuously generated on a straight line due to a malfunction of an apparatus for printing or engraving characters. The cause of the line omission is that the character is moved by running a print head in which a plurality of dot-like print portions that print each dot constituting the character are arranged in a row in a direction perpendicular to the direction in which the print portions are arranged. When a printing apparatus for printing is used, missing dots continuously occur on a straight line along the traveling direction of the print head due to a defect in one of the printing units. For example, in an ink jet printing apparatus, line omission may occur due to clogging of a print head.

ライン抜けを生じるときには、文字の形状によって、ドット抜け点が１個以下しか生じない場合があるから、ドット抜け点ｆ１やノイズ点ｆ２の個数に基づく判読性の判断では良好とみなされることがある。しかし、印字ヘッドが詰まるなど印字装置に異常がある場合には早期に対処する必要があり、とくにライン抜けは重大な異常として検出する必要がある。そこで、複数個の文字が並んでいるときに、文字が並ぶ方向における各ライン上（一般に、横書きの文字では各高さ位置の直線上）でのドットの個数をあらかじめ求めておき、印刷された文字から求めた同ライン上のドットの個数と比較することで、ライン抜けの判断を行う。つまり、印刷された文字から求めた各ライン上のドット数を、ドット文字マスタから求めた各ライン上のドット数で除算し、除算結果が規定の閾値以下であるときにはライン抜けと判断する。たとえば、図１８に示す例では、縦７個のドットを形成する印字ヘッドを用いて文字を印字した例であって、印字ヘッドが正常であれば印字された文字が図１８（ａ）のようになるものとする。横並びの各ドットの数は右端に記載してある。ここで、図１８（ｂ）のように、印字ヘッドの異常により上から３行目のドットが抜けたとすれば、本来は１１個のドットが存在するのに対してドット数が０個になるから、３行目においてライン抜けが生じたと判断することができるのである。 When line omission occurs, depending on the shape of the character, there may be only one or more dot omission points, so it may be considered good in the legibility determination based on the number of dot omission points f1 and noise points f2. . However, when there is an abnormality in the printing apparatus such as the print head being clogged, it is necessary to deal with it at an early stage, and in particular, it is necessary to detect missing lines as a serious abnormality. Therefore, when a plurality of characters are arranged, the number of dots on each line in the direction in which the characters are arranged (generally on a straight line at each height position for horizontally written characters) is obtained in advance and printed. The line missing is determined by comparing with the number of dots on the same line obtained from the characters. That is, the number of dots on each line obtained from the printed character is divided by the number of dots on each line obtained from the dot character master, and it is determined that the line is missing when the division result is equal to or less than a prescribed threshold value. For example, the example shown in FIG. 18 is an example in which characters are printed using a print head that forms seven vertical dots. If the print head is normal, the printed characters are as shown in FIG. Shall be. The number of each side-by-side dot is shown at the right end. Here, as shown in FIG. 18 (b), if the dots in the third row from the top are lost due to an abnormality in the print head, the number of dots is 0 while 11 dots originally exist. From this, it can be determined that a line omission has occurred in the third row.

ところで、ドット照合度としては、上述した値だけではなく、以下に説明する６種類の値のうちのいずれかを用いてもよい。 By the way, as a dot collation degree, you may use not only the value mentioned above but any of six types of values demonstrated below.

（１）図１９はターゲットクラスタに含まれる線要素を一定幅に膨張させるように膨張処理を行ったものであり、膨張後の線要素が注目領域ａｉに占める画素数をドット照合度として用いることができる。線要素を膨張させて用いると、線要素が注目領域ａｉの境界付近に存在する場合に、境界を挟む複数の注目領域ａｉにおいてドット照合度が得られ、ドット照合度が０ではない注目領域ａｉが増えるから、ターゲットクラスタの形状がドット照合度に反映されやすくなり、線要素の膨張処理を行わないより安定性の高い認識が可能になる。 (1) FIG. 19 shows an expansion process performed so that the line elements included in the target cluster are expanded to a certain width, and the number of pixels occupied by the expanded line element in the attention area ai is used as the dot matching degree. Can do. When the line element is expanded and used, when the line element exists near the boundary of the attention area ai, the dot matching degree is obtained in the plurality of attention areas ai sandwiching the boundary, and the attention area ai where the dot matching degree is not zero. Therefore, the shape of the target cluster is easily reflected in the dot matching degree, and the recognition can be performed with higher stability than when the line element expansion processing is not performed.

（２）他のドット照合度としては、ターゲットクラスタを構成する近似線からの距離が大きいほど確率が低くなるような確率密度分布をターゲットクラスタに与え、各注目領域ａｉの中での確率の積算値を各注目領域ａｉのドット照合度に用いてもよい。図２０は確率密度分布の概念を表しており、確率を濃淡で表している（色が濃いほど確率が高い）。近似線との距離が大きいほど低い確率を与えるから、注目領域ａｉの境界付近の近似線に対するドット照合度が小さくなり、各注目領域ａｉに対して近似線からの距離に応じた合理的なドット照合度を与えることができる。 (2) As another dot matching degree, a probability density distribution is given to the target cluster such that the probability decreases as the distance from the approximate line constituting the target cluster increases, and the probability is integrated in each attention area ai. A value may be used for the dot matching degree of each attention area ai. FIG. 20 shows the concept of probability density distribution, where the probability is expressed in shades (the darker the color, the higher the probability). The greater the distance from the approximate line, the lower the probability, so the degree of dot matching with respect to the approximate line near the boundary of the attention area ai decreases, and rational dots corresponding to the distance from the approximate line for each attention area ai. A matching degree can be given.

（３）注目領域ａｉを原画像である濃淡画像に重ね合わせ、各注目領域ａｉごとの平均明度をそれぞれドット照合度に用いるようにしてもよい。このようにしてドット照合度を求めると、膨張処理が不要であり、また確率密度分布を求める必要もないから、ドット照合度を簡単に求めることができる。 (3) The attention area ai may be superimposed on the grayscale image that is the original image, and the average brightness for each attention area ai may be used as the dot matching degree. When the dot matching degree is obtained in this way, the expansion process is not required, and it is not necessary to obtain the probability density distribution. Therefore, the dot matching degree can be easily obtained.

（４）骨格線の抽出の際に、骨格線ではなく点列である骨格点が得られているときには、図２１のように、各注目領域ａｉの中の骨格点（骨格点の点列は骨格線ｌｉに相当するから、図では骨格線ｌｉの符号を付している）の個数をドット照合度に用いてもよい。骨格点の個数は骨格線ｌｉの長さに相当するが、骨格点が得られているときには骨格線ｌｉの長さを求めるよりも処理が簡単になる。 (4) When a skeleton point that is not a skeleton line but a point sequence is obtained when extracting a skeleton line, as shown in FIG. 21, the skeleton points in each attention area ai (the sequence of skeleton points is Since this corresponds to the skeleton line li, the number of skeleton lines li in the figure may be used for the dot matching degree. The number of skeleton points corresponds to the length of the skeleton line li. However, when the skeleton point is obtained, the processing becomes simpler than obtaining the length of the skeleton line li.

（５）ドット照合度として骨格点を利用する場合に、図２２のように、骨格点を中心とする規定の幅および高さを有する矩形を設定し、当該矩形が各注目領域ａｉと重複する面積の総和をドット照合度に用いることも可能である。この方法では、骨格線の膨張処理に相当し、簡単な方法でドット照合度を得ることができる。 (5) When a skeleton point is used as the dot matching degree, a rectangle having a specified width and height centered on the skeleton point is set as shown in FIG. 22, and the rectangle overlaps with each attention area ai. It is also possible to use the total area as the dot matching degree. This method corresponds to the skeleton line expansion process, and the dot matching degree can be obtained by a simple method.

（６）骨格点を用いてドット照合度を求める方法として、図２３のように、骨格点を中心として骨格点の中心からの距離が大きくなるほど確率が低くなる確率密度分布を求め、各注目領域ａｉにおける確率の積算値を各注目領域ａｉのドット照合度に用いてもよい。図２３は確率密度分布の概念を表しており、確率を濃淡で表している（色が濃いほど確率が高い）。この方法では、骨格線に対して求めた確率密度分布を用いてドット照合度を求める場合と同様の効果が期待でき、しかも骨格点を用いるから骨格線を用いる場合よりも処理量が少なくなる。 (6) As a method for obtaining the degree of dot matching using the skeleton point, as shown in FIG. 23, a probability density distribution is obtained in which the probability decreases as the distance from the center of the skeleton point increases with respect to the skeleton point. You may use the integrated value of the probability in ai for the dot collation degree of each attention area ai. FIG. 23 shows the concept of probability density distribution, and the probability is expressed in shades (the darker the color, the higher the probability). In this method, the same effect as the case of obtaining the dot matching degree using the probability density distribution obtained for the skeleton line can be expected, and moreover, since the skeleton point is used, the processing amount is smaller than when the skeleton line is used.

一般に文字には人あるいは装置が認識する際のキーポイントとなる特徴部分が含まれるから、文字ごとに特徴部分（重要部分）に関する基準を設定しておき、特徴部分がドット抜け点やノイズ点になったときにはドット抜け点やノイズ点の個数にかかわらず、不良と判断するようにしてもよい。 Generally, characters include feature parts that are key points for recognition by humans or devices. Therefore, a standard for feature parts (important parts) is set for each character, and the feature parts are used as dot missing points or noise points. When it becomes, it may be determined to be defective regardless of the number of missing dots or noise points.

たとえば、図２４（ａ）のような文字「８」と図２４（ｂ）のような文字「Ｂ」とは形状に共通点が多く、両者の差異は図２４（ｂ）に示すドットｄ２１〜ｄ２３の有無である。したがって、ドットｄ２１〜ｄ２３は判読性に関わる重要部分であって、仮に図２４（ｃ）のようにドットｄ２１，ｄ２２が抜けているときには「８」か「Ｂ」かの識別が困難になって誤読を生じやすくなる。したがって、ドット文字マスタの「８」に対してノイズ点が判読性に関わる重要部分であるドットｄ２１〜ｄ２３の位置に生じるときは不良と判断し、またドットマスタ「Ｂ」に対してドット抜け点がドットｄ２１〜ｄ２３の位置に生じるときは不良と判断する。 For example, the character “8” as shown in FIG. 24 (a) and the character “B” as shown in FIG. 24 (b) have many common points in shape, and the difference between them is the dots d21 to d21 shown in FIG. 24 (b). The presence or absence of d23. Therefore, the dots d21 to d23 are important parts related to legibility, and if the dots d21 and d22 are missing as shown in FIG. 24C, it becomes difficult to identify “8” or “B”. Misreading is likely to occur. Therefore, when the noise point occurs at the positions of the dots d21 to d23 which are important parts related to the legibility with respect to “8” of the dot character master, it is judged as defective, and the dot missing point with respect to the dot master “B”. Is determined to be defective when the dot occurs at the positions of dots d21 to d23.

あるいはまた、図２５（ａ）（ｂ）のように文字「Ｎ」と「Ｈ」とでは、図２５（ａ）のドットｄ２４，ｄ２５、図２５（ｂ）のドットｄ２６，ｄ２７が識別のための特徴部分になる。したがって、図２５（ｃ）のようにドットｄ２４，ｄ２５の位置とドットｄ２６，ｄ２７の位置との両方にドットが検出されるときには、「Ｎ」「Ｈ」のいずれの文字に対しても不良と判断する。 Alternatively, as shown in FIGS. 25 (a) and 25 (b), for characters “N” and “H”, the dots d24 and d25 in FIG. 25 (a) and the dots d26 and d27 in FIG. 25 (b) are for identification. It becomes the characteristic part. Therefore, when dots are detected at both the positions of the dots d24 and d25 and the positions of the dots d26 and d27 as shown in FIG. to decide.

本発明の実施形態の動作説明図である。It is operation | movement explanatory drawing of embodiment of this invention. 同上に用いる装置の概略構成図である。It is a schematic block diagram of the apparatus used for the same as the above. 同上に用いるドット文字マスタの一例を示す図である。It is a figure which shows an example of the dot character master used for the same as the above. 同上におけるターゲットデータ抽出過程の動作説明図である。It is operation | movement explanatory drawing of the target data extraction process in the same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上におけるターゲットクラスタの生成過程を示す動作説明図である。It is operation | movement explanatory drawing which shows the production | generation process of the target cluster in the same as the above. 同上におけるクラスタ候補の生成過程を示す動作説明図である。It is operation | movement explanatory drawing which shows the production | generation process of the cluster candidate in the same as the above. 同上における照合度算出過程を示す動作説明図である。It is operation | movement explanatory drawing which shows the collation degree calculation process in the same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上における判読性評価過程を示す動作説明図である。It is operation | movement explanatory drawing which shows the legibility evaluation process in the same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上におけるドット照合度を算出する概念を示す動作説明図である。It is operation | movement explanatory drawing which shows the concept which calculates the dot collation degree in the same as the above. 同上におけるドット照合度を算出する概念を示す動作説明図である。It is operation | movement explanatory drawing which shows the concept which calculates the dot collation degree in the same as the above. 同上におけるドット照合度を算出する概念を示す動作説明図である。It is operation | movement explanatory drawing which shows the concept which calculates the dot collation degree in the same as the above. 同上におけるドット照合度を算出する概念を示す動作説明図である。It is operation | movement explanatory drawing which shows the concept which calculates the dot collation degree in the same as the above. 同上におけるドット照合度を算出する概念を示す動作説明図である。It is operation | movement explanatory drawing which shows the concept which calculates the dot collation degree in the same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above. 同上の動作説明図である。It is operation | movement explanatory drawing same as the above.

Explanation of symbols

１１対象物
１２ＴＶカメラ
１３デジタル画像生成装置
１４記憶装置
１５画像処理装置 11 Object 12 TV Camera 13 Digital Image Generating Device 14 Storage Device 15 Image Processing Device

Claims

Extracts the skeleton line of the character / graphic from the image of the character / graphic input as the recognition target, then approximates the skeleton line as a set of approximate lines that are simple shapes, and approximates the continuous approximate line as one approximation In addition to the line group, by evaluating the distance between each pair of approximate lines that are not continuous, the approximate line group in the vicinity of the approximate line group is also added to the approximate line group. Target data extraction process to extract as target clusters, and dot arrangement and dots in dot characters / figures configured by arranging dots at placement points along the shape of characters / figures in a matrix composed of multiple vertical and horizontal placement points A collation degree calculation process for evaluating the degree of coincidence between the dot character / graphic master composed of the dimensions of the circumscribed rectangle of the character / graphic and the target cluster. After aligning the circumscribed rectangle of the dot character / graphic master with respect to the circumscribed rectangle set for the target cluster, the target cluster is set by setting the attention area based on the arrangement points of the dot character / graphic master. In addition to calculating the dot matching degree by quantifying the degree of dot presence / absence in the target cluster for each area of interest based on the positional relationship between the approximate line and each area of interest, the dot matching degree and dot character / graphic master in each area of interest The cluster matching degree that represents the similarity with the array of included dots is expressed as a numerical value, and when the cluster matching degree exceeds a threshold value, the target cluster is recognized as a character / figure represented by a dot character / figure master. To recognize characters and figures to be used.

2. The character / graphic recognition according to claim 1, wherein the cluster matching degree uses a normalized cross-correlation value between a dot matching degree in each region of interest and a dot arrangement included in a dot character / graphic master. Method.

The dot arrangement in the dot character / figure master is composed of the dot position in the dot character / figure and the direction value at the position of each dot reflecting the extension direction of the virtual line connecting the dots. In the process, the corresponding point of the target cluster that is within the threshold specified by the distance from each dot in the direction intersecting the extension direction of the virtual line is searched, and when the corresponding point is detected, each dot of the dot character / graphic master is detected. 3. The character / graphic figure according to claim 1 or 2, wherein the dot matching degree is obtained after the dot character / graphic master is transformed by affine transformation so that the degree of coincidence between the character and the corresponding point is high. Recognition method.

Character / graphic images are grayscale images, and pixels whose lightness is within a specified range and whose spatial second-order differential value is equal to or greater than a specified threshold are targets for extracting the skeleton line. The method for recognizing characters and figures according to any one of claims 1 to 3.

The character / graphic image is a grayscale image that approximates the lightness distribution of the area corresponding to the background with a quadric surface, and the lightness represented by the quadric surface among the lightness of each pixel in the character / graphic image 4. The character / graphic recognition method according to claim 1, wherein a pixel whose difference from a threshold value is equal to or more than a predetermined threshold is a target for extracting the skeleton line. 5.

The character / graphic image is a grayscale image, and in selecting the pixel from which the skeleton line is extracted by binarizing the lightness, the light within the lightness range including the lightness of the character / graphic is included in the character / graphic image. A brightness range is set so that the area occupied by the pixel is equal to or greater than the known area occupied by the character / graphic and within the specified range, and the pixels within the brightness range are to be extracted from the skeleton line. The character / graphic recognition method according to claim 1, wherein:

7. The dot collation obtained at the position of the dot character / graphic master when the cluster collation degree is equal to or greater than a threshold in the collation degree calculation process in the character / graphic recognition method according to claim 1. Using the degree and the arrangement of dots in the dot character / graphic master, it is determined whether or not the dot presence / absence matches the character / graphic image for each arrangement point of the dot character / graphic master. A method for inspecting characters / graphics, comprising a legibility evaluation process for determining the quality of characters / graphics to be recognized based on the number of characters.

The legibility evaluation process determines a dot missing point when an arrangement point where the presence or absence of a dot in the dot character / graphic master does not match the character / graphic image is an arrangement point where a dot exists in the dot character / graphic master. 8. The readability is judged to be poor when one of the number of dot missing points and noise points exceeds a prescribed threshold as a noise point when the dot is an arrangement point where no dot exists. Character / graphic inspection method.

In the dot character / graphic master, arrangement points of important parts related to legibility are registered in advance, and the legibility evaluation process is an arrangement in which the presence or absence of dots in the dot character / graphic master does not match the character / graphic image. 9. The character / graphic inspection method according to claim 7, wherein when the point is an important part, the legibility is judged to be poor.

In the legibility evaluation process, if there are a plurality of characters / figures to be recognized and the character is composed of a series of dots, the presence / absence of dots in the dot character / figuration master does not match the character / graphic image. The number of placement points when the placement point is a placement point where dots are present in the dot character / graphic master, the recognition target character / figure and the character / figure master to be recognized The result is obtained by dividing the number of placement points obtained from the character / figure to be recognized by the number of placement points obtained for the dot character / figure master. 8. The method according to claim 7, wherein when it is equal to or less than the threshold value, it is determined that a line of dots whose number has been obtained is missing in a character / graphic to be recognized. Inspection method characters and graphics of any one of claim 9.