JP2005148987A

JP2005148987A - Object identifying method and device, program and recording medium

Info

Publication number: JP2005148987A
Application number: JP2003383579A
Authority: JP
Inventors: Yoshinori Kusachi; 良規草地; Akira Suzuki; 章鈴木; Naoki Ito; 直己伊藤; Kenichi Arakawa; 賢一荒川; Shingo Ando; 慎吾安藤
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2003-11-13
Filing date: 2003-11-13
Publication date: 2005-06-09
Anticipated expiration: 2023-11-13
Also published as: JP4300098B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide an object identifying method and device for improving the discriminating capability of in-scenery letters, a program and a recording medium. <P>SOLUTION: This object identifying method and device(identifying device 1) are provided to change the position(position under consideration) and size(size under consideration) in a target image, and to segment the image under consideration(region image under consideration), and to extract features(target feature vectors) from the segmented image, and to compress the feature vectors according to the number of times of repetition (target compressed feature vectors), and to calculate a distance(degree of collation) at the time of projecting the target compressed feature vectors to a partial space by using a preliminarily inputted partial space(dictionary 3), and to decide the spatial connectivity of the degree of collation of each object (1), and to detect the maximum peak in the connection region (2), and to register the detected peak as a candidate (3), and to repeatedly execute the processes (1) to (3) the designated number of times. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、画像内にどのようなオブジェクトが写っているかを識別する画像識別技術を利用した産業応用システム例えば景観中文字認識システムに関するものである。 The present invention relates to an industrial application system, for example, a character recognition system in a landscape, using an image identification technique for identifying what kind of object is reflected in an image.

従来の景観に写った文字を認識する技術は、大きく分けて２つに分類される。 Techniques for recognizing characters in a conventional landscape are roughly classified into two.

１つの方法は、景観の中から文字が写った領域を切り出し、その領域を２値化し、従来の文字認識技術により判別するという技術（以下、文字切り出し方式と称する）である。この方法には特許公報１（特開平６−１３１４９２号公報）に開示されたナンバープレート認識方法がある。 One method is a technique (hereinafter referred to as a character cut-out method) in which an area in which characters are reflected from a landscape is cut out, the area is binarized, and is determined by a conventional character recognition technique. As this method, there is a license plate recognition method disclosed in Japanese Patent Laid-Open No. 6-131492.

もう一つは、文字の変形テンプレートを用意し、画像全面をスキャニングするという技術（以下、テンプレートマッチング方式と称する）である。この方法には特許公報２（特開２００１−３０７０２１号公報）に「パターン列抽出方法及びナンバープレート認識方法」がある。
特開平６−１３１４９２号公報特開２００１−３０７０２１号公報 The other is a technique of preparing a character deformation template and scanning the entire image (hereinafter referred to as a template matching method). Japanese Patent Laid-Open No. 2001-307021 discloses a “pattern row extraction method and license plate recognition method”.
JP-A-6-131492 Japanese Patent Laid-Open No. 2001-307021

しかしながら、前記文字切り出し方式においては、背景が複雑なテクスチャーを有する場合や文字同士がくっ付いている場合に、文字領域を切り出すこと自体が非常に難しい。また、文字領域に陰影が存在する場合には、２値化に失敗し易いという問題がある。そのため、文字切り出し方式は、適用できる景観画像に大きな制約があった。 However, in the character cutout method, it is very difficult to cut out the character area itself when the background has a complicated texture or when the characters are attached to each other. In addition, when there is a shadow in the character area, there is a problem that binarization tends to fail. Therefore, the character cutout method has a large restriction on the landscape image that can be applied.

テンプレートマッチング方式においては、マッチングの計算自体に多大な時間を要するため、実用的なシステムを構築するには、判別するカテゴリ数が限られるという問題がある。また、景観中文字のようにフォント等の幾何学的形状が非常に多様な場合は、事前に万全なテンプレートを用意することは困難であり、実用的ではないという問題がある。さらに、例えば看板の端を「１」として識別してしまうような、本来は文字ではないところを文字として識別してしまうという問題がある。 In the template matching method, since the matching calculation itself requires a lot of time, there is a problem that the number of categories to be determined is limited in order to construct a practical system. In addition, when the geometric shapes such as fonts are very diverse, such as characters in a landscape, it is difficult to prepare a thorough template in advance, which is not practical. Furthermore, there is a problem that, for example, the end of a signboard is identified as “1”, and a place that is not originally a character is identified as a character.

本発明は、かかる事情に鑑みなされたもので、その目的は、景観中文字の識別能力を高めたオブジェクト識別方法とその装置、プログラム及び記録媒体の提供にある。 The present invention has been made in view of such circumstances, and an object of the present invention is to provide an object identification method, an apparatus, a program, and a recording medium that improve the ability to identify characters in a landscape.

そこで、本発明のオブジェクト識別方法とその装置は、入力画像のある領域に対して特徴ベクトルを求め、辞書との照合により照合度を算出し、この照合度の分布傾向を利用して候補の絞り込みを行うことで、識別の高速化、誤抽出の削減を実現している。 Therefore, the object identification method and apparatus according to the present invention obtain a feature vector for a certain area of an input image, calculate a matching degree by matching with a dictionary, and narrow down candidates using the distribution tendency of the matching degree. As a result, the speed of identification and the reduction of erroneous extraction are realized.

すなわち、本発明は、オブジェクトを識別する段階において、
（１）対象画像において注目する位置（注目位置）及び大きさ（注目サイズ）を変更しながら注目する画像（注目領域画像）を切り出し、この切り出し画像から特徴（対象特徴ベクトル）を抽出し、下記繰り返しの回数に応じて特徴ベクトルを圧縮し（対象圧縮特徴ベクトル）した後に、予め入力された部分空間（認識辞書）を用いて対象圧縮特徴ベクトルを部分空間に投影した際の距離（照合度）を算出し、この各オブジェクトの照合度の空間的な連結性を判定し、
（２）連結領域内での最大のピークを検出し、
（３）この検出されたピークを候補として登録し、
そして、（１）〜（３）の工程を指定された回数繰り返し実行している。 That is, according to the present invention, in the step of identifying an object,
(1) Extracting an image of interest (attention area image) while changing the position of interest (attention position) and size (attention size) in the target image, and extracting features (target feature vectors) from the extracted image, The distance (matching degree) when the target compressed feature vector is projected onto the partial space using the previously input partial space (recognition dictionary) after compressing the feature vector according to the number of repetitions (target compressed feature vector) To determine the spatial connectivity of the matching degree of each object,
(2) Detect the largest peak in the connected region,
(3) Register this detected peak as a candidate,
Then, the steps (1) to (3) are repeatedly executed a specified number of times.

前記特徴抽出の過程では、各画像の横方向の微分と縦方向の微分成分を計算して微分の方向と強さを算出し、次いで、各画像の定められた領域内の各画素の微分方向を定められた段階に量子化し、次いで、微分の強さを段階毎に累積加算した微分方向ヒストグラム作成し、微分方向ヒストグラムをベクトルとみなしてその大きさを定められた値に正規化している。 In the feature extraction process, the horizontal direction differential and the vertical direction differential component of each image are calculated to calculate the direction and strength of the differential, and then the differential direction of each pixel in the defined region of each image Is quantized to a predetermined stage, and then a differential direction histogram is created by accumulating and adding the strength of the differentiation for each stage, and the differential direction histogram is regarded as a vector and the magnitude thereof is normalized to a predetermined value.

辞書との照合度は、注目位置同上が近ければ急激には変化せず、滑らかに変化する。本発明は、注目位置同上が近いものの中で、そのピークを検出し、後段で詳細な辞書と比較するため、辞書とは似ているのだが、より良い候補が近辺にある候補を削減することができ、識別処理の高速化を実現できる。 The matching degree with the dictionary does not change abruptly when the attention position is the same, but changes smoothly. The present invention detects the peak among the similar target positions, and compares it with a detailed dictionary at a later stage, so it is similar to a dictionary, but better candidates are reduced in the vicinity. And speeding up the identification process.

また、辞書との照合度を算出してその空間的な連結惟を判定し連結領域内でピークを検出するため、文字領域以外での誤検出を抑制することができる。例えば、看板の端を「１」と誤検出してしまう誤りは看板の端全体に出現するが、本ピーク検出によりその誤り候補位置を１点に絞り込むことができる。また、粗探索においてピーク検出を行うことによって多くの候補を削除できるので、識別処理を高速化できる。 In addition, since the degree of collation with the dictionary is calculated to determine the spatial connection error and the peak is detected in the connection region, it is possible to suppress erroneous detection outside the character region. For example, an error that erroneously detects the end of a sign as “1” appears throughout the end of the sign, but the error candidate position can be narrowed down to one point by this peak detection. In addition, since many candidates can be deleted by performing peak detection in the rough search, the identification process can be speeded up.

さらに、圧縮率の高いベクトルにおいては、識別率が低い反面、投影距離を高速に求めることができ、多くの候補を高速に除外できる。圧縮率の低いベクトルにおいては、識別率が高い反面、投影距離の計算にコストがかかる。本発明によると、複数の圧縮率により、精度が悪い高速処理と、精度を重視した低速処理を組み合わせることが可能となる。これにより全体の処理を高速且つ高精度に実現できる。 Furthermore, a vector with a high compression rate has a low identification rate, but can obtain a projection distance at a high speed and can exclude many candidates at a high speed. A vector with a low compression rate has a high identification rate but is expensive in calculating the projection distance. According to the present invention, it is possible to combine high-speed processing with poor accuracy and low-speed processing with an emphasis on accuracy by a plurality of compression rates. Thereby, the entire process can be realized at high speed and with high accuracy.

また、正規化した微分方向ヒストグラムは、２値化を必要としないため、陰影の混入に強い。さらに、文字のフォントの多様な形状変動に対して、正規化した微分方向ヒストグラムは大きく変動しないため、代表的なフォントを用いて事前に万全な認識辞一書を作成することが可能である。 Moreover, since the normalized differential direction histogram does not require binarization, it is resistant to shadows. Furthermore, since the normalized differential direction histogram does not vary greatly with respect to various shapes of character fonts, it is possible to create a complete recognition dictionary using a typical font in advance.

尚、本発明のオブジェクト識別方法とその装置はコンピュータにその手順及び手段を実行するためのプログラムによっても実現でき、さらには、このプログラムをコンピュータ読み取り可能な記録媒体に記録すること、ネットワークを通して提供することも可能である。記録媒体としては、フレキシブルディスクや、ＨＤＤ、ＭＯ、ＲＯＭ、メモリカード、ＣＤ、ＤＶＤ、リムーバルディスク等が例示される。 The object identification method and apparatus of the present invention can be realized by a program for executing the procedure and means on a computer, and further, the program is recorded on a computer-readable recording medium and provided through a network. It is also possible. Examples of the recording medium include a flexible disk, HDD, MO, ROM, memory card, CD, DVD, and removable disk.

本発明によれば、辞書との照合度を算出してその空間的な連結性を判定し連結領域内でピークを検出するので、文字領域以外での誤検出を抑制することができる。また、粗探索においてピーク検出を行うことによって多くの候補を削除できるので、識別処理を高速化することができる。 According to the present invention, since the degree of collation with the dictionary is calculated to determine its spatial connectivity and the peak is detected in the connected region, it is possible to suppress erroneous detection outside the character region. In addition, since many candidates can be deleted by performing peak detection in the rough search, the identification process can be speeded up.

さらに、圧縮率の高いベクトルにおいては、識別率が低い反面、投影距離を高速に求めることができるので、多くの候補を高速に除外できる。圧縮率の低いベクトルにおいては、識別率が高い反面、投影距離の計算にコストがかかるが、本発明によれば、精度を保ったまま、画像中に存在する多数オブジェクトの抽出及び識別を高速化できる。 Furthermore, a vector with a high compression rate has a low identification rate, but a projection distance can be obtained at high speed, so that many candidates can be excluded at high speed. For vectors with a low compression rate, the identification rate is high, but the calculation of the projection distance is expensive. However, according to the present invention, the extraction and identification of a large number of objects existing in the image can be accelerated while maintaining accuracy. it can.

また、本発明によって正規化した微分方向ヒストグラムは、２値化を必要としないので、陰影の混入に強い。さらに、文字のフォントの多様な形状変動に対して、正規化した微分方向ヒストグラムは大きく変動しないので、代表的なフォントを用いて事前に万全な辞書を作成することが可能である。 In addition, the differential direction histogram normalized according to the present invention does not require binarization, and thus is resistant to shadows. Furthermore, since the normalized differential direction histogram does not vary greatly with respect to various shape variations of character fonts, it is possible to create a complete dictionary in advance using typical fonts.

本発明の実施の形態について図面を参照しながら説明する。 Embodiments of the present invention will be described with reference to the drawings.

図１は、本発明を景観中文字認識翻訳システムの概略構成図である。 FIG. 1 is a schematic configuration diagram of a character recognition / translation system for landscapes according to the present invention.

本システムにより、ユーザは撮影した文字の画像を基にその文字の翻訳情報をみることができる。但し、文字の翻訳辞書を有していることが前提となる。システムは、本発明の認識装置及び翻訳情報蓄積検索装置から構成される。 With this system, the user can view translation information of a character based on the photographed character image. However, it is premised on having a character translation dictionary. The system is composed of a recognition device and a translation information storage / retrieval device of the present invention.

本発明に係る識別装置１は、ユーザ２によって入力された画像と認識辞書（図１においては辞書３と表記）を利用し、画像に撮影された文字を識別する。 The identification device 1 according to the present invention uses an image input by the user 2 and a recognition dictionary (denoted as the dictionary 3 in FIG. 1) to identify characters photographed in the image.

翻訳装置４は、文字列と翻訳情報を蓄積しておき、文字列から翻訳情報を検索する手段であり、一般のデータベースにより構築できるため、本実施形態例ではその詳細な説明は省略する。 The translation device 4 is a means for accumulating character strings and translation information and retrieving translation information from the character strings, and can be constructed with a general database. Therefore, detailed description thereof is omitted in this embodiment.

以下の実施形態例では、「電」、「信」、「話」の３種類の文字を識別する事例について説明する。但し、本実施形態例は、オブジェクトを３種類に限定するものではなく、何種類にでも拡張可能である。 In the following embodiment example, a case where three types of characters “den”, “shin”, and “story” are identified will be described. However, the present embodiment is not limited to three types of objects, and can be extended to any number of types.

図２は、請求項１に係る発明の実施形態例を示した概略構成図であって、特に識別装置１の構成を説明したものである。 FIG. 2 is a schematic configuration diagram showing an embodiment of the invention according to claim 1, and particularly describes the configuration of the identification device 1.

識別装置１は、入力手段１０と全カテゴリ登録手段１１と繰り返し制御手段１２と切り出す画像切り出し手段１３と特徴抽出手段１４と照合度算出手段１５連結領域内ピーク検出手段１６と連結領域内ピーク検出手段１６と候補カテゴリ更新手段１７とを備える。 The identification device 1 includes an input means 10, an all category registration means 11, an iterative control means 12, an image cutout means 13, a feature extraction means 14, a matching degree calculation means 15, a connected region peak detecting means 16, and a connected region peak detecting means. 16 and candidate category updating means 17.

入力手段１０は、識別したい対象画像を入力する。 The input means 10 inputs a target image to be identified.

全カテゴリ登録手段１１は、定められた位置及びその大きさに対して全カテゴリを候補カテゴリとして登録する。 All category registration means 11 registers all categories as candidate categories for the determined position and its size.

繰り返し制御手段１２は、入力された数（Ｉ）分の後述の画像切り出し手段１３、特徴抽出手段１４、照合度算出手段１５、連結領域内ピーク検出手段１６、候補カテゴリ更新手段１５の処理を繰り返し実行制御する。 The iterative control means 12 repeats the processes of the input number (I) of later-described image cutout means 13, feature extraction means 14, matching degree calculation means 15, connected region peak detection means 16, and candidate category update means 15. Control execution.

画像切り出し手段１３は、対象画像において注目する位置（注目位置）及び大きさ（注目サイズ）を変更しながら注目する阿像（切り出し画像）を切り出す。 The image cutout unit 13 cuts out a target image (cutout image) while changing a target position (target position) and size (target size) in the target image.

特徴抽出手段１４は、切り出し画像から特徴（対象特徴ベクトル）を抽出する。 The feature extraction unit 14 extracts a feature (target feature vector) from the cut-out image.

照合度算出手段１５は、予め作成してある繰り返しの回数（ｉ）に応じた各オブジェクトの認識辞書（辞書３）と対象特徴ベクトルを比較して、その照合度合いを表す照合度を算出する。 The matching level calculation means 15 compares the recognition dictionary (dictionary 3) of each object and the target feature vector according to the number of repetitions (i) created in advance, and calculates a matching level representing the matching level.

連結領域内ピーク検出手段１６は、各オブジェクトの照合度の空間的な連結性を判定し、連結領域内での照合度の最大ピークを検出する。 The connected region peak detection means 16 determines the spatial connectivity of the matching degree of each object, and detects the maximum peak of the matching degree in the connected region.

候補カテゴリ更新手段１７は、検出されたピークを候補として登録する。 The candidate category update means 17 registers the detected peak as a candidate.

全カテゴリ登録手段１１によるカテゴリの登録例について説明する。 An example of category registration by all category registration means 11 will be described.

図８は対象画像の一例である。図９は、対象画像中の注目点（画像中の灰色の画素）を示した図であって、「定められた位置」の一例を示すものである。注目点を中心として、定められた大きさの矩形画像を定義されている。図１０は、「定められた大きさ」の例であって、サイズＡとサイズＢが示されている。 FIG. 8 is an example of the target image. FIG. 9 is a diagram showing a point of interest (a gray pixel in the image) in the target image, and shows an example of a “predetermined position”. A rectangular image having a predetermined size is defined around the attention point. FIG. 10 is an example of “determined size”, and size A and size B are shown.

初期の候補カテゴリは、以下のフォーマットで登録される。 The initial candidate category is registered in the following format.

「画像中の注目位置（ｘ）、画像中の注目位置（ｙ）、注目サイズ、候補カテゴリ１、候補カテゴリ２ … 候補カテゴリＱ」
本例では、３つのカテゴリがあるため、候補カテゴリは以下のようになる。 “Attention position (x) in image, attention position (y) in image, attention size, candidate category 1, candidate category 2... Candidate category Q”
In this example, since there are three categories, the candidate categories are as follows.

「注目位置ｘ、注目位置ｙ、各注目サイズ、電、信、話」。つまり、各画素、各注目サイズを切り出した矩形画像の文字カテゴリの候補は「電」、「信」、「話」であることを意味する。 "Attention position x, attention position y, each attention size, telegraph, communication, talk". That is, it means that the candidate character categories of the rectangular image obtained by cutting out each pixel and each attention size are “den”, “shin”, and “story”.

次に、照合度算出手段１５に入力される認識辞書について説明する。 Next, the recognition dictionary input to the matching degree calculation unit 15 will be described.

繰り返しの回数（ｉ）に応じた各オブジェクトの認識辞書は、例えば、次のように作成する。各オブジェクトの複数画像から、解像度の異なる特徴ベクトルや圧縮度合いを変えた特徴ベクトルを生成し、特徴ベクトル群を主成分分析することによって作成する。繰り返しの回数ｉ＝１には、解像度の低い認識辞書を対応させ、繰り返しの回数ｉ＝Ｉには、解像度の高い認識辞書を対応させる。この方法の一例として、以下のオブジェクト認識方法がある。 The recognition dictionary for each object corresponding to the number of repetitions (i) is created as follows, for example. A feature vector having a different resolution or a feature vector with a different degree of compression is generated from a plurality of images of each object, and the feature vector group is created by principal component analysis. A recognition dictionary with a low resolution is associated with the number of repetitions i = 1, and a recognition dictionary with a high resolution is associated with the number of repetitions i = I. As an example of this method, there is the following object recognition method.

この方法は、オブジェクトを含む複数画像を用いて複数のオブジェクトを登録し、画像中に１個以上存在する登録されたオブジェクトを識別するものであって、前記複数のオブジェクトを登録する過程は、複数画像から特徴ベクトルを抽出する特徴抽出過程と、前記抽出した全ての特徴ベクトルに対して主成分分析を行う主成分分析過程と、計算された特徴ベクトルから圧縮した圧縮主成分ベクトルを出力する圧縮主成分出力過程と、各特徴ベクトルを入力された異なる圧縮率により圧縮して出力する圧縮過程と、前記圧縮された各オブジェクトの圧縮特徴ベクトルの部分空間を求めて出力する部分空間生成過程と、指定された複数圧縮率の個数分だけ前記圧縮過程と部分空間生成過程を複数回繰り返す制御過程とを有している。 In this method, a plurality of objects are registered using a plurality of images including the objects, and one or more registered objects existing in the image are identified. The process of registering the plurality of objects includes a plurality of processes. A feature extraction process for extracting feature vectors from an image, a principal component analysis process for performing principal component analysis on all the extracted feature vectors, and a compressed principal vector for outputting compressed principal component vectors compressed from the calculated feature vectors A component output process, a compression process in which each feature vector is compressed and output at different input compression rates, a subspace generation process in which a subspace of the compressed feature vector of each compressed object is obtained and output, and designation There is a control process in which the compression process and the partial space generation process are repeated a plurality of times by the number of the plurality of compression ratios.

前記圧縮過程としては、全オブジェクトの特徴ベクトルを全て入力する全特徴入力過程と、入力された圧縮率から圧縮する特徴ベクトルの次元数（Ｎ）を求めて上位Ｎ個の主成分ベクトルと特徴ベクトルの内積を計算し、各内積値をベクトルとみなした圧縮特徴ベクトルを出力する圧縮特徴出力過程とを有するものがある。 As the compression process, all the feature input processes for inputting all feature vectors of all objects, and the number of dimensions (N) of the feature vectors to be compressed from the input compression rate, the top N principal component vectors and feature vectors are obtained. And a compressed feature output process for outputting a compressed feature vector in which each inner product value is regarded as a vector.

前記部分空間生成過程としては、各オブジェクトの圧縮特徴ベクトルを入力する各オブジェクト圧縮特徴入力過程と、前記圧縮特徴ベクトルに対して主成分分析を行う主成分分析過程と、前記分析した部分空間主成分を出力する部分空間主成分出力過程とを有するものがある。 As the subspace generation process, each object compression feature input process for inputting the compression feature vector of each object, a principal component analysis process for performing principal component analysis on the compressed feature vector, and the analyzed subspace principal component And a subspace principal component output process for outputting.

オブジェクトを識別する過程としては、識別したい対象画像を入力する入力過程と、対象画像において注目する位置及びそのサイズを変更しながら注目する領域画像を切り出す注目領域画像切り出し過程と、領域画像から対象特徴ベクトルを抽出する特徴抽出過程と、全カテゴリを候補カテゴリとして登録する全カテゴリ登録過程と、指定された複数の圧縮率により候補を絞り込む候補識別過程と、入力された複数圧縮率の個数分だけ前記候補識別過程の処理を繰り返す制御過程とを有するものがある。 The process of identifying an object includes an input process of inputting a target image to be identified, a target area image cutting process of cutting out a target area image while changing the position and size of interest in the target image, and target features from the area image. A feature extraction process for extracting vectors, an all category registration process for registering all categories as candidate categories, a candidate identification process for narrowing down candidates by a plurality of specified compression ratios, and the number of input multiple compression ratios Some have a control process that repeats the candidate identification process.

前記候補識別過程としては、入力された圧縮率から次元数（Ｎ）を求めて上位Ｎ個の前記主成分と対象特徴ベクトルの内積を計算し、各内積値をベクトルとみなした対象圧縮特徴ベクトルを算出する対象圧縮特徴ベクトル算出過程と、各候補カテゴリの該当する圧縮率に対する部分空間主成分を用いて対象圧縮特徴ベクトルを部分空間に投影した際の投影距離を算出する投影距離算出過程と、入力された閾値以下の各候補の投影距離の上位Ｋ個を新しい候補とする候補カテゴリ更新過程と、入力された閾値以下の候補を識別結果として出力する識別結果出力過程と、を有するものがある。 As the candidate identification process, the number of dimensions (N) is obtained from the input compression rate, the inner product of the top N principal components and the target feature vector is calculated, and the target compressed feature vector in which each inner product value is regarded as a vector A target compression feature vector calculation process for calculating the target compression feature vector, and a projection distance calculation process for calculating a projection distance when the target compression feature vector is projected onto the partial space using the main component of the partial space for the compression rate corresponding to each candidate category; Some have a candidate category update process in which the top K projection distances of each candidate below the input threshold are new candidates, and an identification result output process that outputs candidates below the input threshold as an identification result .

図３は、請求項２及び３に係る発明の実施形態例を示した概略構成図であって、特に照合度算出手段１５の概略構成を示している。 FIG. 3 is a schematic configuration diagram showing an embodiment of the invention according to claims 2 and 3, and particularly shows a schematic configuration of the matching degree calculation means 15.

照合度算出手段１５は、圧縮手段１５１と投影距離算出手段１５２と順位算出手段１５３と順位カット手段１５４とを備える。 The collation degree calculation unit 15 includes a compression unit 151, a projection distance calculation unit 152, a rank calculation unit 153, and a rank cut unit 154.

圧縮手段１５１は、繰り返しの回数（ｉ）に応じた入力された圧縮率（例えば対応する辞書と同じ圧縮率）により特徴ベクトルを圧縮した対象圧縮特徴ベクトルを算出する。 The compression unit 151 calculates a target compressed feature vector obtained by compressing the feature vector with an input compression rate (for example, the same compression rate as that of the corresponding dictionary) according to the number of repetitions (i).

投影距離算出手段１５２は、各候補カテゴリの該当する圧縮率に対するあらかじめ入力された部分空間（認識辞書）を用いて対象圧縮特徴ベクトルを部分空間に投影した際の距離（投影距離）を算出する。 The projection distance calculation unit 152 calculates a distance (projection distance) when the target compression feature vector is projected onto the partial space using the partial space (recognition dictionary) input in advance for the compression rate corresponding to each candidate category.

順位算出手段１５３は、各候補カテゴリの投影距離による順位を求める。 The rank calculation means 153 obtains a rank according to the projection distance of each candidate category.

順位カット手段１５４は、指定された順位以下の候補カテゴリの投影距離を大きくする。 The rank cutting means 154 increases the projection distance of candidate categories below the specified rank.

照合度は、矩形画像とカテゴリとの類似性を表すものであり、各サイズの各注目位置毎に算出する。 The collation degree represents the similarity between the rectangular image and the category, and is calculated for each position of interest of each size.

図１１は、照合度算出手段１５による算出結果の一例を示したものである。 FIG. 11 shows an example of a calculation result by the matching degree calculation means 15.

サイズＡでの「電」の照合度、サイズＢでの照合度の例が示されている。図において、黒い画素ほど照合度が高い（投影距離が小さい）ことを意味する。この時点で、ある位置の照合度が閾値以下であるか、識別順位が決められた順位以下である場合は、その位置は候補として削除される。 An example of the matching degree of “den” at size A and the matching degree at size B is shown. In the figure, the darker the pixel, the higher the matching degree (the shorter the projection distance). At this time, if the collation degree at a certain position is equal to or lower than the threshold value or equal to or lower than the determined rank, the position is deleted as a candidate.

圧縮には、予め特徴ベクトル群を主成分分析して求めた上位Ｋ個の主成分を用いて、特徴変換することにより圧縮する。Ｋを変更することにより圧縮率を変更できる。主成分分析は、既知のものを採用すればよく、例えば、大津展之ほか著の「パターン認識」（朝倉書店発行，ｐｐ．３５）に開示されたものを用いるとよい。また、部分空間は、予め圧縮した特徴ベクトル群に対して再び主成分分析を行い、その主成分の上位Ｌ個により部分空間を構成する。Ｌは繰り返しの回数ｉにより変化するパラメータである。 The compression is performed by performing feature conversion using the top K principal components obtained by performing principal component analysis on the feature vector group in advance. The compression rate can be changed by changing K. For the principal component analysis, a known one may be employed, for example, the one disclosed in “Pattern Recognition” (published by Asakura Shoten, pp. 35) by Nobuyuki Otsu et al. In the partial space, the principal component analysis is performed again on the pre-compressed feature vector group, and the partial space is configured by the top L of the principal components. L is a parameter that varies depending on the number of repetitions i.

投影距離は、以下の方法により算出すればよい。この方法は各候補カテゴリの該当する圧縮率に対する部分空間主成分を用いて対象圧縮特徴ベクトルを部分空間に投影した際の投影距離を算出する。 The projection distance may be calculated by the following method. In this method, the projection distance when the target compression feature vector is projected onto the partial space using the principal component of the partial space corresponding to the compression rate corresponding to each candidate category is calculated.

先ず、入力された圧縮率から次元数（Ｎ）を求めて上位Ｎ個の主成分と対象特徴ベクトル（ＩＦ）の内積を計算し、各内積値をベクトルとみなした対象圧縮特徴ベクトル（ＡＩＦ）を算出する。次いで、各候補カテゴリの該当する圧縮率に対する部分空間主成分を用いて対象圧縮特徴ベクトルを部分空間に投影した際の距離（投影距離）を算出する。各カテゴリの投影距離Ｌ（ｃ）は数１式により算出される。この式において、第ｒ部分空間主成分をＢＡＦ（ｃ，ｒ）と表し、ＢＡＦ（ｃ，ｒ）はＮ次元のベクトルで表現される（ただし、ｒ＝１〜Ｎ）。また、各圧縮率に対応したＲは事前に入力された定数である。 First, the number of dimensions (N) is obtained from the input compression rate, the inner product of the top N principal components and the target feature vector (IF) is calculated, and the target compressed feature vector (AIF) in which each inner product value is regarded as a vector. Is calculated. Next, a distance (projection distance) when the target compression feature vector is projected onto the partial space using the partial space principal component corresponding to the compression rate corresponding to each candidate category is calculated. The projection distance L (c) of each category is calculated by the equation (1). In this equation, the r-th subspace principal component is represented as BAF (c, r), and BAF (c, r) is represented by an N-dimensional vector (where r = 1 to N). R corresponding to each compression rate is a constant input in advance.

図４は、請求項４に係る発明の実施形態例における特徴抽出手段の概略構成図である。 FIG. 4 is a schematic configuration diagram of the feature extraction means in the embodiment of the invention according to claim 4.

特徴抽出手段１４は、入力手段１４１と微分強度方向計算手段１４２と微分方向ヒストグラム化手段１４３と微分ヒストグラム正規化手段１４４とを備える。 The feature extracting unit 14 includes an input unit 141, a differential intensity direction calculating unit 142, a differential direction histogram forming unit 143, and a differential histogram normalizing unit 144.

入力手段１４１は、複数画像を入力する。 The input unit 141 inputs a plurality of images.

微分強度方向計算手段１４２は、各画像の横方向の微分と縦方向の微分成分を計算して微分の方向と強さを算出する。 The differential intensity direction calculating unit 142 calculates the differential and vertical differential components of each image to calculate the differential direction and strength.

微分方向ヒストグラム化手段１４３は、各画像に対し、定められた領域内の各画素の微分方向を定められた段階に量子化し、微分の強さを段階毎に累積加算した微分方向ヒストグラムを作成する。 The differential direction histogram forming means 143 quantizes the differential direction of each pixel in a predetermined area for each image at a predetermined stage, and creates a differential direction histogram in which the strength of the differential is cumulatively added for each stage. .

微分ヒストグラム正規化手段１４４は、前記作成された微分方向ヒストグラムをベクトルとみなしてその大きさを定められた値に正規化する。 The differential histogram normalizing unit 144 regards the created differential direction histogram as a vector and normalizes the magnitude to a predetermined value.

図５は微分強度方向計算手段１４２の動作例を説明したものである。 FIG. 5 illustrates an example of the operation of the differential intensity direction calculation unit 142.

原画像Ｉの横をｘ軸、縦をｙ軸と考える。画像は横Ｘピクセル×縦Ｙピクセルであり、画像サイズはＸ×Ｙとなる。先ず、原画像に対し、ソーベルオペレータを作用させ、ｘ方向の微分を計算したｘ方向微分画像Ｄｘとｙ方向の微分を計算したｙ方向微分画像Ｄｙを生成する。但し、ソーベルオペレータを用いるのは一例であって、その他の方法であってもよい。次に、微分強度画像Ｄｉと微分方向画像Ｄｄの各画素を以下の数２式で算出する。 The horizontal direction of the original image I is considered to be the x axis and the vertical direction is considered to be the y axis. The image has horizontal X pixels × vertical Y pixels, and the image size is X × Y. First, a Sobel operator is applied to the original image to generate an x-direction differential image Dx in which the x-direction differential is calculated and a y-direction differential image Dy in which the y-direction differential is calculated. However, using the Sobel operator is merely an example, and other methods may be used. Next, each pixel of the differential intensity image Di and the differential direction image Dd is calculated by the following equation (2).

微分方向ヒストグラム化手段の動作例を図６に示す。 An example of the operation of the differential direction histogram forming means is shown in FIG.

各画素において、まず、角度を量子化する。次に、その画素の強度を、その画素が所属する領域の、該当する角度のヒストグラムに、加算する。以上により、微分方向ヒストグラムが作成される。例では、画像を４（ｎ）分割した領域内で方向を５（ｍ）段階に量子化してヒストグラムを作成しており、特徴は、２０（ｎ×ｍ）次元のベクトルＧで表現される。 In each pixel, the angle is first quantized. Next, the intensity of the pixel is added to the histogram of the corresponding angle in the region to which the pixel belongs. Thus, a differential direction histogram is created. In the example, a histogram is created by quantizing the direction into 5 (m) steps within an area obtained by dividing the image into 4 (n), and the feature is expressed by a vector G of 20 (n × m) dimensions.

微分方向ヒストグラムは、以下の数３式計算により正規化し、特徴ベクトル（Ｆ）とする。但し、ベクトルＧのｋ次元目のスカラー値をＧ（ｋ）と表す。 The differential direction histogram is normalized by the following equation 3 to obtain a feature vector (F). However, the scalar value of the kth dimension of the vector G is represented as G (k).

図７は、請求項５に係る発明の実施形態例における連結領域内ピーク検出手段の概略構成図である。 FIG. 7 is a schematic configuration diagram of the connected region peak detecting means in the embodiment of the invention according to claim 5.

連結領域内ピーク検出手段１６は、繰り返し制御手段Ａ１６１と繰り返し制御手段Ｂ１６２と右横連結性判定手段１６３と下連結性判定手段１６４と領域抽出手段１６５とピーク点出力手段１６６とを備える。 The connected region peak detection means 16 includes a repetition control means A 161, a repetition control means B 162, a right lateral connectivity determination means 163, a lower connectivity determination means 164, a region extraction means 165, and a peak point output means 166.

繰り返し制御手段Ａ１６１は、全ての注目位置・注目サイズにおいて後述の繰り返し制御手段Ｂ１６２の処理の繰り返しを実行制御する。 The repetition control unit A161 executes and controls repetition of processing of the later-described repetition control unit B162 at all attention positions and attention sizes.

繰り返し制御手段Ｂ１６２は、注目位置及び注目サイズに候補として登録されたカテゴリに対して後述の右横連結性判定手段１６３と下連結性判定手段１６４による処理の繰り返しを実行制御する。 The repetition control unit B162 executes and controls the repetition of processing by the later-described right lateral connectivity determination unit 163 and the lower connectivity determination unit 164 for the categories registered as candidates for the target position and target size.

右横連結性判定手段１６３は、注目位置及びその右横の位置の照合度が閾値ＴＨ以下の場合に連結と判定する。 The right lateral connectivity determining unit 163 determines that the target position and the right lateral position are connected when the matching degree between the target position and the right lateral position is equal to or less than the threshold value TH.

下連結性判定手段１６４は、注目位置及びその下の位置の照合度が定められた閾値ＴＨ以下の場合に連結と判定する。 The lower connectivity determination unit 164 determines that the target position and the lower position are connected when the matching degree between the target position and the position below the target position is equal to or less than a predetermined threshold TH.

領域抽出手段１６５は、連結された注目位置群を１つの領域として抽出する。 The area extraction unit 165 extracts the connected attention position group as one area.

ピーク点出力手段１６６は、各領域内においての照合度が最大である位置及び大きさを求めて出力する。 The peak point output means 166 obtains and outputs the position and size at which the matching degree in each region is maximum.

また、請求項６に係る発明の実施形態例として、連結領域内ピーク検出手段１６において前連結性判定手段を備えてもよい。 As an embodiment of the invention according to claim 6, the in-connection region peak detection means 16 may include a pre-connectivity determination means.

図７に示された連結領域内ピーク検出手段１６において、繰り返し制御手段Ａ１６１は、全ての注目位置・注目サイズにおいて繰り返し制御手段Ｂ１６２の処理を繰り返す。次いで、繰り返し制御手段Ｂ１６２は、注目位置及び注目サイズに候補として登録されたカテゴリに対して右横連結性判定手段と下連結性判定手摩の処理を繰り返す。次いで、右横連結性判定手段１６３は、注目位置及びその右横の位置の照合度が閾値ＴＨ以下の場合に連結と判定する。次いで、下連結性判定手段１６４は、注目位置及びその下の位置の照合度が定められた閾値ＴＨ以下の場合に連結と判定する。 In the connected region peak detection means 16 shown in FIG. 7, the repetition control means A161 repeats the processing of the repetition control means B162 at all the attention positions and attention sizes. Next, the repeat control unit B162 repeats the right lateral connectivity determination unit and the lower connectivity determination manual process for the categories registered as candidates for the target position and target size. Next, the right lateral connectivity determining unit 163 determines that the target position and the right lateral position are connected when the matching degree between the target position and the right lateral position is equal to or less than the threshold value TH. Next, the lower connectivity determination unit 164 determines that the target position and the lower position are connected when the collation degree between the target position and the position below the target position is equal to or less than a predetermined threshold value TH.

ここで、図示省略された前連結性判定手段は、注目位置及びその注目サイズの１段階上のサイズの同位置の照合度が定められた閾値以下の場合に連結と判定する。 Here, the previous connectivity determination means (not shown) determines that the target position and the target position and the target size are connected to each other when the degree of matching at the same position of the target size is equal to or less than a predetermined threshold.

次いで、領域抽出手段１６５は連結された注目位置群を１つの領域として抽出する。そして、ピーク点出力手段１６６が、各領域内においての照合度が最大である位置及び大きさを求めて出力する。 Next, the region extraction unit 165 extracts the connected attention position group as one region. Then, the peak point output means 166 obtains and outputs the position and size at which the matching degree in each region is maximum.

図１２は、Ｐ点における連結判定の一例を示したものであって、本例の場合、Ｐ点の右横及び下及び１段階上のサイズＢの同位置が連結と判定される。 FIG. 12 shows an example of the connection determination at the point P. In this example, the same position of the size B on the right side and the lower side of the P point and one level above is determined as the connection.

図１３は、全点における連結判定の一例であって、本例の場合、Ｐ点の左横、右横並びに上下及び１段階上のサイズＢの同位置（Ｑ点）が連結と判定される。また、Ｑ点においても、左様、右横並びに上下及び１段階下のサイズＡの同位置（Ｐ点）が連結と判定される。 FIG. 13 is an example of determination of connection at all points. In this example, the left side, right side, top and bottom, and the same position (Q point) of size B one level above are determined to be connected. . In addition, at the Q point, the same position (P point) of size A on the left, right side, up and down, and one step below is determined to be connected.

図１４は、領域抽出の結果の一例を示すものであって、連結されている領域を１領域とみなして抽出し、各々に対してラベルを付けた結果を示している。Ｐ点及びＱ点が含まれる領域の画素は、領域１とラベリングされ、Ｒ点及びＳ点が含まれる領域の画素は、領域２とラベリングされている。 FIG. 14 shows an example of the result of region extraction, and shows the result of extracting connected regions as one region and labeling each region. The pixel in the region including the P point and the Q point is labeled with the region 1, and the pixel in the region including the R point and the S point is labeled with the region 2.

図１５は、ピーク点出力の結果の一例を示すものであって、領域１内での投影距離が最小であるＰ点と、領域２内での投影距離が最小であるＲ点がピーク点として出力されている。 FIG. 15 shows an example of the peak point output result. The P point having the minimum projection distance in the region 1 and the R point having the minimum projection distance in the region 2 are used as the peak points. It is output.

図１６は、請求項７に係るオブジェクト認識装置における候補カテゴリ更新手段の概略構成図である。 FIG. 16 is a schematic configuration diagram of candidate category update means in the object recognition apparatus according to claim 7.

図示された候補カテゴリ更新手段１７は、周辺領域計算手段１７１と周辺領域登録手段１７２とを備える。周辺領域計算手段１７１は、求められたピーク点の周辺領域を計算する。周辺領域登録手段１７２は、ピーク点及びその周辺領域を候補として登録する。 The candidate category update unit 17 shown in the figure includes a peripheral region calculation unit 171 and a peripheral region registration unit 172. The peripheral area calculation unit 171 calculates the peripheral area of the obtained peak point. The peripheral area registration unit 172 registers the peak point and its peripheral area as candidates.

図１７は、周辺領域計算手段１７１の結果の一例を示すもので、本例では、ピーク点の上下左右及びサイズの一段階上を周辺領域として定義している。但し、本例はあくまで一例であって、その他の多面体等の定義であってもよい。 FIG. 17 shows an example of the result of the peripheral area calculation means 171. In this example, the upper, lower, left, and right sides of the peak point and one step above the size are defined as the peripheral area. However, this example is merely an example, and other polyhedron definitions may be used.

以上説明したオブジェクト認識装置の構成要素の動作例を示す。 The operation example of the component of the object recognition apparatus demonstrated above is shown.

図１８は、識別装置１の動作例を示したフローチャートである。 FIG. 18 is a flowchart illustrating an operation example of the identification device 1.

ステップ１）対象画像の入力
ステップ２）全カテゴリの登録
ステップ３）注目点の数Ｖ×想定する大きさの数Ｗ分繰り返し［１］
ステップ４）注目領域の切り出し
ステップ５）特徴の抽出
ステップ６）繰り返し［１］終了
ステップ７）入力された数Ｉ分繰り返し［２］
ステップ８）照合度の算出
ステップ９）ピークの検出
ステップ１０）候補カテゴリの更新
ステップ１１）繰り返し［２］終了
ステップ１２）識別結果の出力
図１９は、特徴抽出手段１４の動作例を示したフローチャートである。 Step 1) Input of target image Step 2) Registration of all categories Step 3) Repeat for the number of points of interest V × the number of assumed sizes W [1]
Step 4) Extraction of region of interest Step 5) Feature extraction Step 6) Repeat [1] End Step 7) Repeat for input number I [2]
Step 8) Calculation of matching degree Step 9) Peak detection Step 10) Update of candidate category Step 11) Repeat [2] End Step 12) Output of identification result FIG. 19 is a flowchart showing an operation example of the feature extraction means 14 It is.

ステップ１）画像のピクセル数Ｐ分繰り返し［１］
ステップ２）微分強度の計算
ステップ３）繰り返し［１］終了
ステップ４）分割数ｎ分繰り返し［２］
ステップ５）微分方向ヒストグラムの計算
ステップ６）繰り返し［２］終了
ステップ７）微分ヒストグラムの正規化
図２０は、照合度算出手段１５の動作例を示したフローチャートである。 Step 1) Repeat for P number of pixels of image [1]
Step 2) Calculation of differential intensity Step 3) Repeat [1] End Step 4) Repeat for n divisions [2]
Step 5) Calculation of Differential Histogram Step 6) Repeat [2] End Step 7) Normalization of Differential Histogram FIG. 20 is a flowchart showing an operation example of the matching degree calculation means 15.

ステップ１）圧縮
ステップ２）カテゴリ数Ｃ分繰り返し［１］
ステップ３）投影距離算出
ステップ４）繰り返し［１］終了
ステップ５）カテゴリ数Ｃ分繰り返し［１］
ステップ６）順位算出
ステップ７）繰り返し［１］終了
ステップ８）順位カット
図２１は、連結領域内ピーク検出手段１６の動作例を示したフローチャートである。 Step 1) Compression Step 2) Repeat for C categories [1]
Step 3) Projection distance calculation Step 4) Repeat [1] end Step 5) Repeat for C categories [1]
Step 6) Rank calculation Step 7) Repeat [1] End Step 8) Rank cut FIG. 21 is a flowchart showing an operation example of the connected region peak detecting means 16.

ステップ１）候補数Ｊ分繰り返し［１］
ステップ２）右横連結性判定
ステップ３）下連結性判定
ステップ４）前連結性判定
ステップ５）繰り返し［１］終了
ステップ６）領域抽出（ラベリング）
ステップ７）領域数分繰り返し［２］
ステップ８）ピーク点出力
ステップ９）繰り返し［２］終了
図２２は、候補カテゴリ更新手段１７の動作例を示したフローチャートである。 Step 1) Repeat for J candidates [1]
Step 2) Right lateral connectivity determination Step 3) Lower connectivity determination Step 4) Previous connectivity determination Step 5) Repeat [1] End Step 6) Region extraction (labeling)
Step 7) Repeat for the number of areas [2]
Step 8) Peak Point Output Step 9) Repeat [2] End FIG. 22 is a flowchart showing an operation example of the candidate category update means 17.

ステップ１）ｉがＩであるかどうか。Ｙｅｓであれば終了
ステップ２）周辺領域計算
ステップ３）周辺領域登録
以上のフローチャートに従うと、図８の例の対象画像に対して、以下のような識別結果の出力例を得ることができる。 Step 1) Whether i is I or not. If yes, the process ends. Step 2) Peripheral Area Calculation Step 3) Peripheral Area Registration According to the above flowchart, an output example of the following identification result can be obtained for the target image in the example of FIG.

［５，５，１０，電，１０００．］
［１０，５，１０，信，９００］
［１５，５，１０，電，８２０］
［２０，５，１０、話、１２００］
尚、これまでの実施形態例で述べたオブジェクト識別方法は、図１〜図２２で示した処理工程をコンピュータのプログラムで構成し、このプログラムをコンピュータに実行させることができることは言うまでもなく、コンピュータでその機能を実現するためのプログラム、あるいは、コンピュータにその処理の工程の実行させるためのプログラムを、そのコンピュータが読み取りできる記録媒体、例えば、フレキシブルディスクや、ＭＯ、ＲＯＭ、メモリカード、ＣＤ、ＤＶＤ、リムーバルディスク、ＨＤＤ等に記録して、保存したり、配布したりすることが可能である。また、このプログラムをインターネットや電子メールなど、ネットワークを介して提供することも可能である。 [5, 5, 10, electricity, 1000. ]
[10, 5, 10, Shin, 900]
[15, 5, 10, electricity, 820]
[20, 5, 10, story, 1200]
It should be noted that the object identification methods described in the above embodiments can be configured by configuring the processing steps shown in FIGS. 1 to 22 with a computer program and causing the computer to execute the program. A recording medium that can be read by the computer, such as a flexible disk, MO, ROM, memory card, CD, DVD, a program for realizing the function, or a program for causing the computer to execute the process. It can be recorded on a removable disk, HDD, etc., and stored or distributed. It is also possible to provide this program via a network such as the Internet or electronic mail.

そして、これら記録媒体からコンピュータに前記のプログラムをインストールすることにより、あるいはネットワークからダウンロードしてコンピュータに前記のプログラムをインストールすることにより、本発明を実施することが可能となる。但し、コンピュータへのインストールはコンピュータ単位であり、装置やシステムが複数あることなどでインストールの対象となるコンピュータが複数ある場合には、当該プログラムは必要な処理部分毎にインストールされることは当然である。この場合、当該プログラムはコンピュータ対応に記録媒体に記録するか、またはネットワークを介してダウンロードしてもよい。 Then, the present invention can be implemented by installing the program from these recording media into a computer, or by downloading the program from a network and installing the program into the computer. However, installation on a computer is a computer unit, and when there are multiple computers to be installed due to multiple devices and systems, it is natural that the program is installed for each necessary processing part. is there. In this case, the program may be recorded on a recording medium corresponding to a computer, or downloaded via a network.

本発明を景観中文字認識翻訳システムの概略構成図。The schematic block diagram of the character recognition translation system in a landscape for this invention. 請求項１に係る発明の実施形態例を示した概略構成図。The schematic block diagram which showed the embodiment of the invention which concerns on Claim 1. FIG. 請求項２及び３に係る発明の実施形態例で、特に照合度算出手段１５の概略構成図。4 is a schematic configuration diagram of the collation degree calculating means 15 in particular in the embodiment of the invention according to claims 2 and 3. FIG. 請求項４に係る発明の実施形態例における特徴抽出手段の概略構成図。The schematic block diagram of the characteristic extraction means in the embodiment of the invention concerning Claim 4. 微分強度方向計算手段の動作例の説明図。Explanatory drawing of the operation example of a differential intensity | strength direction calculation means. 微分方向ヒストグラム化手段の動作例の説明図。Explanatory drawing of the operation example of a differential direction histogram-izing means. 請求項５に係る発明の実施形態例における連結領域内ピーク検出手段の概略構成図。The schematic block diagram of the connection area | region peak detection means in the example of embodiment of the invention which concerns on Claim 5. FIG. 対象画像の一例。An example of a target image. 対象画像中の注目点を示した図。The figure which showed the attention point in a target image. 「定められた大きさ」の一例。An example of “specified size”. 照合度算出手段による算出結果の一例。An example of the calculation result by a collation degree calculation means. Ｐ点における連結判定の一例。An example of the connection determination in P point. Ｐ点における連結判定の一例。An example of the connection determination in P point. 領域抽出の結果の一例。An example of the result of area extraction. ピーク点出力の結果の一例。An example of the peak point output result. 請求項７に係るオブジェクト認識装置における候補カテゴリ更新手段の概略構成図。The schematic block diagram of the candidate category update means in the object recognition apparatus which concerns on Claim 7. 周辺領域計算手段の結果の一例。An example of the result of a surrounding area calculation means. 識別装置の動作例を示したフローチャート。The flowchart which showed the operation example of the identification device. 特徴抽出手段の動作例を示したフローチャート。The flowchart which showed the operation example of the feature extraction means. 照合度算出手段の動作例を示したフローチャート。The flowchart which showed the operation example of the collation degree calculation means. 連結領域内ピーク検出手段の動作例を示したフローチャート。The flowchart which showed the operation example of the peak detection means in a connection area | region. 候補カテゴリ更新手段の動作例を示したフローチャート。The flowchart which showed the operation example of the candidate category update means.

Explanation of symbols

１…識別装置、２…ユーザ、３…辞書、４…翻訳装置
１０…入力手段、１１…全カテゴリ登録手段、１２…繰り返し制御手段、１３…画像切り出し手段、１４…特徴抽出手段、１５…照合度算出手段、１６…連結領域内ピーク検出手段、１７…候補カテゴリ更新手段
１５１…圧縮手段、１５２…投影距離算出手段、１５３…順位算出手段、１５４…順位カット手段
１４１…入力手段、１４２…微分強度方向計算手段、１４３…微分方向ヒストグラム化手段、１４４…微分ヒストグラム正規化手段
１６１…繰り返し制御手段Ａ、１６２…繰り返し制御手段Ｂ、１６３…右横連結性判定手段、１６４…下連結性判定手段、１６５…領域抽出手段、１６６…ピーク点出力手段
DESCRIPTION OF SYMBOLS 1 ... Identification apparatus, 2 ... User, 3 ... Dictionary, 4 ... Translation apparatus 10 ... Input means, 11 ... All category registration means, 12 ... Repetition control means, 13 ... Image extraction means, 14 ... Feature extraction means, 15 ... Collation Degree calculation means, 16 ... connected region peak detection means, 17 ... candidate category update means 151 ... compression means, 152 ... projection distance calculation means, 153 ... rank calculation means, 154 ... rank cut means 141 ... input means, 142 ... differentiation Intensity direction calculating means, 143... Differential direction histogram forming means, 144... Differential histogram normalizing means 161 .. Repeat control means A, 162 .. Repeat control means B, 163. 165 ... Area extraction means, 166 ... Peak point output means

Claims

An object identification device for identifying a plurality of objects in an image,
An all-category registration means for inputting an identification target image and registering all categories as candidate categories for a predetermined position in the image and its size;
Control means for repeatedly executing processing by the following means;
Image cutout means for cutting out an image of interest while changing the position of interest and the size thereof in the target image;
Feature extraction means for extracting a target feature vector from the cut-out image;
A degree-of-matching calculating unit that compares a recognition dictionary of each object according to the number of repetitions created in advance and a target feature vector, and calculates a degree of matching that represents the degree of matching;
Determining the spatial connectivity of the matching degree of each object, and detecting a peak in the connected area for detecting the maximum peak of the matching degree in the connected area;
An object identification device comprising candidate category update means for registering detected peaks as candidates.

The collation degree calculating means includes
Compression means for calculating a target compressed feature vector obtained by compressing a feature vector with an input compression rate corresponding to the number of repetitions;
2. A projection distance calculating means for calculating a distance when a target compression feature vector is projected onto a partial space using a partial space input in advance for a compression rate corresponding to each candidate category. The object identification device described.

The collation degree calculating means includes
Compression means for calculating a target compressed feature vector obtained by compressing a feature vector with an input compression rate corresponding to the number of repetitions;
A projection distance calculating means for calculating a projection distance when the target compression feature vector is projected onto the partial space using a partial space previously input with respect to a compression rate corresponding to each candidate category;
A rank calculating means for calculating a rank according to the projection distance of each candidate category;
The object discriminating apparatus according to claim 1, further comprising rank cutting means for increasing a projection distance of a candidate category having a specified rank or less.

The feature extraction means includes
Differential intensity direction calculation means for calculating the differential and vertical differential components of each image and calculating the differential direction and strength;
For each image, a differential direction histogram forming means for quantizing the differential direction of each pixel in a predetermined region to a predetermined stage and creating a differential direction histogram in which the strength of the differential is cumulatively added for each stage;
The object identification according to any one of claims 1 to 3, further comprising normalizing means that regards the created differential direction histogram as a vector and normalizes the magnitude to a predetermined value. apparatus.

The connected region peak detection means,
Right-side connectivity determination means for determining connection when the degree of matching between the noted position and its right-side position is equal to or less than a predetermined threshold;
Lower connectivity determination means for determining that the connection is established when the degree of matching between the noted position and the position below it is equal to or less than a predetermined threshold;
Region extraction means for extracting a group of connected positions of interest as one region;
Peak point output means for obtaining and outputting the position and size where the matching degree in each region is maximum,
A step of repeatedly executing the processing by the right lateral connectivity determining means and the lower connectivity determining means for the category registered as a candidate for the noted position and its size, and at all the noted positions and their sizes 5. The object identification device according to claim 1, further comprising a control unit that controls a step of repeatedly executing the processing of the step. 6.

The connected region peak detection means,
Right-side connectivity determination means for determining connection when the degree of matching between the noted position and its right-side position is equal to or less than a predetermined threshold;
Lower connectivity determination means for determining that the connection is established when the degree of matching between the noted position and the position below it is equal to or less than a predetermined threshold;
Pre-connectivity determining means for determining connection when the degree of matching at the same position of the noted position and the size of the position one step above is not more than a predetermined threshold;
A region extracting means for extracting the connected attention position group as one region;
Candidate point output means for obtaining and outputting the position and size where the matching degree in each region is maximum,
The step of repeatedly executing the processing by the right lateral connectivity determining means, the lower connectivity determining means, the front connectivity determining means, and the back connectivity determining means for the category registered as a candidate for the noted position and its size. The object identification according to claim 1, further comprising: a control unit that controls a process of repeatedly executing the process of the process at all focused positions and sizes thereof. apparatus.

The candidate category update means includes:
A peripheral area calculation means for calculating a peripheral area of the obtained peak point;
7. The object identification device according to claim 1, further comprising peripheral region registration means for registering the peak point and its peripheral region as candidates.

An object identification method for identifying a plurality of objects in an image,
All category registration means, image cutout means, feature extraction means, matching degree calculation means, connected region peak detection means, and control means for controlling these means,
A step of registering all categories as candidate categories with respect to a predetermined position and its size in the input identification target image;
A step in which the control means repeatedly executes processing by the following means;
A step of cutting out an image of interest while changing the position and size of the image of interest in the target image;
A feature extracting means for extracting a target feature vector from the cut-out image;
A step of comparing the object feature vector with the recognition dictionary of each object according to the number of repetitions created in advance, and calculating a matching degree indicating the matching degree;
A step in which a peak detection unit in the connected region determines the spatial connectivity of the matching degree of each object, and detects the maximum peak of the matching degree in the connected region;
And a candidate category updating means for registering the detected peak as a candidate.

In the step of calculating the matching degree by the matching degree calculating means,
Calculating a target compressed feature vector obtained by compressing a feature vector with an input compression rate corresponding to the number of repetitions;
9. The object according to claim 8, further comprising a step of calculating a distance when the target compression feature vector is projected onto the partial space using a partial space previously input with respect to a compression rate corresponding to each candidate category. Identification method.

In the step of calculating the matching degree by the matching degree calculating means,
Calculating a target compressed feature vector obtained by compressing a feature vector with an input compression rate corresponding to the number of repetitions;
Calculating a projection distance when the target compression feature vector is projected onto the partial space using the partial space input in advance for the compression ratio corresponding to each candidate category;
Calculating a rank according to the projection distance of each candidate category;
The object identification method according to claim 8, further comprising a step of increasing a projection distance of candidate categories equal to or lower than a specified rank.

In the step of extracting the target feature vector by the feature extraction means,
Calculating a differential direction and strength of each image by calculating a horizontal differential and a vertical differential component of each image;
For each image, a step of quantizing the differential direction of each pixel in a predetermined region to a predetermined stage and creating a differential direction histogram in which the strength of the differential is cumulatively added for each stage;
The object identification method according to claim 8, further comprising a step of regarding the generated differential direction histogram as a vector and normalizing the magnitude to a predetermined value.

In the step of detecting the maximum peak of the matching degree by the connected region peak detection means,
Determining the connection when the degree of matching between the noted position and the position on the right side thereof is not more than a predetermined threshold;
Determining the connection when the degree of matching between the noted position and the position below it is equal to or less than a predetermined threshold;
Extracting a group of linked positions of interest as one region;
A step of obtaining and outputting a position and size having the maximum matching degree in each region; and a right-side connectivity determination means and a lower connectivity with respect to a category registered as a candidate for the noted position and its size. A step of repeatedly executing processing by the determination means;
The object identifying method according to claim 8, further comprising a step of repeatedly executing the processing of the step at all focused positions and sizes thereof.

In the step of detecting the maximum peak of the matching degree by the connected region peak detection means,
Determining the connection when the degree of matching between the noted position and the position on the right side thereof is not more than a predetermined threshold;
Determining the connection when the degree of matching between the noted position and the position below it is equal to or less than a predetermined threshold;
A step of determining connection when the degree of matching at the same position of the noted position and its size one step above is equal to or less than a predetermined threshold;
Extracting the connected attention position group as one area;
A step of obtaining and outputting a position and size having the maximum matching degree in each region; and a right-side connectivity determination means and a lower connectivity with respect to a category registered as a candidate for the noted position and its size. A step of repeatedly executing processing by the determination means, the front connectivity determination means and the back connectivity determination means;
13. The object identification method according to claim 8, further comprising a step of repeatedly executing the processing of the step at all focused positions and sizes thereof.

In the step performed by the candidate category update means,
Calculating the area around the obtained peak point;
The object identification method according to claim 8, further comprising a step of registering the peak point and its peripheral region as candidates.

The program for making a computer perform the object identification method of any one of Claims 8-14.

The computer-readable recording medium which recorded the program for making a computer perform the object identification method of any one of Claims 8-14.