JP2004151771A

JP2004151771A - Method of recognizing object in image

Info

Publication number: JP2004151771A
Application number: JP2002313142A
Authority: JP
Inventors: Shinya Sato; 信也佐藤
Original assignee: EARTH BEAT Inc
Current assignee: EARTH BEAT Inc
Priority date: 2002-10-28
Filing date: 2002-10-28
Publication date: 2004-05-27

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method of recognizing an object in an image for carrying out the proper processing of a recognized object image based on the parameter of the recognized object image. <P>SOLUTION: An image to be recognized is inputted, and a recognition template image is read from a database in which the recognition template image is stored, and the recognition template image is compared with the image to be recognized, so that object recognition can be carried out. Then, the recognized image obtained by the object recognition is used as the recognition template image, and the template image and the parameter in recognition are registered in the database so as to be associated with each other. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、画像中に表示された物体を認識するための技術に関する。
【０００２】
【従来の技術】
デジタル処理された写真画像データ中から物体を認識する従来技術としては、情報処理装置内にあらかじめ多数のテンプレート画像を用意しておき、パターンマッチング等の比較技術により、撮影された写真中の画像が特定の物体であると認識するものが一般的だった。
【０００３】
この種の従来技術としてたとえば、画像処理と物体認識の精度を両立させるために、複数台のカメラを用いて物体を撮影し、これらのカメラの位置状態に基づいて３次元処理を行い当該物体を把握するものがある（特許文献１参照）。
【０００４】
【特許文献１】
特開２００２−８３２９７号公報
【０００５】
【発明が解決しようとする課題】
本発明は、このような従来技術を改良したものであり、認識した物体画像のパラメータに基づいて該物体画像の適切な処理を可能とする物体認識技術の提供を課題とする。
【０００６】
【課題を解決するための手段】
本発明は、第１に、テンプレート画像を被認識画像から自動的に作成するものである。すなわち、被認識画像を入力し、認識テンプレート画像を記憶したデータベースから前記認識テンプレート画像を読み出し、この認識テンプレート画像を前記被認識画像と比較することにより物体認識を行い、前記物体認識で得られた認識画像を認識テンプレート画像とし、前記テンプレート画像と認識時のパラメータとを関連付けて前記データベースに登録するものである。このような処理を実行することにより、たとえば被認識画像自体から作成したテンプレート画像で比較を行え、精度良く物体認識を行うことが可能となる。
【０００７】
また、本発明は、第２に、物体認識によって得られた認識画像を認識時のパラメータに基づいて移動或は変形するものである。すなわち被認識画像を入力し、認識テンプレート画像を記憶したデータベースから前記認識テンプレート画像を読み出し、この認識テンプレート画像を前記被認識画像と比較することにより物体認識を行い、この物体認識で得た認識画像をパラメータに基づいて移動或いは変形するものである。このような処理を実行することにより、たとえば画像の隅で認識された物体の画像（認識画像）を中央に移動させた画像や、認識画像を拡大した画像を出力することが可能となる。
【０００８】
【発明の実施の形態】
《実施形態１》
以下、図面に基づいて、本発明の実施の形態を説明する。
【０００９】
図１は、本発明のシステム構成を示している。本システムは同図に示すように、テンプレート画像作成部と被認識画像入力部とからなる入力インターフェース部と、物体画像認識エンジン部とで構成されている。
【００１０】
本システムは、汎用のコンピュータで物体認識プログラムを実行することによってよって実現されている。具体的には、該コンピュータのメインメモリやＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）等が物体認識プログラムに従って後述の如く処理を行うことにより、上記入力インターフェース部と物体画像認識エンジン部の機能を実現している。このシステム（コンピュータ）の入力装置としては、マウスまたはデジタイザ等の座標入力手段、キーボード、スキャナ、カメラ、ＶＴＲ等を用いることができる。また、記憶手段としては半導体メモリやハードディスク装置等を備えており、この記憶手段内に、本実施例の機能を実現するための各種プログラムやテンプレートデータベース、被認識画像が格納されている。
【００１１】
なお、本システムは、汎用のコンピュータでソフト的に実現するものに限らず、上記入力インターフェース部と物体画像認識エンジン部の機能を備えた電子回路（ハードウェア）で構成しても良い。
【００１２】
（テンプレート画像作成部）
テンプレート画像作成部は、認識用のテンプレートを作成し、物体画像認識エンジン部に入力するようになっており、機能部としては、テンプレートデータベースと、画像入力部と、画像変換部と、量子化器と、テンプレート画像展開部とで構成されている。
【００１３】
ここで、テンプレートデータベースは、認識用のテンプレート画像を記憶している。
また、画像入力部は、このデータベースから認識用のテンプレート画像を読み出して入力を受け付ける。
【００１４】
画像変換部は、上記で入力された画像データ（ＲＧＢのカラーデータ）を、効率の良く認識処理を行うためにグレー・スケール（２５６レベル、８ｂｉｔ相当）または、輝度信号Ｙに変換する。
【００１５】
量子化器は、図２の量子化テーブルに従って前記画像の量子化を行う。
更にテンプレート画像展開部は、同一画像認識や類似画像認識など、認識目的に応じてテンプレート画像の回転や変形などを行う。
【００１６】
画像登録部は、物体画像認識エンジン部で認識した認識画像をテンプレートデータベースに登録する。
具体的には、以下のような処理を行う。
▲１▼同一画像認識の場合、データベースから読み込んだテンプレート画像をそのまま利用する。
【００１７】
▲２▼類似画像認識の場合、このテンプレート画像画像データを正規化して利用する。正規化の方法は、ｘ座標を３２点の画素に正規化し、ｙ座標をこれの縮尺に合わせ正規化する。
【００１８】
▲３▼テンプレート画像が線画の場合、上記▲１▼と▲２▼の処理後、グレー・スケール（２５６レベル、８ｂｉｔ相当）または、輝度信号Ｙに変換し、これをさらに必要に応じ量子化し効率のよい認識処理を行う。
【００１９】
▲４▼認識する物体の向きの違いに対応するために、認識テンプレート画像から、その画像変形による複数のテンプレート画像を作成する。この方法では、まず類似画像認識のための変形された複数のテンプレート画像を作成する。この変形された認識テンプレート画像も併用することにより、たとえば、物体正面画像と共に、横方向や上下方向から見た物体画像も認識可能となる
【００２０】
変形テンプレート画像作成は、基本となる認識テンプレート画像に対し上下方向と左右方向に対し行われる。これにより、同一の物体画像に対し上下左右から見た画像に対しても的確に認識処理を行うことが可能となる
【００２１】
認識用テンプレート画像を作成後に、類似物体画像（変形されたオブジェクトとしてモデル化）の認識も可能とするため以下の手順で認識用テンプレート画像を変形する。
【００２２】
ここでは、以下の一般的に用いられている式から、類似画像認識のための変形されたテンプレート画像を作成する。
ｘ＝（（Ｘ − ｘ０）ＣＯＳθ − （Ｙ − ｙ０）ＳＩＮθ ）／ａ＋ｘ０
ｙ＝（（Ｘ − ｘ０）ＳＩＮθ − （Ｙ − ｙ０）ＣＯＳθ ）／ｂ＋ｙ０
【００２３】
上式では、任意の画像上の点（ｘ０，ｙ０）を中心にして横方向にａ倍、縦方向にｂ倍の拡大・縮小処理を行う。また、中心点（ｘ０，ｙ０）に対しθ回転も行うことができるが、縦横方向だけの処理の場合、θ＝０で演算を行う。ここで、座標（ｘ、ｙ）が拡大・縮小された画像位置であるが、これに対応するもとの画像位置（Ｘ、Ｙ）にある画素データを、そのまま拡大・縮小画素として利用する事ができる利点がある。
【００２４】
この変形された認識テンプレート画像も併用することにより、たとえば、物体正面画像と共に、横方向や上下方向から見た物体画像も認識可能となる；
【００２５】
図３は、変形画像テンプレート画像の作成方法を示す具体例である。
【００２６】
同図では、基になる画像テンプレート画像を縦に４分割、横に４分割してから、各分割領域の縮尺を変えることでそれぞれの変形画像テンプレート画像を作成している。このとき、プログラム上では基になる画像テンプレート画像の縮尺および、各変形方向に対する縮尺はテーブルとして可変可能とする。
【００２７】
このような、変形画像テンプレート画像を作成しておくことにより、たとえば、同一人物の顔画像が含まれている全ての画像ファイルを検索する場合、その人物の１画像を入力することにより、その人物が上下左右から写っている画像も検索可能となる。これにより、検索するために複数のテンプレート画像を用意することなく、効率的な画像検索が可能となる。
【００２８】
実際のプログラム上では基になる画像テンプレート画像の縮尺および、各変形方向に対する縮尺はテーブルとして可変可能である。また、ブロックの分割数も可変となる。
【００２９】
これは、コンピュータのＣＰＵ性能向上速度が早く、また複数のＣＰＵを搭載したサーバ等も考慮されていることによる。実際の変形されたテンプレート画像例を図４に示す。同図では、基になる画像テンプレート画像を４分割し、それぞれの分割領域において縮尺率を変更している。すなわち、右図の場合、縮尺率は左側の領域から順番に、１．７５→１．２５→０．７５→０．２５となっている。このように変形させることにより、人物の正面からの顔写真画像しか与えられていなくても、当該人物が左方向または右方向に向いた顔の状態をテンプレート画像として用意することができる。
【００３０】
（物体画像認識エンジン部）
物体認識エンジンは、認識画像中の物体を種々のアルゴリズムに基づいて認識する機能部である。物体認識のアルゴリズムについては種々の公知技術があるのでここでは説明を省略する。
また、出力部は、物体認識エンジンで認識した認識画像をパラメータに基づいて移動或いは変形して出力する。
【００３１】
（被認識画像入力部）
被認識画像入力部は、認識対象となる画像データを入力するインターフェース部であり、ビットマップ画像入力部と、画像変換部とで構成されている。これらの各構成部は図１で説明した認識用画像テンプレート画像作成部における画像入力部と画像変換部と同様であるので説明を省略する。なお、入力画像がアナログ信号の場合には、画像処理し易いようにビットマップ画像に変換する。
【００３２】
（物体認識方法）
次に、本システムにおける物体認識方法について説明する。
本システムの物体認識方法は、被認識画像からテンプレート画像を作成する第１の工程と、作成したテンプレート画像を被認識画像と比較し物体認識を行う第２の工程と、この物体認識で得た認識画像を出力する第３の工程を有している。
【００３３】
以下、それぞれの工程について説明する。
（１）．被認識画像からテンプレート画像を作成する工程
図５、図６に処理の流れを示す。
処理が開始されると、先ずテンプレート画像作成部の画像入力部がテンプレートデータベースからデフォルトのテンプレート画像を読み込み、物体画像認識エンジン部に入力する（Ｓ１）。
【００３４】
また、被認識画像入力部の画像入力部が被認識画像の入力を受け付ける（Ｓ２）。被認識画像入力部は、この被認識画像を構成している画像、即ち動画であればフレーム毎に画像変換し、物体画像認識エンジン部に入力する。同様に、被認識画像が連続した静止画であれば、静止画一枚毎に入力する。なお、この動画と連続した静止画とは、以下の処理も同一であるので、動画についてのみ説明する。物体画像認識エンジン部は、このフレームを物体認識エンジンに読み込む（Ｓ３）。
物体認識エンジンは、前記デフォルトの認識テンプレート画像を前記被認識画像と比較することにより物体認識を行う。
【００３５】
テンプレート画像と被認識画像の特徴が一致した場合、該被認識画像から一致部分を認識画像として抽出し、該認識時のパラメータと共にテンプレート作成部の登録手段に出力する。登録手段は、該認識画像をテンプレート画像として該パラメータと共にテンプレートデータベースに登録する（Ｓ４）。このとき登録されるパラメータとしては、図７に示すように、認識画像のフレーム位置（動画を構成する何番目のフレームかを示す）、認識物体位置（認識した物体がフレーム中のどこに位置していたかをＸ−Ｙ座標で示す）、認識物体サイズ（認識物体の大きさ、テンプレート画像に対する倍率やフレームに占める割合などでも良い）、認識画像の傾き（テンプレート画像に対する回転角度）、サムネイルデータ（認識画像を含んだフレームの縮小画像）などである。
【００３６】
そしてこの認識を所定のフレーム毎に繰り返す（Ｓ５）。
全てのフレームについて認識処理を終了した場合には、次の被認識画像についてステップＳ２〜Ｓ５の処理を繰り返し、全ての被対象画像を処理した場合には、認識処理を終了する（Ｓ６）。
【００３７】
（２）．作成したテンプレート画像で物体認識を行う工程
図８に処理の流れを示す。
上記テンプレート画像の作成が終了すると、このテンプレート画像を用いた認識処理が開始される。先ずテンプレート画像作成部の画像入力部がテンプレートデータベースから先のステップで作成したテンプレート画像を読み込み、物体画像認識エンジン部に入力する（Ｓ１１）。このときテンプレート画像作成部の画像変換部は、テンプレート画像のパラメータ（サイズ）に基づいて正規化を行う。また、画像展開部は、テンプレート画像のパラメータ（傾き）に基づいて傾きを無くすように回転処理を行う。
【００３８】
また、被認識画像入力部の画像入力部は、被認識画像の入力を受け付け、（Ｓ１２）。この被認識画像を構成しているフレーム毎に画像変換部で適宜変換し、物体画像認識エンジン部に入力する。物体画像認識エンジン部は、このフレームを物体認識エンジンに読み込む（Ｓ１３）。
物体認識エンジンは、前記テンプレート画像を前記被認識画像のフレームと比較することにより物体認識を行い、テンプレート画像と一致した部分の画像（認識画像）或はこの認識画像を含むフレームをこの認識時のパラメータと共に出力部へ出力する（Ｓ１４）。このとき登録されるパラメータは、図７と同様である。
そしてこの認識を所定のフレーム毎に繰り返す（Ｓ１５）。このとき全てのフレーム或は所定数毎のフレームについて認識を行うものでも良いし、テンプレート画像のパラメータ（フレーム位置）に基づいて上述の処理（１）でテンプレート画像が作成されたフレーム、即ち前記デフォルトのテンプレート画像と一致したフレームについて認識を行うものでも良い。
【００３９】
全てのフレームについて認識処理を終了した場合には、次の被認識画像についてステップＳ１２〜Ｓ１５の処理を繰り返し、全ての被対象画像を処理した場合には、認識処理を終了する（Ｓ１６）。
この物体認識を行う工程は、第１のテンプレート画像と被認識画像とを比較して物体認識の候補を索出し、これで索出された画像と第２のテンプレート画像とを比較して絞込み、必要に応じて第３テンプレート画像、第４のテンプレート画像というように順次繰り返して絞り込み検索を行うものでも良い。
また、該物体認識を行う工程は、複数のテンプレート画像と被認識画像とを同時に比較して物体認識を行うものでも良い。
【００４０】
（３）．物体認識で得た認識画像を出力する工程
出力手段は、前記物体認識で得たフレーム（認識画像）を前記物体認識で得たパラメータに基づいて移動或いは変形する。
【００４１】
例えば、
機能１．認識画像位置に基づいて認識物体のフレーム中の位置を判別し、認識物体を中央に表示するように移動する。
機能２．認識画像のサイズに基づき、所定のサイズとなるように拡大・縮小を行う。
機能３．認識画像の角度に基づき、認識した物体の傾きを補正する。即ち認識画像位置を中心に回転移動させる。
図９はこれらの機能により移動或いは変形した例を示している。
図９（Ａ）が元のフレームであり、四角で囲んだ部分が認識画像である。
図９（Ｂ）は、機能１と２を施した例である。このように画面の端に小さく写った認識物体を画面中央に大きく見易く表示することが可能となる。
また、図９（Ｃ）は機能１，２，３を施した例である。この機能により、更に傾いて録画または写っている画像を、傾きを無くした見やすい画像表示にする事が可能となる。
【００４２】
以上述べたように、本実施例によれば、認識処理の度にテンプレート画像を入力しなくてもデフォルトの画像に基づいて認識用のテンプレートを自動的に作成することができる。また、この被認識画像から作成したテンプレートで認識を行うことで、より詳細な認識を行うことが可能である。
【００４３】
例えば、デフォルトのテンプレート（人物）を映画やドラマの画像（被認識画像）と比較し、この被認識画像から人物のテンプレート画像を作成し、この作成したテンプレート画像を用いて同一人であるか否かの認識を行う。これにより特定の人の登場シーンを抽出することが可能となる。また、同一でない人を認識することで該映画やドラマのキャストを抽出できる。
【００４４】
また、画像の出力手段は、表示に限らず、印刷や送信等、任意に設定できる
更に、認識画像をパラメータに基づいて見易すく表示することができる。
【００４５】
《実施形態２》
図１０は実施形態２としての物体画像認識サーバシステムの概略構成図である。
同図に示すように、画像認識サーバ（画像認識装置）は、インターネット等のネットワークを介してユーザの端末と接続し、ＨＴＭＬ等によるウエブページを提供する所謂ウエブサーバである。また、該画像認識サーバは、ユーザ端末からの画像について物体認識処理を行う画像認識システムを備えている。該画像認識システムについては、前述の実施形態１と略同一であるので、重複する部分の説明を省略する。
【００４６】
なお、該画像認識システムのテンプレートデータベースには、動物・女優・歌手などの画像と、パラメータ（画像に係るコメント）とを対応付けて記憶している。
【００４７】
図１１に示すように、ユーザ端末から被認識画像としてユーザの顔の画像が画像認識サーバに入力されると、例えば動物のテンプレート画像を読み出し、物体認識エンジンで比較し、被認識画像と類似しているテンプレートのパラメータ（コメント）を出力部に出力する。
【００４８】
出力部は、このコメントをユーザ端末に送信し表示させる。
これにより、性格チェックや、俳優・歌手などとのそっくり度チェックなどのサービスが提供可能となる。
【００４９】
尚、本発明の物体認識方法、物体認識システム、物体認識プログラム、物体認識装置は、上述の図示例にのみ限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々変更を加え得ることは勿論である。
【００５０】
例えば、以下に付記した構成であっても前述の実施形態と同様の効果が得られる。
（付記１）
被認識画像を入力し、
認識テンプレート画像を記憶したデータベースから前記認識テンプレート画像を読み出し、
前記認識テンプレート画像を前記被認識画像と比較することにより物体認識を行い、
前記物体認識で得られた認識画像を認識テンプレート画像とし、
前記テンプレート画像と認識時のパラメータとを関連付けて前記データベースに登録する画像中の物体認識方法。
【００５１】
（付記２）
前記被認識画像が、複数の連続画像から構成される場合、該連続画像について所定の間隔で前記物体認識を行う付記１記載の画像中の物体認識方法。
【００５２】
（付記３）
被認識画像を入力し、
認識テンプレート画像を記憶したデータベースから前記認識テンプレート画像を読み出し、
前記認識テンプレート画像を前記被認識画像と比較することにより物体認識を行い、
前記物体認識で得た認識画像を前記物体認識で得たパラメータに基づいて移動或いは変形する画像中の物体認識方法。
【００５３】
（付記４）
前記パラメータが、認識画像のフレーム位置、認識物体位置、認識物体サイズ、認識物体の傾きのうち、少なくとも一つを有する付記１から３の何れかに記載の画像中の物体認識方法。
【００５４】
（付記５）
前記認識画像を縦または横方向に分割し、前記分割領域毎に異なる縮尺率で認識画像を変換する付記１から４の何れかに記載の画像中の物体認識方法。
【００５５】
（付記６）
前記認識画像を
ｘ＝（（Ｘ − ｘ０）ＣＯＳθ − （Ｙ − ｙ０）ＳＩＮθ ）／ａ＋ｘ０
ｙ＝（（Ｘ − ｘ０）ＳＩＮθ − （Ｙ − ｙ０）ＣＯＳθ ）／ｂ＋ｙ０
（ただし、画像上の任意の点ｘ０，ｙ０を中心として横方向にａ倍、縦方向にｂ倍したものとする）
上記の式に基づき、拡大処理または縮小処理を行う付記１から５に記載の画像中の物体認識方法。
【００５６】
（付記７）第１のテンプレート画像と被認識画像とを比較して物体認識の候補を索出し、
さらに第２のテンプレート画像、第３テンプレート画像というように複数の画像と前記被認識画像とを順次比較して絞り込み検索を行う付記１から６に記載の画像中の物体認識方法。
【００５７】
（付記８）少なくとも２以上のテンプレート画像と被認識画像とを同時に比較して物体認識を行う付記１から７に記載の画像中の物体認識方法。
【００５８】
（付記９）
被認識画像中の物体を認識するコンピュータ実行可能なプログラムであって、認識テンプレート画像を記憶したデータベースから前記認識テンプレート画像を読み出し、
前記認識テンプレート画像を前記被認識画像と比較することにより物体認識を行い、
前記物体認識で得られた認識画像を認識テンプレート画像とし、
前記テンプレート画像と認識時のパラメータとを関連付けて前記データベースに登録するコンピュータ実行可能なプログラム。
【００５９】
（付記１０）
前記被認識画像が、複数の連続画像から構成される場合、該連続画像について所定の間隔で前記物体認識を行う付記９記載のコンピュータ実行可能なプログラム。
【００６０】
（付記１１）
被認識画像中の物体を認識するコンピュータ実行可能なプログラムであって、
被認識画像を入力し、
認識テンプレート画像を記憶したデータベースから前記認識テンプレート画像を読み出し、
前記認識テンプレート画像を前記被認識画像と比較することにより物体認識を行い、
前記物体認識で得た認識画像を前記物体認識で得たパラメータに基づいて移動或いは変形するコンピュータ実行可能なプログラム。
【００６１】
（付記１２）
前記パラメータが、認識画像のフレーム位置、認識物体位置、認識物体サイズ、認識物体の傾きのうち、少なくとも一つを有する付記９から１１の何れかに記載のコンピュータ実行可能なプログラム。
【００６２】
（付記１３）
認識テンプレート画像を被認識画像と比較することにより被認識画像中の物体を認識する物体認識装置であって、
被認識画像の入力を受け付ける手段と、
認識テンプレート画像を記憶したデータベースと、
前記データベースから読み出した認識テンプレート画像を前記被認識画像と比較する手段と、
前記物体認識で得られた認識画像を認識テンプレート画像とし、前記テンプレート画像と認識時のパラメータとを関連付けて前記データベースに登録する手段とを備える物体認識装置。
【００６３】
（付記１４）
前記被認識画像が、複数の連続画像から構成される場合、該連続画像について所定の間隔で前記物体認識を行う付記１３記載の物体認識装置。
【００６４】
（付記１５）
認識テンプレート画像を被認識画像と比較することにより被認識画像中の物体を認識する物体認識装置であって、
被認識画像の入力を受け付ける手段と、
認識テンプレート画像を記憶したデータベースと、
前記データベースから読み出した認識テンプレート画像を前記被認識画像と比較する手段と、
前記物体認識で得た認識画像を前記物体認識で得たパラメータに基づいて移動或いは変形する手段とを備えた物体認識装置。
【００６５】
（付記１６）
前記パラメータが、認識画像のフレーム位置、認識物体位置、認識物体サイズ、認識物体の傾きのうち、少なくとも一つを有する付記１３から１５の何れかに記載の物体認識装置。
本発明において、上記の各構成は、可能な限り組み合わせることができる。
【発明の効果】
本発明によれば、認識した物体画像のパラメータに基づいて該物体画像の適切な処理を可能とする物体認識技術を提供することができる。
【図面の簡単な説明】
【図１】本発明のシステム構成を示す図
【図２】入力インターフェース部における画像変換の変換式
【図３】テンプレート画像の変形方法を示す図
【図４】変形テンプレート画像の具体例を示す図
【図５】テンプレート画像の作成方法を示す図
【図６】テンプレート画像の作成方法の説明図
【図７】認識時のパラメータを示す図
【図８】物体認識処理の説明図
【図９】認識画像の移動・変形例を示す図
【図１０】本発明に係る実施形態２の概略構成図
【図１１】実施形態２に係るシステムの説明図[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a technique for recognizing an object displayed in an image.
[0002]
[Prior art]
As a conventional technique for recognizing an object from digitally processed photographic image data, a large number of template images are prepared in advance in an information processing apparatus, and an image in a photographed image is compared with a comparison technique such as pattern matching. Those who recognized it as a specific object were common.
[0003]
As a conventional technique of this type, for example, in order to achieve both image processing and object recognition accuracy, an object is photographed using a plurality of cameras, and three-dimensional processing is performed based on the position states of these cameras, and the object is subjected to three-dimensional processing. There is something to grasp (see Patent Document 1).
[0004]
[Patent Document 1]
JP-A-2002-83297
[Problems to be solved by the invention]
An object of the present invention is to improve such a conventional technique, and to provide an object recognition technique capable of appropriately processing an object image based on parameters of the recognized object image.
[0006]
[Means for Solving the Problems]
First, the present invention automatically creates a template image from a recognized image. That is, the recognition target image is input, the recognition template image is read from the database storing the recognition template image, the object recognition is performed by comparing the recognition template image with the recognition target image, and the object recognition is performed. The recognition image is used as a recognition template image, and the template image is associated with parameters at the time of recognition and registered in the database. By performing such a process, for example, comparison can be performed with a template image created from the image to be recognized itself, and object recognition can be performed with high accuracy.
[0007]
Secondly, the present invention moves or deforms a recognition image obtained by object recognition based on parameters at the time of recognition. That is, the recognition target image is input, the recognition template image is read from the database storing the recognition template image, the object recognition is performed by comparing the recognition template image with the recognition target image, and the recognition image obtained by the object recognition is obtained. Is moved or deformed based on the parameter. By executing such processing, for example, it is possible to output an image obtained by moving the image of the object (recognition image) recognized at the corner of the image to the center or an image obtained by enlarging the recognition image.
[0008]
BEST MODE FOR CARRYING OUT THE INVENTION
<< Embodiment 1 >>
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0009]
FIG. 1 shows the system configuration of the present invention. As shown in FIG. 1, the system includes an input interface unit including a template image creation unit and a recognized image input unit, and an object image recognition engine unit.
[0010]
The present system is realized by executing an object recognition program on a general-purpose computer. More specifically, the functions of the input interface unit and the object image recognition engine unit are realized by the main memory, CPU (Central Processing Unit), and the like of the computer performing processing according to an object recognition program as described below. As an input device of this system (computer), coordinate input means such as a mouse or a digitizer, a keyboard, a scanner, a camera, a VTR, and the like can be used. The storage means includes a semiconductor memory, a hard disk device, and the like, and various programs for realizing the functions of the present embodiment, a template database, and an image to be recognized are stored in the storage means.
[0011]
Note that the present system is not limited to a software realized by a general-purpose computer, and may be configured by an electronic circuit (hardware) having the functions of the input interface unit and the object image recognition engine unit.
[0012]
(Template image creation unit)
The template image creation unit creates a template for recognition and inputs the template to the object image recognition engine unit. The function units include a template database, an image input unit, an image conversion unit, and a quantizer. And a template image developing unit.
[0013]
Here, the template database stores a template image for recognition.
The image input unit reads a template image for recognition from the database and receives an input.
[0014]
The image conversion unit converts the input image data (RGB color data) into a gray scale (256 levels, corresponding to 8 bits) or a luminance signal Y for efficient recognition processing.
[0015]
The quantizer quantizes the image according to the quantization table of FIG.
Further, the template image developing unit performs rotation or deformation of the template image according to the recognition purpose, such as recognition of the same image or similar image.
[0016]
The image registration unit registers the recognition image recognized by the object image recognition engine unit in the template database.
Specifically, the following processing is performed.
(1) In the case of the same image recognition, the template image read from the database is used as it is.
[0017]
{Circle around (2)} In the case of similar image recognition, the template image image data is normalized and used. In the normalization method, the x coordinate is normalized to 32 pixels, and the y coordinate is normalized according to the reduced scale.
[0018]
(3) If the template image is a line image, after the processing of (1) and (2) above, it is converted into a gray scale (256 levels, equivalent to 8 bits) or a luminance signal Y, which is further quantized as necessary and the efficiency is increased. A good recognition process.
[0019]
{Circle around (4)} In order to cope with the difference in the orientation of the object to be recognized, a plurality of template images are created from the recognized template image by transforming the image. In this method, first, a plurality of deformed template images for similar image recognition are created. By using the transformed recognition template image together, for example, the object image viewed from the horizontal direction or the vertical direction can be recognized together with the object front image.
The creation of the deformed template image is performed in the vertical and horizontal directions with respect to the basic recognition template image. This makes it possible to accurately perform recognition processing on the same object image even when viewed from the top, bottom, left, and right.
After the recognition template image is created, the recognition template image is deformed in the following procedure to enable recognition of a similar object image (modeled as a deformed object).
[0022]
Here, a modified template image for similar image recognition is created from the following commonly used equation.
x = ((X−x0) COSθ− (Y−y0) SINθ) / a + x0
y = ((X−x0) SINθ− (Y−y0) COSθ) / b + y0
[0023]
In the above equation, enlargement / reduction processing is performed by a factor in the horizontal direction and by a factor of b in the vertical direction around a point (x0, y0) on an arbitrary image. In addition, the rotation about the center point (x0, y0) can be performed by θ, but in the case of processing only in the vertical and horizontal directions, the calculation is performed with θ = 0. Here, the coordinates (x, y) are the enlarged / reduced image position, and the pixel data at the original image position (X, Y) corresponding to this is used as it is as the enlarged / reduced pixel. There is an advantage that can be.
[0024]
By using the modified recognition template image together, for example, the object image viewed from the horizontal direction or the vertical direction can be recognized together with the object front image;
[0025]
FIG. 3 is a specific example showing a method for creating a deformed image template image.
[0026]
In the figure, the original image template image is divided vertically into four parts and four parts horizontally, and then the respective modified image template images are created by changing the scale of each divided area. At this time, in the program, the scale of the base image template image and the scale for each deformation direction can be changed as a table.
[0027]
By creating such a deformed image template image, for example, when searching for all image files including a face image of the same person, inputting one image of the person Is also searchable. Thus, an efficient image search can be performed without preparing a plurality of template images for the search.
[0028]
In an actual program, the scale of the base image template image and the scale for each deformation direction can be changed as a table. Further, the number of divisions of the blocks is also variable.
[0029]
This is because the speed of improving the CPU performance of the computer is high, and a server equipped with a plurality of CPUs is considered. FIG. 4 shows an example of an actual deformed template image. In the figure, the base image template image is divided into four parts, and the scale ratio is changed in each divided area. That is, in the case of the right figure, the scale is 1.75 → 1.25 → 0.75 → 0.25 in order from the left area. By deforming in this way, even if only a face photograph image from the front of the person is given, the face state of the person facing left or right can be prepared as a template image.
[0030]
(Object image recognition engine)
The object recognition engine is a functional unit that recognizes an object in a recognition image based on various algorithms. Since there are various known techniques for the object recognition algorithm, the description is omitted here.
The output unit moves or deforms the recognized image recognized by the object recognition engine based on the parameter and outputs the image.
[0031]
(Recognized image input unit)
The recognized image input unit is an interface unit for inputting image data to be recognized, and includes a bitmap image input unit and an image conversion unit. These components are the same as the image input unit and the image conversion unit in the recognition image template image creation unit described with reference to FIG. If the input image is an analog signal, it is converted to a bitmap image so that the image processing is easy.
[0032]
(Object recognition method)
Next, an object recognition method in the present system will be described.
The object recognition method of the present system includes a first step of creating a template image from a recognized image, a second step of comparing the created template image with the recognized image and performing object recognition, and an object recognition method. There is a third step of outputting a recognition image.
[0033]
Hereinafter, each step will be described.
(1). Process of Creating Template Image from Recognized Image FIGS. 5 and 6 show the flow of processing.
When the process is started, first, the image input unit of the template image creation unit reads a default template image from the template database and inputs the default template image to the object image recognition engine unit (S1).
[0034]
Further, the image input unit of the recognized image input unit receives an input of the recognized image (S2). The recognized image input unit converts an image constituting the recognized image, that is, a moving image, for each frame and inputs the converted image to the object image recognition engine unit. Similarly, if the image to be recognized is a continuous still image, the image is input for each still image. Note that the following processing is the same for a still image that is continuous with the moving image, so only the moving image will be described. The object image recognition engine reads this frame into the object recognition engine (S3).
The object recognition engine performs object recognition by comparing the default recognition template image with the recognized image.
[0035]
If the features of the template image and the image to be recognized match, the matching part is extracted from the image to be recognized as a recognition image and output to the registration unit of the template creation unit together with the parameters at the time of recognition. The registration unit registers the recognized image as a template image in the template database together with the parameter (S4). As parameters registered at this time, as shown in FIG. 7, the frame position of the recognized image (indicating the number of the frame constituting the moving image), the recognized object position (where the recognized object is located in the frame, XY coordinates), the size of the recognition object (the size of the recognition object, the magnification of the template image, the ratio of the frame to the frame, etc.), the inclination of the recognition image (the rotation angle with respect to the template image), and thumbnail data (recognition) (A reduced image of a frame including an image).
[0036]
This recognition is repeated for each predetermined frame (S5).
When the recognition processing has been completed for all the frames, the processing of steps S2 to S5 is repeated for the next image to be recognized, and when all the images to be processed have been processed, the recognition processing ends (S6).
[0037]
(2). Process for Recognizing an Object Using the Created Template Image FIG. 8 shows the flow of the process.
When the creation of the template image is completed, a recognition process using the template image is started. First, the image input unit of the template image creation unit reads the template image created in the previous step from the template database and inputs the template image to the object image recognition engine unit (S11). At this time, the image conversion unit of the template image creation unit performs normalization based on the parameters (size) of the template image. In addition, the image developing unit performs a rotation process based on a parameter (inclination) of the template image so as to eliminate the inclination.
[0038]
Further, the image input unit of the recognized image input unit receives the input of the recognized image (S12). The image conversion unit appropriately converts each frame constituting the image to be recognized and inputs the converted image to the object image recognition engine unit. The object image recognition engine reads this frame into the object recognition engine (S13).
The object recognition engine performs object recognition by comparing the template image with the frame of the image to be recognized, and generates an image (recognition image) of a portion that matches the template image or a frame including the recognition image at the time of the recognition. The data is output to the output unit together with the parameters (S14). The parameters registered at this time are the same as those in FIG.
This recognition is repeated for each predetermined frame (S15). At this time, recognition may be performed for all frames or every predetermined number of frames, or the frame in which the template image is created in the above-described processing (1) based on the template image parameter (frame position), ie, the default frame The recognition may be performed on a frame that matches the template image.
[0039]
When the recognition processing has been completed for all frames, the processing of steps S12 to S15 is repeated for the next image to be recognized, and when all the images to be processed have been processed, the recognition processing ends (S16).
In the step of performing the object recognition, the first template image and the image to be recognized are compared to search for an object recognition candidate, and the image searched out and the second template image are compared and narrowed down. A narrowing-down search may be performed repeatedly and sequentially as necessary, such as a third template image and a fourth template image.
In the object recognition step, the object recognition may be performed by simultaneously comparing a plurality of template images and the image to be recognized.
[0040]
(3). The step of outputting the recognition image obtained by the object recognition moves or deforms the frame (recognition image) obtained by the object recognition based on the parameters obtained by the object recognition.
[0041]
For example,
Function 1. The position of the recognition object in the frame is determined based on the recognition image position, and the recognition object is moved so as to be displayed at the center.
Function 2. Based on the size of the recognition image, enlargement / reduction is performed to a predetermined size.
Function 3. The inclination of the recognized object is corrected based on the angle of the recognized image. That is, it is rotated and moved around the recognition image position.
FIG. 9 shows an example of moving or deforming by these functions.
FIG. 9A shows an original frame, and a portion surrounded by a square is a recognized image.
FIG. 9B is an example in which functions 1 and 2 are performed. In this way, it is possible to display the recognition object that is small on the edge of the screen in the center of the screen so as to be easily seen.
FIG. 9C shows an example in which functions 1, 2, and 3 are performed. With this function, it is possible to display an image recorded or photographed with a further inclination so that the image can be easily viewed without the inclination.
[0042]
As described above, according to the present embodiment, a template for recognition can be automatically created based on a default image without inputting a template image each time recognition processing is performed. Further, by performing recognition using a template created from the image to be recognized, more detailed recognition can be performed.
[0043]
For example, a default template (person) is compared with a movie or drama image (recognized image), a template image of a person is created from the recognized image, and whether or not the same person is determined using the created template image Is recognized. This makes it possible to extract the appearance scene of a specific person. Also, by recognizing non-identical people, the cast of the movie or drama can be extracted.
[0044]
Further, the image output means is not limited to display, and can be set arbitrarily, such as printing or transmission. Further, the recognition image can be displayed based on the parameters so as to be easily viewed.
[0045]
<< Embodiment 2 >>
FIG. 10 is a schematic configuration diagram of an object image recognition server system as the second embodiment.
As shown in FIG. 1, the image recognition server (image recognition device) is a so-called web server that connects to a user terminal via a network such as the Internet and provides a web page in HTML or the like. The image recognition server includes an image recognition system that performs an object recognition process on an image from the user terminal. Since the image recognition system is substantially the same as that of the first embodiment, the description of the overlapping part will be omitted.
[0046]
The template database of the image recognition system stores images of animals, actresses, singers, and the like, and parameters (comments related to the images) in association with each other.
[0047]
As shown in FIG. 11, when an image of a user's face is input from the user terminal as an image to be recognized to the image recognition server, for example, a template image of an animal is read out, compared with an object recognition engine, and is similar to the image to be recognized. The parameters (comments) of the template are output to the output unit.
[0048]
The output unit transmits the comment to the user terminal and displays the comment.
This makes it possible to provide services such as a personality check and a degree of similarity check with actors and singers.
[0049]
Note that the object recognition method, object recognition system, object recognition program, and object recognition device of the present invention are not limited to the above-described illustrated examples, and various changes can be made without departing from the gist of the present invention. Of course.
[0050]
For example, the same effects as those of the above-described embodiment can be obtained even with the configuration described below.
(Appendix 1)
Enter the image to be recognized,
Reading the recognition template image from the database storing the recognition template image,
Performing object recognition by comparing the recognition template image with the recognized image,
The recognition image obtained by the object recognition as a recognition template image,
A method for recognizing an object in an image, in which the template image is associated with parameters at the time of recognition and registered in the database.
[0051]
(Appendix 2)
2. The object recognition method in an image according to claim 1, wherein when the image to be recognized is composed of a plurality of continuous images, the object recognition is performed at predetermined intervals on the continuous images.
[0052]
(Appendix 3)
Enter the image to be recognized,
Reading the recognition template image from the database storing the recognition template image,
Performing object recognition by comparing the recognition template image with the recognized image,
A method for recognizing an object in an image, wherein the recognition image obtained by the object recognition is moved or deformed based on parameters obtained by the object recognition.
[0053]
(Appendix 4)
The object recognition method in an image according to any one of supplementary notes 1 to 3, wherein the parameter includes at least one of a frame position of the recognition image, a recognition object position, a recognition object size, and a tilt of the recognition object.
[0054]
(Appendix 5)
5. The method for recognizing an object in an image according to claim 1, wherein the recognition image is divided in a vertical or horizontal direction, and the recognition image is converted at a different scale for each of the divided areas.
[0055]
(Appendix 6)
X = ((X−x0) COS θ− (Y−y0) SINθ) / a + x0
y = ((X−x0) SINθ− (Y−y0) COSθ) / b + y0
(However, it is assumed that the image is multiplied by a in the horizontal direction and by b in the vertical direction around an arbitrary point x0, y0 on the image)
6. The method for recognizing an object in an image according to Supplementary Notes 1 to 5, wherein an enlargement process or a reduction process is performed based on the above equation.
[0056]
(Supplementary Note 7) A candidate for object recognition is found by comparing the first template image and the image to be recognized,
7. The method for recognizing an object in an image according to Supplementary Notes 1 to 6, wherein a plurality of images are sequentially compared with the image to be recognized, such as a second template image and a third template image, and a refined search is performed.
[0057]
(Supplementary note 8) The object recognition method in an image according to Supplementary notes 1 to 7, wherein object recognition is performed by simultaneously comparing at least two or more template images and the image to be recognized.
[0058]
(Appendix 9)
A computer-executable program for recognizing an object in a recognized image, wherein the recognition template image is read from a database storing a recognition template image,
Performing object recognition by comparing the recognition template image with the recognized image,
The recognition image obtained by the object recognition as a recognition template image,
A computer-executable program that associates the template image with a parameter at the time of recognition and registers the template image in the database.
[0059]
(Appendix 10)
10. The computer-executable program according to claim 9, wherein when the image to be recognized is composed of a plurality of continuous images, the object recognition is performed at predetermined intervals on the continuous images.
[0060]
(Appendix 11)
A computer-executable program for recognizing an object in a recognized image,
Enter the image to be recognized,
Reading the recognition template image from the database storing the recognition template image,
Performing object recognition by comparing the recognition template image with the recognized image,
A computer-executable program for moving or deforming a recognition image obtained by the object recognition based on parameters obtained by the object recognition.
[0061]
(Appendix 12)
12. The computer-executable program according to any one of supplementary notes 9 to 11, wherein the parameter includes at least one of a frame position of a recognition image, a recognition object position, a recognition object size, and a tilt of the recognition object.
[0062]
(Appendix 13)
An object recognition device that recognizes an object in a recognized image by comparing a recognition template image with the recognized image,
Means for receiving an input of a recognized image;
A database storing recognition template images,
Means for comparing the recognition template image read from the database with the recognized image,
An object recognition apparatus comprising: a recognition image obtained by the object recognition as a recognition template image; and a unit that associates the template image with a parameter at the time of recognition and registers the template image in the database.
[0063]
(Appendix 14)
14. The object recognition device according to claim 13, wherein when the image to be recognized is composed of a plurality of continuous images, the object recognition is performed on the continuous images at predetermined intervals.
[0064]
(Appendix 15)
An object recognition device that recognizes an object in a recognized image by comparing a recognition template image with the recognized image,
Means for receiving an input of a recognized image;
A database storing recognition template images,
Means for comparing the recognition template image read from the database with the recognized image,
Means for moving or deforming the recognition image obtained by the object recognition based on the parameters obtained by the object recognition.
[0065]
(Appendix 16)
The object recognition device according to any one of supplementary notes 13 to 15, wherein the parameter has at least one of a frame position of a recognition image, a recognition object position, a recognition object size, and a tilt of the recognition object.
In the present invention, the above configurations can be combined as much as possible.
【The invention's effect】
According to the present invention, it is possible to provide an object recognition technology that enables appropriate processing of an object image based on parameters of the recognized object image.
[Brief description of the drawings]
FIG. 1 is a diagram showing a system configuration of the present invention. FIG. 2 is a conversion formula of image conversion in an input interface unit. FIG. 3 is a diagram showing a method of transforming a template image. FIG. 4 is a diagram showing a specific example of a transformed template image. FIG. 5 is a diagram showing a method of creating a template image. FIG. 6 is an explanatory diagram of a method of creating a template image. FIG. 7 is a diagram showing parameters at the time of recognition. FIG. FIG. 10 is a diagram showing a movement / modification example of an image. FIG. 10 is a schematic configuration diagram of a second embodiment according to the present invention. FIG. 11 is an explanatory diagram of a system according to a second embodiment.

Claims

Enter the image to be recognized,
Reading the recognition template image from the database storing the recognition template image,
Performing object recognition by comparing the recognition template image with the recognized image,
The recognition image obtained by the object recognition as a recognition template image,
A method for recognizing an object in an image, in which the template image is associated with parameters at the time of recognition and registered in the database.

2. The object recognition method in an image according to claim 1, wherein when the image to be recognized is composed of a plurality of continuous images, the object recognition is performed on the continuous images at predetermined intervals.

Enter the image to be recognized,
Reading the recognition template image from the database storing the recognition template image,
Performing object recognition by comparing the recognition template image with the recognized image,
A method for recognizing an object in an image, wherein the recognition image obtained by the object recognition is moved or deformed based on parameters obtained by the object recognition.

4. The method according to claim 1, wherein the parameter includes at least one of a frame position of the recognized image, a recognized object position, a recognized object size, and a tilt of the recognized object.

5. The method according to claim 1, wherein the recognition image is divided in a vertical or horizontal direction, and the recognition image is converted at a different scale for each of the divided areas.

X = ((X−x0) COS θ− (Y−y0) SINθ) / a + x0
y = ((X−x0) SINθ− (Y−y0) COSθ) / b + y0
(However, it is assumed that the image is multiplied by a in the horizontal direction and by b in the vertical direction around an arbitrary point x0, y0 on the image)
6. The method for recognizing an object in an image according to claim 1, wherein the enlargement processing or the reduction processing is performed based on the above equation.

The first template image is compared with the image to be recognized to find candidates for object recognition,
The method of recognizing an object in an image according to claim 1, wherein a plurality of images are sequentially compared with the image to be recognized, such as a second template image and a third template image, and a refined search is performed.

8. The method for recognizing an object in an image according to claim 1, wherein the object recognition is performed by simultaneously comparing at least two or more template images and the image to be recognized.

A computer-executable program for recognizing an object in a recognized image, wherein the recognition template image is read from a database storing a recognition template image,
Performing object recognition by comparing the recognition template image with the recognized image,
The recognition image obtained by the object recognition as a recognition template image,
A computer-executable object recognition program that associates the template image with parameters at the time of recognition and registers the template image in the database.

The storage medium according to claim 9, wherein when the image to be recognized is composed of a plurality of continuous images, the object recognition is performed on the continuous images at predetermined intervals.

A computer-executable program for recognizing an object in a recognized image,
Enter the image to be recognized,
Reading the recognition template image from the database storing the recognition template image,
Performing object recognition by comparing the recognition template image with the recognized image,
A computer-executable object recognition program for moving or deforming a recognition image obtained by the object recognition based on parameters obtained by the object recognition.

The object recognition program according to any one of claims 9 to 11, wherein the parameter includes at least one of a frame position of a recognition image, a recognition object position, a recognition object size, and a tilt of the recognition object.

An object recognition device that recognizes an object in a recognized image by comparing a recognition template image with the recognized image,
Means for receiving an input of a recognized image;
A database storing recognition template images,
Means for comparing the recognition template image read from the database with the recognized image,
An object recognition apparatus comprising: a recognition image obtained by the object recognition as a recognition template image; and a means for associating the template image with parameters at the time of recognition and registering the template image in the database.

14. The object recognition device according to claim 13, wherein when the image to be recognized is composed of a plurality of continuous images, the object recognition is performed on the continuous images at predetermined intervals.

An object recognition device that recognizes an object in a recognized image by comparing a recognition template image with the recognized image,
Means for receiving an input of a recognized image;
A database storing recognition template images,
Means for comparing the recognition template image read from the database with the recognized image,
Means for moving or deforming the recognition image obtained by the object recognition based on the parameters obtained by the object recognition.

16. The object recognition device according to claim 13, wherein the parameter includes at least one of a frame position of the recognition image, a recognition object position, a recognition object size, and a tilt of the recognition object.