JP4643872B2

JP4643872B2 - Image recognition method and apparatus

Info

Publication number: JP4643872B2
Application number: JP2001288574A
Authority: JP
Inventors: 洪涛門
Original assignee: 洪涛門
Priority date: 2001-09-21
Filing date: 2001-09-21
Publication date: 2011-03-02
Anticipated expiration: 2021-09-21
Also published as: JP2003099778A

Description

【０００１】
【発明の属する技術分野】
この発明は、特定の範疇に属する写真等の画像を検出するための画像認識方法及び装置、並びにこれらを実現するためのプログラムに関する。
【０００２】
【従来の技術】
特開平１１−２５２７１号公報には、写真から画像内オブジェクトを抽出するための方法が開示されている。この方法では、階調の連続性を利用して徐々に階調エッジ画像を拡大して画像内オブジェクトを得ている。この方法を利用すれば、特定の画像に類似する画像を自動的に検索することができる。つまり、写真画像等からある程度意味のあるオブジェクトを適宜抽出することができ、多数の写真画像等中から目的とする特定画像（キー画像）に類似する画像を自動的検索することもできる。
【０００３】
【発明が解決しようとする課題】
しかし、上記公報に開示の方法では、例えば猥褻画像のように特定カテゴリに属する画像を的確に検出することができない。すなわち、カテゴリは、抽象的であいまいな概念的広がりを有するものであり、特定カテゴリに属するか否かの基準も必ずしも明確にできない場合が多くその基準も時代とともに推移する可能性があることから、ある画像が特定カテゴリに属するか否かを自動的に適切に判定することはできていない。また、判定の対象が生物等の画像である場合、その姿勢の変化、背景物体の種類・有無など画像に多様性が不可避的に存在するが、このような多様性に対応できる体系的なカテゴリ判定法がまだ存在しない。
【０００４】
そこで、本発明は、特定カテゴリに属するオブジェクトを自動的に抽出することができる画像認識方法等を提供することを目的とする。
【０００５】
また、本発明は、特定カテゴリに属する画像を自動的に検出することができる画像認識方法等を提供することを目的とする。
【０００６】
【課題を解決するための手段】
上記課題を解決するため、本発明の画像認識方法は、記憶した複数の基準パターンと、多様な画像の集合体である学習用画像中から抽象的な所定の範疇に属する画像であるか否かの判定に寄与するものを部品化した状態で自動的に切り出して抽出した幾何的な形状のオブジェクトパターンとを形状に関して回転角度の要素を除外して比較して、例えば前記基準パターンと特定のオブジェクトパターンとの関連性を数値化した値が所定以上になった場合に当該特定のオブジェクトパターンを前記複数の基準パターンの内容に追加する工程と、検査画像中からオブジェクトパターンを抽出する工程と、前記検査画像中から抽出したオブジェクトパターンと前記基準パターンとを比較する工程と、前記検査画像中から抽出したオブジェクトパターンと前記基準パターンとの関連性を数値化した値が所定以上であるものが存在する場合に、前記検査画像が所定の範疇に属する画像であると判定する工程とを備える画像認識方法であって、自動的な切り出しによって抽出された前記幾何的な形状のオブジェクトパターンのうち、所定のオブジェクトパターンを、前記基準パターンとの比較の前に除去し、自動的な切り出しによって抽出された前記幾何的な形状のオブジェクトパターンのうち、除去されないで残ったオブジェクトパターンのランキングを行い、当該ランキングの結果に基づきランキング上位と判定されたオブジェクトパターンを、前記基準テンプレートのうち順位の下のものと入れ替え可能とし、入れ替えた場合に、新たな前記基準テンプレートとして保存する。
【０００７】
上記方法では、前記基準パターンと特定のオブジェクトパターンとの関連性が所定以上になった場合に、当該特定のオブジェクトパターンを前記複数の基準パターンの内容に追加するので、複数の基準パターンを適切なものに更新しつつ検査画像中から抽出したオブジェクトパターンとの関連性を判定することができる。よって、抽象的であいまいな特定カテゴリに属するオブジェクト等を高い確度で自動的に検出することができる。なお、ここでオブジェクトパターンについて「追加」とは、基準パターンにオブジェクトパターンを追加して基準パターンを増やすことのほか、基準パターンのいずれかを追加のオブジェクトパターンと置換することを含む。
【０００８】
上記方法の具体的な態様では、前記検査画像中から抽出したオブジェクトパターンと前記基準パターンとの関連性が所定以上になった場合に、前記検査画像が所定の範疇に属する画像であると判定する工程をさらに備える。この場合、多数の画像から猥褻画像等の所定の範疇に属する画像をある程度の確度で自動的に検出することができる。
【０００９】
上記方法の別の具体的な態様では、例えば前記基準パターンと特定のオブジェクトパターンとの関連性が所定以上になった場合において、さらに前記学習用画像が前記所定の範疇に属する典型画像であると判断された場合に、当該特定のオブジェクトパターンを前記複数の基準パターンの内容に追加する。この場合、典型画像を利用して、上述の複数の基準パターンをより適切なものに更新することができる。
【００１０】
上記方法の別の具体的な態様では、前記所定の範疇に属する追加画像中から抽出したオブジェクトパターンが前記所定の範疇に属する原因となると判断された場合に、前記追加画像中から抽出したオブジェクトパターンを前記複数の基準パターンの内容に追加する工程をさらに備える。この場合、典型画像中の重要なオブジェクトパターンを選択することができ、上述の複数の基準パターンをより適切なものに更新することができる。
【００１１】
上記方法の別の具体的な態様では、前記検査画像中から抽出したオブジェクトパターンと前記基準パターンとの類似性が所定以上になった場合に、前記検査画像中から抽出したオブジェクトパターンを保存する工程をさらに備える。この場合、多数の画像から肖像等の所定の範疇に属するオブジェクトパターンをある程度の確度で自動的に保存することができる。
【００１２】
上記方法の別の具体的な態様では、前記検査画像中から抽出した複数のオブジェクトパターン間の相対的な配置関係を要素として、前記類似性を判断する。この場合、肖像等の所定の範疇に属するオブジェクトパターンをより高い確度で認識することができる。
【００１３】
本発明の画像認識装置は、記憶した複数の基準パターンと、多様な画像の集合体である学習用画像中から抽象的な所定の範疇に属する画像であるか否かの判定に寄与するものを部品化した状態で自動的に切り出して抽出した幾何的な形状のオブジェクトパターンとを形状に関して回転角度の要素を除外して比較して、前記基準パターンと特定のオブジェクトパターンとの関連性を数値化した値が所定以上になった場合に、当該特定のオブジェクトパターンを前記複数の基準パターンの内容に追加する手段と、検査画像中からオブジェクトパターンを抽出する手段と、前記検査画像中から抽出したオブジェクトパターンと前記基準パターンとを比較する手段と、前記検査画像中から抽出したオブジェクトパターンと前記基準パターンとの関連性を数値化した値が所定以上であるものが存在する場合に、前記検査画像が所定の範疇に属する画像であると判定する手段とを備える画像認識装置であって、自動的な切り出しによって抽出された前記幾何的な形状のオブジェクトパターンのうち、所定のオブジェクトパターンを、前記基準パターンとの比較の前に除去し、自動的な切り出しによって抽出された前記幾何的な形状のオブジェクトパターンのうち、除去されないで残ったオブジェクトパターンのランキングを行い、当該ランキングの結果に基づきランキング上位と判定されたオブジェクトパターンを、前記基準テンプレートのうち順位の下のものと入れ替え可能とし、入れ替えた場合に、新たな前記基準テンプレートとして保存する。
【００１４】
上記装置では、前記基準パターンと特定のオブジェクトパターンとの関連性が所定以上になった場合に、当該特定のオブジェクトパターンを前記複数の基準パターンの内容に追加するので、複数の基準パターンを適切なものに更新しつつ検査画像中から抽出したオブジェクトパターンとの関連性を判定することができる。よって、抽象的であいまいな特定カテゴリに属するオブジェクト等を高い確度で自動的に検出することができる。
【００１５】
本発明のコンピュータプログラムは、コンピュータを、記憶した複数の基準パターンと多様な画像の集合体である学習用画像中から抽象的な所定の範疇に属する画像であるか否かの判定に寄与するものを部品化した状態で自動的に切り出して抽出した幾何的な形状のオブジェクトパターンとを形状に関して回転角度の要素を除外して比較して、前記基準パターンと特定のオブジェクトパターンとの関連性を数値化した値が所定以上になった場合に、当該特定のオブジェクトパターンを前記複数の基準パターンの内容に追加する手段と、検査画像中からオブジェクトパターンを抽出する手段と、前記検査画像中から抽出したオブジェクトパターンと前記基準パターンとを比較する手段と、前記検査画像中から抽出したオブジェクトパターンと前記基準パターンとの関連性を数値化した値が所定以上であるものが存在する場合に、前記検査画像が所定の範疇に属する画像であると判定する手段として機能させるためのコンピュータプログラムであって、自動的な切り出しによって抽出された前記幾何的な形状のオブジェクトパターンのうち、所定のオブジェクトパターンを、前記基準パターンとの比較の前に除去し、自動的な切り出しによって抽出された前記幾何的な形状のオブジェクトパターンのうち、除去されないで残ったオブジェクトパターンのランキングを行い、当該ランキングの結果に基づきランキング上位と判定されたオブジェクトパターンを、前記基準テンプレートのうち順位の下のものと入れ替え可能とし、入れ替えた場合に、新たな前記基準テンプレートとして保存する。
【００１６】
上記コンピュータプログラムを組み込んだコンピュータでは、前記基準パターンと特定のオブジェクトパターンとの関連性が所定以上になった場合に、当該特定のオブジェクトパターンを前記複数の基準パターンの内容に追加するので、複数の基準パターンを適切なものに更新しつつ検査画像中から抽出したオブジェクトパターンとの関連性を判定することができる。よって、抽象的であいまいな特定カテゴリに属するオブジェクト等を高い確度で自動的に検出することができる。
【００１７】
【発明の実施の形態】
〔第１実施形態〕
図１は、第１実施形態に係る画像認識装置及び方法を実現するためのネットワークシステムの全体構造を示す。
【００１８】
このネットワークシステムは、通信ネットワークであるインターネットＩＮと、このインターネットＩＮに接続されて画像を含む各種情報を送信する多数のＷＥＢサーバＮＳと、多数のＷＥＢサーバＮＳが提供する画像情報を自動的に監視することができる監視システムＭＳとからなる。
【００１９】
ＷＥＢサーバＮＳは、インターネットＩＮに直接的若しくは間接的に接続するための通信装置を備えたコンピュータシステムであり、インターネットＩＮを介して各種ホームページを公開している。なお、ＷＥＢサーバＮＳが提供するホームページには、各種カテゴリに属する画像が含まれている。
【００２０】
監視システムＭＳも、インターネットＩＮに直接的若しくは間接的に接続するための通信装置を備えたコンピュータシステムであり、多数のＷＥＢサーバＮＳが提供するホームページにアクセス可能になっている。なお、この監視システムＭＳは、多数のＷＥＢサーバＮＳが提供するホームページ中に例えば猥褻画像が含まれているか否かを自動的に監視するための画像検出装置すなわち画像認識装置である。
【００２１】
図２は、図１に示す監視システムＭＳの構造を概念的に説明するブロック図である。図示の監視システムＭＳは、一般的なコンピュータと同様に、ＣＰＵ２１、入力装置２２、表示装置２３、記憶装置２４、及び通信制御装置２５を備えている。
【００２２】
ＣＰＵ２１は、バス１００を介して、表示装置２３、記憶装置２４や通信制御装置２５との間で相互にデータの授受が可能になっている。また、ＣＰＵ２１は、入力装置２２からの指示に基づいて、記憶装置２４や通信制御装置２５から所定のプログラムやデータを読み出し、これらプログラム及びデータに基づく各種処理を実行する。
【００２３】
具体的に説明すると、ＣＰＵ２１は、事前学習プログラムにおいて、入力装置２２からの指示に基づいて、入力装置２２、記憶装置２４等を介して入力された学習用画像中から適宜抽出したオブジェクトパターンを基準パターンである基準テンプレートに追加して、基準テンプレートの適正化を図ることができる。また、ＣＰＵ２１は、ホームページ監視プログラムにおいて、入力装置２２からの指示に基づいて、通信制御装置２５を介して入手したホームページデータに含まれる画像中から適宜オブジェクトパターンを抽出し、このオブジェクトパターンと学習によって更新した基準テンプレートとを比較して画像の猥褻性を自動判定する。
【００２４】
入力装置２２は、キーボード等から構成され、表示装置２３を利用したＧＩＵ操作により、監視システムＭＳを操作するオペレータの意思を反映した指令信号をＣＰＵ２１に出力する。
【００２５】
表示装置２３は、ＣＰＵ２１から入力されるデータに基づいて駆動信号を生成する表示駆動回路と、表示駆動回路から入力される駆動信号に基づいて必要な表示を行うＣＲＴ等により構成され、ＣＰＵ２１からの指令信号に基づいて必要な表示を行う。
【００２６】
記憶装置２４は、監視システムＭＳを動作させる基本プログラム等を複数記憶しているＲＯＭと、アプリケーションプログラム、入力指示、入力データ、処理結果等を一時格納するワークメモリ等のＲＡＭとを備える。さらに、記憶装置２４は、磁気的、或いは光学的な手法によってアプリケーションプログラムやデータを保持することができる記録媒体を駆動するためのドライブを備えており、駆動される記録媒体は、記憶装置２４に固定的に設けたもの、若しくは着脱自在に装着するものとできる。なお、上記アプリケーションプログラムには、事前学習プログラム、ホームページ監視プログラムが含まれ、上記データには、各種画像を含むデータベース等が含まれる。
【００２７】
通信制御装置２５は、ＬＡＮアダプタ等によって構成され、図示を省略するファイヤ・ウォールを介して、インターネットＩＮとの間においてＴＣＰ/ＩＰプロトコルによる通信を可能にしている。
【００２８】
以下、図１に示す監視システムＭＳの主要な動作について説明する。図３は、事前の学習を説明するフローチャートであり、図４〜図６は、図３のフローチャートをさらに具体的に説明するフローチャートである。
【００２９】
監視システムＭＳでは、まず、ＣＰＵ２１が、入力装置２２からの指示に基づいて、記憶装置２４に記憶したデータから初期テンプレートを準備する（ステップＳ２）。初期テンプレートは、ユーザの判断を必要としない固定的なものであり、典型的な猥褻オブジェクトを初期テンプレートとすることができる。具体的には、猥褻画像を構成すると考えられる胸、股、尻等の人体各部の部分画像を予め初期テンプレートとして記憶装置２４に保存しておくことで初期テンプレートとしている。なお、初期テンプレートをユーザ側で準備することもできる。この場合、予め猥褻画像を準備し、この猥褻画像からオブジェクトを切り出し、これを初期画像とする。
【００３０】
次に、ＣＰＵ２１は、入力装置２２からの指示に基づいて、記憶装置２４や通信制御装置２５を介して典型的な猥褻画像を含む複数の学習用サンプル画像を取り込む（ステップＳ４）。なお、取り込まれる複数の学習用サンプル画像は、一般的な猥褻画像のみを含むものではなく、一般的には猥褻と認められないような画像まで含む多様な画像の集合体を構成する。
【００３１】
次に、ＣＰＵ２１は、各学習用サンプル画像から自動的に猥褻なオブジェクトを切り出すとともに、この猥褻オブジェクトをユーザ判断を取り入れつつ当初記憶している初期テンプレートに追加して記憶装置２４に保存することにより、テンプレートの修正を行う（ステップＳ６）。このように、初期テンプレートを多様な学習用サンプル画像とユーザ判断とを利用して修正することで、国や地域、時代の変遷に適合する基準テンプレートすなわち基準パターンを得ることができる。すなわち、以上の工程により、ユーザ側の基準で猥褻と認める画像を検出するための基本的な基準テンプレート(猥褻画像に寄与するものを部品化した画像データ）を学習・準備することができる。
【００３２】
なお、上記ステップＳ４、Ｓ６は、複数回繰返すことができる。これにより、基準テンプレートを複数回更新することができ、よりユーザの判断に近い判定を可能にする基準テンプレートを得ることができる。
【００３３】
また、上記ステップＳ４で、取り込まれる複数の学習用サンプル画像を一般的な猥褻画像のみを含むものとしておくこともできる。この場合、基準テンプレートを猥褻画像に関係するものに精度良く限定することができる。その一方、ユーザの判断が学習用サンプル画像の範囲や傾向の影響を受ける場合が生じやすくなるという問題もある。この解決策としては、学習用サンプル画像の数を増やし、基準テンプレートの数を増やすことが考えられる。
【００３４】
図４は、図３のステップＳ６におけるテンプレートの修正処理を説明するフローチャートである。まず、ＣＰＵ２１は、記憶装置２４等から学習用サンプル画像のうち未評価の任意の特定画像を読み出す（ステップＳ６１）。
【００３５】
次に、ＣＰＵ２１は、図３のステップＳ２で得た初期テンプレートを利用して、ステップＳ６１で読み出した特定画像の猥褻度を判定する（ステップＳ６２）。後に詳細に説明するが、猥褻度の判定にあたっては、対象となる特定画像からオブジェクトを自動的に切り出し、切り出したオブジェクトと初期テンプレートとの相関性すなわち関連性を数値化した猥褻度を出力する。
【００３６】
次に、ＣＰＵ２１は、すべての学習用サンプル画像について猥褻度を判定したか否かを判断し（ステップＳ６３）、すべての学習用サンプル画像について猥褻度評価が終了するまで、ステップＳ６１、Ｓ６２を繰り返す。
【００３７】
次に、ＣＰＵ２１は、各学習用サンプル画像について得た猥褻度にもとづいて、一群の学習用サンプル画像について猥褻度のランキングを行い、表示装置２３にランキング結果を表示させる（ステップＳ６４）。
【００３８】
次に、ＣＰＵ２１は、各学習用サンプル画像の猥褻度ランキングに際して得られたワースト・オブジクトのリストを作成し、表示装置２３にそのリストや対応オブジェクトを表示させる（ステップＳ６５）。
【００３９】
その後、ＣＰＵ２１は、ランキング上位と判定された猥褻画像に対応するワースト・オブジェクトを、初期テンプレートを構成するオブジェクトと入れ替え、基準テンプレートとして記憶装置２４に保存する（ステップＳ６６）。例えば、初期テンプレートが１０であり、ワースト・オブジェクトを５つ追加する場合、初期テンプレートは残り５つとなる。なお、初期テンプレートには順位が付してあり、ワースト・オブジェクトの追加に際して削除される初期テンプレートは順位の下のものとなっている。
【００４０】
次に、ＣＰＵ２１は、表示装置２３を介して、ユーザに対しランキングと関係なく全学習用サンプル画像から特に猥褻と認めるものを適宜選択させる（ステップＳ６７）。具体的には、入力装置２２及び表示装置２３を利用して、ユーザが、ランキング表示された学習用サンプル画像から自己の判断で猥褻と考えるものを例えば上位５つ選択する。
【００４１】
ユーザがステップＳ６７で猥褻画像を選択した場合、ＣＰＵ２１は、選択された猥褻画像に対応するオブジェクトを基本テンプレートを構成するオブジェクトに追加或いは入れ替えて、基準テンプレートとして記憶装置２４に再保存する（ステップＳ６８）。例えば、ステップＳ６６で初期テンプレートを更新した基準プレートが１０であれば、このステップＳ６８で猥褻画像を構成するオブジェクトを５つ追加することになる。
【００４２】
ユーザがステップＳ６７で猥褻画像を選択しなかった場合、ＣＰＵ２１は、ステップＳ６６で得た基準テンプレートをそのまま最終的な基準テンプレートとして保存し、処理を終了する。
【００４３】
図５は、図４のステップＳ６２における猥褻度判定の処理を説明するフローチャートである。まず、ＣＰＵ２１は、判定の対象となる特定画像からオブジェクトを自動的に切り出す（ステップＳ６２１）。オブジェクトの切出に際しては、公知の各種技術を用いることができる。例えば、「従来技術」の欄で引用した公報に開示のように、対象画像を階調化したエッジ画像を取り出し、階調の低いエッジ画像から連続的に階調が増加する領域を拡大統合することによってオブジェクトを切り出すことができる。
【００４４】
次に、ＣＰＵ２１は、ステップＳ６２１で得たオブジェクトから無意味なオブジェクトを除去する（ステップＳ６２２）。具体的には、小さなオブジェクトや背景を構成すると考えられるオブジェクトが、検出された一群のオブジェクトから除去される。例えば、オブジェクトの専有面積比が所定のしきい値以下の場合や、専有面積比が所定のしきい値以上であっても画像の隅部分に偏って検出されたオブジェクトは、微小オブジェクトや背景オブジェクトと判断され、処理速度向上のために判定に利用しない。
【００４５】
次に、ＣＰＵ２１は、ステップＳ６２２で残った各オブジェクトに付いて、オブジェクト特性を計算する（ステップＳ６２３）。このオブジェクト特性は、色、形状、テクスチャ、サイズ等の各種パラメータからなる多次元ベクトルとして表現される。なお、形状に関しては、回転角度を要素として含まないものとすることができる。これにより、男性や女性の肉体の一部である対象物の姿勢等の要素がオブジェクト特性に含まれないようにすることができる。
【００４６】
次に、ＣＰＵ２１は、ステップＳ６２３で特性を算出した各オブジェクトについて、全基準テンプレートとの相関をとりつつ類似性の総合評価を行うことにより、画像全体としての猥褻度を計算する（ステップＳ６２４）。
【００４７】
図６は、図５のステップＳ６２４における猥褻度判定の処理を説明するフローチャートである。まず、ＣＰＵ２１は、オブジェクト特性を算出した特定オブジェクトについて相関度を算出する（ステップＳ６２４ａ）。具体的には、この特定オブジェクトの特性と特定の基準テンプレートの特性との差を各パラメータの重み付けを考慮して線形的或いは非線形的に求め、相関度すなわち類似性を総合的に評価する。なお、基準テンプレートのオブジェクト特性は、ステップＳ６２３で説明したと同様の手法によって予め計算されている。
【００４８】
次に、ＣＰＵ２１は、ステップＳ６２４ａで計算の対象となった特定オブジェクトについて、すべての基準テンプレートとの間で相関度の計算が終了したか否かを判断する（ステップＳ６２４ｂ）。
【００４９】
ステップＳ６２４ｂで全ての基準テンプレートについての相関度計算が終了していないと判断された場合、ステップＳ６２４ａの相関度計算を、残った基準テンプレートについて順次繰返す。
【００５０】
ステップＳ６２４ｂで全ての基準テンプレートについての相関度計算が終了していると判断された場合、ＣＰＵ２１は、判定対象の画像から切り出して得た全ての対象オブジェクトについて、相関度の計算が終了したか否かを判断する（ステップＳ６２４ｃ）。
【００５１】
ステップＳ６２４ｂで全ての対象オブジェクトについての相関度計算が終了していないと判断された場合、ステップＳ６２４ａの相関度計算を、残った対象オブジェクトについて順次繰返す。
【００５２】
ステップＳ６２４ｂで全ての対象オブジェクトについての相関度計算が終了していると判断された場合、ＣＰＵ２１は、各対象オブジェクト毎に得た基準オブジェクトに対する相関度を、サイズや配置を考慮して線形的或いは非線形的にさらに加算することにより、関連性に関する平均的指数すなわち特定画像全体としての猥褻性の程度を示す猥褻度を得る（ステップＳ６２４ｄ）。なお、非線形的な処理では、高い相関度のものが存在すれば、単なる加算よりも高い猥褻度を示し、低い相関度のものが存在すれば、単なる加算よりも低い猥褻度を示すような処理、例えば相関度の数値を３乗や４乗した値を加算するような評価を行うことができる。
【００５３】
図７は、ＷＥＢサイトの監視を説明するフローチャートである。まず、ＣＰＵ２１は、通信制御装置２５を介して適当なＷＥＢサイトに自動的に接続する（ステップＳ１１）。
【００５４】
次に、ＣＰＵ２１は、通信制御装置２５を介して上記ＷＥＢサイトのホームページの画像データ（すなわち検査画像）をダウンロードし、画像ファイルを記憶装置２４に保存する（ステップＳ１２）。画像ファイルは、ＷＥＢサイト単位で収集することができ、ある既定数のホームページの画像データを一括して収集することもできる。例えば、特定のＷＥＢサイトに関してルートディレクトリを定め、それ以下からjpg、gif等のイメージファイルのフルパスを記録したイメージファイルの一覧表を作成し、記憶装置２４に設けた特定ディレクトリに統一したファイル形式で格納することができる。
【００５５】
次に、ＣＰＵ２１は、図３のステップＳ６で得た基準テンプレートを利用して、画像ファイルについて猥褻度を判定する（ステップＳ１３）。猥褻度の判定は、図５に示すものとほぼ同様であるが、単一ではなく多数の画像ファイルについて同様の処理で猥褻度を判定する点が異なる。
【００５６】
次に、ＣＰＵ２１は、これら画像ファイル中に猥褻画像が存在するか否かを判断する（ステップＳ１４）。具体的には、各画像ファイルに関する猥褻度があるしきい値を超えたか否かを判断し、いずれかの画像ファイルで猥褻度が所定のしきい値を超えていれば、猥褻画像が存在すると判断する。
【００５７】
ステップＳ１４で猥褻画像を検出したと判断した場合、ＣＰＵ２１は、表示装置２３に猥褻画像の検出を警告表示する（ステップＳ１５）。この際、ＣＰＵ２１は、検出した猥褻画像をサムネールやファイルリストとして、ＷＥＢサイトのアドレス等とともに表示装置２３に表示させることができ、これらの表示情報を記憶装置２４に保存する。この結果は、例えば離れた場所で別のコンピュータシステムを操作する管理者にメール等の手段を利用して自動的に通報される。
【００５８】
次に、ＣＰＵ２１は、監視対象のＷＥＢサイトの全てについて猥褻度判定が終了したか否かを判断し、全ての対象ＷＥＢサイトについて猥褻度判定が終了していないと判断した場合、ステップＳ１１〜Ｓ１５の処理を繰返す。
【００５９】
以上により、監視対象のＷＥＢサイトに猥褻な画像が含まれるか否かを自動的に監視することができる。この際、本実施形態の方法によれば、猥褻画像を構成する対象の姿勢や背景の影響がキャンセルされ、高い検出精度を得ることができる。
【００６０】
現在まで、インターネット上で公開されている画像が猥褻画像であるかどうかは、人間の肉眼によって判断されている。これは、大変な労力のかかる作業であり、日々に変わるインターネットのコンテンツのチェックには限界がある。安心してネットサーフィンしたいユーザーのために、健全なサービスの提供を目指すインターネットコンテンツ業者にとって、肉眼作業以外の手段で猥褻画像を自動的に検出できる技術の開発が急務になって来ている。最近では、未成年のユーザーを含む一般コンシューマ向けにＩＰＳ(インターネット・プロバイダー・サービス)を提供する事業業者にとって、猥褻画像に対するチェックが義務つけられている。本実施形態の監視システムＭＳは、従来の肉眼による識別方法に代わるものであり、コンピュータによるＷＥＢサイト上における猥褻画像の自動検出が可能になる。すなわち、本監視システムＭＳは、従来の人間の肉眼による判断しかできなかった猥褻画像の識別を、コンピュータによる自動検出、自動認識として実現することができる。これにより、大変な労力とコストを削減でき、インターネットコンテンツの管理、監視の自動化に大きく貢献することができる。また、画像コンテンツの管理、監視の自動化により、一層合理的で低コスト、柔軟性を持つシステムが構築できる。さらに、ＩＰＳ業者にとっては、猥褻画像の放任による行政指導や営業停止の処分を受ける可能性も一段少なくなるという利益が生じる。
【００６１】
図８は、図３〜図７に示す動作におけるデータの流れを視覚的に説明する図であり、図８（ａ）は、監視システムＭＳの事前学習処理を示し、図８（ｂ）は、監視システムＭＳのホームページ監視処理を示す。
【００６２】
図８（ａ）に示す事前学習処理では、自動学習部が、学習用サンプル画像を初期テンプレートを利用して猥褻画像と非猥褻画像とに分類し、その結果をユーザー判断で修正することによって、基準テンプレートを得る。
【００６３】
図８（ｂ）に示すホームページ監視処理では、自動判定部が、図８（ａ）で得た基準テンプレートに基づいて、監視対象画像を自動的に猥褻画像と非猥褻画像とに分別する。
【００６４】
なお、上記実施形態では、ＷＥＢサイトにおいて猥褻画像を検出したが、それ以外のネットワーク、例えば社内のコンピュータネットワーク等において、猥褻画像ファイルが存在すれば、これを簡易・確実に検出することもできる。
【００６５】
〔第２実施形態〕
第２実施形態に係る画像認識装置び方法は、第１実施形態の装置び方法を肖像切り出しのために変形したものである。
【００６６】
事前学習処理は、基本的に第１実施形態のものと同様である。したがって、全体的な処理としては、図３と同様のものが行われ、図４に対応する処理として図９に示す処理が行われ、図５に対応する処理として図１０に示す処理が行われ、図６に対応する処理として図１１に示す処理が行われる。
【００６７】
なお、肖像の初期テンプレートは、肖像から適宜切り出したオブジェクトとすることができ、顔、首、髪の毛等を含む。このように、髪の毛も顔画像のテンプレートに含めるのは、実際に使用される顔写真が髪の毛も含むためである。学習用サンプル画像は、例えば人物の正面スタイルの顔写真とすることができ、女性、男性、髪の毛の薄い又は濃い等、典型的なものについて何パターンかの画像を用意する。
【００６８】
また、学習用サンプル画像のランキング後の画像再追加に際しては、切り出したい肖像の性別、民族等を考慮したものとすることができる。また、テンプレートの修正によって得られる基準テンプレートは、一枚の学習サンプル画像から得られる一群のオブジェクト（例えば、顔、首を含めた皮膚色オブジェクト、髪の毛オブジェクト）及びこれらの位置関係を組み合わせた複合的なものとなっている。このような基準テンプレートを構成するオブジェクトの輪郭は、場合によって手動で調整することもできる。結果的に得られた複数枚の複合的な基準テンプレートは、肖像の画像すなわち顔画像オブジェクトの自動識別及び切り出しの際に使用される。
【００６９】
図１２は、肖像の切り出しを説明するフローチャートである。まず、ＣＰＵ２１は、ＷＥＢサイトや記憶装置２４中の画像データを読み出す（ステップＳ２０１）。
【００７０】
次に、ＣＰＵ２１は、切り出しの処理対象範囲の選択に関する処理を行う（ステップＳ２０２）。具体的には、画像データに対応するプレビュー画像を表示装置２３に表示させる。このプレビュー画像は、何段階かで拡大・縮小表示可能であり、適当なツールを用いてプレビュー画像中で範囲選択を行うことができる。この際、ＣＰＵ２１は、選択範囲や原画像が一定サイズ以下の場合、画質が良くない旨の警告を表示装置２３に行わせる。
【００７１】
次に、ＣＰＵ２１は、対象画像のサイズをチェックする（ステップＳ２０３）。対象画像のサイズが一定値を超えていなければ、そのまま原画像で以後の処理を行う。対象画像のサイズが一定値を超えている場合、▲１▼対象画像のサイズを変更しないで後の処理を行う超高解像度モードと、▲２▼対象画像のサイズを縮小する一般モードとをオペレータに適宜選択させる。
【００７２】
次に、ＣＰＵ２１は、図９の処理で得た基準テンプレートを利用して、画像ファイルから得た対象画像について肖像度を判定する（ステップＳ２０４）。肖像度の判定は、図１０に示すものとほぼ同様である。
【００７３】
次に、ＣＰＵ２１は、これら対象画像中に肖像が存在するか否かを判断する（ステップＳ２０５）。具体的には、対象画像について得た肖像度があるしきい値を超えたか否かや、オブジェクトの配置的要素等を判断し、これらの肖像度等が所定の基準を満たせば、肖像が存在すると判断する。
【００７４】
肖像度の算出及び肖像の存否判断に際しては、切り出したオブジェクトについて、相関度すなわち類似度が最も大きくなる基準テンプレートが、顔、首を含めた皮膚色のテンプレートオブジェクトであれば、顔面候補オブジェクトと認定する。また、類似度が最も大きくなる基準テンプレートが、髪の毛のテンプレートオブジェクトであれば、髪の毛候補オブジェクトと認定する。この際、類似度が画像の中心領域からの距離より重み付け修正される。すべての顔面候補オブジェクトについて、一つ以上の髪の毛候補オブジェクトと隣り合って、隣り合う部分の輪郭線の合計が全体のうち所定の範囲内の値でなければ、同顔面候補オブジェクトを肖像度算出の対象から除外する。すべての髪の毛候補オブジェクトについても、同様な除外処理を行って、切り出したオブジェクトの配置関係を考慮して自動的な弁別を行う。このとき、顔面候補オブジェクトと髪の毛候補オブジェクトとの数が必ずしも一致する必要はない。なお、顔面候補オブジェクトの数が既定の最大値を超えているなら、類似度の低いものとこれに隣接する髪の毛候補オブジェクトを更に除外する。
【００７５】
なお、上記説明は、頭部のオブジェクト抽出を意味するが、衣服のオブジェクトも同様に抽出することができる。この場合、顔面候補オブジェクトに隣接して髪の毛候補オブジェクトの反対側に、所定の予想範囲にあるオブジェクトを衣服候補オブジェクトとして抽出する。この予想範囲は、例えば顔面候補オブジェクトの大きさから算定するものとする。
【００７６】
ステップＳ２０４で肖像を検出したと判断した場合、ＣＰＵ２１は、表示装置２３に肖像画から得たオブジェクトを表示する（ステップＳ２０６）。この際、すべての切り出しオブジェクトを表示することもできるが、対象画像の指定範囲にある一部オブジェクトのみを表示することもできる。オブジェクトの表示方法は、対象画像中のオブジェクトが存在する領域を楕円等で範囲表示し、そこにあるオブジェクトを重なりなくツリー表示するものとすることができる。切り出したオブジェクトのうち不要なものがあれば、上記のようなツリー表示のオブジェクトのいずれかを入力装置２２で指定し、オブジェクトのグループから除去することができる。
【００７７】
次に、ＣＰＵ２１は、切り出したオブジェクトの輪郭修正を受け付ける処理を行う（ステップＳ２０７）。ＣＰＵ２１は、切り出したオブジェクトに対応して、表示装置２３に表示した対象画像のプレビューに重ねて、鮮明な色彩で輪郭線を表示する。この輪郭線は、入力装置２２からの指示に基づいて修正することができる。例えば、輪郭線に沿って先鋭度が極値をとるポイントを一定密度以下で自動検出し、これらのポイントを連結することによって輪郭線用の操作ハンドルを得る。そして、隣接ポイント間には、入力装置２２からの指示に基づいて強制的にポイントを形成することができ、逆に不要なポイントは間引くこともできる。このようにして得た操作ハンドルは、入力装置２２を利用して修正することができ、ユーザの判断を付加した適切な輪郭検出が可能になる。
【００７８】
次に、ＣＰＵ２１は、輪郭を決定したオブジェクトを幾何的に切り出し、切り出した画像を記憶装置２４に保存する処理を行う（ステップＳ２０８）。オブジェクトは、上記のようにして得た輪郭に沿って切り出すこともできるが、任意の矩形領域、楕円領域、円形領域内に輪郭が収まるような画像として切り抜くことができる。
【００７９】
以上により、適当な画像ファイルから肖像を自動的に簡易に切り出して保存、蓄積することができる。このようにして得た肖像ファイルは、携帯電話やパソコン間で行われる電子メールに顔写真ファイルとして添付することができる。また、このような肖像ファイルは、各種画像作成を簡易なものとする。例えば、記念用の顔写真の寄せ書き、顔写真のシール等の多様な用途で、上記のような肖像ファイルが活かされる。
【００８０】
以上実施形態に即して本発明を説明したが、本発明は上記実施形態に限定されるものではない。例えば上記実施形態では、猥褻画像や肖像画を自動検出したが、他の抽象的な範疇の画像を自動検出する場合にも、本発明の方法を適用することができる。
【図面の簡単な説明】
【図１】第１実施形態の画像認識装置及び方法が活用されるネットワークシステムの全体構造を示す。
【図２】図１に示す監視システムの構造を概念的に説明するブロック図である。
【図３】事前の学習を説明するフローチャートである。
【図４】事前の学習の一部を説明するフローチャートである。
【図５】事前の学習の一部を説明するフローチャートである。
【図６】事前の学習の一部を説明するフローチャートである。
【図７】ＷＥＢサイトの監視を説明するフローチャートである。
【図８】（ａ）は、監視システムの事前学習処理を示し、（ｂ）は、監視システムのホームページ監視処理を示す。
【図９】第２実施形態の画像認識装置における事前学習の一部を説明するフローチャートである。
【図１０】第２実施形態の画像認識装置における事前学習の一部を説明するフローチャートである。
【図１１】第２実施形態の画像認識装置における事前学習の一部を説明するフローチャートである。
【図１２】第２実施形態の画像認識装置における肖像画像の切り出しを説明するフローチャートである。
【符号の説明】
２２入力装置
２３表示装置
２４記憶装置
２５通信制御装置
２１ＣＰＵ
ＩＮインターネット
ＭＳ監視システム[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image recognition method and apparatus for detecting an image such as a photograph belonging to a specific category, and a program for realizing them.
[0002]
[Prior art]
Japanese Patent Application Laid-Open No. 11-25271 discloses a method for extracting an in-image object from a photograph. In this method, an in-image object is obtained by gradually enlarging a gradation edge image using gradation continuity. If this method is used, an image similar to a specific image can be automatically searched. That is, an object having some meaning can be appropriately extracted from a photographic image or the like, and an image similar to a target specific image (key image) can be automatically searched from a large number of photographic images.
[0003]
[Problems to be solved by the invention]
However, the method disclosed in the above publication cannot accurately detect an image belonging to a specific category such as a cocoon image. That is, a category has an abstract and ambiguous conceptual spread, and the criteria for whether or not it belongs to a specific category may not always be clear, and the criteria may change with the times. Whether or not an image belongs to a specific category cannot be automatically and appropriately determined. In addition, when the target of the determination is an image of a living organism or the like, diversity exists unavoidably in the image, such as changes in posture, background object type / presence / absence, but a systematic category that can handle such diversity There is no judgment method yet.
[0004]
Accordingly, an object of the present invention is to provide an image recognition method and the like that can automatically extract objects belonging to a specific category.
[0005]
It is another object of the present invention to provide an image recognition method and the like that can automatically detect images belonging to a specific category.
[0006]
[Means for Solving the Problems]
In order to solve the above problems, an image recognition method of the present invention includes a plurality of stored reference patterns, It is a collection of various images From the learning image In a state where components that contribute to the determination of whether an image belongs to an abstract predetermined category or not The object pattern of the geometric shape automatically cut out and extracted Excluding the rotation angle element for the shape In comparison, for example, the relationship between the reference pattern and a specific object pattern Is a numerical value of Adding the specific object pattern to the contents of the plurality of reference patterns when the value exceeds a predetermined value, extracting the object pattern from the inspection image, extracting the object pattern from the inspection image, and the A step of comparing a reference pattern, and a relationship between the object pattern extracted from the inspection image and the reference pattern Is a numerical value of Is greater than or equal to There is something that An image recognition method comprising: determining that the inspection image is an image belonging to a predetermined category, wherein an object pattern having a predetermined shape out of the geometrically-shaped object pattern extracted by automatic clipping is provided. The object pattern is removed before the comparison with the reference pattern, and among the object patterns of the geometric shape extracted by automatic cutout, the remaining object patterns that are not removed are ranked, and the ranking result The object pattern determined to be higher in ranking based on the above can be replaced with a lower one of the reference templates, and when it is replaced, it is stored as a new reference template .
[0007]
In the above method, when the relationship between the reference pattern and the specific object pattern becomes equal to or greater than a predetermined value, the specific object pattern is added to the contents of the plurality of reference patterns. The relevance with the object pattern extracted from the inspection image can be determined while updating the image. Therefore, it is possible to automatically detect objects belonging to a specific category that is abstract and ambiguous with high accuracy. Here, “adding” an object pattern includes adding an object pattern to the reference pattern to increase the reference pattern and replacing any of the reference patterns with an additional object pattern.
[0008]
In a specific aspect of the above method, when the relationship between the object pattern extracted from the inspection image and the reference pattern is equal to or greater than a predetermined value, the inspection image is determined to be an image belonging to a predetermined category. The method further includes a step. In this case, an image belonging to a predetermined category such as a bag image can be automatically detected from a large number of images with a certain degree of accuracy.
[0009]
In another specific aspect of the above method, for example, when the relationship between the reference pattern and a specific object pattern is greater than or equal to a predetermined value, the learning image is a typical image belonging to the predetermined category. If it is determined, the specific object pattern is added to the contents of the plurality of reference patterns. In this case, the above-described plurality of reference patterns can be updated to a more appropriate one using the typical image.
[0010]
In another specific aspect of the above method, the object pattern extracted from the additional image when it is determined that the object pattern extracted from the additional image belonging to the predetermined category is a cause of belonging to the predetermined category. Is further added to the contents of the plurality of reference patterns. In this case, an important object pattern in the typical image can be selected, and the plurality of reference patterns described above can be updated to a more appropriate one.
[0011]
In another specific aspect of the above method, the step of storing the object pattern extracted from the inspection image when the similarity between the object pattern extracted from the inspection image and the reference pattern exceeds a predetermined value Is further provided. In this case, object patterns belonging to a predetermined category such as portraits from a large number of images can be automatically saved with a certain degree of accuracy.
[0012]
In another specific aspect of the above method, the similarity is determined using a relative arrangement relationship between a plurality of object patterns extracted from the inspection image as an element. In this case, an object pattern belonging to a predetermined category such as a portrait can be recognized with higher accuracy.
[0013]
The image recognition apparatus of the present invention includes a plurality of stored reference patterns, It is a collection of various images From the learning image In a state where components that contribute to the determination of whether an image belongs to an abstract predetermined category or not The object pattern of the geometric shape automatically cut out and extracted Excluding the rotation angle element for the shape Compare the reference pattern with a specific object pattern Is a numerical value of Means for adding the specific object pattern to the contents of the plurality of reference patterns, means for extracting an object pattern from the inspection image, and an object pattern extracted from the inspection image A means for comparing the reference pattern, and an association between the object pattern extracted from the inspection image and the reference pattern Is a numerical value of Is greater than or equal to There is something that An image recognition apparatus comprising: a means for determining that the inspection image is an image belonging to a predetermined category, wherein the predetermined shape is selected from the geometrically-shaped object patterns extracted by automatic clipping. The object pattern is removed before the comparison with the reference pattern, and among the object patterns of the geometric shape extracted by automatic cutout, the remaining object patterns that are not removed are ranked, and the ranking result The object pattern determined to be higher in ranking based on the above can be replaced with a lower one of the reference templates, and when it is replaced, it is stored as a new reference template .
[0014]
In the above apparatus, when the relationship between the reference pattern and the specific object pattern becomes equal to or greater than a predetermined value, the specific object pattern is added to the contents of the plurality of reference patterns. The relevance with the object pattern extracted from the inspection image can be determined while updating the image. Therefore, it is possible to automatically detect objects belonging to a specific category that is abstract and ambiguous with high accuracy.
[0015]
The computer program of the present invention includes a computer and a plurality of stored reference patterns. It is a collection of various images From the learning image In a state where components that contribute to the determination of whether an image belongs to an abstract predetermined category or not The object pattern of the geometric shape automatically cut out and extracted Excluding the rotation angle element for the shape Compare the reference pattern with a specific object pattern Is a numerical value of Means for adding the specific object pattern to the contents of the plurality of reference patterns, means for extracting an object pattern from the inspection image, and an object pattern extracted from the inspection image A means for comparing the reference pattern, and an association between the object pattern extracted from the inspection image and the reference pattern Is a numerical value of Is greater than or equal to There is something that A computer program for functioning as a means for determining that the inspection image is an image belonging to a predetermined category, wherein a predetermined one of the geometrically shaped object patterns extracted by automatic cut-out The object pattern is removed before comparison with the reference pattern, and among the object patterns of the geometric shape extracted by automatic cutout, the remaining object patterns that are not removed are ranked, and the ranking is performed. Result of The object pattern determined to be higher in ranking based on the above can be replaced with a lower one of the reference templates, and when it is replaced, it is stored as a new reference template .
[0016]
In the computer in which the computer program is incorporated, when the relationship between the reference pattern and the specific object pattern becomes a predetermined value or more, the specific object pattern is added to the contents of the plurality of reference patterns. The relevance with the object pattern extracted from the inspection image can be determined while updating the reference pattern to an appropriate one. Therefore, it is possible to automatically detect objects belonging to a specific category that is abstract and ambiguous with high accuracy.
[0017]
DETAILED DESCRIPTION OF THE INVENTION
[First Embodiment]
FIG. 1 shows the overall structure of a network system for realizing the image recognition apparatus and method according to the first embodiment.
[0018]
This network system automatically monitors the Internet IN which is a communication network, a large number of WEB servers NS that are connected to the Internet IN and transmit various types of information including images, and image information provided by the large numbers of WEB servers NS. A monitoring system MS that can
[0019]
The WEB server NS is a computer system provided with a communication device for connecting directly or indirectly to the Internet IN, and publishes various homepages via the Internet IN. The home page provided by the WEB server NS includes images belonging to various categories.
[0020]
The monitoring system MS is also a computer system provided with a communication device for connecting directly or indirectly to the Internet IN, and can access home pages provided by a large number of WEB servers NS. Note that the monitoring system MS is an image detection device, that is, an image recognition device for automatically monitoring whether, for example, a spider image is included in a homepage provided by a large number of WEB servers NS.
[0021]
FIG. 2 is a block diagram conceptually illustrating the structure of the monitoring system MS shown in FIG. The illustrated monitoring system MS includes a CPU 21, an input device 22, a display device 23, a storage device 24, and a communication control device 25, similarly to a general computer.
[0022]
The CPU 21 can exchange data with the display device 23, the storage device 24, and the communication control device 25 via the bus 100. Further, the CPU 21 reads a predetermined program or data from the storage device 24 or the communication control device 25 based on an instruction from the input device 22 and executes various processes based on the program and data.
[0023]
More specifically, the CPU 21 uses the object pattern appropriately extracted from the learning image input via the input device 22, the storage device 24, or the like based on an instruction from the input device 22 in the pre-learning program. In addition to the reference template, which is a pattern, the reference template can be optimized. Further, the CPU 21 appropriately extracts an object pattern from an image included in the homepage data obtained via the communication control device 25 based on an instruction from the input device 22 in the homepage monitoring program, and learns this object pattern and learning. The inertia of the image is automatically determined by comparing with the updated reference template.
[0024]
The input device 22 is composed of a keyboard and the like, and outputs a command signal reflecting the intention of the operator who operates the monitoring system MS to the CPU 21 by a GIU operation using the display device 23.
[0025]
The display device 23 includes a display drive circuit that generates a drive signal based on data input from the CPU 21, a CRT that performs necessary display based on the drive signal input from the display drive circuit, and the like. Necessary display is performed based on the command signal.
[0026]
The storage device 24 includes a ROM that stores a plurality of basic programs for operating the monitoring system MS, and a RAM such as a work memory that temporarily stores application programs, input instructions, input data, processing results, and the like. Furthermore, the storage device 24 includes a drive for driving a recording medium that can hold an application program and data by a magnetic or optical technique. The driven recording medium is stored in the storage device 24. It can be fixedly mounted or detachably mounted. The application program includes a pre-learning program and a homepage monitoring program, and the data includes a database including various images.
[0027]
The communication control device 25 is configured by a LAN adapter or the like, and enables communication with the Internet IN using the TCP / IP protocol via a firewall (not shown).
[0028]
Hereinafter, main operations of the monitoring system MS shown in FIG. 1 will be described. FIG. 3 is a flowchart for explaining pre-learning, and FIGS. 4 to 6 are flowcharts for explaining the flowchart of FIG. 3 more specifically.
[0029]
In the monitoring system MS, first, the CPU 21 prepares an initial template from the data stored in the storage device 24 based on an instruction from the input device 22 (step S2). The initial template is a fixed template that does not require user judgment, and a typical bag object can be used as the initial template. Specifically, a partial image of each part of the human body such as a chest, a crotch, and a buttocks that is considered to constitute a heel image is stored in advance in the storage device 24 as an initial template, thereby forming an initial template. An initial template can be prepared by the user. In this case, a haze image is prepared in advance, an object is cut out from this haze image, and this is used as an initial image.
[0030]
Next, the CPU 21 captures a plurality of learning sample images including a typical spider image via the storage device 24 and the communication control device 25 based on an instruction from the input device 22 (step S4). Note that the plurality of learning sample images to be captured do not include only a general wrinkle image, but generally constitute a collection of various images including images that are not recognized as wrinkles.
[0031]
Next, the CPU 21 automatically cuts out an obscene object from each learning sample image, and adds the obscured object to the initially stored initial template while taking the user's judgment, and saves it in the storage device 24. Then, the template is corrected (step S6). In this way, by correcting the initial template using various learning sample images and user judgments, it is possible to obtain a reference template, that is, a reference pattern, adapted to changes in countries, regions, and times. That is, through the above-described steps, a basic reference template for detecting an image that is recognized as a wrinkle on the user's basis can be learned and prepared.
[0032]
The above steps S4 and S6 can be repeated a plurality of times. Thereby, the reference template can be updated a plurality of times, and a reference template that enables a determination closer to the user's determination can be obtained.
[0033]
In addition, in step S4, the plurality of learning sample images that are captured may include only a general eyelid image. In this case, the reference template can be accurately limited to those related to the haze image. On the other hand, there is also a problem that the user's judgment is likely to be affected by the range and tendency of the learning sample image. As a solution, it is conceivable to increase the number of learning sample images and increase the number of reference templates.
[0034]
FIG. 4 is a flowchart for explaining the template correction processing in step S6 of FIG. First, the CPU 21 reads an arbitrary unevaluated specific image among the learning sample images from the storage device 24 or the like (step S61).
[0035]
Next, the CPU 21 determines the intensity of the specific image read in step S61 using the initial template obtained in step S2 of FIG. 3 (step S62). As will be described in detail later, in determining the degree of accuracy, an object is automatically cut out from the target specific image, and the degree of correlation between the cut-out object and the initial template, that is, the degree of association, is output.
[0036]
Next, the CPU 21 determines whether or not the degrees of determination have been determined for all of the learning sample images (step S63), and steps S61 and S62 are repeated until the degrees of evaluation have been completed for all of the learning sample images. .
[0037]
Next, the CPU 21 ranks the degrees of the group of learning sample images based on the degrees obtained for each learning sample image, and displays the ranking result on the display device 23 (step S64).
[0038]
Next, the CPU 21 creates a list of worst objects obtained at the time of ranking the learning sample images, and displays the list and corresponding objects on the display device 23 (step S65).
[0039]
Thereafter, the CPU 21 replaces the worst object corresponding to the wrinkle image determined to be higher in the ranking with the object constituting the initial template, and stores it in the storage device 24 as a reference template (step S66). For example, when the initial template is 10 and five worst objects are added, the remaining five initial templates are left. The initial templates are given a rank, and the initial templates that are deleted when the worst object is added are in the lower rank.
[0040]
Next, the CPU 21 causes the user to appropriately select what is particularly recognized as a wrinkle from all the learning sample images regardless of the ranking via the display device 23 (step S67). Specifically, using the input device 22 and the display device 23, the user selects, for example, the top five that are considered to be a habit by his / her own judgment from the sample images for ranking displayed.
[0041]
When the user selects the cocoon image in step S67, the CPU 21 adds or replaces the object corresponding to the selected cocoon image with the object constituting the basic template, and re-saves it in the storage device 24 as the reference template (step S68). ). For example, if the reference plate whose initial template is updated in step S66 is 10, five objects constituting the eyelid image are added in step S68.
[0042]
If the user does not select the eyelid image in step S67, the CPU 21 stores the reference template obtained in step S66 as it is as the final reference template, and ends the process.
[0043]
FIG. 5 is a flowchart for explaining the process of determining the degree in step S62 of FIG. First, the CPU 21 automatically cuts out an object from a specific image to be determined (step S621). Various known techniques can be used for cutting out the object. For example, as disclosed in the publication cited in the “Prior Art” column, an edge image obtained by gradationizing a target image is extracted, and an area where gradation is continuously increased from an edge image having low gradation is expanded and integrated. The object can be cut out.
[0044]
Next, the CPU 21 removes a meaningless object from the object obtained in step S621 (step S622). Specifically, small objects and objects that are considered to constitute the background are removed from the detected group of objects. For example, when the exclusive area ratio of an object is less than or equal to a predetermined threshold, or even if the exclusive area ratio is greater than or equal to a predetermined threshold, Therefore, it is not used for the determination to improve the processing speed.
[0045]
Next, the CPU 21 calculates an object characteristic for each object remaining in step S622 (step S623). This object characteristic is expressed as a multidimensional vector composed of various parameters such as color, shape, texture, and size. In addition, regarding a shape, a rotation angle cannot be included as an element. Thereby, it is possible to prevent elements such as the posture of an object that is a part of a male or female body from being included in the object characteristics.
[0046]
Next, the CPU 21 performs a comprehensive evaluation of similarity for each object whose characteristics are calculated in step S623 while correlating with all the reference templates, thereby calculating the degree of the image as a whole (step S624).
[0047]
FIG. 6 is a flowchart for explaining the process of determining the degree in step S624 in FIG. First, the CPU 21 calculates the degree of correlation for the specific object for which the object characteristic has been calculated (step S624a). Specifically, the difference between the characteristic of the specific object and the characteristic of the specific reference template is obtained linearly or nonlinearly in consideration of the weight of each parameter, and the degree of correlation, that is, the similarity is comprehensively evaluated. Note that the object characteristics of the reference template are calculated in advance by the same method as described in step S623.
[0048]
Next, the CPU 21 determines whether or not the calculation of the degree of correlation with all the reference templates has been completed for the specific object subjected to the calculation in step S624a (step S624b).
[0049]
If it is determined in step S624b that the correlation degree calculation has not been completed for all the reference templates, the correlation degree calculation in step S624a is sequentially repeated for the remaining reference templates.
[0050]
If it is determined in step S624b that the correlation calculation for all the reference templates has been completed, the CPU 21 determines whether the calculation of the correlation has been completed for all target objects cut out from the determination target image. Is determined (step S624c).
[0051]
If it is determined in step S624b that the correlation calculation for all target objects has not been completed, the correlation calculation in step S624a is sequentially repeated for the remaining target objects.
[0052]
If it is determined in step S624b that the correlation calculation for all target objects has been completed, the CPU 21 calculates the correlation with respect to the reference object obtained for each target object linearly or in consideration of size and arrangement. By further non-linearly adding, an average index relating to relevance, that is, a degree indicating the degree of inertia as a whole specific image is obtained (step S624d). It should be noted that in non-linear processing, if there is a thing with a high degree of correlation, processing that shows a higher degree of accuracy than mere addition, and if there is a thing with a low degree of correlation, processing that shows a lower degree of accuracy than mere addition. For example, it is possible to perform an evaluation such that a value obtained by increasing the correlation value to the third power or fourth power is added.
[0053]
FIG. 7 is a flowchart for explaining monitoring of the WEB site. First, the CPU 21 automatically connects to an appropriate WEB site via the communication control device 25 (step S11).
[0054]
Next, the CPU 21 downloads the image data (that is, the inspection image) of the website of the WEB site via the communication control device 25 and stores the image file in the storage device 24 (step S12). Image files can be collected on a WEB site basis, and a predetermined number of homepage image data can also be collected in a batch. For example, a root directory is defined for a specific WEB site, and a list of image files in which the full path of the image file such as jpg, gif, etc. is recorded is created, and the file format is unified with the specific directory provided in the storage device 24. Can be stored.
[0055]
Next, the CPU 21 determines the degree of the image file using the reference template obtained in step S6 of FIG. 3 (step S13). The determination of the degree of brightness is almost the same as that shown in FIG. 5 except that the degree of the degree is determined by a similar process for a large number of image files instead of a single one.
[0056]
Next, the CPU 21 determines whether or not a habit image exists in these image files (step S14). Specifically, it is determined whether or not the degree of error for each image file exceeds a certain threshold value, and if the degree of error exceeds a predetermined threshold value in any of the image files, a defect image exists. to decide.
[0057]
If it is determined in step S14 that a haze image has been detected, the CPU 21 displays a warning on haze image detection on the display device 23 (step S15). At this time, the CPU 21 can display the detected haze image as a thumbnail or a file list on the display device 23 together with the address of the WEB site and the like, and stores the display information in the storage device 24. This result is automatically notified to a manager who operates another computer system at a remote location by using means such as e-mail.
[0058]
Next, the CPU 21 determines whether or not the degree determination has been completed for all the monitoring target WEB sites, and if it is determined that the degree determination has not been completed for all the target WEB sites, the steps S11 to S15 are performed. Repeat the process.
[0059]
As described above, it is possible to automatically monitor whether or not a cunning image is included in the monitoring target WEB site. At this time, according to the method of the present embodiment, the influence of the posture and background of the object constituting the eyelid image is canceled, and high detection accuracy can be obtained.
[0060]
To date, it has been determined by the human eye whether an image published on the Internet is a spider image. This is a very labor-intensive work, and there is a limit to checking the contents of the Internet that change every day. There is an urgent need for an Internet content provider that aims to provide sound services for users who want to surf the Internet with peace of mind. Recently, businesses that provide IPS (Internet Provider Service) for general consumers, including minor users, are required to check for spider images. The monitoring system MS of the present embodiment is an alternative to the conventional identification method with the naked eye, and enables automatic detection of a haze image on a WEB site by a computer. That is, the present monitoring system MS can realize the identification of the eyelid image, which can only be determined by the human naked eye, as automatic detection and automatic recognition by a computer. This can greatly reduce labor and cost, and can greatly contribute to the automation and management of Internet content. Furthermore, a more rational, low-cost, and flexible system can be constructed by automating image content management and monitoring. Furthermore, there is a benefit for the IPS trader that the possibility of receiving administrative guidance or business suspension by leaving the bag image is further reduced.
[0061]
FIG. 8 is a diagram for visually explaining the flow of data in the operations shown in FIGS. 3 to 7. FIG. 8 (a) shows the pre-learning process of the monitoring system MS, and FIG. The homepage monitoring process of the monitoring system MS is shown.
[0062]
In the pre-learning process shown in FIG. 8 (a), the automatic learning unit classifies the learning sample image into a wrinkle image and a non-wrinkle image using an initial template, and corrects the result by user judgment. Get a reference template.
[0063]
In the homepage monitoring process shown in FIG. 8B, the automatic determination unit automatically separates the monitoring target image into a habit image and a non-haze image based on the reference template obtained in FIG.
[0064]
In the above embodiment, the cocoon image is detected on the WEB site. However, if a cocoon image file exists in another network, for example, an in-house computer network, this can be detected easily and reliably.
[0065]
[Second Embodiment]
The image recognition apparatus and method according to the second embodiment is a modification of the apparatus and method according to the first embodiment for portrait extraction.
[0066]
The pre-learning process is basically the same as that of the first embodiment. Therefore, the overall process is the same as that shown in FIG. 3, the process shown in FIG. 9 is executed as the process corresponding to FIG. 4, and the process shown in FIG. 10 is executed as the process corresponding to FIG. As a process corresponding to FIG. 6, the process shown in FIG. 11 is performed.
[0067]
The initial portrait template can be an object appropriately cut out from the portrait and includes a face, a neck, a hair, and the like. The reason why the hair is included in the face image template in this way is that the actually used face photograph includes the hair. The learning sample image may be, for example, a face photograph of a frontal style of a person, and several patterns of images are prepared for typical ones such as women, men, and thin or dark hair.
[0068]
In addition, when re-adding images after ranking of learning sample images, it is possible to consider the gender, ethnicity, etc. of the portrait to be cut out. In addition, the reference template obtained by correcting the template is a composite of a group of objects (for example, a skin color object including a face and a neck, a hair object) obtained from one learning sample image, and a combination of these positional relationships. It has become a thing. The contours of the objects constituting such a reference template can be manually adjusted in some cases. A plurality of composite reference templates obtained as a result are used for automatic identification and segmentation of portrait images, that is, face image objects.
[0069]
FIG. 12 is a flowchart for explaining portrait cutout. First, the CPU 21 reads image data in the WEB site or the storage device 24 (step S201).
[0070]
Next, the CPU 21 performs a process related to selection of a cut-out processing target range (step S202). Specifically, a preview image corresponding to the image data is displayed on the display device 23. This preview image can be enlarged / reduced in several stages, and a range can be selected in the preview image using an appropriate tool. At this time, the CPU 21 causes the display device 23 to warn that the image quality is not good when the selection range or the original image is smaller than a certain size.
[0071]
Next, the CPU 21 checks the size of the target image (step S203). If the size of the target image does not exceed a certain value, the subsequent processing is performed on the original image as it is. If the size of the target image exceeds a certain value, (1) an ultra-high resolution mode for performing subsequent processing without changing the size of the target image, and (2) a general mode for reducing the size of the target image. To select as appropriate.
[0072]
Next, the CPU 21 determines the portrait degree of the target image obtained from the image file using the reference template obtained in the process of FIG. 9 (step S204). The determination of the portrait degree is almost the same as that shown in FIG.
[0073]
Next, the CPU 21 determines whether or not a portrait exists in these target images (step S205). Specifically, it is judged whether or not the degree of portrait obtained for the target image exceeds a certain threshold value, the layout element of the object, etc., and if the degree of portrait satisfies the predetermined criteria, the portrait exists. Judge that.
[0074]
When calculating the degree of portrait and determining the existence of a portrait, if the reference template that maximizes the correlation, that is, the degree of similarity is the template object with the skin color including the face and neck, it is recognized as a face candidate object. To do. Further, if the reference template having the highest similarity is a hair template object, it is recognized as a hair candidate object. At this time, the similarity is weighted and corrected based on the distance from the center area of the image. If all face candidate objects are adjacent to one or more hair candidate objects and the sum of the contour lines of the adjacent parts is not within a predetermined range, the face candidate object is calculated for portrait Exclude from the target. The same exclusion process is also performed on all hair candidate objects, and automatic discrimination is performed in consideration of the arrangement relationship of the cut out objects. At this time, the number of face candidate objects and the number of hair candidate objects do not necessarily match. Note that if the number of face candidate objects exceeds a predetermined maximum value, those having low similarity and hair candidate objects adjacent thereto are further excluded.
[0075]
Although the above description means extraction of the head object, clothes objects can be extracted in the same manner. In this case, an object within a predetermined expected range is extracted as a clothing candidate object on the opposite side of the hair candidate object adjacent to the face candidate object. The expected range is calculated from the size of the face candidate object, for example.
[0076]
When determining in step S204 that the portrait has been detected, the CPU 21 displays an object obtained from the portrait on the display device 23 (step S206). At this time, all the cutout objects can be displayed, but only some objects within the specified range of the target image can be displayed. The object display method can display an area where the object exists in the target image as a range with an ellipse or the like, and display the object in a tree display without overlapping. If there is an unnecessary one of the cut out objects, any of the tree display objects as described above can be designated by the input device 22 and removed from the object group.
[0077]
Next, the CPU 21 performs processing for accepting contour correction of the cut out object (step S207). The CPU 21 displays a contour line with a clear color on the preview of the target image displayed on the display device 23 corresponding to the clipped object. This contour line can be corrected based on an instruction from the input device 22. For example, a point at which the sharpness is extreme along the contour line is automatically detected at a certain density or less, and an operation handle for the contour line is obtained by connecting these points. And between adjacent points, a point can be compulsorily formed based on the instruction | indication from the input device 22, and an unnecessary point can also be thinned out conversely. The operation handle thus obtained can be corrected by using the input device 22, and appropriate contour detection with the user's judgment can be detected.
[0078]
Next, the CPU 21 performs a process of geometrically cutting out the object whose contour has been determined and saving the cut out image in the storage device 24 (step S208). The object can be cut out along the contour obtained as described above, but can be cut out as an image in which the contour fits in an arbitrary rectangular area, elliptical area, or circular area.
[0079]
As described above, portraits can be automatically cut out from an appropriate image file and stored and stored. The portrait file thus obtained can be attached as a face photo file to an e-mail sent between a mobile phone or a personal computer. Such a portrait file makes it easy to create various images. For example, the portrait file as described above is used for various purposes such as commemorative face photo writing and face photo stickers.
[0080]
Although the present invention has been described based on the above embodiments, the present invention is not limited to the above embodiments. For example, in the above-described embodiment, the eyelid image or portrait is automatically detected, but the method of the present invention can also be applied to the case of automatically detecting an image of another abstract category.
[Brief description of the drawings]
FIG. 1 shows an overall structure of a network system in which an image recognition apparatus and method according to a first embodiment are utilized.
FIG. 2 is a block diagram conceptually illustrating the structure of the monitoring system shown in FIG.
FIG. 3 is a flowchart illustrating pre-learning.
FIG. 4 is a flowchart illustrating a part of prior learning.
FIG. 5 is a flowchart illustrating a part of prior learning.
FIG. 6 is a flowchart illustrating a part of prior learning.
FIG. 7 is a flowchart for explaining monitoring of a WEB site.
FIG. 8A shows the pre-learning process of the monitoring system, and FIG. 8B shows the homepage monitoring process of the monitoring system.
FIG. 9 is a flowchart illustrating a part of pre-learning in the image recognition apparatus according to the second embodiment.
FIG. 10 is a flowchart illustrating a part of pre-learning in the image recognition apparatus according to the second embodiment.
FIG. 11 is a flowchart illustrating a part of pre-learning in the image recognition apparatus according to the second embodiment.
FIG. 12 is a flowchart for explaining portrait image clipping in the image recognition apparatus according to the second embodiment;
[Explanation of symbols]
22 Input device
23 Display device
24 storage devices
25 Communication control device
21 CPU
IN Internet
MS monitoring system

Claims

A plurality of stored reference patterns and a learning image that is a collection of various images are automatically converted into parts that contribute to the determination of whether the image belongs to an abstract predetermined category or not. The geometric pattern of the object pattern cut out and extracted is compared with the shape excluding the element of the rotation angle , and the value obtained by quantifying the relationship between the reference pattern and the specific object pattern has become a predetermined value or more. A step of adding the specific object pattern to the contents of the plurality of reference patterns;
Extracting an object pattern from the inspection image;
Comparing the object pattern extracted from the inspection image with the reference pattern;
A step of determining that the inspection image is an image belonging to a predetermined category when there is an object whose value obtained by quantifying the relationship between the object pattern extracted from the inspection image and the reference pattern is a predetermined value or more. An image recognition method comprising:
Among the object patterns of the geometric shape extracted by automatic cutout, a predetermined object pattern is removed before the comparison with the reference pattern,
Among the object patterns of the geometric shape extracted by automatic cutout, ranking of the object patterns that remain without being removed is performed, and the object pattern determined as the top ranking based on the ranking result is used as the reference template. An image recognition method characterized in that it can be replaced with a lower one of the ranks, and is stored as a new reference template when it is replaced .

2. The image recognition method according to claim 1, wherein a value obtained by quantifying the relationship between the plurality of reference patterns and the specific object pattern is determined by weighting parameters including color and shape.

A plurality of stored reference patterns and a learning image that is a collection of various images are automatically converted into parts that contribute to the determination of whether the image belongs to an abstract predetermined category or not. The geometric pattern of the object pattern cut out and extracted is compared with the shape excluding the element of the rotation angle , and the value obtained by quantifying the relationship between the reference pattern and the specific object pattern has become a predetermined value or more. A means for adding the specific object pattern to the contents of the plurality of reference patterns;
Means for extracting an object pattern from the inspection image;
Means for comparing the object pattern extracted from the inspection image with the reference pattern;
Means for determining that the inspection image is an image belonging to a predetermined category when there is an object whose value obtained by quantifying the relationship between the object pattern extracted from the inspection image and the reference pattern is greater than or equal to a predetermined value An image recognition device comprising:
Among the object patterns of the geometric shape extracted by automatic cutout, a predetermined object pattern is removed before the comparison with the reference pattern,
Among the object patterns of the geometric shape extracted by automatic cutout, ranking of the object patterns that remain without being removed is performed, and the object pattern determined as the top ranking based on the ranking result is used as the reference template. The image recognition apparatus can be replaced with a lower one of the ranks, and is stored as a new reference template when replaced .

Computer
A plurality of stored reference patterns and a learning image that is a collection of various images are automatically converted into parts that contribute to the determination of whether the image belongs to an abstract predetermined category or not. The geometric pattern of the object pattern cut out and extracted is compared with the shape excluding the element of the rotation angle , and the value obtained by quantifying the relationship between the reference pattern and the specific object pattern has become a predetermined value or more. A means for adding the specific object pattern to the contents of the plurality of reference patterns;
Means for extracting an object pattern from the inspection image;
Means for comparing the object pattern extracted from the inspection image with the reference pattern;
Means for determining that the inspection image is an image belonging to a predetermined category when there is a value whose numerical value is a predetermined value or more of the relationship between the object pattern extracted from the inspection image and the reference pattern A computer program for functioning as
Among the object patterns of the geometric shape extracted by automatic cutout, a predetermined object pattern is removed before the comparison with the reference pattern,
Among the object patterns of the geometric shape extracted by automatic cutout, ranking of the object patterns that remain without being removed is performed, and the object pattern determined to be higher in ranking based on the ranking result is used as the reference template. A computer program characterized in that it can be replaced with a lower one of the ranks, and is stored as a new reference template when it is replaced .