JP2021101321A

JP2021101321A - Information processing device, method and program

Info

Publication number: JP2021101321A
Application number: JP2019233161A
Authority: JP
Inventors: 和希緑川; Kazuki Midorikawa
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2021-07-08

Abstract

To precisely extract a foreground from an image.SOLUTION: An information processing device 123 includes an extracting unit 403 and an evaluating unit 404. The extracting unit 403 extracts a foreground candidate region 416 that is a candidate foreground on the basis of an input image 411 and of a background image 413 corresponding to the input image 411. The evaluating unit 404 determines that, in the foreground candidate region, a contour region in which the number of pixels is equal to or greater than a second threshold, the pixels having equal to or smaller than a first threshold of a difference between pixel values of a contour region corresponding to the contour of the foreground candidate region and pixel values of a region corresponding to the contour region in the background image, as a foreground.SELECTED DRAWING: Figure 4

Description

本発明は、情報処理技術に関し、詳細には入力画像から主要被写体を抽出する技術に関する。 The present invention relates to an information processing technique, and more particularly to a technique of extracting a main subject from an input image.

カメラで撮像して得た入力画像から主要被写体を抽出する技術として、主要被写体である特定の人や物などを前景としそのほかを背景として分離する前景背景分離が一般に広く用いられている。前景背景分離の手法には、被写体のエッジやコーナーといった特徴を用いる手法、背景と前景の画素値の差分を用いる手法、前景もしくは背景の性質を学習して分離する手法がある。 As a technique for extracting a main subject from an input image obtained by capturing an image with a camera, foreground background separation, which separates a specific person or object as the main subject as the foreground and the others as the background, is generally widely used. Foreground and background separation methods include a method using features such as edges and corners of a subject, a method using the difference between the pixel values of the background and the foreground, and a method of learning and separating the foreground or background properties.

建造物や産業用機械の一部を背景として特定の物体や生物を前景として抽出するケースでは、前景が存在する領域に制約があったり、前景が出現する領域の色や質感がその他の領域と異なっていたりするケースがある。前景が存在する領域があらかじめ分かっている場合、画像における前記領域以外に対応する部分を処理対象から除外し、前記画像における前記領域に対応する部分を処理対象として扱うことができる。また、特許文献１には、候補物体（前景候補領域）の周囲領域の平均画素値と、候補物体を除いた入力画像の平均画素値との類似度を用いて候補物体（前景）を選別する技術が開示されている。 In the case where a specific object or organism is extracted as the foreground against the background of a part of a building or industrial machine, the area where the foreground exists is restricted, or the color or texture of the area where the foreground appears is different from other areas. There are cases where they are different. When the region where the foreground exists is known in advance, the portion of the image corresponding to the region other than the region can be excluded from the processing target, and the portion of the image corresponding to the region can be treated as the processing target. Further, in Patent Document 1, a candidate object (foreground) is selected by using the similarity between the average pixel value of the peripheral region of the candidate object (foreground candidate region) and the average pixel value of the input image excluding the candidate object. The technology is disclosed.

特開２０１７−２０４２７６号公報JP-A-2017-204276

しかしながら、特許文献１の技術では、複数の色からなる背景や、前景の一部に他の前景が重なると、候補物体の周囲領域の平均画素値を求めても正しい結果が得られず、入力画像から前景を精度よく抽出できない可能性があった。 However, in the technique of Patent Document 1, when a background composed of a plurality of colors or a part of the foreground is overlapped with another foreground, a correct result cannot be obtained even if the average pixel value of the surrounding area of the candidate object is obtained, and the input is performed. There was a possibility that the foreground could not be extracted accurately from the image.

本発明は、入力画像から前景を精度よく抽出する技術を提供する。 The present invention provides a technique for accurately extracting the foreground from an input image.

本発明の一態様に係る画像処置装置は、入力画像と、前記入力画像に対応する背景画像とに基づいて前景の候補となる候補領域を抽出する抽出手段と、前記候補領域において、前記候補領域の輪郭に対応する輪郭領域の画素値と、前記背景画像における前記輪郭領域に対応する領域の画素値との差が第一の閾値以下である領域の画素の数が第二の閾値以上となる前記輪郭領域を前景であると判定する判定手段と、を有することを特徴とする。 The image treatment apparatus according to one aspect of the present invention includes an extraction means for extracting a candidate region that is a candidate for the foreground based on an input image and a background image corresponding to the input image, and the candidate region in the candidate region. The number of pixels in the region where the difference between the pixel value of the contour region corresponding to the contour of the above and the pixel value of the region corresponding to the contour region in the background image is equal to or less than the first threshold value is equal to or greater than the second threshold value. It is characterized by having a determination means for determining the contour region as the foreground.

本発明によれば、入力画像から前景を精度よく抽出することができる。 According to the present invention, the foreground can be accurately extracted from the input image.

仮想視点画像を生成する画像処理システムの構成例を示す図The figure which shows the configuration example of the image processing system which generates a virtual viewpoint image カメラアダプタの内部構成例を示す図Diagram showing an example of the internal configuration of the camera adapter 情報処理装置のハードウェア構成例を示す図The figure which shows the hardware configuration example of an information processing apparatus 情報処理装置の論理構成例を示すブロック図Block diagram showing an example of logical configuration of an information processing device 処理対象例を示す図Diagram showing an example of processing target 情報処理装置で実行される処理の流れを示すフローチャートFlowchart showing the flow of processing executed by the information processing device 情報処理装置で実行される処理の流れを示すフローチャートFlowchart showing the flow of processing executed by the information processing device

以下、本発明を実施するための形態について図面を参照して説明する。ただし、この実施形態に記載されている構成要素はあくまで例示であり、本発明の範囲をそれらに限定する趣旨のものではない。また、実施形態で説明されている構成要素の組み合わせのすべてが、課題を解決するための手段に必須のものとは限らず、種々の変形及び変更が可能である。 Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings. However, the components described in this embodiment are merely examples, and are not intended to limit the scope of the present invention to them. Moreover, not all of the combinations of components described in the embodiments are essential for the means for solving the problem, and various modifications and changes are possible.

＜＜実施形態１＞＞
（システム構成）
図１は、仮想視点画像を生成する画像処理システムの構成例を示す図である。画像処理システム１００は、撮像モジュール１１０ａ〜１１０ｚ、データベース（ＤＢ）２５０、サーバ２７０、制御装置３００、スイッチングハブ１８０、及びエンドユーザ端末１９０を有する。すなわち、画像処理システム１００は、画像収集ドメイン、データ保存ドメイン、及び画像生成ドメインという３つの機能ドメインを有する。画像収集ドメインは撮像モジュール１１０ａ〜１１０ｚを含み、データ保存ドメインはＤＢ２５０とサーバ２７０を含み、画像生成ドメインは制御装置３００及びエンドユーザ端末１９０を含む。 << Embodiment 1 >>
(System configuration)
FIG. 1 is a diagram showing a configuration example of an image processing system that generates a virtual viewpoint image. The image processing system 100 includes imaging modules 110a to 110z, a database (DB) 250, a server 270, a control device 300, a switching hub 180, and an end user terminal 190. That is, the image processing system 100 has three functional domains, an image acquisition domain, a data storage domain, and an image generation domain. The image acquisition domain includes the imaging modules 110a to 110z, the data storage domain includes the DB 250 and the server 270, and the image generation domain includes the control device 300 and the end user terminal 190.

制御装置３００は、画像処理システム１００を構成するそれぞれのブロックに対してネットワークを通じて動作状態の管理及びパラメータ設定制御などを行う。ここで、ネットワークはＥｔｈｅｒｎｅｔ（登録商標）であるＩＥＥＥ標準準拠のＧｂＥ（ギガビットイーサーネット）や１０ＧｂＥでもよいし、インターコネクトＩｎｆｉｎｉｂａｎｄ、産業用ローカルエリアネットワーク等を組合せて構成されてもよい。また、これらに限定されず、他の種類のネットワークであってもよい。 The control device 300 manages the operating state and controls the parameter setting for each block constituting the image processing system 100 through the network. Here, the network may be GbE (Gigabit Ethernet) or 10 GbE conforming to the IEEE standard (registered trademark), or may be configured by combining an interconnect Infiniband, an industrial local area network, or the like. Further, the network is not limited to these, and other types of networks may be used.

最初に、撮像モジュール１１０ａ〜１１０ｚの２６セット分の撮像画像を撮像モジュール１１０ｚからサーバ２７０へ送信する動作を説明する。撮像モジュール１１０ａ〜１１０ｚは、それぞれ１台ずつのカメラ１１２ａ〜１１２ｚを有する。以下では、撮像モジュール１１０ａ〜１１０ｚまでの２６セットのシステムを区別せず、単に「撮像モジュール１１０」と記載する場合がある。各撮像モジュール１１０内の装置についても同様に、「カメラ１１２」、「カメラアダプタ１２０」と記載する場合がある。なお、撮像モジュール１１０の台数を２６セットとしているが、あくまでも一例でありこれに限定されない。 First, an operation of transmitting 26 sets of captured images of the imaging modules 110a to 110z from the imaging module 110z to the server 270 will be described. The image pickup modules 110a to 110z each have one camera 112a to 112z. In the following, 26 sets of systems from the imaging modules 110a to 110z may not be distinguished and may be simply referred to as “imaging module 110”. Similarly, the devices in each imaging module 110 may be described as "camera 112" and "camera adapter 120". The number of imaging modules 110 is 26 sets, but this is just an example and is not limited to this.

撮像モジュール１１０ａ〜１１０ｚはデイジーチェーンにより接続される。この接続形態により、撮像画像の４Ｋや８Ｋなどへの高解像度化及び高フレームレート化に伴う画像データの大容量化において、接続ケーブル数の削減や配線作業の省力化ができる効果がある。なお、接続形態は任意であり、例えば撮像モジュール１１０ａ〜１１０ｚがスイッチングハブ１８０にそれぞれ接続されて、スイッチングハブ１８０を経由して撮像モジュール１１０間のデータ送受信を行うスター型のネットワーク構成としてもよい。 The imaging modules 110a to 110z are connected by a daisy chain. This connection form has the effect of reducing the number of connection cables and labor saving in wiring work in increasing the resolution of captured images to 4K or 8K and increasing the capacity of image data due to the increase in frame rate. The connection form is arbitrary. For example, a star-type network configuration may be obtained in which the imaging modules 110a to 110z are connected to the switching hub 180 and data is transmitted / received between the imaging modules 110 via the switching hub 180.

本実施形態では、各撮像モジュール１１０はカメラ１１２とカメラアダプタ１２０とで構成されているがこれに限定されない。例えば、マイク、雲台、外部センサを有していてもよい。また、本実施形態では、カメラ１１２とカメラアダプタ１２０とが分離された構成となっているが、同一筺体で一体化されていてもよい。撮像モジュール１１０ａ内のカメラ１１２ａにて得られた撮像画像は、カメラアダプタ１２０ａにおいて後述の画像処理が施された後、撮像モジュール１１０ｂのカメラアダプタ１２０ｂに伝送される。同様に撮像モジュール１１０ｂは、カメラ１１２ｂにて得られた撮像画像を、撮像モジュール１１０ａから取得した撮像画像と合わせて撮像モジュール１１０ｃに伝送する。このような動作を続けることにより、２６セット分の撮像画像が、撮像モジュール１１０ｚからスイッチングハブ１８０に伝わり、その後、サーバ２７０へ伝送される。 In the present embodiment, each image pickup module 110 is composed of a camera 112 and a camera adapter 120, but is not limited thereto. For example, it may have a microphone, a pan head, and an external sensor. Further, in the present embodiment, the camera 112 and the camera adapter 120 are separated from each other, but they may be integrated in the same housing. The captured image obtained by the camera 112a in the image pickup module 110a is transmitted to the camera adapter 120b of the image pickup module 110b after the image processing described later is performed on the camera adapter 120a. Similarly, the imaging module 110b transmits the captured image obtained by the camera 112b to the imaging module 110c together with the captured image acquired from the imaging module 110a. By continuing such an operation, 26 sets of captured images are transmitted from the imaging module 110z to the switching hub 180, and then transmitted to the server 270.

なお、本実施形態では、個々のカメラアダプタ１２０内で前景であるかの評価までを行うものとして説明する。ただし、このような態様に限定されるものではなく、２６セット分の撮像画像を受け取ったサーバ２７０にて、個々の撮像画像に対応するシルエット画像の生成を行うような構成であってもよい。 In this embodiment, it is assumed that the foreground is evaluated in each camera adapter 120. However, the present invention is not limited to such an aspect, and the server 270 that has received 26 sets of captured images may be configured to generate silhouette images corresponding to the individual captured images.

（カメラアダプタの構成）
次に、カメラアダプタ１２０の詳細について説明する。図２は、カメラアダプタ１２０の内部構成例を示す機能ブロック図である。カメラアダプタ１２０は、ネットワークアダプタ１２１、伝送部１２２、情報処理装置１２３及びカメラ制御部１２４から構成される。 (Camera adapter configuration)
Next, the details of the camera adapter 120 will be described. FIG. 2 is a functional block diagram showing an example of the internal configuration of the camera adapter 120. The camera adapter 120 includes a network adapter 121, a transmission unit 122, an information processing device 123, and a camera control unit 124.

ネットワークアダプタ１２１は、他のカメラアダプタ１２０やサーバ２７０、制御装置３００とデータ通信を行う。また、例えばＩＥＥＥ１５８８規格のＯｒｄｉｎａｙＣｌｏｃｋに準拠し、サーバ２７０との間で送受信したデータのタイムスタンプの保存や、サーバ２７０との時刻同期も行う。なお、他のＥｔｈｅｒＡＶＢ規格や、独自プロトコルによってタイムサーバとの時刻同期を実現してもよい。本実施形態では、ネットワークアダプタ１２１としてＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）を利用するが、これに限定されない。 The network adapter 121 performs data communication with another camera adapter 120, a server 270, and a control device 300. Further, for example, in accordance with the Ordinary Clock of the IEEE1588 standard, the time stamp of the data transmitted / received to / from the server 270 is saved and the time is synchronized with the server 270. In addition, time synchronization with a time server may be realized by another EtherAVB standard or an original protocol. In the present embodiment, a NIC (Network Interface Card) is used as the network adapter 121, but the present embodiment is not limited to this.

伝送部１２２は、ネットワークアダプタ１２１を介してスイッチングハブ１８０等に対するデータの伝送を制御する。伝送部１２２は、送受信されるデータに対して所定の圧縮方式、圧縮率、及びフレームレートを適用した圧縮を行う機能と、圧縮されたデータを伸張する機能とを有している。また、受信したデータ及び情報処理装置１２３で処理されたデータのルーティング先を決定する機能や、決定したルーティング先へデータを送信する機能を有している。また、画像データを、他のカメラアダプタ１２０またはサーバ２７０へ転送するためのメッセージを作成する機能も有している。メッセージには画像データのメタ情報が含まれる。このメタ情報には、画像撮像のサンプリング時のタイムコードまたはシーケンス番号、データ種別、及びカメラ１１２の識別子などが含まれる。なお、送信する画像データは圧縮されていてもよい。また、他のカメラアダプタ１２０からメッセージを受け取り、メッセージに含まれるデータ種別に応じて、伝送プロトコル規定のパケットサイズにフラグメントされたデータ情報を画像データに復元する。 The transmission unit 122 controls the transmission of data to the switching hub 180 and the like via the network adapter 121. The transmission unit 122 has a function of performing compression by applying a predetermined compression method, compression rate, and frame rate to the transmitted / received data, and a function of decompressing the compressed data. It also has a function of determining the routing destination of the received data and the data processed by the information processing device 123, and a function of transmitting the data to the determined routing destination. It also has a function of creating a message for transferring image data to another camera adapter 120 or server 270. The message contains meta information of image data. This meta information includes a time code or sequence number at the time of sampling of image imaging, a data type, an identifier of the camera 112, and the like. The image data to be transmitted may be compressed. Further, a message is received from another camera adapter 120, and data information fragmented to a packet size specified by a transmission protocol is restored to image data according to the data type included in the message.

情報処理装置１２３は、カメラ制御部１２４の制御によりカメラ１１２が撮像して得た画像データ、および初期化情報に基づき、前景である、オブジェクトの画像を生成する処理を行う。また、動的キャリブレーションなどの処理も行う。前景の生成を複数のカメラアダプタ１２０それぞれが行うことで、画像処理システム１００における負荷を分散させることができる。動的キャリブレーションは、撮像中に行うキャリブレーションで、カメラ毎の色のばらつきを抑えるための色補正処理や、カメラの振動に起因するブレに対して画像の位置を安定させるためのブレ補正処理（電子防振処理）などが含まれる。 The information processing device 123 performs a process of generating an image of an object, which is a foreground, based on the image data obtained by the camera 112 imaged under the control of the camera control unit 124 and the initialization information. It also performs processing such as dynamic calibration. By generating the foreground by each of the plurality of camera adapters 120, the load on the image processing system 100 can be distributed. Dynamic calibration is a calibration performed during imaging, and is a color correction process for suppressing color variation between cameras and a blur correction process for stabilizing the position of an image against blur caused by camera vibration. (Electronic vibration isolation treatment) etc. are included.

カメラ制御部１２４は、カメラ１１２と接続し、カメラ１１２の制御、撮像画像取得、同期信号提供、時刻設定などを行う。カメラ１１２の制御には、例えば撮像パラメータ（画素数、色深度、フレームレート、ホワイトバランスの設定など）の設定及び参照、カメラ１１２の状態情報（撮像中、停止中、同期中、及びエラーなど）の取得、撮像の開始及び停止や、ピント調整などがある。 The camera control unit 124 is connected to the camera 112 to control the camera 112, acquire a captured image, provide a synchronization signal, set a time, and the like. For control of the camera 112, for example, setting and reference of imaging parameters (number of pixels, color depth, frame rate, white balance setting, etc.), state information of the camera 112 (during imaging, stopping, synchronizing, error, etc.) Acquisition, start and stop of imaging, focus adjustment, etc.

（情報処理装置のハードウェア構成）
図３は、情報処理装置１２３のハードウェア構成例を示す図である。情報処理装置１２３は、ＣＰＵ１２３１、ＲＡＭ１２３２、ＲＯＭ１２３３、二次記憶装置１２３４、入力ＩＦ１２３５、及び出力ＩＦ１２３６を有する。各構成要素は、バス１２３７を介して相互にデータを送受信可能に接続されている。 (Hardware configuration of information processing device)
FIG. 3 is a diagram showing a hardware configuration example of the information processing device 123. The information processing device 123 includes a CPU 1231, a RAM 1232, a ROM 1233, a secondary storage device 1234, an input IF 1235, and an output IF 1236. The components are connected to each other via bus 1237 so that data can be transmitted and received.

ＣＰＵ１２３１は、ＲＯＭ１２３３又は二次記憶装置１２３４に格納されたプログラムを実行して、情報処理装置１２３を統括的に制御する。ＲＡＭ１２３２は、ＣＰＵ１２３１がプログラムを実行する際のメインメモリとして機能し、一時記憶領域として用いられる。ＲＯＭ１２３３は、情報処理装置１２３の制御プログラムを格納する。二次記憶装置１２３４は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）やＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）などの記憶媒体であり、画像データや各種プログラムなどを記憶する。 The CPU 1231 executes a program stored in the ROM 1233 or the secondary storage device 1234 to control the information processing device 123 in an integrated manner. The RAM 1232 functions as a main memory when the CPU 1231 executes a program, and is used as a temporary storage area. The ROM 1233 stores the control program of the information processing device 123. The secondary storage device 1234 is a storage medium such as an HDD (Hard Disk Drive) or SSD (Solid State Drive), and stores image data, various programs, and the like.

入力ＩＦ１２３５は、例えばＵＳＢやＩＥＥＥ１３９４などの規格に対応したインタフェースであって、カメラ１１２や外部装置（不図示）や入力装置（不図示）などをバス１２３７に接続する。出力ＩＦ１２３６は、ＤＶＩやＨＤＭＩ（登録商標）などの規格に対応したインタフェースであって、表示装置（不図示）とＣＰＵ１２３１とを接続する。 The input IF1235 is an interface corresponding to standards such as USB and IEEE1394, and connects a camera 112, an external device (not shown), an input device (not shown), or the like to the bus 1237. The output IF1236 is an interface corresponding to standards such as DVI and HDMI (registered trademark), and connects a display device (not shown) and the CPU 1231.

なお、情報処理装置１２３がＣＰＵ１２３１とは異なる専用の１又は複数のハードウェアあるいはＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）を有してもよい。その場合、ＣＰＵ１２３１による処理の少なくとも一部をＧＰＵあるいは専用のハードウェアが行うようにしてもよい。専用のハードウェアの例としては、ＡＳＩＣ（特定用途向け集積回路）、及びＤＳＰ（デジタルシグナルプロセッサ）などがある。 The information processing device 123 may have one or more dedicated hardware or GPU (Graphics Processing Unit) different from the CPU 1231. In that case, the GPU or dedicated hardware may perform at least a part of the processing by the CPU 1231. Examples of dedicated hardware include ASICs (application specific integrated circuits) and DSPs (digital signal processors).

以上、情報処理装置１２３のハードウェア構成例を説明した。情報処理装置１２３のハードウェア構成は、上述の構成に限定されない。ＣＰＵ１２３１がＲＯＭ１２３３や二次記憶装置１２３４などに記憶されたプログラムをＲＡＭ１２３２に読み出して実行することで、ＣＰＵ１２３１が後述する図４に示す各部として機能する形態でもよい。すなわち、情報処理装置１２３は、ソフトウェアのモジュールとして図４に示す各モジュールを実現してもよい。 The hardware configuration example of the information processing apparatus 123 has been described above. The hardware configuration of the information processing device 123 is not limited to the above configuration. The CPU 1231 may read the program stored in the ROM 1233, the secondary storage device 1234, or the like into the RAM 1232 and execute the program, so that the CPU 1231 functions as each part shown in FIG. 4 to be described later. That is, the information processing apparatus 123 may realize each module shown in FIG. 4 as a software module.

（情報処理装置の論理構成）
図４は、本実施形態に係る情報処理装置１２３の論理構成例を示すブロック図である。情報処理装置１２３は、ＣＰＵ１２３１がＲＯＭ１２３３に格納されたプログラムを、ＲＡＭ１２３２をワークメモリとして実行することで、図４に示す論理構成として機能する。なお、以下に示す処理のすべてがＣＰＵ１２３１によって実行される必要はなく、処理の一部または全部がＣＰＵ１２３１以外の一つまたは複数の処理回路によって行われるように情報処理装置１２３が構成されていてもよい。 (Logical configuration of information processing device)
FIG. 4 is a block diagram showing a logical configuration example of the information processing apparatus 123 according to the present embodiment. The information processing device 123 functions as the logical configuration shown in FIG. 4 by the CPU 1231 executing the program stored in the ROM 1233 by using the RAM 1232 as the work memory. It should be noted that it is not necessary that all of the processes shown below are executed by the CPU 1231, and even if the information processing device 123 is configured so that a part or all of the processes are performed by one or a plurality of processing circuits other than the CPU 1231. Good.

図４の情報処理装置１２３では、主要被写体である特定の人や物などの前景がない状態にてカメラ１１２で撮像して得た背景画像と、その中から前景を抽出する領域の背景色を１つ以上指定しておく方法を用いて、次の評価が実行される。すなわち、情報処理装置１２３では、入力画像から抽出した前景候補領域が前景であるか否かについて判定する評価が実行される。また、背景画像は、朝、昼、晩などの時間帯や、１月〜１２月などの時期に応じた、前景のない状態で撮像して得た画像であって、予め用意された背景のみで構成される画像であってもよい。なお、本実施形態では明示しないが、背景画像は、予め決まった背景画像を用いる方法のほかに、ある一定期間ごと、ある一定数のフレームごとなど、背景が変わることをトリガーとして背景画像を更新する動的背景更新によって更新してもよい。 In the information processing device 123 of FIG. 4, the background image obtained by capturing the image with the camera 112 in the absence of the foreground of a specific person or object which is the main subject, and the background color of the region from which the foreground is extracted are obtained. The next evaluation is performed using the method of specifying one or more. That is, the information processing apparatus 123 executes an evaluation for determining whether or not the foreground candidate region extracted from the input image is the foreground. In addition, the background image is an image obtained by taking an image without a foreground according to the time zone such as morning, noon, and evening, and the period from January to December, and only the background prepared in advance. It may be an image composed of. Although not specified in this embodiment, the background image is updated by triggering a change in the background, such as every fixed period or every fixed number of frames, in addition to the method of using a predetermined background image. It may be updated by dynamic background update.

情報処理装置１２３は、入力部４０１、保持部４０２、抽出部４０３、評価部４０４を有する。入力部４０１は、情報処理装置１２３に入力された、入力画像４１１と初期化情報４１２とを含む画像データを受信する。入力画像４１１は、カメラ１１２によって被写体を撮像して得た画像である。初期化情報４１２は、情報処理装置１２３の動作に用いる情報である。本実施形態における初期化情報４１２は、後述の図６に示す処理で用いられる情報であって、背景画像４１３の初期値と、背景色４１４と、背景色の範囲を定義する第一の閾値と、評価部４０４が前景の判定条件として用いる第二の閾値とを含む。第一の閾値および第二の閾値はそれぞれ、情報処理装置１２３の初期化時に評価部４０４に設定されるものとする。背景画像４１３の初期値は、主要被写体である特定の人や物などの前景がない状態にてカメラであらかじめ撮像して得た背景画像である。背景色４１４は、背景画像４１３の中から前景を抽出する領域の背景色であって、予め指定した１つ以上の背景色であり、背景の画素値を示している。上述の画像データの一部または全部は、カメラ１１２から受信されたデータでもよいし、外部装置から或いは二次記憶装置１２３４から受信されたデータでもよい。入力部４０１で受信した入力画像４１１は、抽出部４０３に出力される。入力部４０１で受信した初期化情報４１２は、保持部４０２に出力される。 The information processing device 123 has an input unit 401, a holding unit 402, an extraction unit 403, and an evaluation unit 404. The input unit 401 receives the image data input to the information processing device 123, including the input image 411 and the initialization information 412. The input image 411 is an image obtained by photographing the subject with the camera 112. The initialization information 412 is information used for the operation of the information processing apparatus 123. The initialization information 412 in the present embodiment is information used in the process shown in FIG. 6 described later, and includes an initial value of the background image 413, a background color 414, and a first threshold value for defining a range of the background color. , A second threshold value used by the evaluation unit 404 as a determination condition for the foreground. It is assumed that the first threshold value and the second threshold value are set in the evaluation unit 404 at the time of initialization of the information processing apparatus 123, respectively. The initial value of the background image 413 is a background image obtained in advance by a camera in a state where there is no foreground of a specific person or object which is the main subject. The background color 414 is a background color of a region for extracting the foreground from the background image 413, is one or more background colors specified in advance, and indicates a pixel value of the background. Part or all of the above-mentioned image data may be data received from the camera 112, data received from an external device, or data received from the secondary storage device 1234. The input image 411 received by the input unit 401 is output to the extraction unit 403. The initialization information 412 received by the input unit 401 is output to the holding unit 402.

保持部４０２は、入力部４０１から入力された初期化情報４１２を保持する。保持部４０２に保持される初期化情報４１２は、背景画像４１３の初期値と、背景色４１４と、背景色の範囲を定義する第一の閾値と、評価部４０４が前景の判定条件として用いる第二の閾値とを含む。なお、背景画像が動的背景更新によって更新される場合、保持部４０２に保持される背景画像は、入力部４０１を介して入力される、更新済みの背景画像に置き換えられることになる。この場合、保持部４０２に保持される背景色４１４も、背景画像と同様、更新済みの背景画像に対応する更新済みの背景色に置き換えられることになる。保持部４０２に保持された背景画像４１３は、抽出部４０３に出力される。保持部４０２に保持された背景色４１４、第一の閾値、第二の閾値は、評価部４０４に出力される。 The holding unit 402 holds the initialization information 412 input from the input unit 401. The initialization information 412 held in the holding unit 402 includes the initial value of the background image 413, the background color 414, the first threshold value that defines the range of the background color, and the first threshold value used by the evaluation unit 404 as the foreground determination condition. Includes two thresholds. When the background image is updated by the dynamic background update, the background image held by the holding unit 402 is replaced with the updated background image input via the input unit 401. In this case, the background color 414 held by the holding unit 402 is also replaced with the updated background color corresponding to the updated background image, similarly to the background image. The background image 413 held by the holding unit 402 is output to the extraction unit 403. The background color 414, the first threshold value, and the second threshold value held by the holding unit 402 are output to the evaluation unit 404.

抽出部４０３は、入力部４０１から入力された入力画像４１１と、保持部４０２から入力された背景画像４１３とを用いて、入力画像４１１から前景の候補となる領域を示す前景候補領域４１６を抽出する。抽出部４０３によって抽出された前景候補領域４１６は、評価部４０４に出力される。 The extraction unit 403 uses the input image 411 input from the input unit 401 and the background image 413 input from the holding unit 402 to extract a foreground candidate area 416 indicating a foreground candidate area from the input image 411. To do. The foreground candidate region 416 extracted by the extraction unit 403 is output to the evaluation unit 404.

評価部４０４は、前景候補領域４１６からその輪郭を抽出して前景候補領域４１６の輪郭の座標情報（画素）を取得し、輪郭の座標情報に対応する背景画像の画素を輪郭背景画像４１５として保持部４０２に保持される背景画像４１３から取得する。輪郭は、輪郭領域ともいう。そして、評価部４０４は、前景候補領域４１６のそれぞれについて、輪郭背景画像４１５と背景色４１４とで画素値の差が第一の閾値以下となり、指定背景色フラグがオンに設定された画素を背景色輪郭画素数として計数する。評価部４０４は、背景色輪郭画素数が第二の閾値以上となる輪郭背景画像に対応する前景候補領域を前景に分類する。なお、評価部４０４は、背景色輪郭画素数が第二の閾値未満となる輪郭背景画像に対応する前景候補領域を明示的に背景に分類して扱ってもよいし、前景候補領域として残しておき、前景より優先度を下げて取り扱ってもよい。 The evaluation unit 404 extracts the contour from the foreground candidate region 416, acquires the coordinate information (pixels) of the contour of the foreground candidate region 416, and holds the pixels of the background image corresponding to the contour coordinate information as the contour background image 415. Obtained from the background image 413 held by the unit 402. The contour is also referred to as a contour area. Then, the evaluation unit 404 sets the background of the pixels in which the difference in pixel values between the contour background image 415 and the background color 414 is equal to or less than the first threshold value and the designated background color flag is turned on for each of the foreground candidate regions 416. It is counted as the number of color contour pixels. The evaluation unit 404 classifies the foreground candidate region corresponding to the contour background image in which the number of background color contour pixels is equal to or greater than the second threshold value into the foreground. The evaluation unit 404 may explicitly classify the foreground candidate region corresponding to the contour background image in which the number of background color contour pixels is less than the second threshold value into the background and treat it, or leave it as the foreground candidate region. You may handle it with a lower priority than the foreground.

ここで、評価部４０４によって前景候補領域から前景を分類する方法について、図５を用いて具体的に説明する。図５（ａ）は、情報処理装置１２３で用いる背景画像例を示す図である。図５（ｂ）は、情報処理装置１２３に入力された入力画像例を示す図である。なお、図５（ａ）の背景画像は、図５（ｂ）の入力画像に対応している。図５（ｃ）は、図５（ｂ）の前景候補領域に対応する背景色輪郭画像例を示す図である。図５（ａ）および図５（ｂ）を用いて、背景色輪郭画素数の計数方法について説明する。 Here, a method of classifying the foreground from the foreground candidate region by the evaluation unit 404 will be specifically described with reference to FIG. FIG. 5A is a diagram showing an example of a background image used in the information processing apparatus 123. FIG. 5B is a diagram showing an example of an input image input to the information processing apparatus 123. The background image of FIG. 5A corresponds to the input image of FIG. 5B. FIG. 5C is a diagram showing an example of a background color contour image corresponding to the foreground candidate region of FIG. 5B. A method of counting the number of background color contour pixels will be described with reference to FIGS. 5 (a) and 5 (b).

図５（ａ）の背景画像５００において、グレーで示される背景領域５１２、背景領域５１０は前景抽出の対象として指定した背景色に近い色の領域であり、白で示される背景領域５１１は指定した背景色と異なる領域であるものとする。背景領域５１０、５１２として、例えば、競技場の芝などが挙げられる。背景領域５１２、背景領域５１０では、指定した背景色（背景の画素値）と背景領域内の各画素値の差異が第一の閾値として定めた数値以下となる。図５（ｂ）に示される入力画像５０４には、前景候補領域５０１と前景候補領域５０２と前景候補領域５０３との３つの領域が含まれるとする。図５（ｃ）に示される背景色輪郭画素５０７は、入力画像５０４に含まれる前景候補領域５０１における輪郭に対応する画素で構成される。同様に、図５（ｃ）に示される背景色輪郭画素５０８は、入力画像５０４に含まれる前景候補領域５０２における輪郭に対応する画素で構成される。図５（ｃ）に示される背景色輪郭画素５０９は、入力画像５０４に含まれる前景候補領域５０３に対応する画素で構成される。 In the background image 500 of FIG. 5A, the background area 512 and the background area 510 shown in gray are areas having a color close to the background color designated as the target of foreground extraction, and the background area 511 shown in white is designated. It is assumed that the area is different from the background color. Examples of the background areas 510 and 512 include turf in a stadium. In the background area 512 and the background area 510, the difference between the designated background color (background pixel value) and each pixel value in the background area is equal to or less than the numerical value set as the first threshold value. It is assumed that the input image 504 shown in FIG. 5B includes three regions, a foreground candidate region 501, a foreground candidate region 502, and a foreground candidate region 503. The background color contour pixel 507 shown in FIG. 5C is composed of pixels corresponding to the contour in the foreground candidate region 501 included in the input image 504. Similarly, the background color contour pixel 508 shown in FIG. 5C is composed of pixels corresponding to the contour in the foreground candidate region 502 included in the input image 504. The background color contour pixel 509 shown in FIG. 5C is composed of pixels corresponding to the foreground candidate region 503 included in the input image 504.

前景候補領域５０１に対応する背景色輪郭画素５０７の画素数は、前景候補領域５０１の輪郭が背景領域５１０と重なる３画素と、背景領域５１２と重なる１画素との合計で４画素であったため４となる。同様に、前景候補領域５０２に対応する背景色輪郭画素５０８の画素数は、前景候補領域５０２の輪郭が背景領域５１２とは重ならず、背景領域５１０と重なる画素が８画素であったため８となる。前景候補領域５０３に対応する背景色輪郭画素５０９の画素数は、前景候補領域５０３の輪郭が背景領域５１２とは重ならず、背景領域５１０と重なる画素が１画素であったため１となる。 The number of pixels of the background color contour pixel 507 corresponding to the foreground candidate area 501 is 4 because the total of 3 pixels in which the outline of the foreground candidate area 501 overlaps the background area 510 and 1 pixel overlapping the background area 512 is 4 pixels. It becomes. Similarly, the number of pixels of the background color contour pixel 508 corresponding to the foreground candidate region 502 is 8 because the contour of the foreground candidate region 502 does not overlap with the background region 512 and the number of pixels overlapping with the background region 510 is 8. Become. The number of pixels of the background color contour pixel 509 corresponding to the foreground candidate region 503 is 1 because the contour of the foreground candidate region 503 does not overlap with the background region 512 and the pixel overlapping with the background region 510 is one pixel.

（情報処理装置における動作）
図６は、情報処理装置１２３で実行される処理の流れを示すフローチャートである。情報処理装置１２３は、ＲＯＭ１２３３に格納されたプログラムをＣＰＵ１２３１がＲＡＭ１２３２をワークメモリとして実行することで、図４に示す各部として機能し、図６のフローチャートに示す一連の処理を実行する。抽出部４０３には、入力画像４１１が入力部４０１から入力されているとする。また、保持部４０２には、初期化情報４１２が入力部４０１から入力されているとする。なお、以下に示す処理の全てがＣＰＵ１２３１によって実行される必要はなく、処理の一部または全部が、ＣＰＵ１２３１以外の一つ又は複数の処理回路によって行われるように情報処理装置１２３が構成されてもよい。以降、各処理の説明における記号「Ｓ」は、フローチャートにおけるステップであることを意味する。以下、図６のフローチャートを用いて、情報処理装置１２３における動作の詳細を説明する。Ｓ６０１からＳ６０６の処理は、入力画像から前景候補を抽出する処理に含まれる。 (Operation in information processing device)
FIG. 6 is a flowchart showing a flow of processing executed by the information processing apparatus 123. The information processing apparatus 123 functions as each part shown in FIG. 4 by executing the program stored in the ROM 1233 by the CPU 1231 using the RAM 1232 as the work memory, and executes a series of processes shown in the flowchart of FIG. It is assumed that the input image 411 is input to the extraction unit 403 from the input unit 401. Further, it is assumed that the initialization information 412 is input to the holding unit 402 from the input unit 401. It should be noted that it is not necessary that all of the processes shown below are executed by the CPU 1231, and even if the information processing device 123 is configured so that a part or all of the processes is performed by one or a plurality of processing circuits other than the CPU 1231. Good. Hereinafter, the symbol "S" in the description of each process means a step in the flowchart. Hereinafter, the details of the operation in the information processing apparatus 123 will be described with reference to the flowchart of FIG. The processes of S601 to S606 are included in the process of extracting the foreground candidate from the input image.

Ｓ６０１では、抽出部４０３は、入力部４０１から入力された入力画像４１１において注目画素を特定する。注目画素の特定は、ラスタ―順で行ってもよいし、それ以外の順番に行ってもよい。 In S601, the extraction unit 403 identifies the pixel of interest in the input image 411 input from the input unit 401. The pixel of interest may be specified in raster order or in any other order.

Ｓ６０２では、抽出部４０３は、Ｓ６０１で特定された注目画素が背景の条件を満たすかどうかを判定する。注目画素が背景の条件を満たすかどうかの判定には、入力画像の注目画素と、入力画像に対応し、予め取得した背景画像とを用いた、前景背景分離技術である背景差分のように既知の手法が用いられる。抽出部４０３は、注目画素が背景の条件を満たすとの判定結果を得た場合（Ｓ６０２のＹＥＳ）、処理をＳ６１３に移行する。Ｓ６１３では、抽出部４０３は、Ｓ６０２の処理で背景の条件を満たすと判定された注目画素を背景に分類する。抽出部４０３によって注目画素が背景に分類された後、処理をＳ６０６に移行する。他方、抽出部４０３は、注目画素が背景の条件を満たさないとの判定結果を得た場合（Ｓ６０２のＮＯ）、処理をＳ６０３に移行する。 In S602, the extraction unit 403 determines whether or not the pixel of interest specified in S601 satisfies the background condition. For determining whether or not the attention pixel satisfies the background condition, it is known as background subtraction, which is a foreground background separation technique, using the attention pixel of the input image and the background image corresponding to the input image and acquired in advance. Method is used. When the extraction unit 403 obtains a determination result that the pixel of interest satisfies the background condition (YES in S602), the extraction unit 403 shifts the process to S613. In S613, the extraction unit 403 classifies the pixel of interest determined to satisfy the background condition in the process of S602 as the background. After the pixels of interest are classified into the background by the extraction unit 403, the process shifts to S606. On the other hand, when the extraction unit 403 obtains a determination result that the pixel of interest does not satisfy the background condition (NO in S602), the extraction unit 403 shifts the process to S603.

Ｓ６０３では、抽出部４０３は、Ｓ６０２の処理で背景の条件を満たさなかった着目画素を前景候補に分類する。Ｓ６０２の処理で背景の条件を満たさなかった着目画素は、抽出部４０３によって前景候補として抽出されるともいえる。 In S603, the extraction unit 403 classifies the pixel of interest that does not satisfy the background condition in the processing of S602 as a foreground candidate. It can be said that the pixel of interest that does not satisfy the background condition in the processing of S602 is extracted as a foreground candidate by the extraction unit 403.

Ｓ６０４では、抽出部４０３は、Ｓ６０３の処理で前景候補に分類された注目画素に対応する背景画像の画素値と、予め指定された１つ以上の背景色（背景の画素値）の差分をそれぞれチェックし、チェックした差分が第一の閾値以下であるか否かを判定する。抽出部４０３は、注目画素に対応する背景画像の画素値と、予め指定された背景色の何れかとの差分が第一の閾値以下であるとの判定結果を得た場合（Ｓ６０４のＹＥＳ）、処理をＳ６０５に移行する。抽出部４０３は、注目画素に対応する背景画像の画素値と、予め指定された何れの背景色との差分が第一の閾値以下ではないとの判定結果を得た場合（Ｓ６０４のＮＯ）、Ｓ６０５をスキップして処理をＳ６０６に移行する。 In S604, the extraction unit 403 sets the difference between the pixel value of the background image corresponding to the pixel of interest classified as the foreground candidate in the process of S603 and one or more background colors (background pixel values) specified in advance. It is checked and it is determined whether or not the checked difference is equal to or less than the first threshold value. When the extraction unit 403 obtains a determination result that the difference between the pixel value of the background image corresponding to the pixel of interest and any of the background colors specified in advance is equal to or less than the first threshold value (YES in S604). The process shifts to S605. When the extraction unit 403 obtains a determination result that the difference between the pixel value of the background image corresponding to the pixel of interest and any of the background colors specified in advance is not equal to or less than the first threshold value (NO in S604). S605 is skipped and the process shifts to S606.

Ｓ６０５では、抽出部４０３は、Ｓ６０３の処理で前景候補に分類された画素のうち、Ｓ６０４の処理で第一の閾値以下であると判定された画素に対して指定色背景フラグをオンにセットする。 In S605, the extraction unit 403 sets the designated color background flag on for the pixels classified as foreground candidates in the processing of S603 and determined to be equal to or less than the first threshold value in the processing of S604. ..

Ｓ６０６では、抽出部４０３は、入力画像全体に対して前景候補の抽出処理が終了したか否かを判定する。抽出部４０３は、入力画像において注目画素に特定されず未処理の画素があり、入力画像全体に対して前景候補の抽出処理が終了していないとの判定結果を得た場合（Ｓ６０６のＮＯ）、処理をＳ６０１に戻す。そして、抽出部４０３は、入力画像において注目画素として特定されていない未処理の画素に対してＳ６０１からＳ６０５、Ｓ６１３の処理を実行する。他方、抽出部４０３は、入力画像の全ての画素が注目画素として特定され前景候補または背景に分類されており、前景候補の抽出処理が入力画像全体に対して終了したとの判定結果を得た場合（Ｓ６０６のＹＥＳ）、処理をＳ６０７へ移行する。 In S606, the extraction unit 403 determines whether or not the foreground candidate extraction process has been completed for the entire input image. When the extraction unit 403 obtains a determination result that the foreground candidate extraction process has not been completed for the entire input image because there are unprocessed pixels that are not specified as the pixels of interest in the input image (NO in S606). , The process is returned to S601. Then, the extraction unit 403 executes the processes S601 to S605 and S613 on the unprocessed pixels that are not specified as the pixels of interest in the input image. On the other hand, the extraction unit 403 obtained a determination result that all the pixels of the input image were specified as the pixels of interest and classified as the foreground candidate or the background, and the extraction process of the foreground candidate was completed for the entire input image. If (YES in S606), the process shifts to S607.

Ｓ６０７では、抽出部４０３は、処理対象の入力画像に、前景候補に分類された画素があるかどうかを判定する。抽出部４０３は、処理対象の入力画像に、前景候補に分類された画素がないとの判定結果を得た場合、本フローを終了する。他方、抽出部４０３は、処理対象の入力画像に、前景候補に分類された画素があるとの判定結果を得た場合、処理をＳ６０８に移行する。 In S607, the extraction unit 403 determines whether or not the input image to be processed has pixels classified as foreground candidates. When the extraction unit 403 obtains a determination result that the input image to be processed does not have pixels classified as foreground candidates, the extraction unit 403 ends this flow. On the other hand, when the extraction unit 403 obtains a determination result that the input image to be processed has pixels classified as foreground candidates, the extraction unit 403 shifts the processing to S608.

Ｓ６０８では、抽出部４０３は、前景候補として抽出した画素を領域毎にグルーピングして前景候補領域４１６を特定する。そして、抽出部４０３は、特定した前景候補領域４１６を評価部４０４に出力する。評価部４０４は、前景候補領域４１６から抽出した輪郭に対応する背景画像の画素にて、Ｓ６０５にて指定背景色フラグがオンに設定された画素を背景色輪郭画素として計数し、前景候補領域ごとに背景色輪郭画素数を記録する。各前景候補領域の背景色輪郭画素数の記録先は、後述のＳ６１０の判定処理にて評価部４０４による読み出しが可能であれば、どの機能部であってもよい。このようにＳ６０１からＳ６０８の処理を実行することにより、入力画像は、第一領域、第二領域、第三領域の３つの領域に分類されることになる。すなわち、第一領域は、前景候補に分類され指定背景色フラグがオンにセットされた画素で構成される領域である。第二領域は、前景候補に分類されるが指定背景色フラグがオンにセットされていない画素で構成される領域である。第三領域は、背景に分類される画素で構成される領域である。 In S608, the extraction unit 403 groups the pixels extracted as the foreground candidate for each region to specify the foreground candidate region 416. Then, the extraction unit 403 outputs the specified foreground candidate area 416 to the evaluation unit 404. The evaluation unit 404 counts the pixels of the background image corresponding to the contour extracted from the foreground candidate area 416 as the background color contour pixels for which the designated background color flag is set to ON in S605, and counts the pixels for each foreground candidate area. The number of background color contour pixels is recorded in. The recording destination of the number of background color contour pixels in each foreground candidate region may be any functional unit as long as it can be read out by the evaluation unit 404 in the determination process of S610 described later. By executing the processes S601 to S608 in this way, the input image is classified into three regions, a first region, a second region, and a third region. That is, the first region is an region composed of pixels classified as foreground candidates and with the designated background color flag set to ON. The second area is an area composed of pixels that are classified as foreground candidates but the designated background color flag is not set to on. The third region is an region composed of pixels classified as a background.

Ｓ６０９からＳ６１２の処理は、前景候補領域から前景を抽出する処理に含まれる。 The processes of S609 to S612 are included in the process of extracting the foreground from the foreground candidate region.

Ｓ６０９では、評価部４０４は、着目前景候補領域を特定する。着目前景候補領域の特定は、左端から順番に行ってもよいし、それ以外の順番に行ってもよい。 In S609, the evaluation unit 404 specifies the foreground candidate region of interest. The foreground candidate area of interest may be specified in order from the left end, or may be specified in any other order.

Ｓ６１０では、評価部４０４は、Ｓ６０９で特定した着目前景候補領域に対応する、Ｓ６０８で記録した背景色輪郭画素数（着目前景候補領域の背景色輪郭画素数）が第二の閾値以上であるか否かを判定する。評価部４０４は、着目前景候補領域の背景色輪郭画素数が第二の閾値以上ではないとの判定結果を得た場合（Ｓ６１０のＮＯ）、Ｓ６１１をスキップして処理をＳ６１２に移行する。他方、評価部４０４は、着目前景候補領域の背景色輪郭画素数が第二の閾値以上であるとの判定結果を得た場合（Ｓ６１０のＹＥＳ）、処理をＳ６１１に移行する。 In S610, the evaluation unit 404 indicates whether the number of background color contour pixels recorded in S608 (the number of background color contour pixels in the foreground candidate region of interest) corresponding to the foreground candidate region of interest specified in S609 is equal to or greater than the second threshold value. Judge whether or not. When the evaluation unit 404 obtains a determination result that the number of background color contour pixels in the foreground candidate region of interest is not equal to or greater than the second threshold value (NO in S610), the evaluation unit 404 skips S611 and shifts the process to S612. On the other hand, when the evaluation unit 404 obtains a determination result that the number of background color contour pixels in the foreground candidate region of interest is equal to or greater than the second threshold value (YES in S610), the evaluation unit 404 shifts the process to S611.

Ｓ６１１では、評価部４０４は、Ｓ６１０の処理で背景色輪郭画素数が第二の閾値以上であると判定された着目前景候補領域を前景に分類する。Ｓ６１０の処理にて背景色輪郭画素数が第二の閾値以上であると判定された着目前景候補領域は、抽出部４０３によって前景として抽出されるともいえる。 In S611, the evaluation unit 404 classifies the foreground candidate region of interest, which is determined by the process of S610 to have the number of background color contour pixels equal to or greater than the second threshold value, into the foreground. It can be said that the foreground candidate region of interest, for which the number of background color contour pixels is determined to be equal to or greater than the second threshold value in the processing of S610, is extracted as the foreground by the extraction unit 403.

Ｓ６１２では、評価部４０４は、全ての前景候補領域に対して前景かどうかのチェックである前景の抽出処理が終了したか否かを判定する。評価部４０４は、全ての前景候補領域に対して前景かどうかのチェックを終了していないとの判定結果を得た場合（Ｓ６１２のＮＯ）、処理をＳ６０９に戻す。そして、評価部４０４は、入力画像において注目前景候補領域として特定されていない未処理の前景候補領域に対してＳ６０９からＳ６１２の処理を実行する。他方、評価部４０４は、全ての前景候補領域が注目前景候補領域として特定され前景またはそれ以外に分類されており、前景の抽出処理が全ての前景候補領域に対して行われてチェック終了との判定結果を得た場合（Ｓ６１２のＹＥＳ）、本フローを終了する。 In S612, the evaluation unit 404 determines whether or not the foreground extraction process, which is a check for whether or not the foreground is foreground, has been completed for all the foreground candidate regions. When the evaluation unit 404 obtains a determination result that the check for the foreground has not been completed for all the foreground candidate regions (NO in S612), the evaluation unit 404 returns the process to S609. Then, the evaluation unit 404 executes the processes S609 to S612 on the unprocessed foreground candidate region that is not specified as the attention foreground candidate region in the input image. On the other hand, in the evaluation unit 404, all the foreground candidate areas are specified as the attention foreground candidate areas and classified into the foreground or others, and the foreground extraction process is performed for all the foreground candidate areas and the check is completed. When the determination result is obtained (YES in S612), this flow ends.

ここで、上記Ｓ６１０の判定処理について、図５に示す前景候補領域を例に具体的に説明する。Ｓ６１０では、図５（ｂ）に示す各前景候補領域５０１、５０２、５０３は、対応する輪郭の背景色輪郭画素数が第二の閾値以上であるかに応じて、次のように分類されることになる。 Here, the determination process of S610 will be specifically described using the foreground candidate region shown in FIG. 5 as an example. In S610, the foreground candidate regions 501, 502, and 503 shown in FIG. 5B are classified as follows according to whether the number of background color contour pixels of the corresponding contour is equal to or greater than the second threshold value. It will be.

第二の閾値が８に設定された場合には、背景色輪郭画素数が８である背景色輪郭画素５０８に対応する前景候補領域５０２が前景に分類されて抽出されることになる。なお、第二の設定値未満となる背景色輪郭画素５０７、５０９それぞれに対応する前景候補領域５０１、５０３は、評価部４０４によって明示的に背景に分類されて扱われる、または、前景候補領域として残されて、前景より優先度を下げて取り扱われる。 When the second threshold value is set to 8, the foreground candidate region 502 corresponding to the background color contour pixel 508 having the background color contour pixel number of 8 is classified into the foreground and extracted. The foreground candidate areas 501 and 503 corresponding to the background color contour pixels 507 and 509, which are less than the second set value, are explicitly classified as the background by the evaluation unit 404, or are treated as the foreground candidate area. It is left behind and treated with a lower priority than the foreground.

第二の閾値が３に設定された場合には、背景色輪郭画素数が４、８である背景色輪郭画素５０７、５０８に対応する前景候補領域５０１、５０２が前景に分類されて抽出されることになる。なお、第二の閾値未満となる背景色輪郭画素５０９に対応する前景候補領域５０３は、評価部４０４によって明示的に背景に分類されて扱われる、または、前景候補領域として残されて、前景より優先度を下げて取り扱われる。 When the second threshold value is set to 3, the foreground candidate areas 501 and 502 corresponding to the background color contour pixels 507 and 508 having the background color contour pixels 4 and 8 are classified into the foreground and extracted. It will be. The foreground candidate area 503 corresponding to the background color contour pixel 509 that is less than the second threshold value is explicitly classified as the background by the evaluation unit 404 and treated, or is left as the foreground candidate area and is more than the foreground. It is treated with a lower priority.

以上説明したように、前景候補領域の輪郭に対応する背景画像と予め指定された背景色で画素値の差が第一の閾値以下である前景候補領域と、前景候補領域の輪郭とに関する情報に基づき、前景候補領域を前景であると判定することで次の効果を奏する。前景が出現する領域の背景がそれ以外の領域と異なる特徴を持つ場合、前景を精度よく抽出することができる。ズームやカメラの移動などで被写体に変化があった場合であっても、背景とみなす領域の特徴が同じであれば図６に示す処理を継続して実行可能である。また、複数の色があったり、日照変化で明るさが変わったりしても、背景を除外できる。 As described above, the information regarding the background image corresponding to the contour of the foreground candidate region, the foreground candidate region in which the difference in pixel values between the background colors specified in advance is equal to or less than the first threshold value, and the contour of the foreground candidate region Based on this, the following effect is obtained by determining that the foreground candidate area is the foreground. When the background of the area where the foreground appears has different characteristics from the other areas, the foreground can be extracted accurately. Even if the subject changes due to zooming or moving the camera, the process shown in FIG. 6 can be continuously executed as long as the characteristics of the area regarded as the background are the same. Also, even if there are multiple colors or the brightness changes due to changes in sunshine, the background can be excluded.

また、背景色輪郭画素数が第二の閾値以上となる前景候補領域を前景に分類することによって、ノイズや誤差が多い前景候補領域を前景から除外することができる。 Further, by classifying the foreground candidate region in which the number of background color contour pixels is equal to or greater than the second threshold value into the foreground, the foreground candidate region having a large amount of noise and error can be excluded from the foreground.

前景が出現する領域の背景がそれ以外の領域と異なる特徴を持つ場合に、画像から前景を精度よく抽出することができる。ズームやカメラの移動等で被写体に変化があった場合であっても、背景とみなす領域の特徴が同じであれば継続して動作が可能である。また、複数の色があったり、日照変化で明るさが変わったりした場合も、背景を除外できる。 When the background of the area where the foreground appears has different characteristics from the other areas, the foreground can be extracted accurately from the image. Even if the subject changes due to zooming or moving the camera, continuous operation is possible as long as the characteristics of the area regarded as the background are the same. Also, if there are multiple colors or the brightness changes due to changes in sunshine, the background can be excluded.

ある程度大きさが決まった前景候補領域を前景として扱う場合、本実施形態に示す方法を用いることで抽出したい形状と比較して大きすぎたり小さすぎたりする前景候補領域を処理対象から除外して、前景のノイズを減らすことができる。ある程度大きさが決まったとは、前景が人であり、入力画像の画角（幅方向の画素数と高さ方向の画素数）、カメラから前景までの焦点距離、人の大きさや形状などを基に、前景が所定（所望）の大きさでマッピングされるのを演算して決められる場合を含む。抽出したい形状と比較して大きすぎる前景候補領域は、抽出したい形状の大きさと完全一致ではなく、抽出したい形状の大きさに数％マージンを加算した大きさより大きい領域を示している。抽出したい形状と比較して小さすぎる前景候補領域とは、抽出したい形状の大きさと完全一致ではなく、抽出したい形状の大きさに数％マージンを減算した大きさより小さい領域を示している。 When treating the foreground candidate region having a certain size as the foreground, the foreground candidate region that is too large or too small compared to the shape to be extracted by using the method shown in this embodiment is excluded from the processing target. Foreground noise can be reduced. The size is decided to some extent based on the angle of view of the input image (the number of pixels in the width direction and the number of pixels in the height direction), the focal length from the camera to the foreground, the size and shape of the person, etc. Including the case where the foreground is calculated and determined to be mapped with a predetermined (desired) size. The foreground candidate region, which is too large compared to the shape to be extracted, does not exactly match the size of the shape to be extracted, but indicates a region larger than the size of the shape to be extracted plus a few percent margin. The foreground candidate region, which is too small compared to the shape to be extracted, is not an exact match with the size of the shape to be extracted, but is a region smaller than the size of the shape to be extracted minus a few percent margin.

＜＜実施形態２＞＞
実施形態１では、背景色輪郭画素数が第二の閾値以上である前景候補領域を前景に分類して抽出する場合について説明した。本実施形態では、前景候補領域の輪郭の輪郭長と背景色輪郭画素数との比率が第三の閾値以上である前景候補領域を前景に分類して抽出する場合について説明する。実施形態１と共通の部分については、説明を省略する << Embodiment 2 >>
In the first embodiment, a case where the foreground candidate region in which the number of background color contour pixels is equal to or larger than the second threshold value is classified into the foreground and extracted has been described. In the present embodiment, a case where the foreground candidate region in which the ratio of the contour length of the contour of the foreground candidate region to the number of background color contour pixels is equal to or larger than the third threshold value is classified into the foreground and extracted will be described. The description of the parts common to the first embodiment will be omitted.

（情報処理装置の論理構成）
次に、本実施形態の情報処理装置の論理構成は、評価部４０４以外に関し、図４に示す実施形態１の情報処理装置１２３の論理構成と同じであり、その説明を省略する。 (Logical configuration of information processing device)
Next, the logical configuration of the information processing apparatus of the present embodiment is the same as the logical configuration of the information processing apparatus 123 of the first embodiment shown in FIG. 4 except for the evaluation unit 404, and the description thereof will be omitted.

本実施形態の評価部４０４は、前景候補領域４１６からその輪郭を抽出して前景候補領域４１６の輪郭の座標情報（画素）を取得し、輪郭の座標情報に対応する背景画像の画素を輪郭背景画像４１５として保持部４０２に保持される背景画像４１３から取得する。そして、評価部４０４は、前景候補領域４１６のそれぞれについて、輪郭背景画像４１５と背景色４１４とで画素値の差が第一の閾値以下となり、指定背景色フラグがオンに設定された画素を背景色輪郭画素数として計数する。評価部４０４は、輪郭の画素数（輪郭長）と、輪郭に対応する背景色輪郭画素数との比率を前景候補領域ごとに導出して記録する。導出した比率の記録先は、後述の評価部４０４による読み出しが可能であれば、どの機能部であってもよい。評価部４０４は、輪郭長と背景色輪郭画素数の比率が第三の閾値以上であるか否かを判定する。評価部４０４は、輪郭長と背景色輪郭画素数の比率が第三の閾値以上であるとの判定結果を得た場合、対応する前景候補領域を前景に分類する。なお、評価部４０４は、輪郭長と背景色輪郭画素数の比率が第三の閾値未満となる前記比率に対応する前景候補領域を明示的に背景に分類して扱ってもよいし、前景候補領域として残しておき、前景より優先度を下げて取り扱ってもよい。 The evaluation unit 404 of the present embodiment extracts the contour from the foreground candidate region 416, acquires the coordinate information (pixels) of the contour of the foreground candidate region 416, and plots the pixels of the background image corresponding to the coordinate information of the contour as the contour background. Obtained from the background image 413 held by the holding unit 402 as the image 415. Then, the evaluation unit 404 sets the background of the pixels in which the difference in pixel values between the contour background image 415 and the background color 414 is equal to or less than the first threshold value and the designated background color flag is turned on for each of the foreground candidate regions 416. It is counted as the number of color contour pixels. The evaluation unit 404 derives and records the ratio of the number of pixels of the contour (contour length) to the number of background color contour pixels corresponding to the contour for each foreground candidate region. The recording destination of the derived ratio may be any functional unit as long as it can be read by the evaluation unit 404 described later. The evaluation unit 404 determines whether or not the ratio of the contour length to the number of background color contour pixels is equal to or greater than the third threshold value. When the evaluation unit 404 obtains a determination result that the ratio of the contour length and the number of background color contour pixels is equal to or greater than the third threshold value, the evaluation unit 404 classifies the corresponding foreground candidate region into the foreground. The evaluation unit 404 may explicitly classify the foreground candidate region corresponding to the ratio in which the ratio of the contour length and the number of background color contour pixels is less than the third threshold value into the background and treat the foreground candidate. It may be left as an area and handled with a lower priority than the foreground.

（情報処理装置における動作）
本実施形態に係る情報処理装置による処理ついて、図７を用いて説明する。図７は、本実施形態の情報処理装置で実行される処理の流れを示すフローチャートである。情報処理装置１２３は、ＲＯＭ１２３３に格納されたプログラムをＣＰＵ１２３１がＲＡＭ１２３２をワークメモリとして実行することで、図４に示す各部として機能し、図７のフローチャートに示す一連の処理を実行する。抽出部４０３には、入力画像４１１が入力部４０１から入力されているとする。また、保持部４０２には、初期化情報４１２が入力部４０１から入力されているとする。なお、以下に示す処理の全てがＣＰＵ１２３１によって実行される必要はなく、処理の一部または全部が、ＣＰＵ１２３１以外の一つ又は複数の処理回路によって行われるように情報処理装置１２３が構成されてもよい。実施形態１と同じ処理内容のステップには同一符号を付記する。Ｓ６０１からＳ６０６の処理は、入力画像から前景候補を抽出する処理に含まれる。 (Operation in information processing device)
The processing by the information processing apparatus according to the present embodiment will be described with reference to FIG. 7. FIG. 7 is a flowchart showing a flow of processing executed by the information processing apparatus of the present embodiment. The information processing apparatus 123 functions as each part shown in FIG. 4 by executing the program stored in the ROM 1233 by the CPU 1231 using the RAM 1232 as the work memory, and executes a series of processes shown in the flowchart of FIG. 7. It is assumed that the input image 411 is input to the extraction unit 403 from the input unit 401. Further, it is assumed that the initialization information 412 is input to the holding unit 402 from the input unit 401. It should be noted that it is not necessary that all of the processes shown below are executed by the CPU 1231, and even if the information processing device 123 is configured so that a part or all of the processes is performed by one or a plurality of processing circuits other than the CPU 1231. Good. The same reference numerals are added to the steps having the same processing contents as those in the first embodiment. The processes of S601 to S606 are included in the process of extracting the foreground candidate from the input image.

本実施形態のＳ６０１からＳ６０７では、実施形態１と同様な処理が行われる。 In S601 to S607 of the present embodiment, the same processing as that of the first embodiment is performed.

続いて、Ｓ７０８では、抽出部４０３は、前景候補として抽出した画素を領域毎にグルーピングして前景候補領域４１６を特定する。そして、抽出部４０３は、特定した前景候補領域４１６を評価部４０４に出力する。評価部４０４は、前景候補領域４１６から抽出した輪郭の画素数（輪郭長）と、輪郭に対応する背景画像の画素であって、指定背景色フラグがオンに設定された画素を背景色輪郭画素として計数して得た背景色輪郭画素数との比率を前景候補領域ごとに記録する。比率の記録先は、評価部４０４による読み出しが可能であれば、どの機能部であってもよい。 Subsequently, in S708, the extraction unit 403 groups the pixels extracted as the foreground candidate for each region to specify the foreground candidate region 416. Then, the extraction unit 403 outputs the specified foreground candidate area 416 to the evaluation unit 404. The evaluation unit 404 uses the number of pixels (contour length) of the contour extracted from the foreground candidate area 416 and the pixels of the background image corresponding to the contour, and the pixels in which the designated background color flag is set to the background color contour pixels. The ratio to the number of background color contour pixels obtained by counting as is recorded for each foreground candidate region. The recording destination of the ratio may be any functional unit as long as it can be read by the evaluation unit 404.

Ｓ６０９、Ｓ７１０、Ｓ６１１からＳ６１２の処理は、前景候補領域から前景を抽出する処理に含まれる。 The processes of S609, S710, and S611 to S612 are included in the process of extracting the foreground from the foreground candidate region.

続いて、本実施形態のＳ６０９では、実施形態１と同様な処理が行われる。 Subsequently, in S609 of the present embodiment, the same processing as that of the first embodiment is performed.

続いて、Ｓ７１０では、評価部４０４は、Ｓ６０９で特定した着目前景候補領域に対応する、Ｓ７０８で記録した、輪郭長と背景色輪郭画素数の比率が第三の閾値以上であるか否かを判定する。評価部４０４は、輪郭長と背景色輪郭画素数の比率が第三の閾値以上ではないとの判定結果を得た場合（Ｓ７１０のＮＯ）、Ｓ６１１をスキップして処理をＳ６１２に移行する。他方、評価部４０４は、輪郭長と背景色輪郭画素数の比率が第三の閾値以上であるとの判定結果を得た場合（Ｓ７１０のＹＥＳ）、処理をＳ６１１に移行する。 Subsequently, in S710, the evaluation unit 404 determines whether or not the ratio of the contour length and the number of background color contour pixels recorded in S708, which corresponds to the foreground candidate region of interest specified in S609, is equal to or greater than the third threshold value. judge. When the evaluation unit 404 obtains a determination result that the ratio of the contour length and the number of background color contour pixels is not equal to or greater than the third threshold value (NO in S710), the evaluation unit 404 skips S611 and shifts the process to S612. On the other hand, when the evaluation unit 404 obtains a determination result that the ratio of the contour length and the number of background color contour pixels is equal to or greater than the third threshold value (YES in S710), the evaluation unit 404 shifts the process to S611.

続いて、本実施形態のＳ６１１、Ｓ６１２では、実施形態１と同様な処理が行われる。Ｓ６１２にて、評価部４０４は、前景の抽出処理が全ての前景候補領域に対して行われてチェック終了との判定結果を得た場合（Ｓ６１２のＹＥＳ）、本フローを終了する。 Subsequently, in S611 and S612 of the present embodiment, the same processing as in the first embodiment is performed. In S612, the evaluation unit 404 ends this flow when the foreground extraction process is performed on all the foreground candidate regions and a determination result that the check is completed is obtained (YES in S612).

ここで、上記Ｓ７１０の判定処理について、図５に示す前景候補領域を例に具体的に説明する。Ｓ７１０の処理の前に、図５（ｂ）に示す前景候補領域５０１、５０２、５０３それぞれの輪郭長と背景色輪郭画素数との比率は、評価部４０４によって導出されているとする。なお、評価部４０４によって導出された前景候補領域の輪郭長と背景色輪郭画素数との比率は、Ｓ７０８にて、記録されることになる。 Here, the determination process of S710 will be specifically described using the foreground candidate region shown in FIG. 5 as an example. Before the processing of S710, it is assumed that the ratio of the contour length of each of the foreground candidate regions 501, 502, and 503 shown in FIG. 5B to the number of background color contour pixels is derived by the evaluation unit 404. The ratio of the contour length of the foreground candidate region derived by the evaluation unit 404 to the number of background color contour pixels is recorded in S708.

前景候補領域５０１の輪郭の輪郭長が１２であり、前景候補領域５０１の輪郭に対応する背景色輪郭画素数が４であるため、前景候補領域５０１の輪郭長と背景色輪郭画素数の比率は、３３．３％と導出される。前景候補領域５０２の輪郭の輪郭長が１２であり、前景候補領域５０２の輪郭に対応する背景色輪郭画素数が８であるため、前景候補領域５０２の輪郭長と背景色輪郭画素数の比率は、６６．６％と導出される。前景候補領域５０３の輪郭の輪郭長が１であり、前景候補領域５０３の輪郭に対応する背景色輪郭画素数が１であるため、前景候補領域５０３の輪郭長と背景色輪郭画素数の比率は、１００％と導出される。 Since the contour length of the contour of the foreground candidate region 501 is 12 and the number of background color contour pixels corresponding to the contour of the foreground candidate region 501 is 4, the ratio of the contour length of the foreground candidate region 501 to the number of background color contour pixels is , 33.3%. Since the contour length of the contour of the foreground candidate region 502 is 12, and the number of background color contour pixels corresponding to the contour of the foreground candidate region 502 is 8, the ratio of the contour length of the foreground candidate region 502 to the number of background color contour pixels is , 66.6%. Since the contour length of the contour of the foreground candidate region 503 is 1 and the number of background color contour pixels corresponding to the contour of the foreground candidate region 503 is 1, the ratio of the contour length of the foreground candidate region 503 to the number of background color contour pixels is , 100% is derived.

Ｓ７１０では、図５（ｂ）に示す各前景候補領域５０１、５０２、５０３は、対応する輪郭の輪郭長と背景色輪郭画素数との比率が第三の閾値以上であるかに応じて、次のように分類されることになる。第三の閾値が５０％に設定された場合、前景候補領域の輪郭長と背景色輪郭画素数の比率が６６．６％、１００％となる前景候補領域５０２、５０３とが前景に分類されて抽出されることになる。なお、第三の閾値未満となる背景色輪郭画素５０７に対応する前景候補領域５０１は、評価部４０４によって明示的に背景に分類されて扱われる、または、前景候補領域として残されて、前景より優先度を下げて取り扱われる。 In S710, the foreground candidate regions 501, 502, and 503 shown in FIG. 5B are as follows, depending on whether the ratio of the contour length of the corresponding contour to the number of background color contour pixels is equal to or greater than the third threshold value. It will be classified as. When the third threshold value is set to 50%, the foreground candidate areas 502 and 503 in which the ratio of the contour length of the foreground candidate region to the number of background color contour pixels is 66.6% and 100% are classified into the foreground. It will be extracted. The foreground candidate area 501 corresponding to the background color contour pixel 507 that is less than the third threshold value is explicitly classified as the background by the evaluation unit 404 and treated, or is left as the foreground candidate area and is more than the foreground. It is treated with a lower priority.

以上説明したように、前景候補領域の輪郭長と背景色輪郭画素数の比率を用いて前景候補領域をフィルタリングすることで、指定した背景色の領域上に抽出の範囲を限定して、前景候補領域の大きさに関わらず前景を精度よく抽出することができる。また、輪郭長と背景色輪郭画素数との比率を用いることによって、撮像中にズームを変更した場合も前景抽出の条件を変更することなく、変更前と同様、入力画像から前景を抽出することができる。 As described above, by filtering the foreground candidate area using the ratio of the contour length of the foreground candidate area and the number of background color contour pixels, the extraction range is limited to the specified background color area, and the foreground candidate is used. The foreground can be extracted accurately regardless of the size of the area. In addition, by using the ratio of the contour length to the number of background color contour pixels, even if the zoom is changed during imaging, the foreground can be extracted from the input image as before, without changing the foreground extraction conditions. Can be done.

［その他の実施形態］
実施形態１では、背景色輪郭画素数が第二の閾値以上となる前景候補領域を前景に分類する処理について説明した。実施形態２では、輪郭長と背景色輪郭画素数の比率が第三の閾値以上となる前景候補領域を前景に分類する処理について説明した。背景色輪郭画素数が第二の閾値以上となり、かつ、輪郭長と背景色輪郭画素数の比率が第三の閾値以上となる前景候補領域を前景に分類してもよい。すなわち、第二の閾値を用いた第一の抽出処理を行い、第一の抽出処理の抽出結果に対して、第三の閾値を用いた第二の抽出処理を行ってもよい。また、第三の閾値を用いた第二の抽出処理を行い、第二の抽出処理の抽出結果に対して、第二の閾値を用いた第一の抽出処理を行ってもよい。 [Other Embodiments]
In the first embodiment, the process of classifying the foreground candidate region in which the number of background color contour pixels is equal to or greater than the second threshold value into the foreground has been described. In the second embodiment, the process of classifying the foreground candidate region in which the ratio of the contour length and the number of background color contour pixels is equal to or more than the third threshold value into the foreground has been described. The foreground candidate region in which the number of background color contour pixels is equal to or greater than the second threshold value and the ratio of the contour length to the number of background color contour pixels is equal to or greater than the third threshold value may be classified as the foreground. That is, the first extraction process using the second threshold value may be performed, and the extraction result of the first extraction process may be subjected to the second extraction process using the third threshold value. Further, the second extraction process using the third threshold value may be performed, and the extraction result of the second extraction process may be subjected to the first extraction process using the second threshold value.

第一の抽出処理に続いて第二の抽出処理を行う場合について、図５に示す前景候補領域を例に具体的に説明する。第一の抽出処理で用いられる第二の閾値は、３に設定されるとする。第二の抽出処理で用いられる第三の閾値は、５０％に設定されるとする。 A case where the second extraction process is performed after the first extraction process will be specifically described by taking the foreground candidate region shown in FIG. 5 as an example. It is assumed that the second threshold value used in the first extraction process is set to 3. It is assumed that the third threshold value used in the second extraction process is set to 50%.

第一の抽出処理では、図５（ｂ）に示す各前景候補領域５０１、５０２、５０３は、対応する輪郭の背景色輪郭画素数が第二の閾値の３以上であるかに応じて、次のように分類されることになる。背景色輪郭画素数が４、８である背景色輪郭画素５０７、５０８に対応する前景候補領域５０１、５０２が前景に分類されて抽出されることになる。 In the first extraction process, the foreground candidate regions 501, 502, and 503 shown in FIG. 5B are next depending on whether the number of background color contour pixels of the corresponding contour is 3 or more of the second threshold value. It will be classified as. The foreground candidate regions 501 and 502 corresponding to the background color contour pixels 507 and 508 having the number of background color contour pixels 4 and 8 are classified into the foreground and extracted.

続いて、第二の抽出処理では、各前景候補領域５０１、５０２は、対応する前景候補領域の輪郭長と背景色輪郭画素数の比率が第三の閾値の５０％以上であるかに応じて、次のように分類されることになる。前景候補領域の輪郭長と背景色輪郭画素数の比率が６６．６％となる前景候補領域５０２のみが前景に分類されて抽出されることになる。 Subsequently, in the second extraction process, each of the foreground candidate regions 501 and 502 depends on whether the ratio of the contour length of the corresponding foreground candidate region to the number of background color contour pixels is 50% or more of the third threshold value. , Will be classified as follows. Only the foreground candidate region 502 in which the ratio of the contour length of the foreground candidate region to the number of background color contour pixels is 66.6% is classified into the foreground and extracted.

なお、第二および第三の閾値以上の条件を満たさない背景色輪郭画素５０９に対応する前景候補領域５０３と、第二の閾値以上の条件を満たすが第三の閾値以上の条件を満たさない背景色輪郭画素５０７に対応する前景候補領域５０１は、次のように扱われる。すなわち、前景候補領域５０１、５０３は、評価部４０４によって明示的に背景に分類されて扱われる、または、前景候補領域として残されて、前景より優先度を下げて取り扱われる。これにより、背景を抽出したい領域におけるノイズの誤検出を減らすことができる。すなわち、前景としては抽出したくない領域におけるノイズの誤検出を減らすことができる。 The foreground candidate area 503 corresponding to the background color contour pixel 509 that does not satisfy the conditions of the second and third threshold values or more, and the background that satisfies the conditions of the second threshold value or more but does not satisfy the condition of the third threshold value or more. The foreground candidate region 501 corresponding to the color contour pixel 507 is treated as follows. That is, the foreground candidate areas 501 and 503 are explicitly classified and treated as the background by the evaluation unit 404, or are left as the foreground candidate area and are treated with a lower priority than the foreground. As a result, it is possible to reduce false detection of noise in the region where the background is desired to be extracted. That is, it is possible to reduce false detection of noise in a region that is not desired to be extracted as the foreground.

第二の抽出処理に続いて第一の抽出処理を行う場合について、図５に示す前景候補領域を例に具体的に説明する。第二の抽出処理で用いられる第三の閾値は、５０％に設定されるとする。第一の抽出処理で用いられる第二の閾値は、３に設定されるとする。 A case where the first extraction process is performed after the second extraction process will be specifically described by taking the foreground candidate region shown in FIG. 5 as an example. It is assumed that the third threshold value used in the second extraction process is set to 50%. It is assumed that the second threshold value used in the first extraction process is set to 3.

第二の抽出処理では、各前景候補領域５０１、５０２、５０３は、対応する前景候補領域の輪郭長と背景色輪郭画素数の比率が第三の閾値の５０％以上であるかに応じて、次のように分類されることになる。前景候補領域の輪郭長と背景色輪郭画素数の比率が６６．６％、１００％となる前景候補領域５０２、５０３が前景に分類されて抽出されることになる。 In the second extraction process, each foreground candidate region 501, 502, 503 depends on whether the ratio of the contour length of the corresponding foreground candidate region to the number of background color contour pixels is 50% or more of the third threshold value. It will be classified as follows. The foreground candidate areas 502 and 503 in which the ratio of the contour length of the foreground candidate region to the number of background color contour pixels is 66.6% and 100% are classified into the foreground and extracted.

続いて、第一の抽出処理では、図５（ｂ）に示す各前景候補領域５０２、５０３は、対応する輪郭の背景色輪郭画素数が第二の閾値の３以上であるかに応じて、次のように分類されることになる。背景色輪郭画素数が８である背景色輪郭画素５０８に対応する前景候補領域５０３が前景に分類されて抽出されることになる。 Subsequently, in the first extraction process, the foreground candidate regions 502 and 503 shown in FIG. 5B depend on whether the number of background color contour pixels of the corresponding contour is 3 or more of the second threshold value. It will be classified as follows. The foreground candidate region 503 corresponding to the background color contour pixel 508 having the background color contour pixel number of 8 is classified into the foreground and extracted.

なお、第三および第二の閾値以上の条件を満たさない背景色輪郭画素５０７に対応する前景候補領域５０１と、第三の閾値以上の条件を満たすが第二の閾値以上の条件を満たさない背景色輪郭画素５０９に対応する前景候補領域５０３は、次のように扱われる。すなわち、前景候補領域５０１、５０３は、評価部４０４によって明示的に背景に分類されて扱われる、または、前景候補領域として残されて、前景より優先度を下げて取り扱われる。これにより、背景を抽出したい領域におけるノイズの誤検出を減らすことができる。すなわち、前景としては抽出したくない領域におけるノイズの誤検出を減らすことができる。 The foreground candidate region 501 corresponding to the background color contour pixels 507 that does not satisfy the conditions of the third and second threshold values or more, and the background that satisfies the conditions of the third threshold value or more but does not satisfy the conditions of the second threshold value or more. The foreground candidate region 503 corresponding to the color contour pixel 509 is treated as follows. That is, the foreground candidate areas 501 and 503 are explicitly classified and treated as the background by the evaluation unit 404, or are left as the foreground candidate area and are treated with a lower priority than the foreground. As a result, it is possible to reduce false detection of noise in the region where the background is desired to be extracted. That is, it is possible to reduce false detection of noise in a region that is not desired to be extracted as the foreground.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by processing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１２３情報処理装置
４０３抽出部
４０４評価部 123 Information processing device 403 Extraction unit 404 Evaluation unit

Claims

An extraction means for extracting a candidate region that is a candidate for the foreground based on the input image and the background image corresponding to the input image.
In the candidate region, the number of pixels in which the difference between the pixel value of the contour region corresponding to the contour of the candidate region and the pixel value of the region corresponding to the contour region in the background image is equal to or less than the first threshold value is the second. A determination means for determining the contour region that is equal to or greater than the second threshold value as the foreground, and
An information processing device characterized by having.

The determination means is such that the number of pixels in the contour region which is equal to or less than the first threshold value is equal to or greater than the second threshold value and the number of pixels in the contour region is equal to or less than the first threshold value. The information processing apparatus according to claim 1, wherein the contour region in which the ratio of the number of pixels to the number of pixels is equal to or greater than a third threshold value is determined to be the foreground.

An extraction means for extracting a candidate region that is a candidate for the foreground based on the input image and the background image corresponding to the input image.
In the candidate region, the number of pixels in which the difference between the pixel value of the contour region corresponding to the contour of the candidate region and the pixel value of the region corresponding to the contour region in the background image is equal to or less than the first threshold value. A determination means for determining that the contour region in which the ratio of the contour region to the number of pixels is equal to or greater than the third threshold value is the foreground
An information processing device characterized by having.

In the determination means, the ratio of the number of pixels in the contour region, which is equal to or less than the first threshold value, to the number of pixels in the contour region is equal to or greater than the third threshold value and equal to or less than the first threshold value. The information processing apparatus according to claim 3, wherein the contour region in which the number of pixels in the contour region is equal to or greater than a second threshold value is determined to be the foreground.

The input image is a captured image obtained by imaging with an imaging means.
The information processing apparatus according to any one of claims 1 to 4, further comprising an acquisition means for acquiring the captured image.

The information processing apparatus according to any one of claims 1 to 5, further comprising a holding means for holding the background image corresponding to the input image.

An extraction step of extracting candidate regions that are candidates for the foreground based on the input image and the background image corresponding to the input image.
In the candidate region, the number of pixels in which the difference between the pixel value of the contour region corresponding to the contour of the candidate region and the pixel value of the region corresponding to the contour region in the background image is equal to or less than the first threshold value is the second. A determination step of determining the contour region that is equal to or greater than the second threshold value as the foreground, and
An information processing method characterized by having.

An extraction step of extracting candidate regions that are candidates for the foreground based on the input image and the background image corresponding to the input image.
In the candidate region, the number of pixels in which the difference between the pixel value of the contour region corresponding to the contour of the candidate region and the pixel value of the region corresponding to the contour region in the background image is equal to or less than the first threshold value. A determination step of determining that the contour region in which the ratio of the contour region to the number of pixels is equal to or greater than the third threshold value is the foreground
An information processing method characterized by having.

A program that causes a computer to function as each means of the information processing apparatus according to any one of claims 1 to 6.