JP2009266169A

JP2009266169A - Information processor and method, and program

Info

Publication number: JP2009266169A
Application number: JP2008118394A
Authority: JP
Inventors: Takamasa Yamano; 高将山野; Tomoyuki Otsuki; 知之大月; Kenji Takahashi; 健治高橋; Tetsujiro Kondo; 哲二郎近藤
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2008-04-30
Filing date: 2008-04-30
Publication date: 2009-11-12

Abstract

PROBLEM TO BE SOLVED: To more easily and accurately extract a moving object area. SOLUTION: In a moving object area extraction device 11, a whole screen movement calculation part 22 calculates the whole screen movement expressing movement about the whole of a processing screen that is a processing target about each of a plurality of frames constituting a moving image, and a moving object area extraction part 25 decides whether a block is the moving object area or not based on the whole screen movement and activity that is an index representing complicacy of a change of a luminance inside the block obtained by dividing the processing screen. This invention can be applied to an information processor, for example. COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、情報処理装置及び方法、並びにプログラムに関し、特に、より簡単かつ正確に動物体領域を抽出することができるようになった情報処理装置及び方法、並びにプログラムに関する。 The present invention relates to an information processing apparatus, method, and program, and more particularly, to an information processing apparatus, method, and program that can extract a moving object region more easily and accurately.

画像内から、動いている物体（オブジェクト）の領域である動物体領域を抽出する抽出手法が、従来から多く提案されている。具体的には、例えば、動物体領域の抽出手法として、動きベクトル情報を用いる方法が数多く提案されている。 Conventionally, many extraction methods for extracting a moving object region, which is a moving object (object) region, from an image have been proposed. Specifically, for example, many methods using motion vector information have been proposed as a method for extracting a moving object region.

特許文献１には、隣接ブロックの動きの相関性から動物体領域を抽出する手法が開示されている。特許文献２には、過去の動きベクトルを参照し、それらと類似する動きベクトルをもつ領域を動物体領域として入力画像から分離する方法が開示されている。 Patent Document 1 discloses a technique for extracting a moving object region from the correlation of motion of adjacent blocks. Patent Document 2 discloses a method of referring to past motion vectors and separating a region having a motion vector similar to them from an input image as a moving body region.

また、いわゆる背景差分法、すなわち、動物体の写っていない背景画像を生成し、その背景画像と入力画像との差分を計算することで、動物体を検出する手法が数多く提案されている（例えば、特許文献３参照）。
特開2004-207786号公報特開2002-27480号公報特開2002-157599号公報 In addition, a so-called background difference method, that is, a method for detecting a moving object by generating a background image in which the moving object is not captured and calculating a difference between the background image and the input image has been proposed (for example, And Patent Document 3).
JP 2004-207786 A JP 2002-27480 A JP 2002-157599 A

しかしながら、上述した特許文献１や２を含む動きベクトル情報を用いる従来の手法では、一般に、動物体領域の抽出結果が動き検出の精度に強く依存するため、動き検出が失敗し易い平坦部や繰り返しパターン部がある場合、あるいは、物体の変形が激しい場合、動物体の誤検出を生じ易いという問題があった。 However, in the conventional method using the motion vector information including Patent Documents 1 and 2 described above, generally, the extraction result of the moving object region strongly depends on the accuracy of the motion detection, so that the flat portion or the repetition which is likely to fail the motion detection is repeated. When there is a pattern portion or when the deformation of an object is severe, there is a problem that erroneous detection of an animal body is likely to occur.

また、上述した特許文献３を含む従来の背景差分法では、一般に、背景を生成するために処理が複雑かつ重くなりがちであった。また、静止背景であれば良いが、背景自体に動きがある場合、すなわち例えば、カメラのパンチルト、ズーム、回転等が生じた場合、精度良く背景を生成することが困難であった。その結果、必然的に、動物体の誤検出を生じ易いという問題があった。 Further, in the conventional background subtraction method including Patent Document 3 described above, generally, the processing tends to be complicated and heavy in order to generate the background. In addition, a static background may be used, but when the background itself moves, that is, for example, when panning / tilting, zooming, rotation, or the like of the camera occurs, it is difficult to accurately generate the background. As a result, there is inevitably a problem that erroneous detection of the moving object is likely to occur.

本発明は、このような状況に鑑みてなされたものであり、より簡単かつ正確に動物体を検出することができるようにするものである。 The present invention has been made in view of such a situation, and makes it possible to detect a moving object more easily and accurately.

本発明の一側面の情報処理装置は、動画像を構成する複数の単位画像のそれぞれについて、処理対象とする処理画面の全体についての動きを表す全画面動きを算出する全画面動き算出手段と、前記全画面動き算出手段により算出された前記全画面動きと、前記処理画面を分割することで得られるブロック内の輝度の変化の複雑度を表す指標であるアクティビティとに基づいて、前記ブロックが動物体領域か否かを判定する判定手段とを備える。 An information processing apparatus according to an aspect of the present invention includes a full-screen motion calculation unit that calculates a full-screen motion representing a motion of the entire processing screen to be processed for each of a plurality of unit images constituting a moving image, Based on the full screen motion calculated by the full screen motion calculation means and an activity that is an index representing the complexity of luminance change in the block obtained by dividing the processing screen, the block is an animal. Determination means for determining whether or not the body region.

本発明の一側面の情報処理装置には、前記第１の単位画像の前記ブロック毎に、前記第２の単位画像上の探索先との差分についての評価値を演算し、前記評価値と前記探索先とを対応付けた評価値テーブルを生成し、前記評価値テーブルに含まれる探索先の中から、前記評価値テーブルに含まれる評価値に基づいて、検出すべき探索先を検出し、検出した探索先を各前記ブロックの動きとする動き検出手段をさらに設け、前記全画面動き算出手段には、前記動き検出手段により検出された前記ブロックの動きから前記全画面動きを算出させ、前記判定手段には、前記ブロック毎の前記評価値テーブルに含まれる前記全画面動きベクトル先に対応する評価値を、前記全画面動きベクトル先との差分として用いさせることができる。 The information processing apparatus according to one aspect of the present invention calculates an evaluation value for a difference from a search destination on the second unit image for each block of the first unit image, and the evaluation value and the An evaluation value table in which a search destination is associated is generated, and a search destination to be detected is detected from the search destinations included in the evaluation value table based on the evaluation value included in the evaluation value table, and is detected. A motion detection unit that uses the search destination as the motion of each block, and the full screen motion calculation unit calculates the full screen motion from the motion of the block detected by the motion detection unit, and the determination The means may use an evaluation value corresponding to the full screen motion vector destination included in the evaluation value table for each block as a difference from the full screen motion vector destination.

本発明の一側面の情報処理装置には、前記アクティビティを計算するアクティビティ計算手段をさらに設けることができる。 The information processing apparatus according to one aspect of the present invention may further include activity calculation means for calculating the activity.

本発明の一側面の情報処理方法またはプログラムは、動画像を構成する複数の単位画像のそれぞれについて、処理対象とする処理画面の全体についての動きを表す全画面動きを算出し、算出された前記全画面動きと、前記処理画面を分割することで得られるブロック内の輝度の変化の複雑度を表す指標であるアクティビティとに基づいて、前記ブロックが動物体領域か否かを判定するステップを含む情報処理方法または情報処理装置に制御を行うコンピュータに実行させるプログラム。 An information processing method or program according to one aspect of the present invention calculates a full screen motion representing a motion of an entire processing screen to be processed for each of a plurality of unit images constituting a moving image, and the calculated Determining whether or not the block is a moving object region based on the whole screen movement and an activity that is an index representing the complexity of the luminance change in the block obtained by dividing the processing screen. A program to be executed by a computer that controls an information processing method or information processing apparatus.

本発明の一側面においては、動画像を構成する複数の単位画像のそれぞれについて、処理対象とする処理画面の全体についての動きを表す全画面動きが算出され、算出された前記全画面動きと、前記処理画面を分割することで得られるブロック内の輝度の変化の複雑度を表す指標であるアクティビティとに基づいて、前記ブロックが動物体領域か否かが判定される。 In one aspect of the present invention, for each of a plurality of unit images constituting a moving image, a full screen motion representing a motion of the entire processing screen to be processed is calculated, and the calculated full screen motion; Whether or not the block is a moving object region is determined based on an activity that is an index representing the complexity of the change in luminance in the block obtained by dividing the processing screen.

本発明の一側面によれば、例えば、動物体領域を抽出することができる。特に、例えば、より簡単かつ正確に動物体領域を抽出することができる。 According to one aspect of the present invention, for example, an animal body region can be extracted. In particular, for example, a moving body region can be extracted more easily and accurately.

以下、図面を参照して本発明を適用した実施の形態について説明する。 Embodiments to which the present invention is applied will be described below with reference to the drawings.

なお、本実施の形態では、動画像の処理単位として、フレームを採用する。ただし、処理単位は、フレームに限定されず、例えばフィールド等でも構わない。以下、このようなフィールドやフレームといった動画像の処理単位を、単位画像と称する。即ち、本発明は、任意の単位画像に対する処理に適用可能であり、以下に説明するフレームは単位画像の例示にしか過ぎない点留意すべきである。 In the present embodiment, a frame is adopted as a moving image processing unit. However, the processing unit is not limited to a frame, and may be a field, for example. Hereinafter, such a moving image processing unit such as a field or a frame is referred to as a unit image. That is, it should be noted that the present invention can be applied to processing on an arbitrary unit image, and the frame described below is merely an example of the unit image.

図１は、本発明を適用した動物体領域抽出装置１１の機能的構成例を示す図である。 FIG. 1 is a diagram showing a functional configuration example of an animal body region extraction device 11 to which the present invention is applied.

動物体領域抽出装置１１には、動画像を構成する各フレームが、入力画像として順次入力される。 Each frame constituting the moving image is sequentially input to the moving object region extracting device 11 as an input image.

動画像領域抽出装置１１は、入力画像から動物体領域を抽出し、その動物体領域を特定するための情報である動物体領域情報を外部に出力する。以下、このような動画像領域抽出装置１１の一連の処理を、抽出処理と称する。 The moving image region extraction device 11 extracts a moving object region from the input image, and outputs moving object region information, which is information for specifying the moving object region, to the outside. Hereinafter, a series of processes of such a moving image area extracting apparatus 11 is referred to as an extraction process.

動画像領域抽出装置１１は、ブロックマッチング部２１、全画面動き算出部２２、評価値テーブル保持部２３、アクティビティ計算部２４、及び動物体領域抽出部２５から構成される。 The moving image region extraction apparatus 11 includes a block matching unit 21, a full screen motion calculation unit 22, an evaluation value table holding unit 23, an activity calculation unit 24, and a moving object region extraction unit 25.

ブロックマッチング部２１は、入力画像を、格子状に任意のサイズのブロックに分割する。なお、このブロックは、動物体領域の抽出単位となる。すなわち、動物体領域の抽出精度は、このブロックのブロックサイズで決まる。したがって、ブロックサイズは任意で良いが、動物体領域の抽出精度を考慮して決める必要がある点に留意する。 The block matching unit 21 divides the input image into blocks of an arbitrary size in a lattice shape. This block is a unit for extracting the moving object region. That is, the extraction accuracy of the moving object region is determined by the block size of this block. Therefore, the block size may be arbitrary, but it should be noted that it needs to be determined in consideration of the extraction accuracy of the moving object region.

ブロックマッチング部２１は、分割後の各ブロックに対してブロックマッチングを行い、後述する評価値テーブルを生成する。 The block matching unit 21 performs block matching on each divided block and generates an evaluation value table to be described later.

なお、このブロックマッチングに用いるブロックサイズには、上記分割による分割後のブロックサイズをそのまま用いるのが理想である。もっとも、ブロックマッチングに用いるブロックサイズを異なるブロックサイズに変えても構わない。 It is ideal to use the block size after division by the division as it is as the block size used for this block matching. However, the block size used for block matching may be changed to a different block size.

また、ブロックマッチングは、全ブロックに対して同時並列に行われることがが理想である。もっとも、回路規模の制約によっては、１つのマッチング回路を使い回してシリアルに処理を行うことにより、１ブロックずつブロックマッチングを行うようにしても構わない。 Ideally, block matching is performed on all blocks simultaneously in parallel. However, depending on the restrictions on the circuit scale, block matching may be performed block by block by serially processing using one matching circuit.

ブロックマッチングでは、各ブロック毎に、隣接するフレーム上の探索ベクトル先との差分についての評価値がそれぞれ演算される。なお、探索ベクトル先が表す探索先の範囲（探索範囲）は任意の範囲で良い。上記探索ベクトル先との差分の評価値の一例としての評価値Dは、式（１）で表される。 In block matching, an evaluation value for a difference from a search vector destination on an adjacent frame is calculated for each block. The search destination range (search range) indicated by the search vector destination may be an arbitrary range. An evaluation value D as an example of an evaluation value of a difference from the search vector destination is expressed by Expression (1).

・・・（１）

... (1)

式（１）（及び後述する式（２））において、x、yは、それぞれ、入力画像におけるx座標、y座標を示している。(Sx, Sy)は、探索ベクトルを示している。Sx、Syは、それぞれ、探索ベクトル(Sx, Sy)のx成分、y成分を示している。BLOCKは、上記分割後のブロック領域を示している。 In Expression (1) (and Expression (2) described later), x and y indicate the x coordinate and y coordinate in the input image, respectively. (Sx, Sy) indicates a search vector. Sx and Sy indicate the x component and the y component of the search vector (Sx, Sy), respectively. BLOCK indicates the block area after the division.

また、I_t(x, y)は、時刻tの第tフレーム上の点(x, y)の画素値を示している。（x+Sx, y+Sy）は、点(x, y)に対する探索ベクトル先を示している。従って、I_t+1(x+Sx, y+Sy)は、第tフレームより１フレーム後の時刻t+1の第t+1フレーム上の探索ベクトル先（x+Sx, y+Sy）の画素値を示している。 In addition, I _t (x, y) indicates the pixel value of the point (x, y) on the t-th frame at time t. (X + Sx, y + Sy) indicates a search vector destination for the point (x, y). Therefore, I _{t + 1} (x + Sx, y + Sy) is the search vector destination (x + Sx, y + Sy) on the t + 1 frame at time t + 1, one frame after the t frame. The pixel value is shown.

すなわち、評価値D(Sx, Sy)とは、次のような値となる。第tフレーム上の点(x, y)の画素値と、第t+1フレーム上の探索ベクトル先（x+Sx, y+Sy）の画素値とについて、ブロック領域BLOCK内の全ての点(x, y)に対する差分絶対値和をとった値が、評価値D(Sx, Sy)である。 That is, the evaluation value D (Sx, Sy) is the following value. For the pixel value of the point (x, y) on the t-th frame and the pixel value of the search vector destination (x + Sx, y + Sy) on the t + 1-th frame, all points ( A value obtained by taking the sum of absolute differences with respect to x, y) is an evaluation value D (Sx, Sy).

なお、式（１）における差分絶対値和に代えて、差分自乗和、差分和等とすることは勿論可能である。 Of course, instead of the sum of absolute differences in equation (1), a sum of squares of differences, a sum of differences, or the like can be used.

ブロックマッチング部２１は、各探索ベクトル(Sx, Sy)毎に、探索ベクトル(Sx, Sy)と探索ベクトル先との差分の評価値D(Sx, Sy)とを対応付けた評価値テーブルを生成する。 The block matching unit 21 generates, for each search vector (Sx, Sy), an evaluation value table in which the evaluation value D (Sx, Sy) of the difference between the search vector (Sx, Sy) and the search vector destination is associated. To do.

そして、ブロックマッチング部２１は、ブロック毎に、そのブロックの評価値テーブルに含まれる探索ベクトルの中から、最小評価値を与える探索ベクトルを検出し、検出した探索ベクトルを、そのブロックの動きを表す動きベクトルとする。なお、この動きベクトルは、後述する全画面の動きに比べて局所的な動きを表しているので、以下、局所動きベクトルと称する。ブロックマッチング部２１は、このようにして得られた局所動きベクトルを、全画面動き算出部２２に供給する。 Then, for each block, the block matching unit 21 detects a search vector that gives the minimum evaluation value from search vectors included in the evaluation value table of the block, and represents the detected search vector as the motion of the block. Let it be a motion vector. In addition, since this motion vector represents a local motion compared with the motion of the whole screen mentioned later, it is hereafter called a local motion vector. The block matching unit 21 supplies the local motion vector thus obtained to the full screen motion calculation unit 22.

ブロックマッチング部２１は、局所動きベクトルの検出を終えた評価値テーブルを、評価値テーブル保持部２３に供給する。 The block matching unit 21 supplies the evaluation value table for which the local motion vector has been detected to the evaluation value table holding unit 23.

全画面動き算出部２２は、ブロックマッチング部２１から供給される全ブロックの局所動きベクトルに対してフレーム毎に統計処理を施すことにより、入力画像内の処理対象とする処理画面の全体の動きを表す全画面動きベクトルを算出し、評価値テーブル保持部２３に供給する。 The full screen motion calculation unit 22 performs a statistical process for each frame on the local motion vectors of all the blocks supplied from the block matching unit 21, thereby calculating the overall motion of the processing screen to be processed in the input image. A full screen motion vector to be expressed is calculated and supplied to the evaluation value table holding unit 23.

なお、全画面動き算出部２２に採用する統計処理の処理手法は、特に限定されない。例えば、簡単には、全ブロックについて検出された局所的な動きベクトルをヒストグラム化し、そのヒストグラムにおける最頻ベクトルを全画面動きベクトルとする等の手法が考えられる。そこで、本実施の形態では、この統計処理を採用するとする。 In addition, the processing method of the statistical process employ | adopted as the whole screen motion calculation part 22 is not specifically limited. For example, a simple method may be considered in which local motion vectors detected for all blocks are histogrammed and the most frequent vector in the histogram is used as a full screen motion vector. Therefore, in this embodiment, it is assumed that this statistical processing is adopted.

全画面動き算出部２２は、統計処理の結果得られるベクトル、例えば上述した最頻ベクトルを全画面動きベクトルとして算出し、評価値テーブル保持部２３に供給する。 The full screen motion calculation unit 22 calculates a vector obtained as a result of the statistical processing, for example, the above-described mode vector as a full screen motion vector, and supplies it to the evaluation value table holding unit 23.

評価値テーブル保持部２３は、ブロックマッチング部２１から供給される評価値テーブルをブロック毎に保持する。 The evaluation value table holding unit 23 holds the evaluation value table supplied from the block matching unit 21 for each block.

また、評価値テーブル保持部２３は、全画面動き算出部２２から全画面動きベクトルが供給されると、所定のブロックを処理対象として注目すべき注目ブロックに順次設定する。評価値テーブル保持部２３は、注目ブロックについて、保持している評価値テーブルを参照し、全画面動きベクトル先の評価値を算出する。なお、この全画面動きベクトル先の評価値が、注目ブロックにおける全画面動きベクトル先との差分についての評価値になる。そこで、以下、全画面動きベクトル先の評価値を、全画面動きベクトル先との差分と適宜称する。そして、評価値テーブル保持部２３は、全画面動きベクトル先との差分を、動物体領域抽出部２５に供給する。 Further, when the full-screen motion vector is supplied from the full-screen motion calculation unit 22, the evaluation value table holding unit 23 sequentially sets a predetermined block as a target block to be noted as a processing target. The evaluation value table holding unit 23 refers to the evaluation value table held for the block of interest, and calculates an evaluation value of the full screen motion vector destination. Note that the evaluation value of the full-screen motion vector destination is an evaluation value for the difference from the full-screen motion vector destination in the target block. Therefore, hereinafter, the evaluation value of the full screen motion vector destination is appropriately referred to as a difference from the full screen motion vector destination. Then, the evaluation value table holding unit 23 supplies the difference from the full screen motion vector destination to the moving object region extraction unit 25.

なお、以下、全画面動きベクトル先との差分を、Dbと記述する。全画面動きベクトル先との差分Dbは、全画面動きベクトルを(Wx, Wy)とすると、その全画面動きベクトル(Wx, Wy)を式（１）に代入することで求まるD(Wx, Wy)である（すなわち、Db=D(Wx, Wy)）。なお、全画面動きベクトル先は、(x+Wx, y+Wy)と表される。 Hereinafter, the difference from the full-screen motion vector destination is described as Db. The difference Db from the full-screen motion vector destination can be obtained by substituting the full-screen motion vector (Wx, Wy) into equation (1), where (Wx, Wy) is the full-screen motion vector. ) (Ie, Db = D (Wx, Wy)). The full-screen motion vector destination is expressed as (x + Wx, y + Wy).

アクティビティ計算部２４は、各ブロックのアクティビティを計算し、動物体領域抽出部２５に供給する。ここで、アクティビティとは、ブロック内の輝度変化の複雑度を表す指標をいう。アクティビティの一例としてのアクティビティAは、式（２）で表される。 The activity calculation unit 24 calculates the activity of each block and supplies it to the moving object region extraction unit 25. Here, the activity is an index representing the complexity of the luminance change in the block. Activity A as an example of activity is represented by Expression (2).

・・・・・（２）

(2)

式（２）において、I_t(x, y)は、時刻tの第tフレーム上の点(x, y)の画素値を示している。i、jは、それぞれ、-1から1の整数の変数である。したがって、I_t(x+j, y+i)は、時刻tの第tフレーム上の点(x+j, y+i)の画素値を示している。 In Expression (2), I _t (x, y) represents the pixel value of the point (x, y) on the t-th frame at time t. i and j are integer variables of -1 to 1, respectively. Therefore, I _t (x + j, y + i) represents the pixel value of the point (x + j, y + i) on the t-th frame at time t.

すなわち、アクティビティ計算部２４は、ブロック内の画素のそれぞれに対して、隣接８画素との差分絶対値和を求め、その差分絶対値和を全画素分積算することにより、アクティビティAを計算する。 In other words, the activity calculation unit 24 calculates the activity A by calculating the sum of absolute differences from the adjacent eight pixels for each pixel in the block and integrating the sum of absolute differences for all pixels.

但し、アクティビティの評価値の表現は式（２）に限定されない。例えば、式（２）のような周囲８近傍との差分ではなく、例えば上下左右の４近傍のみの差分としたり、あるいは、絶対値和ではなく二乗和にする等、用途に応じて様々な表現が考えられる。なお、以降においては、式（２）のアクティビティの評価値を使った場合について説明する。 However, the expression of the activity evaluation value is not limited to the expression (2). For example, instead of the difference from the vicinity of the surrounding 8 as in the expression (2), for example, a difference of only the four vicinity of the top, bottom, left and right, or a sum of squares instead of the sum of absolute values, various expressions depending on applications Can be considered. Hereinafter, the case where the activity evaluation value of Expression (2) is used will be described.

アクティビティ計算部２４によるアクティビティの計算は、ブロックマッチング部２１による差分計算時に図示せぬ画像メモリに展開している画素を利用し、さらにブロックマッチング部２１によるブロックマッチングと同時並行に行われるのがアーキテクチャ的には好ましい。但し、アクティビティの計算をブロックマッチングと同時並行に行うのが必ずしも必要ではない点に留意する。 The activity calculation by the activity calculation unit 24 uses pixels developed in an image memory (not shown) at the time of difference calculation by the block matching unit 21, and is further performed in parallel with the block matching by the block matching unit 21 in the architecture. It is preferable. However, it should be noted that it is not always necessary to calculate the activity in parallel with the block matching.

動物体領域抽出部２５は、注目ブロックについての、評価値テーブル保持部２３から供給される全画面動きベクトル先との差分、及びアクティビティ計算部２４から供給されるアクティビティから、注目ブロックが背景かどうかを判定する。 The moving object region extraction unit 25 determines whether the target block is the background from the difference from the full screen motion vector destination supplied from the evaluation value table holding unit 23 and the activity supplied from the activity calculation unit 24 for the target block. Determine.

具体的には、例えば、動物体領域抽出部２５は、注目ブロックについての、全画面動きベクトル先との差分Dbと、アクティビティ計算部２４から供給されるアクティビティAとを用いた、式（３）に示す評価式を演算し、その演算値の正負により注目ブロックが背景か否かを判定する。 Specifically, for example, the moving object region extraction unit 25 uses the difference Db with respect to the full-screen motion vector destination for the block of interest and the activity A supplied from the activity calculation unit 24 (3) Is evaluated, and it is determined whether or not the block of interest is the background based on whether the calculated value is positive or negative.

α×A＋β×N−Db ・・・・・（３） α × A + β × N−Db (3)

式（３）において、Nは、ブロック内の画素数を示している。α、βは、それぞれ、アクティビティA、全画面動きベクトル先との差分Dbの係数である。 In Expression (3), N indicates the number of pixels in the block. α and β are coefficients of the difference Db from the activity A and the full screen motion vector destination, respectively.

式（３）の評価式の演算値が負（＜0）であれば、注目ブロックは動物体領域である（すなわち、注目ブロックは背景でない）と判定される。逆に、式（３）の評価式の演算値が正（≧0）であれば、注目ブロックは背景である（すなわち、注目ブロックは動物体領域でない）と判定される。 If the calculated value of the evaluation formula of Expression (3) is negative (<0), it is determined that the block of interest is a moving object region (that is, the block of interest is not the background). On the other hand, if the calculated value of the evaluation expression of Expression (3) is positive (≧ 0), it is determined that the block of interest is the background (that is, the block of interest is not a moving object region).

なお、係数α及びβは、動物体領域の抽出精度を変えるためのパラメータであり、アプリケーションや用途に応じて値を変えることができる。また、動き量が既知の動画像から逆算し、係数α及びβを統計的に求めることもできる。すなわち、例えば、動き量が既知の動画像について、アクティビティA、全画面動きベクトル先との差分Dbを求め、動物体領域と判定されたブロックと背景と判定されたブロックのそれぞれについて、統計処理を行うことにより、係数α、βを求めることができる。 The coefficients α and β are parameters for changing the extraction accuracy of the moving object region, and the values can be changed according to the application and application. It is also possible to calculate the coefficients α and β statistically by calculating backward from a moving image whose motion amount is known. That is, for example, for a moving image whose motion amount is known, the difference Db between the activity A and the full screen motion vector destination is obtained, and statistical processing is performed for each of the block determined as the moving object region and the block determined as the background. By doing so, the coefficients α and β can be obtained.

一般的な動画像から動物体領域を抽出する場合、例えば、α=0.8、β=2程度が好ましい。 When extracting a moving object region from a general moving image, for example, α = 0.8 and β = 2 are preferable.

その後、動物体領域抽出部２５は、全てのブロックについて背景かどうかを判定し終えると、動物体領域と判定されたブロックのうちの連続しているブロックの集合のそれぞれを動物体領域として抽出する。そして、動物体領域抽出部２５は、抽出した動物体領域を特定する情報である動物体領域情報を生成し、外部に出力する。 Thereafter, when the moving object region extraction unit 25 determines whether all blocks are backgrounds, the moving object region extraction unit 25 extracts each of a set of consecutive blocks among the blocks determined as moving object regions as moving object regions. . Then, the moving object region extracting unit 25 generates moving object region information that is information for specifying the extracted moving object region, and outputs the generated moving object region information to the outside.

次に、図２及び３のフローチャートを参照して、動物体領域抽出装置１１の抽出処理の一例について説明する。 Next, an example of the extraction process of the moving object region extraction device 11 will be described with reference to the flowcharts of FIGS.

動物体領域抽出装置１１のブロックマッチング部２１及びアクティビティ計算部２４には、動画像を構成する各フレームが、入力画像として順次入力される。 Each frame constituting the moving image is sequentially input as an input image to the block matching unit 21 and the activity calculation unit 24 of the moving object region extraction device 11.

ステップＳ１において、ブロックマッチング部２１は、入力画像を、格子状に任意のサイズのブロックに分割する。 In step S1, the block matching unit 21 divides the input image into blocks of an arbitrary size in a grid pattern.

具体的には、ブロックマッチング部２１は、例えば、図４に示すような、入力画像の処理画面を所定のサイズのブロックに分割することにより、縦横７×１０の７０のブロックを生成する。なお、入力画像全体や他の大きさの画面を処理画面とすることも勿論可能である。 Specifically, for example, the block matching unit 21 divides an input image processing screen into blocks of a predetermined size as shown in FIG. 4 to generate 70 × 7 × 10 blocks. Of course, the entire input image or a screen of another size can be used as the processing screen.

ステップＳ２において、ブロックマッチング部２１は、分割後のブロックのうちの所定のブロックを注目ブロックに設定する。その後、処理はステップＳ３に進む。 In step S2, the block matching unit 21 sets a predetermined block among the divided blocks as a target block. Thereafter, the process proceeds to step S3.

ステップＳ３において、ブロックマッチング部２１は、注目ブロックについてブロックマッチングを行い、各探索ベクトル毎に、探索ベクトルと探索ベクトル先との差分の評価値とを対応付けた評価値テーブルを生成する。その後、処理はステップＳ４に進む。 In step S3, the block matching unit 21 performs block matching on the block of interest, and generates an evaluation value table in which the search vector and the evaluation value of the difference between the search vector destinations are associated with each search vector. Thereafter, the process proceeds to step S4.

ステップＳ４において、ブロックマッチング部２１は、評価値テーブルに含まれる探索ベクトルの中から、最小評価値を与える探索ベクトルを局所動きベクトルとして検出し、全画面動き算出部２２に供給する。その後、処理はステップＳ５に進む。 In step S 4, the block matching unit 21 detects a search vector that gives the minimum evaluation value from among the search vectors included in the evaluation value table as a local motion vector, and supplies it to the full-screen motion calculation unit 22. Thereafter, the process proceeds to step S5.

ステップＳ５において、ブロックマッチング部２１は、注目ブロックについての評価値テーブルを、評価値テーブル保持部２３に供給する。その後、処理はステップＳ６に進む。 In step S 5, the block matching unit 21 supplies the evaluation value table for the block of interest to the evaluation value table holding unit 23. Thereafter, the process proceeds to step S6.

ステップＳ６において、評価値テーブル保持部２３は、ブロックマッチング部２１から供給される評価値テーブルをブロック毎に保持する。その後、処理はステップＳ７に進む。 In step S 6, the evaluation value table holding unit 23 holds the evaluation value table supplied from the block matching unit 21 for each block. Thereafter, the process proceeds to step S7.

ステップＳ７において、ブロックマッチング部２１は、注目ブロックは入力画像中の最後のブロックであるか否かを判定する。 In step S7, the block matching unit 21 determines whether or not the target block is the last block in the input image.

注目ブロックは入力画像中の最後のブロックでない場合、ステップＳ７においてＮＯであると判定されて、処理はステップＳ２に戻され、それ以降の処理が繰り返される。 If the target block is not the last block in the input image, it is determined as NO in step S7, the process returns to step S2, and the subsequent processes are repeated.

即ち、入力画像を構成する各ブロック毎に、ステップＳ２乃至Ｓ７のループ処理が繰り返し実行される。そして、入力画像中の最後のブロックについて、ステップＳ６の処理で評価値テーブルが保持されると、ステップＳ７の処理でＹＥＳであると判定されて、処理はステップＳ８に進む。 That is, the loop process of steps S2 to S7 is repeatedly executed for each block constituting the input image. If the evaluation value table is held in the process of step S6 for the last block in the input image, it is determined YES in the process of step S7, and the process proceeds to step S8.

ステップＳ８において、全画面動き算出部２２は、ブロックマッチング部２１から供給される全ブロックの局所動きベクトルに対して統計処理を施すことにより、全画面動きベクトルを算出し、評価値テーブル保持部２３に供給する。その後、処理はステップＳ９に進む。 In step S 8, the full screen motion calculation unit 22 calculates a full screen motion vector by performing statistical processing on the local motion vectors of all blocks supplied from the block matching unit 21, and the evaluation value table holding unit 23. To supply. Thereafter, the process proceeds to step S9.

ステップＳ９において、評価値テーブル保持部２３は、入力画像中の所定のブロックを注目ブロックに設定する。その後、処理はステップＳ１０に進む。 In step S9, the evaluation value table holding unit 23 sets a predetermined block in the input image as a target block. Thereafter, the process proceeds to step S10.

ステップＳ１０において、評価値テーブル保持部２３は、注目ブロックについての評価値テーブルを参照し、全画面動きベクトル先との差分を求める。そして、評価値テーブル保持部２３は、求めた全画面動きベクトル先との差分を、動物体領域抽出部２５に供給する。その後、処理はステップＳ１１に進む。 In step S10, the evaluation value table holding unit 23 refers to the evaluation value table for the block of interest and obtains a difference from the full screen motion vector destination. Then, the evaluation value table holding unit 23 supplies the obtained difference from the full-screen motion vector destination to the moving object region extraction unit 25. Thereafter, the process proceeds to step S11.

ステップＳ１１において、アクティビティ計算部２４は、注目ブロックのアクティビティを計算し、動物体領域抽出部２５に供給する。 In step S 11, the activity calculation unit 24 calculates the activity of the block of interest and supplies it to the moving object region extraction unit 25.

具体的には、例えば、アクティビティ計算部２４は、注目ブロック内の画素のそれぞれに対して、隣接８画素との差分絶対値和を求め、その差分絶対値和を注目ブロック内の全画素分積算する。 Specifically, for example, the activity calculation unit 24 obtains the sum of absolute differences from the adjacent 8 pixels for each pixel in the target block, and integrates the sum of absolute differences for all the pixels in the target block. To do.

すなわち、図５に示すように、アクティビティ計算部２４は、注目ブロック内の所定の画素を注目画素に設定する。なお、図５の例では、ブロックは縦M横NのM×Nの画素から構成されるとする。 That is, as shown in FIG. 5, the activity calculation unit 24 sets a predetermined pixel in the target block as the target pixel. In the example of FIG. 5, it is assumed that the block is composed of M × N pixels of vertical M and horizontal N.

まず、アクティビティ計算部２４は、例えば、注目ブロック内の第１行第１列の画素を注目画素に設定する。アクティビティ計算部２４は、図５左上の点線の丸の内部に示すように、注目画素の隣接８画素、すなわち、注目画素の左上、上、右上、左、右、左下、下、右下の画素のそれぞれとの差分絶対値和を求める。次に、アクティビティ計算部２４は、例えば、注目画素の右隣の画素、すなわち、第１行第２列の画素を新たな注目画素に設定し、隣接８画素との差分絶対値和を求める。以下、同様にして、注目画素（第１行第j列の画素）の右隣の画素（第１行第j+1(3≦j≦N)列の画素）を新たな注目画素に順次設定し、隣接８画素との差分絶対値和を求めていく。その後、第１行の画素に対して演算を終えると、第１行の下の行、すなわち第２行の画素に対して演算を行う。同様に、第i(2≦i≦M)行の画素に対して演算を終えると、第i行の下の行、すなわち第i+1行の画素に対して演算を行う。このようにして、アクティビティ計算部２４は、注目ブロック内の全画素に対して、隣接８画素との差分絶対値和を求め、その差分絶対値和を注目ブロック内の全画素分積算する。 First, the activity calculation unit 24 sets, for example, the pixel in the first row and the first column in the target block as the target pixel. As shown in the dotted circle at the upper left of FIG. 5, the activity calculation unit 24 includes the eight pixels adjacent to the target pixel, that is, the upper left, upper, upper right, left, right, lower left, lower, and lower right pixels of the target pixel. Find the sum of absolute differences from each. Next, the activity calculation unit 24 sets, for example, the pixel on the right side of the target pixel, that is, the pixel in the first row and the second column as a new target pixel, and obtains the sum of absolute differences from the adjacent eight pixels. In the same manner, the pixel immediately adjacent to the pixel of interest (pixel in the first row and j column) (pixel in the first row and j + 1 (3 ≦ j ≦ N) column) is sequentially set as a new pixel of interest. Then, the sum of absolute differences from the adjacent 8 pixels is obtained. Thereafter, when the calculation is completed for the pixels in the first row, the calculation is performed for the pixels in the lower row of the first row, that is, the pixels in the second row. Similarly, when the calculation for the pixels in the i-th (2 ≦ i ≦ M) row is completed, the calculation is performed for the pixels in the lower row of the i-th row, that is, the pixels in the i + 1-th row. In this way, the activity calculation unit 24 calculates the sum of absolute differences with respect to all the pixels in the block of interest with respect to the adjacent 8 pixels, and integrates the sum of absolute differences of all pixels in the block of interest.

ステップＳ１２において、動物体領域抽出部２５は、評価値テーブル保持部２３から供給される注目ブロックの全画面動きベクトル先との差分と、アクティビティ計算部２４から供給される注目ブロックのアクティビティとから、注目ブロックが動物体領域か否かを判定する。 In step S12, the moving object region extraction unit 25 calculates the difference between the target block full-screen motion vector destination supplied from the evaluation value table holding unit 23 and the activity of the target block supplied from the activity calculation unit 24. It is determined whether the block of interest is a moving object region.

具体的には、動物体領域抽出部２５は、評価値テーブル保持部２３からの注目ブロックの全画面動きベクトル先との差分Db、及びアクティビティ計算部２４から供給される注目ブロックのアクティビティAとを用いた、式（３）に示した評価式を演算する。そして、動物体領域抽出部２５は、この評価式の演算値が負の場合に注目ブロックが動物体領域と判定し、この評価式の演算値が正の場合に注目ブロックが動物体領域でないと判定する。 Specifically, the moving object region extraction unit 25 calculates the difference Db from the full-screen motion vector destination of the target block from the evaluation value table holding unit 23 and the activity A of the target block supplied from the activity calculation unit 24. The used evaluation formula shown in Formula (3) is calculated. Then, the moving object region extraction unit 25 determines that the target block is a moving object region when the calculated value of the evaluation formula is negative, and if the calculated value of the evaluation formula is positive, the target block is not the moving object region. judge.

ステップＳ１３において、評価値テーブル保持部２３は、注目ブロックは入力画像中の最後のブロックか否かを判定する。 In step S 13, the evaluation value table holding unit 23 determines whether the target block is the last block in the input image.

注目ブロックは入力画像中の最後のブロックでない場合、ステップＳ１３においてＮＯと判定されて、処理はステップＳ１０に戻され、それ以降の処理が繰り返される。 If the target block is not the last block in the input image, it is determined as NO in step S13, the process returns to step S10, and the subsequent processes are repeated.

即ち、入力画像中の各ブロック毎に、ステップＳ９乃至Ｓ１３のループ処理が繰り返し実行される。そして、入力画像中の最後のブロックについて、ステップＳ１２の処理で動物体領域か否かが判定されると、ステップＳ１３の処理でＹＥＳであると判定されて、ステップＳ１４に進む。 That is, the loop process of steps S9 to S13 is repeatedly executed for each block in the input image. If it is determined in step S12 that the last block in the input image is a moving object region, it is determined YES in step S13, and the process proceeds to step S14.

ステップＳ１４において、動物体領域抽出部２５は、動物体領域と判定されたブロックのうちの連続しているブロックの集合のそれぞれを動物体領域として抽出する。その後、処理はステップＳ１５に進む。 In step S 14, the moving object region extraction unit 25 extracts each set of consecutive blocks among the blocks determined as moving object regions as moving object regions. Thereafter, the process proceeds to step S15.

ステップＳ１５において、動物体領域抽出部２５は、抽出した動物体領域を特定するための情報である動物体領域情報を生成して外部に出力する。そして、抽出処理は終了となる。 In step S15, the moving object region extracting unit 25 generates moving object region information, which is information for specifying the extracted moving object region, and outputs it to the outside. Then, the extraction process ends.

以上のように、動物体領域抽出装置１１は、大きな特徴の一つとして、動き検出時に得られる評価値テーブル上の評価値といった特徴量を動物体領域の判定に流用している。その結果、動物体領域の抽出を実現する上で、一般的な動き検出回路に対して付加すべき付加要素を回路規模・処理工数の点で共に極めて少なくすることができる。 As described above, the moving object region extraction device 11 uses a feature amount such as an evaluation value on the evaluation value table obtained at the time of motion detection as one of the large features for determining the moving object region. As a result, in realizing the extraction of the moving object region, the additional elements to be added to the general motion detection circuit can be extremely reduced in terms of circuit scale and processing man-hours.

従って、動物体領域抽出装置１１は、元々動き検出を想定したアプリケーション、例えば、トラッキング、圧縮、あるいは時空間解像度創造等といったアプリケーションと相性が非常に良いと言える。 Therefore, it can be said that the moving object region extraction device 11 is very compatible with an application that originally assumed motion detection, for example, an application such as tracking, compression, or creation of spatiotemporal resolution.

そのようなアプリケーションの一例としては、例えば、ユーザが指示した対象をトラッキングする監視カメラシステムが特開2007-272732で提案されている。 As an example of such an application, for example, a surveillance camera system that tracks an object designated by a user is proposed in Japanese Patent Laid-Open No. 2007-272732.

具体的には、この監視カメラシステムでは、動きベクトルを特徴量として、その統計処理により全画面動き等を求め、トラッキングに使用している。 Specifically, in this surveillance camera system, a motion vector is used as a feature amount, and a full screen motion or the like is obtained by statistical processing and used for tracking.

この監視カメラシステムでは、全画面動きベクトルや評価値テーブルが求められており、また、アクティビティについても、動物体領域抽出装置１１が計算するアクティビティと同様のものを出力する機構が兼ね備わっている。 In this surveillance camera system, a full-screen motion vector and an evaluation value table are obtained, and a mechanism for outputting the same activity as the activity calculated by the moving object region extraction device 11 is also used.

従って、この監視カメラシステムを用いて動物体領域抽出装置１１を構成する場合、監視カメラシステムからの全画面動きベクトル、評価値テーブル上の評価値、及びアクティビティといった特徴量は付加回路なく得ることができる。これらの全画面動きベクトルと評価値テーブル上の評価値から、全画面動きベクトル先との差分と、アクティビティという出力結果が得られる。そして、この出力結果を用いた式（３）の評価式の演算と、その演算値と閾値（本実施の形態では0）との比較とを行うための極めて小規模な回路を付け加えるだけで、動物体検出が可能となる。 Therefore, when the moving body region extraction apparatus 11 is configured using this monitoring camera system, it is possible to obtain the feature quantities such as the full-screen motion vector, the evaluation value on the evaluation value table, and the activity from the monitoring camera system without an additional circuit. it can. From these full-screen motion vectors and the evaluation values on the evaluation value table, the difference from the full-screen motion vector destination and the output result of activity are obtained. Then, by adding an extremely small circuit for performing the calculation of the evaluation expression of the expression (3) using this output result and the comparison between the calculated value and the threshold value (0 in this embodiment), A moving object can be detected.

さらに、動物体検出で得られた動物体領域情報は、この監視カメラシステムによるトラッキングにとっても非常に有効な特徴量の一つとなる。 Furthermore, the moving object region information obtained by moving object detection is one of the feature quantities that are very effective for tracking by this surveillance camera system.

また、動物体領域抽出装置１１は、もう一つの大きな特徴として、アクティビティを定義し、アクティビティを特徴量として動物体領域を判定している。アクティビティの計算は、極めて小規模の回路で実現できるので、動物体領域抽出装置１１は、アクティビティを有効に活用することで、簡易ながらロバストな動物体領域抽出を実現できる。 In addition, the moving object region extracting apparatus 11 defines an activity as another major feature, and determines the moving object region using the activity as a feature amount. Since the activity calculation can be realized with an extremely small circuit, the moving object region extracting apparatus 11 can realize a simple yet robust moving object region extraction by effectively utilizing the activity.

次に、図６及び７を参照して、具体的に、アクティビティを考慮することによる動物体領域の抽出結果への効果について説明する。 Next, with reference to FIGS. 6 and 7, the effect on the extraction result of the moving object region by considering the activity will be specifically described.

図６Ａ（及び後述する図７Ａ）は、アクティビティを考慮した場合の動物体領域抽出装置１１による動物体領域の抽出結果を示している。図６Ｂ（及び後述する図７Ｂ）は、対比較としての、アクティビティを考慮しなかった場合の動物体領域抽出装置１１による動物体領域の抽出結果を示している。 FIG. 6A (and FIG. 7A described later) shows the extraction result of the moving object region by the moving object region extracting device 11 when the activity is considered. FIG. 6B (and FIG. 7B described later) shows the extraction result of the moving object region by the moving object region extracting device 11 when the activity is not considered as a pair comparison.

図６Ａ及び図６Ｂでは、動物体領域と判定された領域が、入力画像P1に対して、縁取りされたグレー領域として重畳してある。 In FIG. 6A and FIG. 6B, the area determined as the moving object area is superimposed on the input image P1 as an edged gray area.

入力画像P1は、一般的なTV(television)映像信号の１フレームであり、パン動きが存在し、また、スケートリンクやフェンス等、平坦で動き検出が難しい領域が多々存在する画像である。それにもかかわらず、図６Ａ、すなわち、アクティビティを考慮した場合の動動物体領域の抽出結果では、動物体領域と判定されるべき動物体であるスケータが綺麗に抽出されているのが分かる。それに対して、図７Ｂ、すなわち、アクティビティを考慮しなかった場合の動物体領域の抽出結果では、動物体領域と判定されるべき部分以外の部分、例えば、スケータの奥に位置する「Smart Ones」という文字や観客席等のディテールの細かい部分に関して、誤判定が目立つのが見てとれる。 The input image P1 is one frame of a general TV (television) video signal, and is an image in which panning motion exists and there are many flat and difficult motion detection regions such as a skating rink and a fence. Nevertheless, in FIG. 6A, that is, in the extraction result of the moving object region in consideration of the activity, it can be seen that the skater which is the moving object to be determined as the moving object region is clearly extracted. On the other hand, in FIG. 7B, that is, in the extraction result of the moving object region when the activity is not considered, a part other than the part to be determined as the moving object region, for example, “Smart Ones” located in the back of the skater. It can be seen that misjudgment is conspicuous in the details such as the letters and auditorium seats.

図７Ａ及び図７Ｂでは、動物体領域と判定された領域が、入力画像P2に対して、縁取りされたグレー領域として重畳してある。 In FIG. 7A and FIG. 7B, the region determined to be a moving object region is superimposed on the input image P2 as an edged gray region.

入力画像P2は、上記TV映像信号とは別の動画像の１フレームである。入力画像P2は、動物体として検出されるべき対象物が複数存在したり、または、背景に強いディテールがあるような画像である。それにもかかわらず、図７Ａ、すなわち、アクティビティを考慮した場合の動物体領域の抽出結果では、動物体領域と判定されるべき動物体である図中やや左中央に位置する子供と右やや上に位置する犬とが、問題なく抽出できているのが見てとれる。それに対して、図７Ｂ、すなわち、アクティビティを考慮しなかった場合の動物体領域の抽出結果では、動物体領域と判定されるべき部分だけでなく画面上の大部分が動物体領域と誤判定されてしまい、動物体領域を全く判定できていない。 The input image P2 is one frame of a moving image different from the TV video signal. The input image P2 is an image in which there are a plurality of objects to be detected as a moving object or there is a strong detail in the background. Nonetheless, in FIG. 7A, that is, the extraction result of the moving object region in consideration of the activity, the moving object region to be determined as the moving object region is slightly above the right with the child located slightly in the left center. It can be seen that the dogs located are extracted without problems. On the other hand, in FIG. 7B, that is, in the extraction result of the moving object region when the activity is not considered, not only the portion that should be determined as the moving object region but also most of the screen is erroneously determined as the moving object region. As a result, the moving object region cannot be determined at all.

以上のように、動物体領域抽出装置１１は、動き検出に用いる特徴量の利用と、極めて小規模の付加回路の付加とによって、精度の高い動物体領域の抽出が実現できる。 As described above, the moving object region extracting apparatus 11 can realize highly accurate moving object region extraction by using the feature amount used for motion detection and adding an extremely small additional circuit.

また、動物体領域抽出装置１１は、動きベクトルの検出とその比較による従来手法にありがちな、動きベクトルの検出が難しい平坦部や繰り返しパターン部での抽出結果の破綻といった問題を生じ難くすることができる。 In addition, the moving object region extraction device 11 is less likely to cause problems such as failure of extraction results in flat portions and repeated pattern portions, which are difficult to detect motion vectors, which are common in conventional methods based on motion vector detection and comparison. it can.

さらに、動物体領域抽出装置１１は、パン・チルト等の平行カメラ動きがあるような、背景が動いている動画像の場合でも問題なく動物体領域の抽出を実現できる。 Furthermore, the moving object region extraction apparatus 11 can realize the extraction of the moving object region without any problem even in the case of a moving image with a moving background such as pan / tilt parallel camera movement.

すなわち、動物体領域抽出装置１１は、従来技術の問題点を解決し、簡易な処理・アーキテクチャでロバストな動物体領域の抽出することができる。 That is, the moving object region extracting apparatus 11 can solve the problems of the conventional technology and can extract a moving object region with a simple process and architecture.

上述した一連の処理は、ハードウエアにより実行することもできるし、ソフトウエアにより実行することもできる。一連の処理をソフトウエアにより実行する場合には、そのソフトウエアを構成するプログラムが、専用のハードウエアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどに、プログラム記録媒体からインストールされる。 The series of processes described above can be executed by hardware or can be executed by software. When a series of processing is executed by software, a program constituting the software executes various functions by installing a computer incorporated in dedicated hardware or various programs. For example, it is installed from a program recording medium in a general-purpose personal computer or the like.

図８は、上述した一連の処理をプログラムにより実行するコンピュータのハードウェアの構成例を示すブロック図である。 FIG. 8 is a block diagram illustrating an example of a hardware configuration of a computer that executes the above-described series of processes using a program.

コンピュータにおいて、CPU４１，ROM（Read Only Memory）４２，RAM（Random Access Memory）４３は、バス４４により相互に接続されている。 In the computer, a CPU 41, a ROM (Read Only Memory) 42, and a RAM (Random Access Memory) 43 are connected to each other by a bus 44.

バス４４には、さらに、入出力インタフェース４５が接続されている。入出力インタフェース４５には、キーボード、マウス、マイクロホンなどよりなる入力部４６、ディスプレイ、スピーカなどよりなる出力部４７、ハードディスクや不揮発性のメモリなどよりなる記憶部４８、ネットワークインタフェースなどよりなる通信部４９、磁気ディスク、光ディスク、光磁気ディスク、或いは半導体メモリなどのリムーバブルメディア５１を駆動するドライブ５０が接続されている。 An input / output interface 45 is further connected to the bus 44. The input / output interface 45 includes an input unit 46 including a keyboard, a mouse, and a microphone, an output unit 47 including a display and a speaker, a storage unit 48 including a hard disk and a nonvolatile memory, and a communication unit 49 including a network interface. A drive 50 for driving a removable medium 51 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory is connected.

以上のように構成されるコンピュータでは、CPU４１が、例えば、記憶部４８に記憶されているプログラムを、入出力インタフェース４５及びバス４４を介して、RAM４３にロードして実行することにより、上述した一連の処理が行われる。 In the computer configured as described above, the CPU 41 loads, for example, a program stored in the storage unit 48 to the RAM 43 via the input / output interface 45 and the bus 44 and executes the program. Is performed.

コンピュータ（CPU４１）が実行するプログラムは、例えば、磁気ディスク（フレキシブルディスクを含む）、光ディスク（CD−ROM(Compact Disc−Read Only Memory),DVD(Digital Versatile Disc)等）、光磁気ディスク、もしくは半導体メモリなどよりなるパッケージメディアであるリムーバブルメディア５１に記録して、あるいは、ローカルエリアネットワーク、インターネット、ディジタル衛星放送といった、有線または無線の伝送媒体を介して提供される。 The program executed by the computer (CPU 41) is, for example, a magnetic disk (including a flexible disk), an optical disk (CD-ROM (Compact Disc-Read Only Memory), DVD (Digital Versatile Disc), etc.), a magneto-optical disk, or a semiconductor. It is recorded on a removable medium 51, which is a package medium composed of a memory or the like, or is provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.

そして、プログラムは、リムーバブルメディア５１をドライブ５０に装着することにより、入出力インタフェース４５を介して、記憶部４８にインストールすることができる。また、プログラムは、有線または無線の伝送媒体を介して、通信部４９で受信し、記憶部４８にインストールすることができる。その他、プログラムは、ROM４２や記憶部４８に、あらかじめインストールしておくことができる。 The program can be installed in the storage unit 48 via the input / output interface 45 by attaching the removable medium 51 to the drive 50. The program can be received by the communication unit 49 via a wired or wireless transmission medium and installed in the storage unit 48. In addition, the program can be installed in the ROM 42 or the storage unit 48 in advance.

なお、コンピュータが実行するプログラムは、本明細書で説明する順序に沿って時系列に処理が行われるプログラムであっても良いし、並列に、あるいは呼び出しが行われたとき等の必要なタイミングで処理が行われるプログラムであっても良い。 The program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.

なお、本発明の実施の形態は、上述した実施の形態に限定されるものではなく、本発明の要旨を逸脱しない範囲において種々の変更が可能である。 The embodiment of the present invention is not limited to the above-described embodiment, and various modifications can be made without departing from the gist of the present invention.

動物体領域抽出装置１１の機能的構成例を示す図である。It is a figure which shows the functional structural example of the moving body area | region extraction apparatus. 動物体領域抽出装置１１の抽出処理の一例を説明するフローチャートである。It is a flowchart explaining an example of the extraction process of the moving body area | region extraction apparatus 11. FIG. 動物体領域抽出装置１１の抽出処理の一例を説明するフローチャートである。It is a flowchart explaining an example of the extraction process of the moving body area | region extraction apparatus 11. FIG. ブロックの一例を示す図である。It is a figure which shows an example of a block. アクティビティの計算を説明するための図である。It is a figure for demonstrating calculation of activity. 動物体領域の抽出結果の一例を示す図である。It is a figure which shows an example of the extraction result of a moving body area | region. 動物体領域の抽出結果の一例を示す図である。It is a figure which shows an example of the extraction result of a moving body area | region. 本発明を適用したコンピュータのハードウェアの構成例を示すブロック図である。It is a block diagram which shows the structural example of the hardware of the computer to which this invention is applied.

Explanation of symbols

１１動物体領域抽出装置，２１ブロックマッチング部，２２全画面動き算出部，２３評価値テーブル保持部，２４アクティビティ計算部，２５動物体領域抽出部，４１ CPU，４２ ROM，４３ RAM，４４バス，４５入出力インタフェース，４６入力部，４７出力部，４８記憶部，４９通信部，５０ドライブ，５１リムーバブルメディア 11 moving object region extracting device, 21 block matching unit, 22 full screen motion calculating unit, 23 evaluation value table holding unit, 24 activity calculating unit, 25 moving object region extracting unit, 41 CPU, 42 ROM, 43 RAM, 44 bus, 45 I / O interface, 46 input section, 47 output section, 48 storage section, 49 communication section, 50 drives, 51 removable media

Claims

A full screen motion calculating means for calculating a full screen motion representing a motion of the entire processing screen to be processed for each of a plurality of unit images constituting the moving image;
Based on the full screen motion calculated by the full screen motion calculation means and an activity that is an index representing the complexity of luminance change in the block obtained by dividing the processing screen, the block is an animal. An information processing apparatus comprising: determination means for determining whether or not the body region.

The determination means determines whether the block is a moving object region based on a difference between the block of the first unit image and a full screen motion vector destination based on the full screen motion on the second unit image. The information processing apparatus according to claim 1.

The determination means calculates a predetermined evaluation formula using a difference between the activity of the first unit image and the full-screen motion vector destination on the second unit image, and by the calculated value, The information processing apparatus according to claim 2, wherein it is determined whether or not the block is a moving object region.

For each block of the first unit image, an evaluation value for a difference from the search destination on the second unit image is calculated, and an evaluation value table in which the evaluation value is associated with the search destination is generated. Then, a search destination to be detected is detected from search destinations included in the evaluation value table based on an evaluation value included in the evaluation value table, and the detected search destination is a motion of each block. A detection means;
The full screen motion calculating means calculates the full screen motion from the motion of the block detected by the motion detecting means,
The information processing apparatus according to claim 2, wherein the determination unit uses an evaluation value corresponding to the full screen motion vector destination included in the evaluation value table for each block as a difference from the full screen motion vector destination.

The information processing apparatus according to claim 1, wherein the activity is an integrated value in the block of a difference sum between each pixel in the block and its adjacent pixels.

The information processing apparatus according to claim 1, further comprising activity calculation means for calculating the activity.

For each of a plurality of unit images constituting a moving image, calculate a full screen motion representing a motion of the entire processing screen to be processed,
Based on the calculated full-screen motion and an activity that is an index representing the complexity of luminance change in the block obtained by dividing the processing screen, it is determined whether or not the block is a moving object region. An information processing method including a step.

For each of a plurality of unit images constituting a moving image, calculate a full screen motion representing a motion of the entire processing screen to be processed,
Based on the calculated full-screen motion and an activity that is an index representing the complexity of luminance change in the block obtained by dividing the processing screen, it is determined whether or not the block is a moving object region. A program to be executed by a computer that controls an information processing apparatus including a step.