JP4441354B2

JP4441354B2 - Video object extraction device, video object trajectory synthesis device, method and program thereof

Info

Publication number: JP4441354B2
Application number: JP2004222890A
Authority: JP
Inventors: 正樹高橋; 清一合志; 俊彦三須
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2004-07-30
Filing date: 2004-07-30
Publication date: 2010-03-31
Anticipated expiration: 2024-07-30
Also published as: JP2006042229A

Description

本発明は、入力された映像に含まれる映像オブジェクトを抽出し、映像オブジェクトの移動軌跡を映像上に合成する映像オブジェクト抽出装置、映像オブジェクト軌跡合成装置、その方法及びそのプログラムに関する。 The present invention relates to a video object extraction apparatus, a video object trajectory synthesis apparatus, a method thereof, and a program thereof that extract a video object included in an input video and synthesize a moving trajectory of the video object on the video.

従来、映像に含まれるオブジェクト（映像オブジェクト）を抽出し、追跡する映像オブジェクト抽出技術は種々存在している（例えば、特許文献１参照）。
そして、その抽出した映像オブジェクトの位置情報を利用して、映像に特殊な効果を持たせることで、映像制作の幅を広げる試みが行われている。映像に特殊効果を持たせる一例としては、野球中継等で、ピッチャーが投げたボール（映像オブジェクト）を追跡し、その後に、ピッチャーの投球映像にボールの移動軌跡をＣＧ（コンピュータ・グラフィックス）により合成するものがある。これによって、視聴者は、ピッチャーがボールを離した瞬間からキャッチャーがボールを捕球するまでのボールの軌跡、例えば、変化球の変化をより実感することができる。 Conventionally, there are various video object extraction techniques for extracting and tracking an object (video object) included in a video (see, for example, Patent Document 1).
Attempts have been made to expand the scope of video production by using the positional information of the extracted video objects to give the video special effects. As an example of giving a special effect to the video, the ball (video object) thrown by the pitcher is tracked in a baseball broadcast, etc., and then the movement trajectory of the ball is displayed on the pitcher's pitch video by CG (computer graphics). There is something to synthesize. Thereby, the viewer can more feel the change of the trajectory of the ball, for example, the change ball, from the moment when the pitcher releases the ball until the catcher catches the ball.

特許文献１の映像オブジェクト追跡装置は、入力された映像信号を前景画像と背景画像とに二値化したシルエット画像を生成し、映像オブジェクトの動き予測を、過去２フレームの映像オブジェクト領域から推定するものである。この従来の映像オブジェクト追跡装置は、２枚の連続画像から輝度差分抽出処理を行うことにより、映像から映像オブジェクトを抽出し、追跡するものであり、一旦記憶した映像を処理するものであった。つまり、抽出した映像オブジェクトの位置情報に基づいて、リアルタイムで映像オブジェクトの移動軌跡を合成するものではなかった。
そのため、本願出願人は、リアルタイムで映像オブジェクトの移動軌跡を合成する映像オブジェクト軌跡合成装置（特願２００３−３５５６２０号）を先に出願している。この映像オブジェクト軌跡合成装置では、移動する映像オブジェクトを追跡する際には、映像オブジェクトの次画像での位置座標を予測し、予測点周辺を探索領域とする必要がある。このような位置予測の手法として、先に出願した映像オブジェクト軌跡合成装置では線形予測を行っているのが現状である。
なお、線形予測の一例として、ブロックマッチングやオプティカルフローを用いた動きベクトルを算出する手法もある（例えば、特許文献２参照）。
特開２００３−４４８６０号公報（段落００１８〜００２７、図１）特開２００２−１５０２９１号公報（段落００５０〜００６０、図３） The video object tracking device disclosed in Patent Document 1 generates a silhouette image obtained by binarizing an input video signal into a foreground image and a background image, and estimates motion prediction of the video object from the video object areas of the past two frames. Is. This conventional video object tracking device extracts and tracks a video object from a video by performing a luminance difference extraction process from two continuous images, and processes a video once stored. That is, the moving trajectory of the video object is not synthesized in real time based on the extracted position information of the video object.
Therefore, the applicant of the present application has applied for a video object trajectory synthesis device (Japanese Patent Application No. 2003-355620) that synthesizes a moving trajectory of a video object in real time. In this video object trajectory synthesis apparatus, when tracking a moving video object, it is necessary to predict the position coordinates in the next image of the video object, and to set the search area around the predicted point. As such a position prediction method, the video object trajectory synthesizer previously applied for linear prediction is currently used.
As an example of linear prediction, there is a method of calculating a motion vector using block matching or optical flow (see, for example, Patent Document 2).
JP 2003-44860 (paragraphs 0018 to 0027, FIG. 1) JP 2002-150291 A (paragraphs 0050 to 0060, FIG. 3)

しかしながら、従来の映像オブジェクト抽出技術では、２枚の連続画像から輝度差分抽出処理を行うため、対象とする映像オブジェクト以外のノイズが多く発生する。一例として野球中継において、抽出対象とする映像オブジェクト（ボール）の周囲に他の動オブジェクトが存在した場合に誤抽出する場合が多い。具体的には、ボール以外にピッチャーやキャッチャーの手を抽出し、合成映像にこれらの残像までも表示してしまう。また、野球中継においてピッチャーからバッターに臨む映像がカメラ配置の関係上、右方向からの映像なので、左バッターがバッターボックスに入った場合にはボールの軌道上に左バッターが存在することになる。この左バッターはボールと比較して大きな動物体として作用するので、作画される画像に大量のノイズ成分が発生する。このため、ボールの抽出に失敗することが多い。それ故、左バッターに対してはボールの軌跡を映像に合成した合成映像（ボール軌跡表示）が殆ど利用されることがなかった。また、ピッチャーが投げるボールの軌跡は直線ではないので、バッター近くのボール予測位置は線形予測のみでは正確に求めることができない。このため、ボールの探索領域を広くとる必要があり、ボールを抽出するための演算量を抑えることができないという問題がある。 However, in the conventional video object extraction technique, since luminance difference extraction processing is performed from two continuous images, a lot of noise other than the target video object is generated. As an example, in baseball broadcasts, when there are other moving objects around a video object (ball) to be extracted, there are many cases of erroneous extraction. Specifically, the hands of the pitcher and the catcher are extracted in addition to the ball, and even these afterimages are displayed in the synthesized video. In addition, since the video facing the batter from the pitcher in the baseball broadcast is a video from the right direction due to the camera arrangement, when the left batter enters the batter box, the left batter exists on the ball trajectory. Since this left batter acts as a large animal body compared to the ball, a large amount of noise components are generated in the drawn image. For this reason, ball extraction often fails. Therefore, for the left batter, a composite image (ball trajectory display) in which the ball trajectory is combined with the video is hardly used. Further, since the trajectory of the ball thrown by the pitcher is not a straight line, the predicted ball position near the batter cannot be obtained accurately only by linear prediction. For this reason, it is necessary to take a wide search area for the ball, and there is a problem in that the amount of calculation for extracting the ball cannot be suppressed.

本発明は、以上のような問題点に鑑みてなされたものであり、背景画像にノイズが存在する場合でも映像オブジェクトを安定に抽出・追跡することが可能な映像オブジェクト抽出装置、映像オブジェクト軌跡合成装置、その方法及びそのプログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and a video object extraction apparatus and a video object trajectory synthesis capable of stably extracting and tracking a video object even when noise exists in a background image. An object is to provide an apparatus, a method thereof, and a program thereof.

本発明は、前記目的を達成するために創案されたものであり、まず、請求項１記載の映像オブジェクト抽出装置は、入力された映像に含まれる映像オブジェクトの位置を抽出する映像オブジェクト抽出装置であって、オブジェクト候補画像生成手段と、オブジェクト抽出手段と、を有し、前記オブジェクト候補画像生成手段は、第１の画像記憶手段と、第２の画像記憶手段と、第１の差分画像生成手段と、第２の差分画像生成手段と、候補画像生成手段と、を備える構成とした。 The present invention has been made to achieve the above object, and first, the video object extraction device according to claim 1 is a video object extraction device for extracting the position of a video object included in an input video. And an object candidate image generation means and an object extraction means, wherein the object candidate image generation means includes a first image storage means, a second image storage means, and a first difference image generation means. And a second difference image generation means and a candidate image generation means.

かかる構成によれば、映像オブジェクト抽出装置は、オブジェクト候補画像生成手段によって、時系列に入力される連続した第１の画像（過去）、第２の画像（現在）及び第３の画像（未来）のうち、第１の画像を第１の画像記憶手段に記憶すると共に、第２の画像を第２の画像記憶手段に記憶する。また、オブジェクト候補画像生成手段は、第１の差分画像生成手段によって、例えば第２の画像記憶手段に記憶された第２の画像の輝度から、新たに入力される第３の画像の輝度を引くことによって第１の差分画像を生成する。また、第２の差分画像生成手段によって、第１の記憶手段に記憶された第１の画像の輝度から、第２の記憶手段に記憶された第２の画像の輝度を引くことによって第２の差分画像を生成する。これら第１の差分画像と第２の差分画像とは、候補画像生成手段に入力される。この候補画像生成手段は、第１の差分画像と第２の差分画像の共通位置に配された画素値を参照して第１の差分画像の画素値の符号と第２の差分画像の画素値の符号とが異なる領域を判別し、オブジェクト候補画像を生成する。 According to such a configuration, the video object extraction device includes the first image (past), the second image (current), and the third image (future) that are input in time series by the object candidate image generation unit. Among these, the first image is stored in the first image storage means, and the second image is stored in the second image storage means. Further, the object candidate image generation means subtracts the brightness of the newly input third image from the brightness of the second image stored in the second image storage means, for example, by the first difference image generation means. Thus, the first difference image is generated. Further, the second difference image generation means subtracts the brightness of the second image stored in the second storage means from the brightness of the first image stored in the first storage means, thereby generating the second difference image generation means. A difference image is generated. These first difference image and second difference image are input to the candidate image generating means. The candidate image generating means refers to the pixel value arranged at the common position of the first difference image and the second difference image, the sign of the pixel value of the first difference image, and the pixel value of the second difference image An area with a different code is discriminated, and an object candidate image is generated.

ここで、入力画像中の映像オブジェクトが高速で移動し、背景画像よりも輝度が高いとすれば、候補画像生成手段は、第１の差分画像の画素値の符号がプラスであり、且つ、第２の差分画像の画素値の符号がマイナスとなる領域を映像オブジェクト候補画像として抽出する。つまり、映像オブジェクト候補画像は、第１の差分画像の画素値の符号と第２の差分画像の画素値の符号とが異なる領域に示される画像である。また、入力画像中の映像オブジェクトが背景画像よりも輝度が低い場合や、差分画像の生成の仕方を変更した場合にも、第１の差分画像の画素値の符号と第２の差分画像の画素値の符号とが異なる領域を映像オブジェクト候補画像とすることができる。 Here, if the video object in the input image moves at a high speed and the brightness is higher than that of the background image, the candidate image generating means has a plus sign of the pixel value of the first difference image and the first image. A region where the sign of the pixel value of the difference image 2 is minus is extracted as a video object candidate image. That is, the video object candidate image is an image shown in a region where the sign of the pixel value of the first difference image is different from the sign of the pixel value of the second difference image. Also, when the video object in the input image has a lower brightness than the background image, or when the method of generating the difference image is changed, the sign of the pixel value of the first difference image and the pixel of the second difference image A region having a different value sign can be set as a video object candidate image.

そして、映像オブジェクト抽出装置は、オブジェクト抽出手段が、抽出条件を参照することで、オブジェクト候補画像の中から、抽出すべき映像オブジェクトの位置を抽出する。例えば、記憶手段に、予め抽出対象となる映像オブジェクトの抽出条件を記憶しておくことで、複数の映像オブジェクトの候補から１つの映像オブジェクトを特定することが可能になる。これによって、抽出すべき映像オブジェクトの画像内における位置が特定される。ここで、映像オブジェクトの位置は映像オブジェクトの重心座標等を用いることができる。 Then, in the video object extraction device, the object extraction unit refers to the extraction condition, and extracts the position of the video object to be extracted from the object candidate images. For example, it is possible to specify one video object from a plurality of video object candidates by storing the extraction conditions of the video object to be extracted in advance in the storage means. Thereby, the position in the image of the video object to be extracted is specified. Here, the position of the video object can use the barycentric coordinates of the video object.

請求項２記載の映像オブジェクト抽出装置は、請求項１に記載の映像オブジェクト抽出装置において、前記オブジェクト抽出手段で抽出された映像オブジェクトの位置に基づいて、次に入力される画像内における前記映像オブジェクトの位置を予測し、前記映像オブジェクトの予測位置情報を前記オブジェクト候補画像生成手段に出力するオブジェクト位置予測手段をさらに備えることを特徴とする。 3. The video object extracting device according to claim 2, wherein the video object in the next input image is based on the position of the video object extracted by the object extracting means in the video object extracting device according to claim 1. An object position predicting means for predicting the position of the video object and outputting the predicted position information of the video object to the object candidate image generating means.

かかる構成によれば、映像オブジェクト抽出装置は、オブジェクト位置予測手段によって、オブジェクト抽出手段で抽出された映像オブジェクトの位置に基づいて、次に入力されるフィールド画像内における映像オブジェクトの探索領域を、映像オブジェクトの近傍領域に設定する。 According to such a configuration, the video object extraction device uses the object position prediction unit to search the video object search area in the next input field image based on the position of the video object extracted by the object extraction unit. Set in the neighborhood area of the object.

請求項３記載の映像オブジェクト抽出装置は、請求項２に記載の映像オブジェクト抽出装置において、前記オブジェクト位置予測手段は、前記映像オブジェクトの軌跡が直線になると予測する線形予測手段と、前記映像オブジェクトの軌跡が曲線になると予測する曲線予測手段と、前記映像オブジェクトの過去に抽出された位置座標の数または前記曲線の方程式に基づいて、前記線形予測手段による線形予測と前記曲線予測手段による曲線予測とを切り替える切替手段とを有することを特徴とする。 The video object extracting device according to claim 3 is the video object extracting device according to claim 2, wherein the object position predicting means predicts that the trajectory of the video object is a straight line; Curve prediction means for predicting that the locus becomes a curve, and linear prediction by the linear prediction means and curve prediction by the curve prediction means based on the number of position coordinates of the video object extracted in the past or the equation of the curve And switching means for switching between.

かかる構成によれば、オブジェクト位置予測手段は、例えば、映像オブジェクトの過去に抽出された位置座標が所定数よりも少ない場合には線形予測を実行し、前記所定数以上である場合には曲線予測を実行する。また、例えば曲線予測を実行しているときに、曲線の方程式が許容範囲を超えて更新された場合には、直線予測に切り替える。また、線形予測手段は、例えば、カルマンフィルタを用いることで、逐次移動する映像オブジェクトの位置を推定し、探索領域を限定することができる。また、曲線予測手段は、例えば最小自乗法により映像オブジェクトの予測軌跡としての曲線を予測する。 According to such a configuration, the object position prediction means performs linear prediction when, for example, the position coordinates extracted in the past of the video object are smaller than a predetermined number, and curves prediction when the number is larger than the predetermined number. Execute. For example, when the curve prediction is executed and the curve equation is updated beyond the allowable range, the prediction is switched to the straight line prediction. Moreover, the linear prediction means can estimate the position of the video object which moves sequentially by using a Kalman filter, for example, and can limit the search area. Further, the curve predicting means predicts a curve as a prediction trajectory of the video object by, for example, the least square method.

請求項４記載の映像オブジェクト軌跡合成装置は、請求項１乃至請求項３のいずれか１項に記載の映像オブジェクト抽出装置と、前記オブジェクト抽出手段で抽出された映像オブジェクトの位置に基づいて、時系列に入力される画像に対応した領域に前記映像オブジェクトを示す画像を作画することで、前記映像オブジェクトの移動軌跡を抽出した軌跡画像を生成する作画手段と、前記軌跡画像と前記時系列に入力される画像とを合成する画像合成手段と、を備えている構成とした。 According to a fourth aspect of the present invention, there is provided a video object trajectory synthesis device based on the video object extraction device according to any one of the first to third aspects and a position of the video object extracted by the object extraction means. A drawing means for generating a trajectory image obtained by extracting a moving trajectory of the video object by drawing an image showing the video object in an area corresponding to the image inputted in the series; and the trajectory image and the time series input And an image synthesizing means for synthesizing the image to be synthesized.

かかる構成によれば、映像オブジェクト軌跡合成装置は、作画手段によって、オブジェクト抽出手段で抽出した映像オブジェクトの位置から、映像オブジェクトの移動軌跡のみを作画した軌跡画像を生成する。なお、この軌跡画像は、入力画像毎に抽出した映像オブジェクトを時系列に順次上書きすることで、生成することができる。このようにして生成された軌跡画像を、画像合成手段によって、逐次入力画像と合成することで、映像に抽出対象となる映像オブジェクトの移動軌跡を合成した合成映像が生成される。 According to such a configuration, the video object trajectory synthesis apparatus generates a trajectory image in which only the moving trajectory of the video object is drawn from the position of the video object extracted by the object extraction means by the drawing means. The trajectory image can be generated by sequentially overwriting the video objects extracted for each input image in time series. The trajectory image generated in this manner is sequentially combined with the input image by the image compositing means, thereby generating a composite video in which the moving trajectory of the video object to be extracted is combined with the video.

請求項５記載の映像オブジェクト軌跡合成方法は、差分画像生成ステップと、オブジェクト候補画像生成ステップと、オブジェクト抽出ステップと、作画ステップと、画像合成ステップと、を含んでいることを特徴とする。 The video object trajectory synthesis method according to claim 5 includes a difference image generation step, an object candidate image generation step, an object extraction step, a drawing step, and an image synthesis step.

この手順によれば、映像オブジェクト軌跡合成方法は、差分画像生成ステップによって、時系列に入力される連続画像から、第１の差分画像と第２の差分画像とを生成する。そして、オブジェクト候補画像生成ステップによって、第１の差分画像の画素値の符号と第２の差分画像の画素値の符号とが異なる領域を、映像オブジェクトの候補として抽出したオブジェクト候補画像を生成する。そして、オブジェクト抽出ステップにおいて、オブジェクト候補画像の中から、抽出すべき映像オブジェクトの位置を抽出する。ここで、抽出条件には、抽出すべき映像オブジェクトの位置、面積、輝度、色及び円形度の少なくとも１つ以上を用いる。そして、映像オブジェクト軌跡合成方法は、作画ステップによって、映像オブジェクトの移動軌跡のみを作画した軌跡画像を生成する。この軌跡画像を、画像合成ステップによって、逐次入力画像と合成することで、抽出対象となる映像オブジェクトの移動軌跡を映像に合成した合成映像が生成される。 According to this procedure, the video object trajectory synthesis method generates the first difference image and the second difference image from the continuous images input in time series by the difference image generation step. Then, in the object candidate image generation step, an object candidate image is generated by extracting an area where the code of the pixel value of the first difference image is different from the code of the pixel value of the second difference image as a video object candidate. In the object extraction step, the position of the video object to be extracted is extracted from the object candidate images. Here, at least one of the position, area, luminance, color, and circularity of the video object to be extracted is used as the extraction condition. Then, the video object trajectory synthesis method generates a trajectory image in which only the moving trajectory of the video object is drawn in the drawing step. The trajectory image is sequentially combined with the input image in an image compositing step, thereby generating a composite video in which the moving trajectory of the video object to be extracted is combined with the video.

さらに、請求項６に記載の映像オブジェクト軌跡合成プログラムは、コンピュータを、差分画像生成手段、オブジェクト候補画像生成手段、オブジェクト抽出手段、作画手段、画像合成手段、として機能させることを特徴とする。 Further, the video object locus synthesis program according to claim 6 causes the computer to function as a difference image generation means, an object candidate image generation means, an object extraction means, a drawing means, and an image composition means.

かかる構成によれば、映像オブジェクト軌跡合成プログラムは、差分画像生成手段によって、時系列に入力される連続画像から、第１の差分画像と第２の差分画像とを生成する。そして、オブジェクト候補画像生成手段によって、第１の差分画像の画素値の符号と第２の差分画像の画素値の符号とが異なる領域を、映像オブジェクトの候補として抽出したオブジェクト候補画像を生成する。そして、オブジェクト抽出手段によって、オブジェクト候補画像の中から、抽出すべき映像オブジェクトの位置を抽出する。さらに、映像オブジェクト軌跡合成プログラムは、作画手段によって、映像オブジェクトの移動軌跡のみを作画した軌跡画像を生成する。この軌跡画像を画像合成手段によって、逐次フィールド画像と合成することで、抽出対象となる映像オブジェクトの移動軌跡を映像に合成した合成映像が生成される。 According to such a configuration, the video object trajectory synthesis program generates the first difference image and the second difference image from the continuous images input in time series by the difference image generation means. Then, an object candidate image is generated by extracting an area where the code of the pixel value of the first difference image is different from the code of the pixel value of the second difference image by the object candidate image generating means. Then, the position of the video object to be extracted is extracted from the object candidate image by the object extracting means. Further, the video object trajectory synthesis program generates a trajectory image in which only the moving trajectory of the video object is drawn by the drawing means. The trajectory image is sequentially synthesized with the field image by the image synthesizing means, thereby generating a synthesized video in which the moving trajectory of the video object to be extracted is synthesized with the video.

請求項１記載の発明によれば、オブジェクト候補画像生成手段により、２つの差分画像で画素値が異符号となる領域がオブジェクト候補画像として抽出されるので、抽出対象とする映像オブジェクトの近傍に、ノイズとしての他の動オブジェクトが存在していても、高速で移動する映像オブジェクトだけを安定に抽出することが可能になる。 According to the first aspect of the present invention, the object candidate image generating means extracts the region where the pixel values of the two difference images have different signs as the object candidate image. Therefore, in the vicinity of the video object to be extracted, Even if other moving objects as noise exist, it is possible to stably extract only video objects that move at high speed.

請求項２記載の発明によれば、入力画像内において、逐次移動する映像オブジェクトの位置を推定することができ、その位置に基づいて、映像オブジェクトの探索領域を映像オブジェクトの近傍に設定することができるので、映像オブジェクトを抽出する際の演算量を抑えることができる。 According to the second aspect of the present invention, it is possible to estimate the position of the video object that moves sequentially in the input image, and to set the search area of the video object in the vicinity of the video object based on the position. Therefore, it is possible to reduce the amount of computation when extracting the video object.

請求項３記載の発明によれば、線形予測と曲線予測とを切り替えるので、高精度且つ誤検出を低減した位置予測をすることができ、映像オブジェクトを安定に追跡することができる。 According to the third aspect of the invention, since linear prediction and curve prediction are switched, position prediction with high accuracy and reduced false detection can be performed, and a video object can be tracked stably.

請求項４、請求項５または請求項６に記載の発明によれば、映像から抽出対象となる映像オブジェクトを抽出、追跡し、その映像オブジェクトの移動軌跡を、人手を介さず映像に合成することができる。これによって、新たな特殊効果を持った映像を提供することが可能になる。 According to the invention of claim 4, claim 5 or claim 6, the video object to be extracted is extracted and tracked from the video, and the movement trajectory of the video object is synthesized with the video without human intervention. Can do. This makes it possible to provide a video with a new special effect.

次に、本発明の実施形態について、適宜図面を参照しながら詳細に説明する。
まず、図１を参照して、本発明に係る映像オブジェクト軌跡合成装置の構成について説明する。図１は、映像オブジェクト軌跡合成装置の構成を示したブロック図である。
図１に示すように、映像オブジェクト軌跡合成装置１は、入力された映像から、追跡対象となる動きを伴うオブジェクト（映像オブジェクト）を抽出し、追跡するとともに、その映像オブジェクトの移動軌跡を映像上に合成するものである。なお、追跡対象の映像オブジェクトは例えばボールを前提としている。ここでは、映像オブジェクト軌跡合成装置１は、映像オブジェクト抽出装置１０と、映像遅延手段２０と、作画・画像合成手段３０とを備えて構成されている。 Next, embodiments of the present invention will be described in detail with reference to the drawings as appropriate.
First, the configuration of the video object trajectory synthesis apparatus according to the present invention will be described with reference to FIG. FIG. 1 is a block diagram showing a configuration of a video object trajectory synthesis apparatus.
As shown in FIG. 1, the video object trajectory synthesizer 1 extracts and tracks an object (video object) with a movement to be tracked from an input video, and tracks the movement trajectory of the video object on the video. To be synthesized. Note that the video object to be tracked is premised on a ball, for example. Here, the video object locus synthesizing apparatus 1 includes a video object extracting apparatus 10, a video delay unit 20, and a drawing / image synthesis unit 30.

映像オブジェクト抽出装置１０は、入力された映像からボールの位置座標及び特徴量を抽出し、抽出した位置座標及び特徴量を作画・画像合成手段３０に出力するものである。なお、この映像オブジェクト抽出装置１０は、位置座標のみを抽出、出力するように構成してもよい。
映像遅延手段２０は、入力された映像を遅延させて作画・画像合成手段３０に出力するものである。
作画・画像合成手段３０は、映像オブジェクト抽出装置１０により抽出されたボールの位置に相当する場所に、ＣＧ（コンピュータ・グラフィックス）によりボールの軌跡を作画する作画手段３１と、作画したボールの軌跡画像と映像遅延手段２０により遅延された映像とを合成し合成映像を出力する画像合成手段３２とを備えている。なお、作画・画像合成手段３０は、映像オブジェクト抽出装置１０がボールの抽出に失敗した場合に、ボールの軌跡を内挿により補間する補間手段（図示せず）を備えている。 The video object extraction device 10 extracts the position coordinates and feature quantities of the ball from the input video, and outputs the extracted position coordinates and feature quantities to the drawing / image synthesis means 30. The video object extraction device 10 may be configured to extract and output only the position coordinates.
The video delay means 20 delays the input video and outputs it to the drawing / image synthesis means 30.
The drawing / image synthesizing means 30 includes a drawing means 31 for drawing a ball trajectory by CG (computer graphics) at a place corresponding to the position of the ball extracted by the video object extracting device 10, and a trajectory of the drawn ball. An image synthesizing unit 32 that synthesizes the image and the video delayed by the video delay unit 20 and outputs a synthesized video is provided. The drawing / image composition means 30 includes an interpolation means (not shown) for interpolating the trajectory of the ball by interpolation when the video object extraction device 10 fails to extract the ball.

映像オブジェクト抽出装置１０は、図１に示すように、オブジェクト候補画像生成手段１１と、ボール選定手段１２と、抽出条件記憶手段１３と、位置予測手段１４と、探索領域設定手段１５とを備えている。
オブジェクト候補画像生成手段１１は、入力された映像から、その映像を構成するフィールド画像毎に探索領域を切り出し、追跡対象となる映像オブジェクトの候補を抽出したオブジェクト候補画像を生成するものである。なお、映像は、例えば１秒間に６０枚のフィールド画像から構成されている。そこで、オブジェクト候補画像生成手段１１は、このフィールド画像の中から映像オブジェクトの候補を抽出し、２値化することで、その映像オブジェクトの候補だけからなる画像（オブジェクト候補画像）を生成している。このオブジェクト候補画像は、追跡対象となる映像オブジェクトに類似する映像オブジェクトを複数抽出した画像である。例えば、オブジェクト候補画像は、動きを伴った映像オブジェクト等、追跡対象となる映像オブジェクトを大まかに抽出した画像である。 As shown in FIG. 1, the video object extraction device 10 includes an object candidate image generation unit 11, a ball selection unit 12, an extraction condition storage unit 13, a position prediction unit 14, and a search area setting unit 15. Yes.
The object candidate image generation means 11 generates an object candidate image by extracting a search area for each field image constituting the video from the input video and extracting video object candidates to be tracked. The video is composed of, for example, 60 field images per second. Therefore, the object candidate image generation means 11 extracts a video object candidate from the field image and binarizes it, thereby generating an image (object candidate image) consisting only of the video object candidate. . This object candidate image is an image obtained by extracting a plurality of video objects similar to the video object to be tracked. For example, the object candidate image is an image obtained by roughly extracting a video object to be tracked, such as a video object with movement.

このオブジェクト候補画像生成手段１１は、画像記憶部１１１，１１２と、差分画像生成手段１１３，１１４と、候補画像生成手段１１５とを備えて構成されている。
画像記憶部１１１，１１２は、例えば映像信号を１フィールド単位でデジタルデータとして記録し、各種の信号／画像処理をするためのメモリである。
画像記憶部１１１（第２の画像記憶手段）は、連続する３枚の入力画像のうち中間の位置（現在）の入力画像（奇数フィールドおよび偶数フィールド）を記憶する。ここで記憶される現在の入力画像はボール選定手段１２に出力される。
画像記憶部１１２（第１の画像記憶手段）は、連続する３枚の入力画像のうち最初の位置（過去）の入力画像（奇数フィールドおよび偶数フィールド）を記憶する。 The object candidate image generation unit 11 includes image storage units 111 and 112, difference image generation units 113 and 114, and candidate image generation unit 115.
The image storage units 111 and 112 are memories for recording, for example, video signals as digital data in units of one field and performing various signal / image processing.
The image storage unit 111 (second image storage unit) stores an input image (odd field and even field) at an intermediate position (current) among three consecutive input images. The current input image stored here is output to the ball selecting means 12.
The image storage unit 112 (first image storage means) stores the input image (odd field and even field) at the first position (past) among the three consecutive input images.

差分画像生成手段１１３（第１の差分画像生成手段）は、現在の入力画像を遅延させ、遅延させた現在の入力画像の輝度から、新たに入力した画像（未来）の輝度を差し引くことによって差分画像１（第１の差分画像）を生成し、生成した差分画像１を候補画像生成手段１１５に出力するものである。
差分画像生成手段１１４（第２の差分画像生成手段）は、過去の入力画像を遅延させ、遅延させた過去の入力画像の輝度から、現在の入力画像の輝度を差し引くことによって差分画像２（第２の差分画像）を生成し、生成した差分画像２を候補画像生成手段１１５に出力するものである。 The difference image generation unit 113 (first difference image generation unit) delays the current input image, and subtracts the luminance of the newly input image (future) from the delayed luminance of the current input image. An image 1 (first difference image) is generated, and the generated difference image 1 is output to the candidate image generation means 115.
The difference image generation unit 114 (second difference image generation unit) delays the past input image, and subtracts the luminance of the current input image from the luminance of the delayed past input image, thereby calculating the difference image 2 (first image). 2 difference images) and the generated difference image 2 is output to the candidate image generation means 115.

候補画像生成手段１１５は、探索領域の全画素に対して差分画像１及び差分画像２を所定の輝度閾値と比較して、例えば、画像の画素値が予め定めた所定の条件を満たす場合に、画素値を“１（白）”、それ以外の場合に“０（黒）”と判別することで２値化画像を生成するものである。この候補画像生成手段１１５は、これによって、例えば、画素値が“１（白）”となる領域を映像オブジェクトの候補として抽出することができる。ここで生成された２値化画像は、映像オブジェクトの候補を抽出したオブジェクト候補画像として、ボール選定手段１２に出力される。 The candidate image generation unit 115 compares the difference image 1 and the difference image 2 with a predetermined luminance threshold for all the pixels in the search region. For example, when the pixel value of the image satisfies a predetermined condition, A binarized image is generated by discriminating the pixel value as “1 (white)”, otherwise “0 (black)”. Thereby, the candidate image generating unit 115 can extract, for example, an area where the pixel value is “1 (white)” as a video object candidate. The binarized image generated here is output to the ball selecting means 12 as an object candidate image obtained by extracting video object candidates.

ボール選定手段１２（オブジェクト抽出手段）は、オブジェクト候補画像生成手段１１で生成されたオブジェクト候補画像の中から、抽出条件記憶手段１３に記憶されている抽出条件に基づいて、抽出（追跡）対象となる映像オブジェクト（ボール）を選択し、その映像オブジェクトの位置及び映像オブジェクトを特徴付ける特徴量を抽出するものである。ここでは、ボール選定手段１２は、ラベリング部１２１と、特徴量解析部１２２と、フィルタ処理部１２３と、オブジェクト選択部１２４とを備えて構成されている。 The ball selection unit 12 (object extraction unit) selects an extraction (tracking) target from the object candidate images generated by the object candidate image generation unit 11 based on the extraction conditions stored in the extraction condition storage unit 13. The video object (ball) is selected, and the position of the video object and the feature quantity characterizing the video object are extracted. Here, the ball selecting means 12 includes a labeling unit 121, a feature amount analyzing unit 122, a filter processing unit 123, and an object selecting unit 124.

ラベリング部１２１は、オブジェクト候補画像生成手段１１で生成されたオブジェクト候補画像（２値化画像）の中で、映像オブジェクトの候補となる領域に対して番号（ラベル）を付すものである。すなわち、ラベリング部１２１は、映像オブジェクトの領域である“１（白）”の画素値を持つ連結した領域（連結領域）に対して１つの番号を付す。これによって、オブジェクト候補画像内の映像オブジェクトの候補が番号付けされたことになる。 The labeling unit 121 assigns numbers (labels) to regions that are candidates for video objects in the object candidate images (binarized images) generated by the object candidate image generation unit 11. That is, the labeling unit 121 assigns one number to a connected area (connected area) having a pixel value of “1 (white)” that is an area of the video object. As a result, the video object candidates in the object candidate image are numbered.

特徴量解析部１２２は、ラベリング部１２１で番号付けされた映像オブジェクトの候補毎に、映像オブジェクトの候補の位置座標や、映像オブジェクトの“面積、輝度、色及び円形度”などの特徴量（パラメータ）の値を算出する。
ここで、“位置”は、例えば、映像オブジェクトの重心位置を示す。“面積”は、例えば、映像オブジェクトの画素数を示す。また、“輝度”は、映像オブジェクトにおける各画素の輝度の平均値を示す。また、“色”は、映像オブジェクトにおける各画素のＲＧＢ値の平均値を示す。また、“円形度”は、映像オブジェクトの円形の度合いを示すものであって、円形に近いほど大きな値を有するものである。例えば、抽出対象の映像オブジェクトがボールのような円形の形状を有するものの場合は、抽出条件の円形度は円形に近いほど１に近い値になる。この円形度ｅは、映像オブジェクトの面積をＳ、周囲長をＬとしたとき、以下の（１）式で表される。 For each video object candidate numbered by the labeling unit 121, the feature amount analysis unit 122 includes feature amounts (parameters) such as the position coordinates of the video object candidates and the “area, luminance, color, and circularity” of the video object. ) Is calculated.
Here, “position” indicates, for example, the barycentric position of the video object. “Area” indicates, for example, the number of pixels of the video object. “Luminance” indicates the average value of the luminance of each pixel in the video object. “Color” indicates an average value of RGB values of each pixel in the video object. The “circularity” indicates the circularity of the video object, and has a larger value as it becomes closer to a circle. For example, when the video object to be extracted has a circular shape such as a ball, the circularity of the extraction condition becomes a value closer to 1 as it is closer to a circle. The circularity e is expressed by the following equation (1), where S is the area of the video object and L is the perimeter.

ｅ＝４πＳ／Ｌ²…（１） e = 4πS / L ² (1)

フィルタ処理部１２３は、特徴量解析部１２２が算出したパラメータの値を用いて、抽出条件記憶手段１３に記憶されている抽出条件情報（パラメータ目標値）等に合致する映像オブジェクトかどうかを判定することで、抽出（追跡）対象となる映像オブジェクトを絞り込むものである。すなわち、このフィルタ処理部１２３は、映像オブジェクトの候補毎に、抽出条件記憶手段１３に記憶されている抽出条件（例えば、面積、輝度、色及び円形度）や後記する位置予測手段１４による予測位置に基づいて、特徴量解析部１２２で解析された特徴量をフィルタ（位置フィルタ、面積フィルタ、輝度フィルタ、色フィルタ及び円形度フィルタ）にかけることで、抽出条件を満たす映像オブジェクトを、抽出すべき映像オブジェクトとして選択する。
なお、画像記憶部１１１に記憶された現在のフィールド画像（奇数フィールドおよび偶数フィールド）は前記のようにボール選定手段１２に出力されており、フィルタ処理部１２３は、入力画像を１フレーム（２フィールド）分遅らせた画像を参照画像としてフィルタ処理を「現在」のタイミングで行う。 The filter processing unit 123 uses the parameter value calculated by the feature amount analysis unit 122 to determine whether the video object matches the extraction condition information (parameter target value) stored in the extraction condition storage unit 13. Thus, the video objects to be extracted (tracked) are narrowed down. That is, for each video object candidate, the filter processing unit 123 extracts an extraction condition (for example, area, luminance, color, and circularity) stored in the extraction condition storage unit 13 and a predicted position by the position prediction unit 14 described later. The video object satisfying the extraction condition should be extracted by applying the feature value analyzed by the feature value analysis unit 122 to a filter (position filter, area filter, luminance filter, color filter, and circularity filter) based on Select as a video object.
The current field image (odd field and even field) stored in the image storage unit 111 is output to the ball selection unit 12 as described above, and the filter processing unit 123 converts the input image into one frame (2 fields). ) Filtering is performed at the “current” timing using the image delayed by the reference image as a reference image.

オブジェクト選択部１２４は、全てのフィルタを通過したオブジェクトの中で、ボールの予測位置座標に最も近いオブジェクト候補をボールとして選択するものである。ここで抽出した映像オブジェクトの位置は、現在の映像オブジェクトの位置情報として、抽出条件記憶手段１３に書き込まれる。ここで映像オブジェクトの位置としては、映像オブジェクトの重心座標、多角形近似の頂点座標、スプライン曲線の制御点座標等を用いることができる。なお、オブジェクト選択部１２４は、抽出条件に適合した映像オブジェクトを選択できなかった場合は、その旨（抽出失敗）を作画・画像合成手段３０に通知する。 The object selection unit 124 selects an object candidate closest to the predicted position coordinates of the ball as a ball among the objects that have passed through all the filters. The extracted position of the video object is written in the extraction condition storage unit 13 as current video object position information. Here, as the position of the video object, the barycentric coordinates of the video object, the vertex coordinates of the polygon approximation, the control point coordinates of the spline curve, and the like can be used. If the object selection unit 124 cannot select a video object that matches the extraction condition, the object selection unit 124 notifies the drawing / image composition unit 30 to that effect (extraction failure).

抽出条件記憶手段１３は、抽出（追跡）対象となる映像オブジェクトを選択するための条件を記憶するもので、一般的なハードディスク等の記憶媒体である。この抽出条件記憶手段１３は、種々の抽出条件を示す抽出条件情報（パラメータ目標値）と、映像オブジェクトの位置を示す位置情報とを記憶している。 The extraction condition storage means 13 stores conditions for selecting a video object to be extracted (tracked), and is a general storage medium such as a hard disk. The extraction condition storage unit 13 stores extraction condition information (parameter target value) indicating various extraction conditions and position information indicating the position of the video object.

抽出条件情報は、抽出すべき映像オブジェクトの抽出条件を記述した情報であって、例えば、面積、輝度、色及び円形度の少なくとも１つ以上の抽出条件を記述したものである。この抽出条件情報は、ボール選定手段１２が、オブジェクト候補画像生成手段１１で生成されたオブジェクト候補画像から抽出すべき映像オブジェクトを選択するためのフィルタ（面積フィルタ、輝度フィルタ、色フィルタ及び円形度フィルタ）の条件となるものである。
なお、抽出条件情報には、面積フィルタ、輝度フィルタ、色フィルタ及び円形度フィルタの条件として、予め定めた初期値と、その許容範囲を示す閾値とを記憶しておく。これによって、各フィルタは閾値外の値（特徴）を持つ映像オブジェクトを抽出すべき映像オブジェクトの候補から外すことができる。 The extraction condition information is information describing the extraction condition of the video object to be extracted, and describes, for example, at least one extraction condition of area, luminance, color, and circularity. This extraction condition information includes a filter (area filter, luminance filter, color filter, and circularity filter) for the ball selecting unit 12 to select a video object to be extracted from the object candidate image generated by the object candidate image generating unit 11. ).
The extraction condition information stores a predetermined initial value and a threshold value indicating an allowable range as conditions for the area filter, the luminance filter, the color filter, and the circularity filter. Thereby, each filter can exclude a video object having a value (feature) outside the threshold from candidates of video objects to be extracted.

抽出条件記憶手段１３に記憶される位置情報は、現在追跡している映像オブジェクトの位置を示す情報である。この位置情報は、例えば、映像オブジェクトの重心座標とする。この重心座標は、ボール選定手段１２の特徴量解析部１２２によって算出される。なお、この位置情報は、抽出条件情報に合致する映像オブジェクトが複数存在する場合に、位置情報で示した座標に最も近い映像オブジェクトを、抽出すべき映像オブジェクトとして決定するための抽出条件としても利用される。 The position information stored in the extraction condition storage unit 13 is information indicating the position of the video object currently being tracked. This position information is, for example, the barycentric coordinates of the video object. The barycentric coordinates are calculated by the feature amount analysis unit 122 of the ball selection unit 12. This position information is also used as an extraction condition for determining the video object closest to the coordinates indicated by the position information as the video object to be extracted when there are a plurality of video objects that match the extraction condition information. Is done.

位置予測手段１４は、ボール選定手段１２で選定されたボールの位置（重心座標等）に基づいて、ボールの予測位置を推定し、次に入力されるフィールド画像における、予測位置情報を探索領域設定手段１５に出力するものである。
この位置予測手段１４は、線形予測手段１４１と、曲線予測手段１４２と、切替手段１４３とを備えて構成されている。 The position predicting means 14 estimates the predicted position of the ball based on the position (center of gravity coordinates, etc.) of the ball selected by the ball selecting means 12, and sets the predicted position information in the next input field image as a search region. This is output to the means 15.
The position predicting unit 14 includes a linear predicting unit 141, a curve predicting unit 142, and a switching unit 143.

線形予測手段１４１は、動きベクトルを利用しボールの軌跡が直線になると予測するものである。この線形予測手段１４１は、例えば、重心座標にカルマンフィルタ（Ｋａｌｍａｎｆｉｌｔｅｒ）を適用することで、次フィールド画像（次フレーム）におけるボールの位置を予測し、探索領域を推定する。カルマンフィルタは、時系列に観測される観測量に基づいて現在の状態を推定する「濾波」と、未来の状態を推定する「予測」とを行う漸化式を適用することで、時々刻々と変化する状態を推定するものである。 The linear predicting means 141 predicts that the ball trajectory is a straight line by using the motion vector. For example, the linear prediction unit 141 predicts the position of the ball in the next field image (next frame) by applying a Kalman filter to the barycentric coordinates, and estimates the search area. The Kalman filter changes from moment to moment by applying a recurrence formula that performs “filtering” to estimate the current state based on the observed quantity observed in time series and “prediction” to estimate the future state. The state to be performed is estimated.

曲線予測手段１４２は、ボールの軌跡が最小自乗法で求めた２次曲線になると予測するものである。この曲線予測手段１４２は、ボールを抽出する度に曲線予測に利用する２次曲線（ｙ＝ａｘ²＋ｂｘ＋ｃ）を更新すると共に、２次曲線の係数a，bを監視しており、係数ａ，ｂの符号が変化したり、係数ａ，ｂの値が所定値を超えたりしたか否かを判別し、判別結果を切替手段１４３に出力する。 The curve predicting unit 142 predicts that the ball trajectory is a quadratic curve obtained by the method of least squares. The curve predicting means 142 updates the quadratic curve (y = ax ² + bx + c) used for the curve prediction every time a ball is extracted, and monitors the coefficients a and b of the quadratic curve. It is determined whether or not the sign of b has changed or the values of the coefficients a and b have exceeded a predetermined value, and the determination result is output to the switching means 143.

切替手段１４３は、過去に抽出されたボールの位置座標の数または２次曲線の方程式に基づいて、線形予測手段１４１による線形予測と曲線予測手段１４２による曲線予測とを切り替えるものである。ここで、切替手段１４３は、例えば映像オブジェクトの過去に抽出された位置座標が所定数以上である場合には曲線予測を選択し、前記所定数より少ない場合には線形予測を選択する。この所定数は例えば３個であり、特に正確な曲線を求める場合には５個以上が好適である。また、曲線予測を実行しているときに、２次曲線の方程式（ｙ＝ａｘ²＋ｂｘ＋ｃ）の係数ａ，ｂの符号が変化したり、係数ａ，ｂの値が所定値を超えた場合には、切替手段１４３は曲線予測から線形予測に切り替える。 The switching unit 143 switches between linear prediction by the linear prediction unit 141 and curve prediction by the curve prediction unit 142 based on the number of ball position coordinates extracted in the past or a quadratic curve equation. Here, for example, the switching unit 143 selects the curve prediction when the position coordinates extracted in the past of the video object are a predetermined number or more, and selects the linear prediction when the position coordinates are smaller than the predetermined number. The predetermined number is, for example, three, and five or more are preferable when obtaining an accurate curve. In addition, when curve prediction is executed, when the signs of the coefficients a and b of the quadratic curve equation (y = ax ² + bx + c) change or the values of the coefficients a and b exceed a predetermined value. The switching means 143 switches from curve prediction to linear prediction.

探索領域設定手段１５は、位置予測手段１４によって算出された次フィールド（次フレーム）でのボールの予測位置を利用して、フィールド画像中に所定範囲のボールの探索領域を設定するものである。この探索領域設定手段１５は、設定された探索領域の位置及び大きさを、探索領域情報としてオブジェクト候補画像生成手段１１へ出力する。探索領域の初期値は、図示していないマウス、キーボード等の入力手段によって入力される。探索領域設定手段１５は、初期設定された探索領域を、ボールの動きに合わせて逐次移動する。 The search area setting means 15 uses the predicted position of the ball in the next field (next frame) calculated by the position prediction means 14 to set a search area for the ball within a predetermined range in the field image. The search area setting means 15 outputs the set position and size of the search area to the object candidate image generation means 11 as search area information. The initial value of the search area is input by input means such as a mouse or a keyboard (not shown). The search area setting means 15 sequentially moves the initially set search area in accordance with the movement of the ball.

以上説明した映像オブジェクト軌跡合成装置１は、一般的なコンピュータにプログラムを実行させ、コンピュータ内の演算装置や記憶装置を動作させることにより実現することができる。このプログラム（映像オブジェクト軌跡合成プログラム）は、通信回線を介して配布することも可能であるし、ＣＤ−ＲＯＭ等の記録媒体に書き込んで配布することも可能である。 The video object locus synthesizing apparatus 1 described above can be realized by causing a general computer to execute a program and operating an arithmetic device or a storage device in the computer. This program (video object trajectory synthesis program) can be distributed via a communication line, or can be distributed on a recording medium such as a CD-ROM.

次に、図２乃至図５を参照して、映像オブジェクト軌跡合成装置１の動作について説明する。図２は、映像オブジェクト軌跡合成装置１の全体動作を示すフローチャートである。図３は、映像オブジェクト抽出装置１０によるボール抽出処理の動作を示すフローチャートである。図４は、オブジェクト候補画像生成手段１１におけるオブジェクト候補画像の抽出処理の動作を示すフローチャートである。図５は、ボール選定手段１２におけるボール選定処理の動作を示すフローチャートである。
まず、図２を参照（適宜図１参照）して、映像オブジェクト軌跡合成装置１の動作について説明する。 Next, the operation of the video object locus synthesizing device 1 will be described with reference to FIGS. FIG. 2 is a flowchart showing the overall operation of the video object trajectory synthesis apparatus 1. FIG. 3 is a flowchart showing the operation of the ball extraction process by the video object extraction device 10. FIG. 4 is a flowchart showing the operation of object candidate image extraction processing in the object candidate image generating means 11. FIG. 5 is a flowchart showing the operation of the ball selection process in the ball selection means 12.
First, the operation of the video object trajectory synthesis apparatus 1 will be described with reference to FIG. 2 (refer to FIG. 1 as appropriate).

映像オブジェクト軌跡合成装置１は、映像オブジェクト抽出装置１０によって、入力された映像から抽出（追跡）対象とするボールを抽出する処理を行う（ステップＳ２０１）。なお、このステップＳ２０１におけるボール抽出処理の詳細な動作については、図３を参照して後で説明することとする。 The video object locus synthesizing device 1 performs a process of extracting a ball to be extracted (tracked) from the input video by the video object extracting device 10 (step S201). The detailed operation of the ball extraction process in step S201 will be described later with reference to FIG.

映像オブジェクト軌跡合成装置１は、作画手段３１によって、映像オブジェクトの特徴量を示す画像を作画する（ステップＳ２０２）。このステップＳ２０２では、ステップＳ２０１で抽出したボールの位置（例えば、重心座標）に、特徴量（例えば、テクスチャ）を作画する。 The video object locus synthesizing apparatus 1 uses the drawing means 31 to draw an image indicating the feature amount of the video object (step S202). In step S202, a feature amount (for example, texture) is drawn at the position (for example, barycentric coordinates) of the ball extracted in step S201.

そして、映像オブジェクト軌跡合成装置１は、画像合成手段３２によって、軌跡画像と、映像遅延手段２０により遅延された映像を構成する入力画像とを合成し、合成画像を生成する（ステップＳ２０３）。なお、画像合成手段３２が、この合成画像を連続して出力することで、映像オブジェクトの移動軌跡を映像に合成した合成映像が生成されることになる。 Then, the video object trajectory synthesizing apparatus 1 synthesizes the trajectory image and the input image constituting the video delayed by the video delay means 20 by the image synthesizing unit 32 to generate a synthesized image (step S203). The image composition means 32 continuously outputs the composite image, thereby generating a composite video in which the moving trajectory of the video object is combined with the video.

ステップＳ２０３に続いて、映像オブジェクト軌跡合成装置１は、次に処理すべき画像（次入力画像）が存在するかどうかを判別する（ステップＳ２０４）。次入力画像がある場合には（ステップＳ２０４；Ｙｅｓ）、映像オブジェクト軌跡合成装置１は、ステップＳ２０１にて抽出したボールの位置（重心座標等）に基づいて、位置予測手段１４により、次に入力される画像におけるボールの探索領域を推定し（ステップＳ２０５）、ステップＳ２０１に戻る。この探索領域の推定には、後記するように、最小自乗法及びカルマンフィルタを用いることができる。また、映像オブジェクト軌跡合成装置１は、ボールを探索する領域を適宜更新しながら、ボールの抽出を行うため、ボールを抽出・更新するための演算量を抑えることができる。なお、ステップＳ２０４において次入力画像がない場合には（ステップＳ２０４；Ｎｏ）、動作を終了する。
以上の動作によって、映像オブジェクト軌跡合成装置１は、映像として時系列に入力されるフィールド画像から、追跡対象となる映像オブジェクト（ボール）を逐次抽出、追跡し、映像オブジェクトの移動軌跡を映像に合成した合成映像を出力することができる。 Subsequent to step S203, the video object trajectory synthesis device 1 determines whether there is an image (next input image) to be processed next (step S204). If there is a next input image (step S204; Yes), the video object trajectory synthesis device 1 inputs next by the position predicting means 14 based on the position (center of gravity coordinates, etc.) of the ball extracted in step S201. The search area of the ball in the image to be estimated is estimated (step S205), and the process returns to step S201. As will be described later, the least square method and the Kalman filter can be used for the estimation of the search region. In addition, the video object trajectory synthesis apparatus 1 performs ball extraction while appropriately updating the area where the ball is searched, so that the amount of calculation for extracting / updating the ball can be suppressed. If there is no next input image in step S204 (step S204; No), the operation is terminated.
Through the above operations, the video object trajectory synthesis apparatus 1 sequentially extracts and tracks video objects (balls) to be tracked from field images input in time series as video, and synthesizes the moving trajectory of the video object into the video. The synthesized video can be output.

次に、図３を参照（適宜図１参照）して、映像オブジェクト軌跡合成装置１の映像オブジェクト抽出装置１０における、ボール抽出処理（図２のステップＳ２０１）の動作について説明する。
まず、図示していないマウス、キーボード等の入力手段によって探索領域が入力され、映像オブジェクト抽出装置１０は、探索領域設定手段１５によって、入力された映像を構成する画像において、画像の一部の範囲に映像オブジェクトを探索する探索領域を設定する（ステップＳ３０１）。なお、この探索領域は、ボールの動きに追従して位置が適宜更新される。 Next, the operation of the ball extraction process (step S201 in FIG. 2) in the video object extraction device 10 of the video object locus synthesis device 1 will be described with reference to FIG. 3 (refer to FIG. 1 as appropriate).
First, a search area is input by input means such as a mouse and a keyboard (not shown), and the video object extraction device 10 includes a range of a part of an image in an image constituting the video input by the search area setting means 15. A search area for searching for a video object is set in step S301. Note that the position of this search area is appropriately updated following the movement of the ball.

次に、映像オブジェクト抽出装置１０は、オブジェクト候補画像生成手段１１によって、探索領域からボールの候補画像（オブジェクト候補画像）を抽出する処理を行う（ステップＳ３０２）。なお、このステップＳ３０２におけるボールの候補画像の抽出処理の詳細な動作については、図４を参照して後で説明することとする。 Next, the video object extracting device 10 performs a process of extracting a ball candidate image (object candidate image) from the search area by the object candidate image generating unit 11 (step S302). The detailed operation of the ball candidate image extraction process in step S302 will be described later with reference to FIG.

映像オブジェクト抽出装置１０は、オブジェクト候補画像生成手段１１によってボールの候補画像が抽出されると、ボール選定手段１２によって、ボールの候補画像を絞り込み、ボールの位置座標及び特徴量を抽出する（ステップＳ３０３）。なお、このステップＳ３０３におけるボール選定処理の詳細な動作については、図５を参照して後で説明することとする。
ステップＳ３０３に続いて、映像オブジェクト抽出装置１０は、ボール選定手段１２によって、ステップＳ３０３においてオブジェクト候補画像から絞り込んで選定したボールの位置座標及び特徴量を作画・画像合成手段３０に出力する（ステップＳ３０４）。 When the candidate image for the ball is extracted by the object candidate image generating unit 11, the video object extracting device 10 narrows down the candidate image for the ball by the ball selecting unit 12, and extracts the position coordinates and the feature amount of the ball (step S303). ). The detailed operation of the ball selection process in step S303 will be described later with reference to FIG.
Subsequent to step S303, the video object extraction device 10 outputs the position coordinates and feature quantities of the ball selected and selected from the object candidate images in step S303 to the drawing / image composition unit 30 by the ball selection unit 12 (step S304). ).

次に、図４を参照（適宜図１参照）して、映像オブジェクト抽出装置１０のオブジェクト候補画像生成手段１１における、オブジェクト候補画像の抽出処理（図３のステップＳ３０２）の動作について説明する。
オブジェクト候補画像生成手段１１には、３枚の連続した入力画像（過去、現在、未来）が入力される。 Next, the operation of the object candidate image extraction process (step S302 in FIG. 3) in the object candidate image generation unit 11 of the video object extraction device 10 will be described with reference to FIG. 4 (refer to FIG. 1 as appropriate).
Three consecutive input images (past, present and future) are input to the object candidate image generating unit 11.

オブジェクト候補画像生成手段１１は、差分画像生成手段１１３によって、現在の入力画像（画像記憶部１１１に記憶された画像：例えば奇数フィールド）と未来の入力画像（最新の入力画像：例えば奇数フィールド）との間の輝度差を画素値とした差分画像Ｉｍａｇｅ１（差分画像１）を生成する（ステップＳ４０１）。
また、オブジェクト候補画像生成手段１１は、差分画像生成手段１１４によって、現在の入力画像（画像記憶部１１１に記憶された画像：例えば奇数フィールド）と過去の入力画像（画像記憶部１１２に記憶された画像：例えば奇数フィールド）との間の輝度差を画素値とした差分画像Ｉｍａｇｅ２（差分画像２）を生成する（ステップＳ４０２）。 The object candidate image generation unit 11 uses the difference image generation unit 113 to determine the current input image (image stored in the image storage unit 111: for example an odd field) and the future input image (latest input image: for example an odd field). A difference image Image1 (difference image 1) is generated with the luminance difference between the pixel values as pixel values (step S401).
Further, the object candidate image generation means 11 is processed by the difference image generation means 114 with the current input image (image stored in the image storage unit 111: for example, an odd field) and the past input image (stored in the image storage unit 112). A difference image Image2 (difference image 2) having a pixel value as a luminance difference between the image and an odd number field is generated (step S402).

オブジェクト候補画像生成手段１１は、候補画像生成手段１１５によって、差分画像Ｉｍａｇｅ１，Ｉｍａｇｅ２が以下の条件を満たす領域を判別し、オブジェクト候補画像として抽出する（ステップＳ４０３）。本実施形態では、抽出（追跡）対象とする映像オブジェクトがボールであり、ボールの輝度が背景画像の輝度よりも一般に高いので、式（２）と式（３）とが同時に満たされることがこのときの条件式となる。 The object candidate image generation unit 11 determines, using the candidate image generation unit 115, the areas where the difference images Image1 and Image2 satisfy the following conditions, and extracts them as object candidate images (step S403). In the present embodiment, the video object to be extracted (tracked) is a ball, and the luminance of the ball is generally higher than the luminance of the background image, so that the expressions (2) and (3) are simultaneously satisfied. When the conditional expression.

Ｉｍａｇｅ１（ｘ，ｙ）＞Ｔ…（２）
Ｉｍａｇｅ２（ｘ，ｙ）＜ −Ｔ…（３） Image1 (x, y)> T (2)
Image2 (x, y) <− T (3)

ここで、前記式（２）及び（３）におけるＩｍａｇｅ１（ｘ，ｙ）を差分画像Ｉｍａｇｅ１の座標（ｘ，ｙ）に位置する画素の輝度値、Ｉｍａｇｅ２（ｘ，ｙ）を差分画像Ｉｍａｇｅ２の座標（ｘ，ｙ）に位置する画素の輝度値、Ｔを輝度閾値としている。
以下に、前記のように差分画像Ｉｍａｇｅ１，Ｉｍａｇｅ２が条件式（２）及び（３）を満足するときに、高速で移動するオブジェクト（ボール）を抽出できる理由を図６を参照して説明する。 Here, Image1 (x, y) in the above formulas (2) and (3) is the luminance value of the pixel located at the coordinates (x, y) of the difference image Image1, and Image2 (x, y) is the coordinates of the difference image Image2. The luminance value T of the pixel located at (x, y) is set as a luminance threshold.
The reason why an object (ball) that moves at high speed when the difference images Image1 and Image2 satisfy the conditional expressions (2) and (3) as described above will be described below with reference to FIG.

図６は、差分画像の条件式（２）及び（３）を説明するための説明図であり、図６の（ａ）は、３枚の連続した入力画像（それぞれが例えば奇数フィールド画像）を示し、図６の（ｂ）は、２枚の差分画像を示している。図６の（ａ）に示すように、過去、現在、未来を表す連続した３枚のフィールド画像６０１，６０２，６０３には、過去・現在・未来の位置にある１個のボール（Ｍ１，Ｍ２，Ｍ３）が示されている。なお、ボールＭ１，Ｍ２，Ｍ３は、フィールド画像同士で重ならない程度の高速で移動しており、ボールの輝度は背景画像の輝度よりも高い。
また、図６の（ｂ）に示すように、差分画像Ｉｍａｇｅ１は、フィールド画像６０２の輝度からフィールド画像６０３の輝度を引いた差分を画素値として生成されたものである。また、差分画像Ｉｍａｇｅ２は、フィールド画像６０１の輝度からフィールド画像６０２の輝度を引いた差分を画素値として生成されたものである。 FIG. 6 is an explanatory diagram for explaining conditional expressions (2) and (3) of the difference image. FIG. 6A shows three consecutive input images (each of which is, for example, an odd field image). FIG. 6B shows two difference images. As shown in FIG. 6A, three consecutive field images 601, 602, and 603 representing the past, present, and future include one ball (M1, M2) at the past, present, and future positions. , M3). The balls M1, M2, and M3 are moving at a high speed that does not overlap the field images, and the luminance of the balls is higher than the luminance of the background image.
Further, as shown in FIG. 6B, the difference image Image1 is generated using a difference obtained by subtracting the luminance of the field image 603 from the luminance of the field image 602 as a pixel value. Further, the difference image Image2 is generated by using a difference obtained by subtracting the luminance of the field image 602 from the luminance of the field image 601 as a pixel value.

差分画像Ｉｍａｇｅ１における領域Ｒ２は、ボールＭ２の輝度からオブジェクトの存在しない領域の輝度を差し引いた輝度差分値を画素値として構成されているので、画素値は符号がプラス（例えば白）となる。
差分画像Ｉｍａｇｅ１における領域Ｒ３は、オブジェクトの存在しない領域の輝度からボールＭ３の輝度を差し引いた輝度差分値を画素値として構成されているので、画素値は符号がマイナス（例えば黒）となる。
差分画像Ｉｍａｇｅ１における領域Ｒ１を含むその他の領域は、オブジェクトの存在しない領域の輝度同士を差し引いた輝度差分値を画素値として構成されているので、画素値は０となる。 The region R2 in the difference image Image1 is configured with a luminance difference value obtained by subtracting the luminance of the region where no object exists from the luminance of the ball M2 as a pixel value, and thus the pixel value has a plus sign (for example, white).
The region R3 in the difference image Image1 is configured with a luminance difference value obtained by subtracting the luminance of the ball M3 from the luminance of the region where no object exists, so that the pixel value has a minus sign (for example, black).
The other regions including the region R1 in the difference image Image1 are configured by using the luminance difference value obtained by subtracting the luminances of the regions where no object exists as the pixel value, and thus the pixel value is 0.

差分画像Ｉｍａｇｅ２における領域Ｒ１は、ボールＭ１の輝度からオブジェクトの存在しない領域の輝度を差し引いた輝度差分値を画素値として構成されているので、画素値は符号がプラス（例えば白）となる。
差分画像Ｉｍａｇｅ２における領域Ｒ２は、オブジェクトの存在しない領域の輝度からボールＭ２の輝度を差し引いた輝度差分値を画素値として構成されているので、画素値は符号がマイナス（例えば黒）となる。
差分画像Ｉｍａｇｅ２における領域Ｒ３を含むその他の領域は、オブジェクトの存在しない領域の輝度同士を差し引いた輝度差分値を画素値として構成されているので、画素値は０となる。 Since the region R1 in the difference image Image2 is configured with a luminance difference value obtained by subtracting the luminance of the region where no object exists from the luminance of the ball M1 as a pixel value, the pixel value has a plus sign (for example, white).
The region R2 in the difference image Image2 is configured with a luminance difference value obtained by subtracting the luminance of the ball M2 from the luminance of the region where no object exists, so that the pixel value has a minus sign (for example, black).
The other regions including the region R3 in the difference image Image2 are configured by using the luminance difference value obtained by subtracting the luminances of the regions where no object exists as the pixel value, and thus the pixel value is 0.

差分画像Ｉｍａｇｅ１と差分画像Ｉｍａｇｅ２とを比較すると、差分画像Ｉｍａｇｅ１において画素値の符号がプラスであり（式（２））、且つ、差分画像Ｉｍａｇｅ２において画素値の符号がマイナスである（式（３））領域は、領域Ｒ２すなわちボールＭ２（現在の位置）を表すことになる。従って、映像オブジェクト抽出装置１０は、条件式（２）及び（３）から、高速で移動する映像オブジェクト（ボール）をオブジェクト候補画像として抽出することができる。
なお、対象とする映像オブジェクトの輝度が背景画像の輝度よりも低い場合には、式（４）と式（５）とを同時に満たすことを条件とする。 When the difference image Image1 and the difference image Image2 are compared, the sign of the pixel value in the difference image Image1 is positive (formula (2)), and the sign of the pixel value in the difference image Image2 is negative (formula (3)) ) Area represents the area R2, that is, the ball M2 (current position). Therefore, the video object extraction device 10 can extract a video object (ball) moving at high speed as an object candidate image from the conditional expressions (2) and (3).
Note that when the luminance of the target video object is lower than the luminance of the background image, it is a condition that Expression (4) and Expression (5) are satisfied at the same time.

Ｉｍａｇｅ１（ｘ，ｙ）＜ −Ｔ…（４）
Ｉｍａｇｅ２（ｘ，ｙ）＞Ｔ…（５） Image1 (x, y) <− T (4)
Image2 (x, y)> T (5)

この場合、差分画像Ｉｍａｇｅ１において画素値の符号がマイナスであり（式（４））、且つ、差分画像Ｉｍａｇｅ２において画素値の符号がプラスである（式（５））領域が映像オブジェクトとして抽出される。 In this case, a region in which the sign of the pixel value is negative in the difference image Image1 (formula (4)) and the sign of the pixel value is positive in the difference image Image2 (formula (5)) is extracted as a video object. .

図１の映像オブジェクト軌跡合成装置１の動作について説明を続ける。
図５を参照（適宜図１参照）して、映像オブジェクト抽出装置１０のボール選定手段１２における、ボール選定処理（図３のステップＳ３０３）の動作について説明する。
ボール選定手段１２は、ラベリング部１２１によって、オブジェクト候補画像の中で、映像オブジェクトの候補となる領域に対して番号（ラベル）を付す（ステップＳ５０１）。なお、以降の動作は、映像オブジェクトの候補に付された番号に基づいて、映像オブジェクトの単位で処理される。 The description of the operation of the video object trajectory synthesis apparatus 1 in FIG. 1 will be continued.
The operation of the ball selection process (step S303 in FIG. 3) in the ball selection means 12 of the video object extraction device 10 will be described with reference to FIG. 5 (refer to FIG. 1 as appropriate).
The ball selection unit 12 uses the labeling unit 121 to assign a number (label) to a region that is a candidate for a video object in the object candidate image (step S501). The subsequent operations are processed in units of video objects based on the numbers assigned to the video object candidates.

ボール選定手段１２は、特徴量解析部１２２によって、ステップＳ５０１で番号付けされた映像オブジェクト毎に、オブジェクト候補画像の中から、抽出条件記憶手段１３に記憶されている抽出条件等に基づいて、抽出（追跡）対象となる映像オブジェクトを選択し、選択した映像オブジェクトの位置及び特徴量を解析（算出）して、映像オブジェクトの位置及び特徴量を抽出する（ステップＳ５０２）。映像オブジェクトの位置としては、例えば、映像オブジェクトの重心座標を用いる。また、映像オブジェクトの特徴量としては、映像オブジェクトの面積、輝度、色、円形度等を用いる。 The ball selection means 12 is extracted by the feature value analysis unit 122 for each video object numbered in step S501 based on the extraction conditions stored in the extraction condition storage means 13 from the object candidate images. A video object to be tracked is selected, the position and feature amount of the selected video object are analyzed (calculated), and the position and feature amount of the video object are extracted (step S502). As the position of the video object, for example, the barycentric coordinates of the video object are used. As the feature amount of the video object, the area, luminance, color, circularity, etc. of the video object are used.

ボール選定手段１２は、フィルタ処理部１２３によって、オブジェクト候補画像の中から、番号（ラベル）に基づいて、映像オブジェクトを選択し、フィルタ処理を行う（ステップＳ５０３）。すなわち、フィルタ処理部１２３は、選択された映像オブジェクトの「位置」が、位置予測手段１４により予測される範囲に適合するかどうかを判定する。ここで、「位置」による適合条件に合致する場合、フィルタ処理部１２３は、選択された映像オブジェクトの「面積」が、抽出条件記憶手段１３に記憶されている抽出条件に適合するかどうかを判定する。ここで、「面積」による適合条件に合致する場合、フィルタ処理部１２３は、映像オブジェクトの「色」が、抽出条件に適合するかどうかを判定する。また、「色」による適合条件に合致する場合、フィルタ処理部１２３は、映像オブジェクトの「輝度」が、抽出条件に適合するかどうかを判定する。また、「輝度」による適合条件に合致する場合、フィルタ処理部１２３は、映像オブジェクトの「円形度」が、抽出条件に適合するかどうかを判定する。さらに、「円形度」による適合条件に合致する場合、すなわち、すべての抽出条件に適合した場合、フィルタ処理部１２３は、先に選択した映像オブジェクトを、追跡対象の映像オブジェクトとして選択する。 The ball selection unit 12 selects a video object from the object candidate images based on the number (label) by the filter processing unit 123, and performs a filter process (step S503). That is, the filter processing unit 123 determines whether or not the “position” of the selected video object matches the range predicted by the position prediction unit 14. Here, when the matching condition based on the “position” is met, the filter processing unit 123 determines whether the “area” of the selected video object matches the extraction condition stored in the extraction condition storage unit 13. To do. Here, when the matching condition based on “area” is met, the filter processing unit 123 determines whether or not the “color” of the video object matches the extraction condition. If the matching condition based on “color” is met, the filter processing unit 123 determines whether the “luminance” of the video object matches the extraction condition. When the matching condition based on “luminance” is met, the filter processing unit 123 determines whether the “circularity” of the video object matches the extraction condition. Further, when the matching condition based on the “roundness” is met, that is, when all the extraction conditions are met, the filter processing unit 123 selects the previously selected video object as the video object to be tracked.

ステップＳ５０３に続いて、ボール選定手段１２は、オブジェクト選択部１２４によって、予測位置座標に最も近い映像オブジェクトを追跡対象のボールとして選択する処理を行う（ステップＳ５０４）。これにより、すべてのフィルタを通過した映像オブジェクトが複数存在する場合にもボールを選定することができる。
以上の動作によって、映像オブジェクト抽出装置１０は、ボール選定手段１２によって、オブジェクト候補画像の中から抽出条件に適した映像オブジェクト（ボール）を選択することができる。 Subsequent to step S503, the ball selecting unit 12 performs a process of selecting the video object closest to the predicted position coordinates as the tracking target ball by the object selecting unit 124 (step S504). As a result, a ball can be selected even when there are a plurality of video objects that have passed through all the filters.
With the above operation, the video object extraction device 10 can select a video object (ball) suitable for the extraction condition from the object candidate images by the ball selection unit 12.

次に、映像オブジェクト抽出装置１０の位置予測手段１４による位置予測処理（ステップＳ２０５）の動作を図７を参照して説明する。図７は位置予測処理を説明するための説明図である。
過去のオブジェクト抽出処理によりそれぞれ異なる時刻のボールＭ１乃至Ｍ４が得られており、また、現在のフィールド画像におけるボールＭ５（図中、二重丸）の位置座標が抽出されているものと仮定する。位置予測手段１４は、線形予測手段１４１によって、現フィールドでの抽出位置座標（ボールＭ５）と前フィールド（前フレーム）での抽出位置座標（ボール軌跡Ｍ４）とにより、矢印Ｖ１で示される動きベクトルを求める。 Next, the operation of the position prediction process (step S205) by the position prediction unit 14 of the video object extraction device 10 will be described with reference to FIG. FIG. 7 is an explanatory diagram for explaining the position prediction process.
It is assumed that balls M1 to M4 at different times are obtained by past object extraction processing, and that the position coordinates of the ball M5 (double circle in the figure) in the current field image are extracted. The position prediction means 14 uses the linear prediction means 141 to determine the motion vector indicated by the arrow V1 based on the extracted position coordinates (ball M5) in the current field and the extracted position coordinates (ball trajectory M4) in the previous field (previous frame). Ask for.

そして、線形予測手段１４１は、動きベクトルＶ１に基づいて、現フィールドでの抽出位置座標（ボールＭ５）に対して、動きベクトルＶ１と同様の動きベクトルＶ２を、ボールＭ５の位置に加算することにより、ボールの予測位置として、線形予測位置座標Ｙ１（図中、三角印）を仮に求める（線形予測）。一方、位置予測手段１４は、曲線予測手段１４２によって、最小自乗法を用いて過去から現在に至るボール軌跡（ボールＭ１乃至Ｍ５）を平均的に通るような破線で示される２次曲線Ｋ１を求める。そして、位置予測手段１４は、線形予測位置座標Ｙ１を２次曲線Ｋ１上に補正して、四角印で示される曲線予測位置座標Ｙ２を求める（曲線予測）。求めた曲線予測位置座標Ｙ２を次フィールド画像（次フレーム）での探索位置とする。 Then, the linear prediction unit 141 adds a motion vector V2 similar to the motion vector V1 to the position of the ball M5 with respect to the extracted position coordinates (ball M5) in the current field based on the motion vector V1. As a predicted position of the ball, a linear predicted position coordinate Y1 (triangle mark in the figure) is temporarily determined (linear prediction). On the other hand, the position predicting unit 14 uses the curve predicting unit 142 to obtain a quadratic curve K1 indicated by a broken line that passes through the ball trajectory (balls M1 to M5) from the past to the present using the least square method. . Then, the position predicting unit 14 corrects the linear predicted position coordinate Y1 on the quadratic curve K1, and obtains a predicted curve position coordinate Y2 indicated by a square mark (curve prediction). The obtained curve predicted position coordinate Y2 is set as a search position in the next field image (next frame).

但し、位置予測手段１４は、切替手段１４３によって、過去に抽出されたボールの位置座標の数または２次曲線の方程式に基づいて、線形予測手段１４１による線形予測と曲線予測手段１４２による曲線予測とを切り替える。すなわち、位置予測手段１４は、曲線予測を実行するのに十分な数（３個でも良いが５個以上が好適である）の過去のボール位置座標がない場合には曲線予測を実行せずに、線形予測のみを実行する。また、位置予測手段１４は、曲線予測手段１４２によって曲線予測を実行しているときに、曲線予測に用いる２次曲線（ｙ＝ａｘ²＋ｂｘ＋ｃ）の係数ａ，ｂの符号が変化したり、係数ａ，ｂの値が所定値を超えた場合には、線形予測に切り替える。 However, the position predicting unit 14 performs linear prediction by the linear predicting unit 141 and curve prediction by the curve predicting unit 142 based on the number of position coordinates of the ball extracted in the past by the switching unit 143 or a quadratic curve equation. Switch. That is, the position predicting unit 14 does not perform the curve prediction when there is not a sufficient number of past ball position coordinates (three but five or more are suitable) for executing the curve prediction. Perform linear prediction only. Further, the position predicting unit 14 changes the sign of the coefficients a and b of the quadratic curve (y = ax ² + bx + c) used for the curve prediction when the curve predicting unit 142 executes the curve prediction, When the values of a and b exceed a predetermined value, switching to linear prediction is performed.

次に、図８を参照して映像オブジェクト軌跡画像合成装置１が、映像オブジェクトの軌跡を合成する例について説明する。図８は、映像オブジェクト（ボール）の移動軌跡を映像上に合成した例を説明するための説明図である。なお、ここでは、野球のボールを追跡し、その移動軌跡を合成するものとして説明を行う。 Next, an example in which the video object trajectory image composition device 1 composes the trajectory of the video object will be described with reference to FIG. FIG. 8 is an explanatory diagram for explaining an example in which the moving trajectory of the video object (ball) is synthesized on the video. In the following description, it is assumed that a baseball is tracked and the movement trajectory is synthesized.

ピッチャーＰＴがキャッチャーＣＴに向けて投げたボールは、所定のタイミングで位置座標（図では１０点）を抽出され、ボールＭ１乃至Ｍ１０の位置を経てキャッチャーＣＴのミットに収められる。
このとき、映像オブジェクト軌跡合成装置１は、位置予測手段１４によって、当初は線形予測により、ボールの位置座標を予測する。曲線予測に十分な点数（例えば５点）の軌跡が抽出されると、位置予測手段１４は線形予測から曲線予測に切り替える。ここで、図８に示すボールＭ６とボールＭ７の間には、左バッターＢＴが介在している。ボールが左バッターＢＴのユニフォームにかかると、ボールの輝度とユニフォームの輝度との差が僅少なためにボールがブラインドされ、ボールの抽出に失敗することがある。このような場合にも、映像オブジェクト軌跡合成装置１は、曲線予測を実行することによって、ボールＭ６から過去数点遡ったボールの移動軌跡によりボールＭ７の位置を予測することができる。従って、動オブジェクト（左バッターＢＴ）によりブラインドされてボールの抽出に失敗した場合であっても、左バッターＢＴを越えた後のボールの予測位置が、線形予測のみで行う場合に比べて前フィールド（前フレーム）から滑らかに接続されて正確なものになる。従って、映像オブジェクト軌跡合成装置１は、探索領域を狭めることができるので、ボールの抽出に必要な演算量を低く抑えることができる。 The ball thrown by the pitcher PT toward the catcher CT is extracted with the position coordinates (10 points in the figure) at a predetermined timing, and is stored in the mitt of the catcher CT through the positions of the balls M1 to M10.
At this time, the video object locus synthesizing apparatus 1 predicts the position coordinates of the ball by the position prediction means 14 by linear prediction at first. When a trajectory with a sufficient number of points (for example, five points) for curve prediction is extracted, the position prediction unit 14 switches from linear prediction to curve prediction. Here, the left batter BT is interposed between the balls M6 and M7 shown in FIG. When the ball hits the uniform of the left batter BT, the difference between the luminance of the ball and the luminance of the uniform is so small that the ball is blinded and the extraction of the ball may fail. Even in such a case, the video object locus synthesizing apparatus 1 can predict the position of the ball M7 based on the movement locus of the ball going back several points from the ball M6 by executing the curve prediction. Accordingly, even when the extraction of the ball is blinded by the moving object (left batter BT), the predicted position of the ball after exceeding the left batter BT is the previous field compared with the case where only the linear prediction is performed. It is connected smoothly from the (front frame) and becomes accurate. Therefore, the video object trajectory synthesis apparatus 1 can narrow the search area, so that the amount of calculation required for ball extraction can be kept low.

また、映像オブジェクト軌跡合成装置１では、前記のように、ボールの抽出に失敗した場合に、ボール選定手段１２が作画・画像合成手段３０にその旨を通知する。この場合、作画・画像合成手段３０は、図示しない補間手段によって、軌跡画像上にボールＭ６とボールＭ７との位置（例えば重心位置）から内挿によりボール位置を計算するので、作画手段３１によってボールＮ１を作画することができる。 In the video object locus synthesizing apparatus 1, as described above, when the ball extraction fails, the ball selecting unit 12 notifies the drawing / image synthesizing unit 30 to that effect. In this case, the drawing / image combining means 30 calculates the ball position by interpolation from the position (for example, the center of gravity) of the balls M6 and M7 on the trajectory image by the interpolation means (not shown). N1 can be drawn.

以上、本発明の実施形態について説明したが、本発明は前記実施形態には限定されない。例えば、本実施形態では、映像オブジェクトを野球のボールとして説明したが、例えばサッカーボール、テニスボール、ゴルフボールなど他のスポーツで用いるボールであってもよい。更には、映像オブジェクトは、フィールド画像ごとの重なりが生じない程度の高速で移動するものであればボールに限定されない。また、本実施形態では、入力画像の単位時間としてフィールドを用いたが、フレーム等、他の時間単位であってもよい。 As mentioned above, although embodiment of this invention was described, this invention is not limited to the said embodiment. For example, in the present embodiment, the video object has been described as a baseball ball, but may be a ball used in other sports such as a soccer ball, a tennis ball, and a golf ball. Furthermore, the video object is not limited to a ball as long as it moves at a high speed that does not cause overlapping of field images. In this embodiment, a field is used as the unit time of the input image, but other time units such as a frame may be used.

本発明に係る映像オブジェクト軌跡合成装置の構成を示したブロック図である。It is the block diagram which showed the structure of the video object locus | trajectory synthesis apparatus based on this invention. 本発明に係る映像オブジェクト軌跡合成装置の全体動作を示すフローチャートである。It is a flowchart which shows the whole operation | movement of the video object locus | trajectory synthesis apparatus which concerns on this invention. 本発明に係る映像オブジェクト抽出装置によるボール抽出処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the ball | bowl extraction process by the video object extraction device which concerns on this invention. オブジェクト候補画像生成手段におけるオブジェクト候補画像の抽出処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the extraction process of the object candidate image in an object candidate image generation means. ボール選定手段におけるボール選定処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the ball selection process in a ball | bowl selection means. 差分画像の条件式を説明するための説明図であり、（ａ）は３枚の連続したフィールド画像であり、（ｂ）は２枚の差分画像である。It is explanatory drawing for demonstrating the conditional expression of a difference image, (a) is three continuous field images, (b) is two difference images. 位置予測処理を説明するための説明図である。It is explanatory drawing for demonstrating a position prediction process. 映像オブジェクトの移動軌跡を映像上に合成した例を説明するための説明図である。It is explanatory drawing for demonstrating the example which synthesize | combined the movement locus | trajectory of the video object on the image | video.

Explanation of symbols

１映像オブジェクト軌跡合成装置
１０映像オブジェクト抽出装置
１１オブジェクト候補画像生成手段
１２ボール選定手段（オブジェクト抽出手段）
１３抽出条件記憶手段
１４位置予測手段
１５探索領域設定手段
２０映像遅延手段
３０作画・画像合成手段
３１作画手段
３２画像合成手段
１１１画像記憶部（第２の画像記憶手段）
１１２画像記憶部（第１の画像記憶手段）
１１３差分画像生成手段（第１の差分画像生成手段）
１１４差分画像生成手段（第２の差分画像生成手段）
１１５候補画像生成手段
１２１ラベリング部
１２２特徴量解析部
１２３フィルタ処理部
１２４オブジェクト選択部
１４１線形予測手段
１４２曲線予測手段
１４３切替手段 DESCRIPTION OF SYMBOLS 1 Video object locus synthesis apparatus 10 Video object extraction apparatus 11 Object candidate image generation means 12 Ball selection means (object extraction means)
DESCRIPTION OF SYMBOLS 13 Extraction condition memory | storage means 14 Position prediction means 15 Search area setting means 20 Image | video delay means 30 Drawing and image composition means 31 Drawing means 32 Image composition means 111 Image storage part (2nd image storage means)
112 Image storage unit (first image storage means)
113 difference image generation means (first difference image generation means)
114 difference image generation means (second difference image generation means)
115 Candidate Image Generation Unit 121 Labeling Unit 122 Feature Quantity Analysis Unit 123 Filter Processing Unit 124 Object Selection Unit 141 Linear Prediction Unit 142 Curve Prediction Unit 143 Switching Unit

Claims

A video object extraction device that extracts a position of a video object included in an input video,
An object candidate image generating means for generating an object candidate image extracted as a candidate for the video object from a continuous first image, second image and third image input in time series;
Object extraction means for extracting the position of the video object to be extracted from the generated object candidate images based on a predetermined extraction condition;
The object candidate image generation means includes
First image storage means for storing the first image;
Second image storage means for storing the second image;
First difference image generation means for generating a first difference image having a pixel value as a luminance difference between the third image and the second image;
Second difference image generation means for generating a second difference image having a pixel value as a luminance difference between the second image and the first image;
A sign of a pixel value of the first difference image and a sign of a pixel value of the second difference image with reference to a pixel value arranged at a common position of the first difference image and the second difference image; Discriminating different regions, and candidate image generating means for generating the object candidate image,
A video object extraction device comprising:

Based on the position of the video object extracted by the object extraction means, the position of the video object in the next input image is predicted, and the predicted position information of the video object is output to the object candidate image generation means. 2. The video object extracting device according to claim 1, further comprising an object position predicting unit.

The object position predicting means includes
Linear prediction means for predicting that the trajectory of the video object is a straight line;
Curve prediction means for predicting that the trajectory of the video object is a curve;
Switching means for switching between linear prediction by the linear prediction means and curve prediction by the curve prediction means based on the number of position coordinates of the video object extracted in the past or the equation of the curve;
The video object extracting device according to claim 2, wherein:

The video object extraction device according to any one of claims 1 to 3,
A trajectory in which a moving trajectory of the video object is extracted by drawing an image showing the video object in an area corresponding to an image input in time series based on the position of the video object extracted by the object extracting means. A drawing means for generating an image;
Image synthesizing means for synthesizing the trajectory image and the image input in time series;
A video object locus synthesizing apparatus comprising:

A video object trajectory synthesis method that extracts and tracks a video object included in an input video and synthesizes a moving trajectory of the video object on the video,
Of the first image, the second image, and the third image that are input in time series, a first difference image having a pixel value as a luminance difference between the third image and the second image A difference image generation step for generating and generating a second difference image having a difference in luminance between the second image and the first image as a pixel value;
With reference to the pixel value arranged at the common position of the first difference image and the second difference image generated in the difference image generation step, the sign of the pixel value of the first difference image and the second difference image An object candidate image generation step of generating an object candidate image obtained by extracting an area having a different pixel value code as the video object candidate;
An object extraction step of extracting the position of the video object from the object candidate image generated in the object candidate image generation step based on at least one extraction condition of position, area, luminance, color, and circularity;
Based on the position of the video object extracted in this object extraction step, a trajectory in which the moving trajectory of the video object is extracted by drawing an image showing the video object in an area corresponding to the image input in time series A drawing step for generating an image;
An image synthesis step of synthesizing the trajectory image generated in this drawing step and the image input in time series;
A video object trajectory synthesis method characterized by comprising:

In order to extract and track a video object included in the input video, and to synthesize a moving locus of the video object on the video,
Of the first image, the second image, and the third image that are input in time series, a first difference image having a pixel value as a luminance difference between the third image and the second image Differential image generating means for generating a second differential image with a pixel value as a luminance difference between the second image and the first image;
The sign of the pixel value of the first difference image and the pixel of the second difference image with reference to the pixel value arranged at the common position of the first difference image and the second difference image generated by the difference image generating means Object candidate image generation means for generating an object candidate image obtained by extracting a region having a different sign from the value as a candidate for the video object;
Object extraction means for extracting the position of the video object from the object candidate image generated by the object candidate image generation means based on at least one extraction condition of position, area, luminance, color and circularity;
Based on the position of the video object extracted by the object extracting means, a trajectory in which the moving trajectory of the video object is extracted by drawing an image showing the video object in an area corresponding to the image input in time series Drawing means for generating images,
Image synthesis means for synthesizing the trajectory image generated by the drawing means and the image input in time series,
A video object trajectory synthesis program characterized by functioning as