JP5192940B2

JP5192940B2 - Image conversion apparatus, image conversion program, and image conversion method

Info

Publication number: JP5192940B2
Application number: JP2008211535A
Authority: JP
Inventors: 俊枝三須; 真人藤井; 伸行八木
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2008-08-20
Filing date: 2008-08-20
Publication date: 2013-05-08
Anticipated expiration: 2028-08-20
Also published as: JP2010050601A

Description

本発明は、画像中の特定の映像オブジェクトを選択的に誇張するように画像変換する画像変換装置、画像変換プログラムおよび画像変換方法に関する。 The present invention relates to an image conversion apparatus, an image conversion program, and an image conversion method for performing image conversion so as to selectively exaggerate a specific video object in an image.

従来、高解像度画像を低解像度ディスプレイや小型ディスプレイに表示する場合には、高解像度画像の全体がディスプレイの表示領域内に収まるよう、解像度や表示サイズを一定倍率で変換する方法が主流である。 Conventionally, when a high-resolution image is displayed on a low-resolution display or a small display, a method of converting the resolution and display size at a constant magnification so that the entire high-resolution image fits within the display area of the display has been the mainstream.

また、例えば特許文献１では、画像の重要な領域を推定して切り出し、切り出された領域がディスプレイの表示領域内に収まるよう変換するトリミング手法も提案されている。
特開２００７−６１１１号公報（段落００２０〜段落００２３、図１） For example, Patent Document 1 proposes a trimming technique in which an important region of an image is estimated and cut out, and the cut out region is converted so as to be within the display region of the display.
JP 2007-6111 (paragraph 0020 to paragraph 0023, FIG. 1)

しかしながら、高解像度画像の全体をディスプレイの表示領域内に収まるよう変換する手法では、特に主要な被写体が小さく撮影されている場合には、画像の縮小により当該被写体の視認性が劣化するため、視聴者に疲労が生じたり画像内容の理解が困難になるという問題があった。 However, in the method of converting the entire high-resolution image so that it fits within the display area of the display, especially when the main subject is photographed small, the visibility of the subject deteriorates due to the reduction of the image. There is a problem that the person is tired and it is difficult to understand the image contents.

例えば、サッカーの試合を撮影した映像において、ボール近傍の比較的広い領域をハイビジョンカメラで撮影された高解像度画像を視聴する場合について考えると、高解像度ディスプレイであるハイビジョンディスプレイを用いたときには、画像中の選手の顔を容易に識別できても、例えば、１セグ放送用テレビのような低解像度の小型ディスプレイに表示するために画像全体を縮小して表示した場合には、個々の選手の顔が小さく誰がプレーしているのか識別が困難となることがあった。 For example, in the case where a high-resolution image taken with a high-definition camera is viewed on a relatively wide area near the ball in a video of a soccer game, when using a high-definition display that is a high-resolution display, Even if the player's face can be easily identified, for example, if the entire image is reduced and displayed for display on a low-resolution small display such as a television for 1-segment broadcasting, the face of each player It was difficult to identify who was playing small.

特許文献１の手法は、主要な被写体が画像内の一部に偏在する場合には有効な手法であるが、画像全体に主要な被写体が分散して存在する場合には、有効な切り出し範囲を設定することが難しく、広範囲を縮小して表示するため、画像全体を縮小して表示する場合と同様の問題が生じることとなる。 The technique of Patent Document 1 is an effective technique when the main subject is unevenly distributed in a part of the image. However, when the main subject is dispersed throughout the image, an effective clipping range is set. Since setting is difficult and a wide range is reduced and displayed, the same problem as when the entire image is reduced and displayed occurs.

そこで本発明では、このような問題点に鑑みて、主要な被写体領域とそれ以外の領域との変倍率を変えて、当該主要な被写体を相対的に拡大して誇張した合成画像に変換する画像変換装置、画像変換プログラムおよび画像変換方法を提供することを目的とする。 Accordingly, in the present invention, in view of such a problem, an image to be converted into an exaggerated composite image by changing the magnification ratio of the main subject region and the other regions and relatively expanding the main subject. It is an object to provide a conversion device, an image conversion program, and an image conversion method.

前記した目的を達成するために、請求項１に記載の画像変換装置は、入力画像を、当該入力画像中の特定の映像オブジェクトを相対的に拡大した画像に変換する画像変換装置であって、映像オブジェクト検出手段と、映像オブジェクト選定手段と、部分画像切り出し手段と、画像変倍手段と、画像合成手段と、を備え、映像オブジェクト選定手段は、複数の決定論理手段と、対象選定手段と、を備える構成とした。 In order to achieve the above-described object, the image conversion device according to claim 1 is an image conversion device that converts an input image into an image in which a specific video object in the input image is relatively enlarged, A video object detection unit, a video object selection unit, a partial image cutout unit, an image scaling unit, and an image synthesis unit; and the video object selection unit includes a plurality of decision logic units, a target selection unit, and a structure in which Ru equipped with.

かかる構成によれば、画像変換装置は、映像オブジェクト検出手段によって、入力画像から１種類以上の所定種別の映像オブジェクトを検出して、検出した映像オブジェクト毎に当該検出した映像オブジェクトの種別を含む種別情報と検出した位置を示す位置情報とを出力する。次に、映像オブジェクト選定手段によって、映像オブジェクト検出手段によって検出された映像オブジェクトの種別情報と位置情報とに基づき、これらの検出された映像オブジェクトの中から予め定められた条件に適合する映像オブジェクトを特定の映像オブジェクトとして選定する。続いて、部分画像切り出し手段によって、入力画像から、選定された特定の映像オブジェクトが含まれる部分画像を切り出し、画像変倍手段によって、入力画像または当該部分画像の少なくとも一方を変倍して、入力画像の全体領域に対して当該部分画像が相対的に拡大された関係となる全体画像と部分画像とからなる画像対を生成する。そして、画像合成手段によって、当該画像対をなす全体画像と部分画像とを合成する。この際、映像オブジェクト選定手段の複数の決定論理手段によって、映像オブジェクト検出手段によって検出された映像オブジェクトの種別情報と位置情報とに基づき、これらの検出された映像オブジェクトの中から、それぞれ予め定められた条件に適合する映像オブジェクトを特定の映像オブジェクトの候補として決定する。続いて、映像オブジェクト選定手段の対象選定手段によって、これら特定の映像オブジェクトの候補の中から、特定の映像オブジェクトを選定する。そして、画像変倍手段によって、この選定された特定の映像オブジェクトの部分画像を、入力画像に対して、もしくは全体画像変倍手段を有する場合には変倍全体画像に対して、相対的に拡大された画像となるように変倍して変倍部分画像を作成する。そして、画像合成手段によって、画像対をなす全体画像（入力画像もしくは変倍全体画像）と、この変倍部分画像とを合成する。 According to such a configuration, the image conversion device detects one or more types of video objects of a predetermined type from the input image by the video object detection unit, and includes a type including the type of the detected video object for each detected video object. Information and position information indicating the detected position are output. Next, based on the type information and position information of the video object detected by the video object detection unit, the video object selection unit selects a video object that meets a predetermined condition from these detected video objects. Select as a specific video object. Subsequently, the partial image cutout means cuts out the partial image including the selected specific video object from the input image, and the image scaling means scales at least one of the input image or the partial image and inputs it. An image pair including an entire image and a partial image in which the partial image is relatively enlarged with respect to the entire area of the image is generated. Then, the whole image and the partial image forming the image pair are synthesized by the image synthesis means. At this time, a plurality of decision logic means of the video object selection means are respectively predetermined from the detected video objects based on the type information and position information of the video objects detected by the video object detection means. Video objects that meet the specified conditions are determined as specific video object candidates. Subsequently, a specific video object is selected from these specific video object candidates by the target selection means of the video object selection means. Then, the image scaling means enlarges the partial image of the selected specific video object relative to the input image or, if the entire image scaling means is provided, to the scaled whole image. A variable-magnification partial image is created by scaling the image so that the resulting image is obtained. Then, an entire image (an input image or a zoomed whole image) forming an image pair and the zoomed partial image are synthesized by the image synthesizing unit.

これによって、画像変換装置は、入力画像を、特定の映像オブジェクトの画像が選択的に拡大処理されることによって誇張された画像に変換する際に、複数の条件に適合する映像オブジェクトの中から選定した特定の映像オブジェクトが誇張された画像に変換することができる。 Thus, the image conversion device selects an input image from video objects that meet a plurality of conditions when converting an input image into an exaggerated image by selectively enlarging an image of a specific video object. The specific video object can be converted into an exaggerated image.

また、請求項２に記載の画像変換装置は、請求項１に記載の画像変換装置において、画像変倍手段は全体画像変倍手段を有する構成とした。 According to a second aspect of the present invention, in the image conversion device according to the first aspect, the image scaling unit includes an entire image scaling unit.

かかる構成によれば、画像変換装置は、画像変倍手段の全体画像変倍手段によって、入力画像の全体を縮小した変倍全体画像を作成する。そして、画像合成手段によって、画像対をなす、この変倍全体画像と、映像オブジェクト部分画像切り出し手段によって切り出された特定の映像オブジェクトの部分画像とを合成する。
これによって、画像変換装置は、入力画像を、入力画像の解像度よりも低解像度の画像表示装置に表示可能であって、特定の映像オブジェクトが誇張された画像に変換することができる。 According to such a configuration, the image conversion apparatus creates a scaled whole image obtained by reducing the entire input image by the whole image scaling unit of the image scaling unit. Then, the image composition unit synthesizes the entire zoom image forming a pair with the partial image of the specific video object cut out by the video object partial image cut-out unit.
Accordingly, the image conversion apparatus can display the input image on an image display apparatus having a resolution lower than the resolution of the input image, and can convert the input image into an image in which a specific video object is exaggerated.

請求項３に記載の画像変換装置は、請求項１または請求項２に記載の画像変換装置において、画像変倍手段は部分画像変倍手段を有する構成とした。 According to a third aspect of the present invention, in the image conversion device according to the first or second aspect, the image scaling unit includes a partial image scaling unit.

かかる構成によれば、画像変換装置は、画像変倍手段の部分画像変倍手段によって、特定の映像オブジェクトの部分画像を、入力画像に対して、もしくは全体画像変倍手段を有する場合には変倍全体画像に対して、相対的に拡大された画像に変倍して変倍部分画像を作成する。そして、画像合成手段によって、画像対をなす全体画像（入力画像もしくは変倍全体画像）と、この変倍部分画像とを合成する。
これによって、画像変換装置は、入力画像を、特定の映像オブジェクトが誇張された画像に変換することができる。さらに、全体画像変倍手段を有する場合には、入力画像を所望の解像度の画像表示装置に適合し、特定の映像オブジェクトが誇張された画像に変換することができる。 According to such a configuration, the image conversion apparatus changes the partial image of the specific video object with respect to the input image or the entire image scaling unit by the partial image scaling unit of the image scaling unit. A zoomed partial image is created by scaling the image of the entire doubled image to a relatively enlarged image. Then, an entire image (an input image or a zoomed whole image) forming an image pair and the zoomed partial image are synthesized by the image synthesizing unit.
Thus, the image conversion apparatus can convert the input image into an image in which a specific video object is exaggerated. Further, when the entire image scaling unit is provided, the input image can be converted into an image in which a specific video object is exaggerated by conforming to an image display device having a desired resolution.

請求項４に記載の画像変換装置は、請求項３に記載の画像変換装置であって、部分画像変倍手段は非線形変倍を行うとともに、画像合成手段は映像オブジェクト間の接合部を基準点として合成するように構成した。 The image conversion device according to claim 4 is the image conversion device according to claim 3, wherein the partial image scaling means performs nonlinear scaling, and the image composition means sets the joint between video objects as a reference point. Was configured to synthesize.

かかる構成によれば、画像変換装置は、部分画像変倍手段によって、選定された特定の映像オブジェクトの部分画像を、画像合成手段によって合成される入力画像中の当該特定の映像オブジェクトでない他の映像オブジェクトとの接合部において、当該他の映像オブジェクトと同じ変倍率とし、当該接合部から遠ざかるほど変倍率が大きくなるように非線形変倍して変倍部分画像を作成する。そして、画像合成手段によって、この非線形変倍された変倍部分画像と画像対をなす全体画像と当該変倍部分画像とを、接合部を基準点として合成する。
これによって、画像変換装置は、入力画像を、特定の映像オブジェクトが非線形変倍によって誇張された画像に変換することができる。 According to such a configuration, the image conversion apparatus allows the partial image scaling unit to select another image that is not the specific video object in the input image synthesized by the image synthesizing unit. At the junction with the object, the magnification is the same as that of the other video object, and the magnification partial image is created by nonlinear scaling so that the magnification increases as the distance from the junction increases. Then, the image synthesizing means synthesizes the non-linearly scaled scaled partial image, the whole image forming an image pair, and the scaled partial image with the joint as a reference point.
Thus, the image conversion apparatus can convert the input image into an image in which a specific video object is exaggerated by nonlinear scaling.

請求項５に記載の画像変換プログラムは、コンピュータを、請求項１から請求項４の何れか一項に記載の画像変換装置として機能させることとした。 Image conversion program according to claim 5, a computer, and a function as an image conversion apparatus according to any one of claims 1 to 4.

請求項６に記載の画像変換方法は、入力画像を、当該入力画像中の特定の映像オブジェクトを相対的に拡大した画像に変換する画像変換方法であって、映像オブジェクト検出ステップと、映像オブジェクト選定ステップと、映像オブジェクト部分画像切り出しステップと、画像変倍ステップと、画像合成ステップと、を含み、映像オブジェクト選定ステップは、複数の決定論理ステップと、対象選定ステップと、を含むことを特徴とする。 7. The image conversion method according to claim 6 , wherein the input image is converted into an image obtained by relatively enlarging a specific video object in the input image, and includes a video object detection step and a video object selection. a step, a video object partial image cutting-out step, viewed including the image scaling step, and an image synthesizing step, the video object selection step, characterized a plurality of decision logic step, the object selecting step, the-containing Mukoto And

この方法によれば、映像オブジェクト検出ステップにおいて、入力画像から１種類以上の所定種別の映像オブジェクトを映像オブジェクト検出手段により検出して、検出した映像オブジェクト毎に当該検出した映像オブジェクトの種別を含む種別情報と検出した位置を示す位置情報とを出力し、映像オブジェクト選定ステップにおいて、映像オブジェクト検出ステップで検出された映像オブジェクトの種別情報と位置情報とに基づき、これらの検出された映像オブジェクトの中から予め定められた条件に適合する映像オブジェクトを映像オブジェクト選定手段により特定の映像オブジェクトとして選定する。続いて、部分画像切り出しステップにおいて、入力画像から、選定された特定の映像オブジェクトが含まれる部分画像を部分画像切り出し手段により切り出し、画像変倍ステップにおいて、入力画像または当該部分画像の少なくとも一方を画像変倍手段により変倍して、入力画像の全体領域に対して当該部分画像が相対的に拡大された関係となる全体画像と部分画像とからなる画像対を生成する。そして、画像合成ステップにおいて、当該画像対をなす全体画像と部分画像とを画像合成手段により合成する。この際、映像オブジェクト選定ステップの複数の決定論理ステップによって、映像オブジェクト検出手段によって検出された映像オブジェクトの種別情報と位置情報とに基づき、これらの検出された映像オブジェクトの中から、それぞれ予め定められた条件に適合する映像オブジェクトを特定の映像オブジェクトの候補として決定する。続いて、映像オブジェクト選定ステップの対象選定ステップによって、これら特定の映像オブジェクトの候補の中から、特定の映像オブジェクトを選定する。そして、画像変倍ステップによって、この選定された特定の映像オブジェクトの部分画像を、入力画像に対して、もしくは全体画像変倍ステップを有する場合には変倍全体画像に対して、相対的に拡大された画像となるように変倍して変倍部分画像を作成する。そして、画像合成ステップによって、画像対をなす全体画像（入力画像もしくは変倍全体画像）と、この変倍部分画像とを合成する。
これによって、入力画像を、特定の映像オブジェクトの画像が選択的に拡大処理されることによって誇張された画像に変換する際に、複数の条件に適合する映像オブジェクトの中から選定した特定の映像オブジェクトが誇張された画像に変換することができる。 According to this method, in the video object detection step, one or more predetermined types of video objects are detected from the input image by the video object detection means, and each detected video object includes the type of the detected video object. Information and position information indicating the detected position are output, and in the video object selection step, based on the type information and position information of the video object detected in the video object detection step, the detected video object is selected from the detected video objects. A video object that meets a predetermined condition is selected as a specific video object by the video object selection means. Subsequently, in the partial image cutout step, a partial image including the selected specific video object is cut out from the input image by the partial image cutout means, and in the image scaling step, at least one of the input image or the partial image is imaged. The image is scaled by the scaling means to generate an image pair composed of the entire image and the partial image in which the partial image is relatively enlarged with respect to the entire area of the input image. Then, in the image composition step, the whole image and the partial image forming the image pair are synthesized by the image composition means. At this time, based on the type information and position information of the video object detected by the video object detection means by the plurality of decision logic steps of the video object selection step, each of the detected video objects is determined in advance. Video objects that meet the specified conditions are determined as specific video object candidates. Subsequently, a specific video object is selected from these specific video object candidates in the target selection step of the video object selection step. Then, by the image scaling step, the partial image of the selected specific video object is relatively enlarged with respect to the input image or, when the entire image scaling step is provided, with respect to the entire scaled image. A variable-magnification partial image is created by scaling the image so that the resulting image is obtained. Then, in the image synthesis step, the entire image (input image or zoomed whole image) forming an image pair and the zoomed partial image are synthesized.
Thus, when the input image is converted into an exaggerated image by selectively enlarging the image of the specific video object, the specific video object selected from the video objects that meet a plurality of conditions Can be converted into an exaggerated image.

本発明によれば、複数の条件に適合する特定の箇所が選択的に誇張されるため、特に注目すべき箇所が複数ある場合に、当該箇所が漏れなく誇張された視認性の優れた画像に変換することができる。 According to the present invention, specific portions that meet a plurality of conditions are selectively exaggerated, so that when there are a plurality of particularly notable portions, the portions are exaggerated without omission and an image with excellent visibility is obtained. Can be converted.

なお、特許請求の範囲の記載において、映像オブジェクトとは、人物やボールなどの被写体（オブジェクト）を撮影した入力画像中の被写体像のことである。 Na us, in the appended claims, and the video object is that of the object image in the input image obtained by photographing the subject (object), such as a person or a ball.

以下、本発明の実施形態について適宜図面を参照して詳細に説明する。なお、ここでは、サッカーの試合の様子を撮影した映像に本発明を適用した場合を例として説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings as appropriate. Here, a case where the present invention is applied to an image of a soccer game will be described as an example.

＜第１実施形態＞
［画像変換装置の構成］
まず、図１を参照して、本発明の第１実施形態における画像変換装置１の構成について説明する。図１は、本発明の第１実施形態における画像変換装置の構成を示すブロック図である。 <First Embodiment>
[Configuration of Image Conversion Device]
First, the configuration of the image conversion apparatus 1 according to the first embodiment of the present invention will be described with reference to FIG. FIG. 1 is a block diagram showing the configuration of the image conversion apparatus according to the first embodiment of the present invention.

図１に示した画像変換装置１は、全体画像変倍手段１０と部分画像変倍手段５０とを含む画像変倍手段１００と、映像オブジェクト検出手段２０と、誇張対象映像オブジェクト選定手段３０と、映像オブジェクト領域切り出し手段４０と、画像合成手段６０とを含んで構成されている。
画像変換装置１は、画像入力装置２から供給される入力画像Ｈを入力し、合成画像Ｇを作成して画像表示装置３に出力する。 The image conversion apparatus 1 shown in FIG. 1 includes an image scaling unit 100 including an entire image scaling unit 10 and a partial image scaling unit 50, a video object detection unit 20, an exaggeration target video object selection unit 30, The image object area cutout means 40 and the image composition means 60 are included.
The image conversion apparatus 1 receives the input image H supplied from the image input apparatus 2, creates a composite image G, and outputs it to the image display apparatus 3.

次に、画像変換装置１の各部の構成について説明する。
画像変倍手段１００は、全体画像変倍手段１０と部分画像変倍手段５０とを含んで構成されており、全体画像変倍手段１０と部分画像変倍手段５０とが協働して、全体画像である入力画像Ｈと特定の映像オブジェクトの部分画像Ｔとを、部分画像Ｔが相対的に拡大された全体画像（変倍全体画像Ｌ）と部分画像（変倍部分画像Ｕ）とからなる画像対を生成するものである。 Next, the configuration of each unit of the image conversion apparatus 1 will be described.
The image scaling unit 100 includes an entire image scaling unit 10 and a partial image scaling unit 50, and the entire image scaling unit 10 and the partial image scaling unit 50 cooperate with each other. An input image H that is an image and a partial image T of a specific video object are composed of a whole image (magnification whole image L) in which the partial image T is relatively enlarged and a partial image (magnification partial image U). An image pair is generated.

画像変倍手段１００の全体画像変倍手段１０は、画像入力装置２から供給される全体画像である入力画像Ｈを入力し、この入力画像Ｈを変倍して画像表示装置３に適合する解像度の変倍全体画像Ｌを作成し、作成した変倍全体画像Ｌを画像合成手段６０に出力する。 The overall image scaling unit 10 of the image scaling unit 100 inputs an input image H that is an entire image supplied from the image input device 2, and scales the input image H to suit the image display device 3. Is generated, and the generated entire enlarged image L is output to the image composition means 60.

入力画像Ｈを供給する画像入力装置２としては、ハイビジョンカメラなどの撮影装置や、ハードディスクや光ディスクなどの画像記憶装置を利用することができる。全体画像変倍手段１０は、このような撮影装置によって撮影された画像や、画像記憶装置から読み出された画像などを、直接にまたは通信回線や放送受信装置などを介して入力画像Ｈとして入力する。 As the image input device 2 that supplies the input image H, a photographing device such as a high-vision camera or an image storage device such as a hard disk or an optical disk can be used. The whole image scaling unit 10 inputs an image taken by such a photographing device or an image read from the image storage device as an input image H directly or via a communication line or a broadcast receiving device. To do.

ここで、図２を参照して、入力画像Ｈについて説明する。図２は、入力画像の例を表す図である。
入力画像Ｈにおいて、画像座標は左上を原点（０，０）とし、２次元座標の最初の要素を水平座標Ｘ（右方向を正とする）、２番目の要素を垂直座標Ｙ（下方向を正とする）とし、それぞれ画素（ピクセル）数単位で計数するものとする。後記する変倍全体画像Ｌにおける画像座標も同様とする。 Here, the input image H will be described with reference to FIG. FIG. 2 is a diagram illustrating an example of an input image.
In the input image H, the upper left corner of the input image H is the origin (0, 0), the first element of the two-dimensional coordinate is the horizontal coordinate X (the right direction is positive), the second element is the vertical coordinate Y (the lower direction is It is assumed that each pixel is counted in units of the number of pixels. The same applies to the image coordinates in the scaled whole image L described later.

図２に示した入力画像Ｈは、サッカーの試合の一場面を示しており、画面右側にボールの映像オブジェクトＯ_３を手にした人物の映像オブジェクトＭ_１と、画面左側に人物の映像オブジェクトＭ_２とが映し出されている。後記するように、本実施形態において、映像オブジェクトの種別として「人物の顔（頭部）」と「ボール」とを検出する場合には、人物の顔の映像オブジェクトＯ_１、人物の顔の映像オブジェクトＯ_２およびボールの映像オブジェクトＯ_３が検出される。そして、それぞれの映像オブジェクトの代表点として重心位置を用いると、それらの画像座標はそれぞれ（ｆ_１，ｇ_１）、（ｆ_２，ｇ_２）および（ｆ_３，ｇ_３）のように表すことができる。 The input image H shown in FIG. 2 shows a scene of a soccer game. A video object M _{1 of} a person who holds a video object O _{3 of a} ball on the right side of the screen and a video object M of a person on the left side of the screen. ₂ is projected. As will be described later, in this embodiment, when “human face (head)” and “ball” are detected as the types of video objects, the human face video object O ₁ , the human face video, The object O ₂ and the video object O _{3 of the} ball are detected. If the center of gravity is used as the representative point of each video object, the image coordinates are expressed as (f ₁ , g ₁ ), (f ₂ , g ₂ ) and (f ₃ , g ₃ ), respectively. Can do.

図１に戻って説明を続ける。
全体画像変倍手段１０は、入力画像Ｈの全体領域を画像表示装置３の解像度に適合する変倍率で変倍処理（縮小処理または拡大処理）を行い、変倍全体画像Ｌを作成する。好ましくは、画像表示装置３の解像度に一致する変倍率で変倍処理を行う。作成された変倍全体画像Ｌは、画像合成手段６０に出力される。 Returning to FIG. 1, the description will be continued.
The whole image scaling unit 10 performs a scaling process (a reduction process or an enlargement process) on the entire area of the input image H with a scaling factor suitable for the resolution of the image display device 3 to create a scaled whole image L. Preferably, the scaling process is performed at a scaling factor that matches the resolution of the image display device 3. The created entire zoom image L is output to the image composition means 60.

画像の変倍処理は、入力画像Ｈの画素値を用いて補間演算し、再標本化することで所望の解像度の変倍全体画像Ｌを構成する各画素値を得るものである。補間の手法は任意であるが、例えば、最近傍補間法（ニアレストネイバー補間法）、双一次補間法（バイリニア補間法）または三次補間法（キュービックコンボリューション補間法もしくはバイキュービック補間法）を用いることができる。 In the image scaling process, interpolation is performed using the pixel values of the input image H, and re-sampling is performed to obtain each pixel value constituting the scaled entire image L having a desired resolution. The interpolation method is arbitrary, but for example, nearest neighbor interpolation method (nearest neighbor interpolation method), bilinear interpolation method (bilinear interpolation method) or cubic interpolation method (cubic convolution interpolation method or bicubic interpolation method) is used. be able to.

例えば、入力画像Ｈのサイズが横１９２０画素、縦１０８０画素の高精細画像（ハイビジョン画像）であるものを、横３２０画素、縦１８０画素の低解像度画像に変換する場合、縦横それぞれ６分の１の画素数に縮小すればよい。入力画像ＨをＨ（Ｘ，Ｙ）、変倍全体画像ＬをＬ（ｘ，ｙ）とおく。ここで、Ｘは入力画像Ｈにおける水平方向の画像座標、Ｙは入力画像Ｈにおける垂直方向の画像座標、ｘは変倍全体画像Ｌにおける水平方向の画像座標、ｙは変倍全体画像Ｌにおける垂直方向の画像座標である。Ｈ（Ｘ，Ｙ）、Ｌ（ｘ，ｙ）は、それぞれ画像座標（Ｘ，Ｙ）における入力画像Ｈの輝度値、色度値などの画素値、画像座標（ｘ，ｙ）における変倍全体画像Ｌの輝度値、色度値などの画素値を示す。 For example, when a high-definition image (high-definition image) having a size of 1920 pixels in the horizontal direction and 1080 pixels in the vertical direction is converted into a low-resolution image having 320 pixels in the horizontal direction and 180 pixels in the vertical direction, 1/6 each in the vertical and horizontal directions. It may be reduced to the number of pixels. The input image H is set to H (X, Y), and the entire zoom image L is set to L (x, y). Here, X is the horizontal image coordinate in the input image H, Y is the vertical image coordinate in the input image H, x is the horizontal image coordinate in the entire zoom image L, and y is vertical in the zoom entire image L. The image coordinates of the direction. H (X, Y) and L (x, y) are pixel values such as the luminance value and chromaticity value of the input image H at the image coordinates (X, Y), respectively, and the entire zooming at the image coordinates (x, y). Pixel values such as the luminance value and chromaticity value of the image L are shown.

入力画像Ｈの画像サイズが横１９２０画素、縦１０８０画素の高精細画像の場合には、Ｘ∈｛０，１，…，１９１９｝、Ｙ∈｛０，１，…，１０７９｝である。また変倍全体画像Ｌの画像サイズが、横３２０画素、縦１８０画素の低解像度画像の場合には、ｘ∈｛０，１，…，３１９｝、ｙ∈｛０，１，…，１７９｝である。 When the image size of the input image H is 1920 pixels wide and 1080 pixels high, Xε {0, 1,..., 1919} and Yε {0, 1,. Further, in the case where the image size of the entire zoom image L is a low resolution image having horizontal 320 pixels and vertical 180 pixels, x∈ {0, 1,..., 319}, y∈ {0, 1,. It is.

ここで全体画像変倍手段１０は、入力画像Ｈを６分の１に変倍した変倍全体画像Ｌを得るために、例えば、式（１）に示したように、入力画像Ｈを縦６画素、横６画素の大きさのブロックに分割し、該ブロック内の平均画素値をもって変倍全体画像Ｌの１画素の画素値とすることで、縦横それぞれ６分の１の変倍処理（縮小処理）とすることができる。 Here, in order to obtain a scaled whole image L obtained by scaling the input image H to 1/6, the whole image scaling unit 10 converts the input image H into the vertical 6 as shown in the equation (1), for example. By dividing the block into pixels each having a size of 6 pixels in the horizontal direction, and using the average pixel value in the block as the pixel value of one pixel in the entire magnification-changed image L, scaling processing (reduction of 1/6 in the vertical and horizontal directions) is performed. Processing).

映像オブジェクト検出手段２０は、画像入力装置２から供給される全体画像である入力画像Ｈを入力し、この入力画像Ｈの中から、予め定められた種類の映像オブジェクトを検出し、検出した映像オブジェクトの種別情報Ｑと位置情報Ｐとを誇張対象映像オブジェクト選定手段３０に出力する。 The video object detection means 20 receives an input image H that is an entire image supplied from the image input device 2, detects a predetermined type of video object from the input image H, and detects the detected video object. Type information Q and position information P are output to the exaggeration target video object selecting means 30.

ここで種別情報Ｑとは、映像オブジェクトの種類、個体、個人を特定する識別子である。種別情報Ｑおよび位置情報Ｐは、入力画像Ｈの中から検出された映像オブジェクトの個数分（Ｋ個）だけ出力される。 Here, the type information Q is an identifier that identifies the type, individual, and individual of the video object. The type information Q and the position information P are output by the number (K) of video objects detected from the input image H.

また、出力する種別情報Ｑおよび位置情報Ｐの個数に上限を設けてもよい。適用する映像オブジェクト検出の手法が、検出した個々の映像オブジェクトに対して、例えば、後記するＶｉｏｌａ等の手法のように、映像オブジェクトが正しく検出できたかどうかの信頼度が計算可能な場合には、その信頼度が大きいものから優先して、所定の上限数までの映像オブジェクトに関する種別情報Ｑおよび位置情報Ｐを誇張対象映像オブジェクト選定手段３０に出力するようにしてもよい。 Further, an upper limit may be set for the number of type information Q and position information P to be output. When the applied video object detection method can calculate the reliability of whether or not the video object has been correctly detected for each detected video object, for example, the method of Viola described later, The type information Q and the position information P related to video objects up to a predetermined upper limit number may be output to the exaggeration target video object selection means 30 in preference to those having a high reliability.

ここで、映像オブジェクト検出手段２０の出力する種別情報Ｑに含まれるｋ番目の映像オブジェクトに関する種別情報をＱ_ｋとおくと、種別情報Ｑは、式（２）のように表すことができる。 Here, if the type information regarding the k-th video object included in the type information Q output from the video object detection means 20 is Q _k , the type information Q can be expressed as in Expression (2).

種別情報Ｑ_ｋは、例えば、「人物の顔」、「人物の脚」、「ボール」などといった種類や、人物の氏名や背番号といった個体や個人を識別する識別子で、文字列で表現してもよいし、数値で表現してもよい。
また、映像オブジェクト検出手段２０の出力する位置情報Ｐに含まれるｋ番目のオブジェクトに関する位置情報をＰ_ｋとおくと、位置情報Ｐは式（３）のように表すことができる。 The type information Q _k is, for example, an identifier for identifying an individual or an individual such as “person's face”, “person's leg”, “ball”, or the like, or a person's name or number, and is expressed as a character string. Or may be expressed numerically.
Further, if the position information regarding the kth object included in the position information P output from the video object detection means 20 is P _k , the position information P can be expressed as Equation (3).

位置情報Ｐは、映像オブジェクトの存在範囲に対応する代表点の画像座標（ｆ_ｋ，ｇ_ｋ）であってもよいし、位置と面的な広がりを表現する数値であってもよい。例えば、代表点として映像オブジェクト領域の重心点を用いることができる。 The position information P may be the image coordinates (f _k , g _k ) of the representative point corresponding to the existence range of the video object, or may be a numerical value expressing the position and the area spread. For example, the center of gravity of the video object area can be used as the representative point.

例えば、ｋ番目の映像オブジェクトの存在領域が、入力画像Ｈの中の領域Ｄ_ｋに対応する場合に、重心点は式（４）によって計算することができる。すなわち、領域Ｄ_ｋに属する全画素の画像座標値の水平方向の平均値と垂直方法の平均値とを算出し、それぞれ位置情報Ｐ_ｋの要素ｆ_ｋおよび要素ｇ_ｋとする。 For example, when the existence area of the kth video object corresponds to the area _Dk in the input image H, the barycentric point can be calculated by Expression (4). That is, the average value in the horizontal direction of the image coordinate values of all the pixels belonging to the region D _k and the average value in the vertical method are calculated and set as an element f _k and an element g _{k of the} position information P _k , respectively.

また、前記した領域Ｄ_ｋのバウンディングボックス（領域Ｄ_ｋを内接する矩形領域）を位置情報Ｐ_ｋとして用いることができる。バウンディングボックスの左上点を（ｐ_ｋ，ｑ_ｋ）、右下点を（ｒ_ｋ，ｓ_ｋ）とすると、位置情報Ｐ_ｋは、式（５）のようにして求めることができる。すなわち、領域Ｄ_ｋに属する全画素中の画像座標の水平方向および垂直方向の最小値と、水平方向および垂直方向の最大値とを求め、それぞれ位置情報Ｐ_ｋの要素ｐ_ｋ、要素ｑ_ｋ、要素ｒ_ｋおよび要素ｓ_ｋとする。 In addition, the bounding box of the area D _k described above (a rectangular area inscribed in the area D _k ) can be used as the position information P _k . If the upper left point of the bounding box is (p _k , q _k ) and the lower right point is (r _k , s _k ), the position information P _k can be obtained as shown in equation (5). That is, the minimum value in the horizontal direction and the vertical direction and the maximum value in the horizontal direction and the vertical direction of the image coordinates in all the pixels belonging to the region D _k are obtained, and the elements p _k and q _{k of} the position information P _k are _obtained . Let element r _k and element s _k .

なお、バウンディングボックスの表現法としては、式（５）のように左上点と右下点とによる表現のほか、左上点と幅と高さとによる表現、中心点と幅と高さとによる表現など任意の表現法を用いることができる。
さらに、例えば、前記した領域Ｄ_ｋそのもののビットマップ情報を式（６）のように位置情報Ｐ_ｋとして用いてもよい。 The expression of the bounding box is arbitrary, such as the expression using the upper left point and the lower right point, the expression using the upper left point, the width and the height, and the expression using the center point, the width and the height, as in Expression (5). Can be used.
Further, for example, the bitmap information of the region D _k itself may be used as the position information P _k as shown in Expression (6).

映像オブジェクト検出手段２０による映像オブジェクト検出を実現する手法としては、例えば、文献「Paul Viola and Michael Jones: “Rapid Object Detection using a Boosted Cascade of Simple Features”, Proceedings of the 2001 IEEE Computer Society Conference on Vision and Pattern Recognition, pp.511-518, (2001).」の手法を用いることができる。 As a technique for realizing video object detection by the video object detection means 20, for example, the document “Paul Viola and Michael Jones:“ Rapid Object Detection using a Boosted Cascade of Simple Features ”, Proceedings of the 2001 IEEE Computer Society Conference on Vision and Pattern Recognition, pp.511-518, (2001) ”can be used.

本手法は、検出すべき対象物体に現れる特徴に合わせ、統計学習を使って作成した弱識別器と呼ばれる複数の単純な識別器を用い、強識別器と呼ばれる識別器によって、複数の弱識別器による判別結果に重み付けをした和を算出し、その和に基づいて所望の物体かどうかの識別を行う。また、本手法によれば、強識別器によって算出した和の大きさによって、所望の物体を検出したかどうかの信頼度を定義できるため、前記したように、検出した映像オブジェクトに優先順位をつける場合に適している。 This method uses multiple simple classifiers called weak classifiers created using statistical learning in accordance with the features that appear in the target object to be detected. The sum obtained by weighting the discrimination result is calculated, and whether or not the object is a desired object is determined based on the sum. In addition, according to the present method, since the reliability of whether or not a desired object is detected can be defined by the magnitude of the sum calculated by the strong classifier, priority is given to the detected video object as described above. Suitable for cases.

本手法で用いる識別器を、検出すべき対象物体として人物の顔画像と非顔画像の学習データを用いて予め学習させておくことで、入力画像Ｈの中からの顔検出を行い、顔の出現位置や大きさを出力することができる。
同様に本手法を、例えば、ボールの画像と非ボール画像の学習データを用いて予め学習させておくことで、入力画像Ｈの中からのボール検出を行い、ボールの出現位置や大きさを出力することができる。 The classifier used in this method is learned in advance using learning data of a human face image and a non-face image as a target object to be detected, thereby detecting a face from the input image H, Appearance position and size can be output.
Similarly, this method is pre-learned using, for example, learning data of a ball image and a non-ball image, thereby detecting a ball from the input image H and outputting the appearance position and size of the ball. can do.

また、映像オブジェクトの検出手法として、クロマキー法を用いることもできる。クロマキー法では、予め指定した色範囲に属する画素を非映像オブジェクト領域、それ以外の領域を映像オブジェクト領域として抽出し、映像オブジェクトのシルエット画像を得ることができる。ここで得られたシルエット画像を連結領域ごとに分割し、各連結領域を前記した領域Ｄ_ｋとし、例えば、前記した式（４）〜式（６）の表現法により位置情報を出力することができる。 Further, the chroma key method can be used as a method for detecting a video object. In the chroma key method, pixels belonging to a color range designated in advance can be extracted as a non-video object area, and other areas can be extracted as a video object area, and a silhouette image of the video object can be obtained. The silhouette image obtained here is divided for each connected region, each connected region is set as the above-described region _Dk, and for example, the position information is output by the expression method of the above-described equations (4) to (6). it can.

さらに、例えば、シルエット画像から得た各映像オブジェクト領域に関して平均色を求め、それら平均色が予め設定された、いくつかの代表色のいずれに近いかを判定することで、当該映像オブジェクト領域に属する人物の着衣の色などを色分けすることができ、これを種別情報Ｑとして用いることができる。 Further, for example, an average color is obtained for each video object region obtained from a silhouette image, and the average color belongs to the video object region by determining which of several representative colors is preset. The color of the clothes of the person can be color-coded, and this can be used as the type information Q.

誇張対象映像オブジェクト選定手段（映像オブジェクト選定手段）３０は、映像オブジェクト検出手段２０によって検出されたＫ個の映像オブジェクトの種別情報Ｑと位置情報Ｐとを入力し、この入力されたＫ個の映像オブジェクトの種別情報Ｑと位置情報Ｐとに基づいて、これらの映像オブジェクトの中から予め定められた条件に適合する映像オブジェクトを誇張処理すべき映像オブジェクトとして選択し、選択した個数（Ｗ個）分の映像オブジェクトの位置情報Πを映像オブジェクト領域切り出し手段４０と画像合成手段６０とに順次に出力する。 The exaggeration target video object selecting means (video object selecting means) 30 inputs the type information Q and position information P of the K video objects detected by the video object detecting means 20, and the inputted K videos. Based on the object type information Q and the position information P, a video object that meets a predetermined condition is selected from these video objects as a video object to be exaggerated, and the selected number (W). Are sequentially output to the video object area cutout means 40 and the image composition means 60.

例えば、誇張対象映像オブジェクト選定手段３０は、ある特定の種別のある映像オブジェクトが、他の映像オブジェクト（群）に対し、予め定められた位置関係を満たしたとき、当該ある特定の種別のある映像オブジェクトに関する位置情報Πを出力する。具体的には、例えば、種別が「人物」のある映像オブジェクトが、種別が「ボール」の映像オブジェクトに所定の距離以内に近接したとき（例えば、重心間の距離が１０画素以内のとき）に、当該種別が「人物」の映像オブジェクトの位置情報Πを出力する。 For example, the exaggeration target video object selecting unit 30 may determine that a certain specific type of video object satisfies a predetermined positional relationship with respect to another video object (group). Outputs position information about the object. Specifically, for example, when a video object having the type “person” comes close to a video object having the type “ball” within a predetermined distance (for example, when the distance between the centers of gravity is within 10 pixels). The position information の of the video object whose type is “person” is output.

次に、図３を参照（適宜図１参照）して映像オブジェクト検出手段２０および誇張対象映像オブジェクト選定手段３０の詳細な構成について説明する。図３は、本発明の第１実施形態における画像変換装置の映像オブジェクト検出手段および誇張対象映像オブジェクト選定手段の構成を示すブロック図である。 Next, the detailed configuration of the video object detection means 20 and the exaggeration target video object selection means 30 will be described with reference to FIG. FIG. 3 is a block diagram showing the configuration of the video object detection means and the exaggeration target video object selection means of the image conversion apparatus according to the first embodiment of the present invention.

映像オブジェクト検出手段２０は、第１特定映像オブジェクト検出手段２１_１および第２特定映像オブジェクト検出手段２１_２ないし第Ｍ特定映像オブジェクト検出手段２１_Ｍまで全部でＭ個（但し、Ｍは自然数）の特定映像オブジェクト検出手段を含んで構成されている。 Video object detecting means 20, the particular M number in total up to a first specific image object detecting means 21 ₁ and the second specific image object detecting unit 21 ₂ to the M particular video object detecting means 21 _M (where, M is a natural number) The video object detection means is included.

各第ｍ特定映像オブジェクト検出手段２１_ｍ（ｍ∈｛１，２，…，Ｍ｝）は、画像入力装置２から供給される入力画像Ｈをそれぞれ入力し、入力した入力画像Ｈの中から、それぞれ予め定められた特定種別の映像オブジェクトを検出し、検出した映像オブジェクトの種別情報Ｑ^（ｍ）と位置情報Ｐ^（ｍ）を、誇張対象映像オブジェクト選定手段３０の第１決定論理３１_１〜第Ｎ決定論理３１_Ｎに出力する。ここで、Ｑ^（ｍ）およびＰ^（ｍ）は、それぞれ第ｍ特定映像オブジェクト検出手段２１_ｍが出力した種別情報および位置情報を示す。 Each m-th specific video object detection means 21 _m (m∈ {1, 2,..., M}) inputs the input image H supplied from the image input device 2 and from among the input images H that are input, A predetermined specific type of video object is detected, respectively, and the type information Q ^(m) and position information P ^(m) of the detected video object are used as the first determination logic 31 ₁ to 31 of the exaggeration target video object selection unit 30. N decision logic 31 outputs to _N. Here, Q ^(m) and P ^(m) indicate type information and position information output by the m-th specific video object detection means 21 _m , respectively.

第ｍ特定映像オブジェクト検出手段２１_ｍは、入力画像Ｈから予め定められた特定の種別Ｑ^（ｍ）の映像オブジェクトを順次に検出し、その位置情報Ｐ^（ｍ）を出力する。ここで、第ｍ特定映像オブジェクト検出手段２１_ｍが検出した映像オブジェクトの個数をＫ^（ｍ）とおく。Ｋ^（ｍ）が１以上のとき、位置情報Ｐ^（ｍ）に含まれるｋ番目（ｋ∈｛０，１，…，Ｋ^（ｍ）−１｝）の映像オブジェクトの位置情報をＰ_ｋ ^（ｍ）とおき、Ｋ^（ｍ）＝０のときはＰ^（ｍ）＝φ（φは空集合）とおくと、位置情報Ｐ^（ｍ）は式（７）のように表すことができる。 The m-th specific video object detection means 21 _m sequentially detects video objects of a specific type Q ^(m) determined in advance from the input image H, and outputs the position information P ^(m) . Here, the number of video objects detected by the m-th specific video object detection means 21 _m is set to K ^(m) . When K ^(m) is 1 or more, the position information of the kth (kε {0, 1,..., K ^(m) −1}) video object included in the position information P ^(m) is represented by P _k ^{(m )} , And when K ^(m) = 0, if P ^(m) = φ (φ is an empty set), the position information P ^(m) can be expressed as in Equation (7).

予め定める特定の映像オブジェクトの種別としては、例えば、「人物の顔」や「ボール」、「人物の足」、「ゴール」、「コーナー」、「背番号」、「ユニフォーム」などが挙げられる。具体的には、例えば、第１特定映像オブジェクト検出手段２１_１は、種別が「人物の顔」の映像オブジェクトを検出し、種別「人物の顔」を種別情報Ｑ^（１）として出力するとともに、その位置情報Ｐ^（１）を出力する。また、第２特定映像オブジェクト検出手段２１_２は、種別「ボール」の映像オブジェクトを検出し、種別「ボール」を種別情報Ｑ^（２）として出力するとともに、その位置情報Ｐ^（２）を出力する。なお、一つの種別、例えば「人物の顔」に属する映像オブジェクトが複数検出された場合には、位置情報Ｐ^（１）は式（７）に示したように、検出されたそれぞれの映像オブジェクトの位置情報に対応する複数の要素から構成される情報となる。 Examples of the predetermined video object type include “person's face”, “ball”, “person's foot”, “goal”, “corner”, “number”, and “uniform”. Specifically, for example, the first specific image object detecting means 21 _1, type detects the video object "human face", and outputs the type of the "human face" as the type information Q ^(1), The position information P ⁽¹⁾ is output. The second specific image object detecting means 21 ₂ detects an image object of type "ball", and outputs the type of the "ball" as the type information Q ^(2), and outputs the position information P ⁽²⁾ . When a plurality of video objects belonging to one type, for example, “person's face”, are detected, the position information P ⁽¹⁾ is set to each detected video object as shown in the equation (7). The information is composed of a plurality of elements corresponding to the position information.

誇張対象映像オブジェクト選定手段３０は、第１決定論理３１_１および第２決定論理３１_２ないし第Ｎ決定論理３１_Ｎまでの全部でＮ個（但し、Ｎは自然数）の決定論理と、対象選定手段３２とを含んで構成されている。 Exaggerated target video object selecting unit 30, N pieces in total up to the first decision logic 31 ₁ and the second decision logic 31 ₂ to N-th decision logic 31 _N (where, N is a natural number) and decision logic, object selecting means 32.

各第ｎ決定論理（決定論理手段）３１_ｎ（ｎ∈｛１，２，…，Ｎ｝）は、それぞれ映像オブジェクト検出手段２０の第１特定映像オブジェクト２１_１ないし第Ｍ特定映像オブジェクト検出手段２１_Ｍから出力された種別情報Ｑ^（１）ないし種別情報Ｑ^（Ｍ）および位置情報Ｐ^（１）ないし位置情報Ｐ^（Ｍ）を入力し、予め定められた、それぞれの決定の論理に従って誇張処理の対象候補となる映像オブジェクトを決定し、決定した映像オブジェクトの位置情報Π^（ｎ）を対象選定手段３２に出力する。 Each n-th decision logic (determination logic _{means) 31 n (n∈ {1,2,} ..., N}) , respectively the first specific image object 21 _first through M particular video object detecting means 21 of the video object detecting means 20 and input type information Q ⁽¹⁾ to type information Q a ^(M) and the position information P ⁽¹⁾ to the position information P ^(M) output from the _M, predetermined, the exaggeration process in accordance with the logic of each decision The target video object is determined, and the position information Π ⁽ⁿ⁾ of the determined video object is output to the target selection unit 32.

第ｎ決定論理３１_ｎは、例えば、特定の一つまたは複数の種別の映像オブジェクトが存在するかどうかや、特定の種別の映像オブジェクトが存在する場合には、当該映像オブジェクトの位置や、映像オブジェクト相互間の位置関係に基づいて、入力された位置情報Ｐ^（１）〜Ｐ^（Ｍ）に対応する映像オブジェクトの中から、誇張処理の対象候補として出力すべき映像オブジェクトがあるか否かを決定して、誇張処理の対象候補とすべき映像オブジェクトがある場合には、当該映像オブジェクトに対応する位置情報Π^（ｎ）を対象選定手段３２に出力する。 The nth decision logic 31 _n is, for example, whether or not there is one or more specific types of video objects, and if there are specific types of video objects, the position of the video object, the video object Based on the positional relationship between each other, it is determined whether there is a video object to be output as a candidate for the exaggeration process among the video objects corresponding to the input positional information P ^{(1) to} P ^(M). When there is a video object that should be a candidate for the exaggeration process, the position information Π ⁽ⁿ⁾ corresponding to the video object is output to the target selection unit 32.

第ｎ決定論理３１_ｎが誇張処理の対象候補として決定する映像オブジェクトの個数は０以上の整数個とし、その数をＪ^（ｎ）とおく。Ｊ^（ｎ）が１以上のとき、出力する位置情報Π^（ｎ）に含まれるｊ番目（ｊ∈｛０，１，…，Ｊ^（ｎ）−１｝）の映像オブジェクトの位置情報をΠ_ｊ ^（ｎ）とおく。また、Ｊ^（ｎ）＝０のときはΠ^（ｎ）＝φ（φは空集合）とおくと、位置情報Π^（ｎ）は式（８）のように表すことができる。 The number of video objects determined by the nth decision logic 31 _n as a candidate for the exaggeration process is an integer greater than or equal to 0, and the number is J ⁽ⁿ⁾ . When J ⁽ⁿ⁾ is 1 or more, the position information of the _jth video object (j∈ {0, 1,..., J ⁽ⁿ⁾ −1}) included in the output position information Π ⁽ⁿ⁾ is represented as ｊ _j ^{Let (n)} . Further, when J ⁽ⁿ⁾ = 0, if Π ⁽ⁿ⁾ = φ (φ is an empty set), the position information Π ⁽ⁿ⁾ can be expressed as in equation (8).

Ｎ個の決定論理である第１決定論理３１_１〜第Ｎ決定論理３１_Ｎには、入力された映像オブジェクトの種別情報Ｑ^（１）〜種別情報Ｑ^（Ｍ）および位置情報Ｐ^（１）〜位置情報Ｐ^（Ｍ）に基づいて、それぞれ誇張処理の対象候補とすべき映像オブジェクトを決定する論理が予め定められている。例えば、第１決定論理３１_１は、第１特定映像オブジェクト検出手段２１_１により検出された、種別情報Ｑ^（１）が「人物の顔」である映像オブジェクトの位置情報Ｐ^（１）、および第２特定映像オブジェクト検出手段２１_２により検出された種別情報Ｑ^（２）が「ボール」である映像オブジェクトの位置情報Ｐ^（２）を参照し、「人物の顔」の映像オブジェクトが誇張処理の対象候補とするかどうかを決定し、対象候補と決定した場合には、当該「人物の顔」の映像オブジェクトの位置情報Π^（１）を対象選定手段３２に出力する。 The first decision logic 31 ₁ to N decision logic 31 _N, which are N decision logics, include type information Q ⁽¹⁾ to type information Q ^(M) and position information P ⁽¹⁾ to input video object. Based on the position information P ^(M) , logic for determining a video object to be a candidate for each exaggeration process is predetermined. For example, the first decision logic 31 ₁ includes the position information P ^{(1) of} the video object detected by the first specific video object detection means 21 ₁ and whose type information Q ⁽¹⁾ is “person's face”, and 2 Referring to the position information P ⁽²⁾ of the video object whose type information Q ⁽²⁾ detected by the specific video object detection means 21 ₂ is “ball”, the video object of “person's face” is subject to exaggeration processing If it is determined whether or not to be a candidate, and if it is determined to be a target candidate, the position information Π ⁽¹⁾ of the video object of the “person's face” is output to the target selecting means 32.

詳細に説明すると、まず、第１決定論理３１_１は、種別情報Ｑ^（１）が「人物の顔」である映像オブジェクトの位置情報Ｐ^（１）に含まれるｋ番目の位置情報Ｐ_ｋ ^（１）と、種別情報Ｑ^（２）が「ボール」である映像オブジェクトの位置情報Ｐ^（２）に含まれるκ番目の位置情報Ｐ_κ ^（２）との、すべてのｋおよびκの組み合わせに関して、人物の顔とボールとの間の距離Ｄ（ｋ，κ）を式（９）によって求める。但し、Ｐ^（１）に含まれる要素数Ｋ^（１）か、Ｐ^（２）に含まれる要素数Ｋ^（２）のいずれか一方もしくはそれらの両方が０である場合には、第１決定論理３１_１は距離Ｄ（ｋ，κ）を求めず、また以後の演算および出力を行わないものとする。なお、サッカーのようにボールが１個の球技にあっては、好ましくはボールの要素数Ｋ^（２）は０または１であるが、誤検出などによりＫ^（２）が１より大きな整数をとる場合を想定しても構わない。 More specifically, first, the first decision logic 311 _first calculates the k-th position information P _k ⁽¹ ⁾ included in the position information P ⁽¹⁾ of the video object whose type information Q ⁽¹⁾ is “person's face”. ⁾ And κ-th position information P _κ ⁽²⁾ included in the position information P ⁽²⁾ of the video object whose type information Q ⁽²⁾ is “ball”, for all combinations of k and κ The distance D (k, κ) between the face and the ball is obtained by the equation (9). ^However, if ^{P (1)} to the number of elements ^{K (1)} or ^contained, either one or both of the ^{P (2)} to contain the number of elements ^{K (2)} is 0, the first decision logic 31 ₁ does not determine the distance D (k, κ), and shall not perform subsequent operations and output. In a ball game with one ball like soccer, the number K ⁽²⁾ of the balls is preferably 0 or 1, but K ⁽²⁾ takes an integer larger than 1 due to erroneous detection or the like. A case may be assumed.

ここで、関数ｄｉｓｔ（Ｐ，Ｐ’）は、位置情報Ｐおよび位置情報Ｐ’を有する２つの映像オブジェクト間の距離を表す。例えば、２つの映像オブジェクトそれぞれの重心位置間のユークリッド距離を出力するようにすることができる。 Here, the function dist (P, P ′) represents the distance between two video objects having position information P and position information P ′. For example, it is possible to output the Euclidean distance between the gravity center positions of two video objects.

続いて、第１決定論理３１_１は、例えば、式（１０）に示したように、種別が「ボール」である映像オブジェクトの位置情報Ｐ^（２）の要素を指す各インデックスκに関して、式（９）に示した距離Ｄ（ｋ，κ）を最小化する、種別が「人物の顔」である映像オブジェクトの位置情報Ｐ^（１）の要素を指すインデックスｈ（κ）を求める。
このようにして、第１決定論理３１_１は、各ボールと最も近接した人物の顔の映像オブジェクトを誇張処理の対象候補として決定することができる。 Subsequently, for example, as shown in Expression (10), the first determination logic 31 _{1 1} relates to each index κ indicating the element of the position information P ⁽²⁾ of the video object whose type is “Ball”. The index h (κ) indicating the element of the position information P ⁽¹⁾ of the video object whose type is “person's face” that minimizes the distance D (k, κ) shown in 9) is obtained.
In this way, the first decision logic 31 ₁ may determine a video object of a face of a person in closest proximity to each ball as candidates for exaggeration process.

最後に、第１決定論理３１_１は、式（１０）によって求められた、種別が「人物の顔」の映像オブジェクトの位置情報Ｐ_ｈ（κ） ^（１）を、式（１１）に示したように位置情報Π^（１）の各要素として対象選定手段３２に出力する。 Finally, the first decision logic 31 _1, obtained by the equation (10), the type of video objects "human face" position information _{P h} a ^{_(kappa) (1),} shown in equation (11) Thus, the information is output to the object selection means 32 as each element of the position information Π ⁽¹⁾ .

また、例えば、第２決定論理３１_２は、第１特定映像オブジェクト検出手段２１_１により検出された、種別情報Ｑ^（１）が「人物の顔」である映像オブジェクトの位置情報Ｐ^（１）を参照して誇張処理の対象候補として決定し、この決定した「人物の顔」の映像オブジェクトの位置情報Π^（２）を対象選定手段３２に出力する。 Further, for example, the second decision logic 31 _2, detected by the first specific image object detecting unit 21 _1, the position information P of the video object that is type information Q ⁽¹⁾ is "human face" a ⁽¹⁾ With reference to this, it is determined as a candidate for the exaggeration process, and the determined position information Π ⁽²⁾ of the video object of “person's face” is output to the target selection means 32.

詳細に説明すると、まず、第２決定論理３１_２は、種別情報Ｑ^（１）が「人物の顔」である映像オブジェクトの位置情報Ｐ^（１）に含まれるｋ番目の位置情報Ｐ_ｋ ^（１）を有する映像オブジェクトと、κ番目の位置情報Ｐ_κ ^（１）を有する映像オブジェクトとの間の距離Ｅ（ｋ，κ）を式（１２）によって求める。ただし、位置情報Ｐ^（１）に含まれる要素数Ｋ^（１）が０である場合には、第２決定論理３１_２は距離Ｅ（ｋ，κ）を求めず、また以後の演算および出力を行わないものとする。 In detail, first, second determination logic 31 _2, type information Q ⁽¹⁾ k-th position information is included in the position information P ⁽¹⁾ of the video object that is "human face" P _{k ⁽¹} ^{) And} a video object having the κ-th position information P _κ ⁽¹⁾ are obtained by Expression (12). However, if the number of elements K included in the position information P ^{⁽¹⁾ (1)} is 0, the second decision logic 31 ₂ does not seek distance E (k, kappa), also the subsequent operation and output Shall not be performed.

ここで、関数ｄｉｓｔ（Ｐ，Ｐ’）は、式（９）と同様に、位置情報Ｐおよび位置情報Ｐ’を有する２つの映像オブジェクト間の距離であり、例えば、２つの映像オブジェクトそれぞれの重心位置間のユークリッド距離を出力するものとする。 Here, the function dist (P, P ′) is the distance between the two video objects having the position information P and the position information P ′ as in the equation (9). For example, the center of gravity of each of the two video objects The Euclidean distance between positions shall be output.

続いて、第２決定論理３１_２は、例えば、式（１３）に示したように、種別が「人物の顔」である映像オブジェクトの位置情報Ｐ^（１）の要素を指す各インデックスκに関して、式（１２）に示した距離Ｅ（ｋ，κ）のｋに関する総和を最小化する、種別が「人物の顔」である映像オブジェクトの位置情報Ｐ^（１）の要素を指すインデックスｆを求める。
このようにして、検出された「人物の顔」の映像オブジェクトのうち、最も映像オブジェクトの密集した場所に位置する映像オブジェクトを選択することができる。 Subsequently, the second decision logic 31 _2, for example, as shown in Equation (13), for each index κ of classification points to elements of the position information of the video object that is "human face" P ^(1), An index f indicating the element of the position information P ⁽¹⁾ of the video object whose type is “person's face” that minimizes the sum of the distances E (k, κ) relating to k shown in Expression (12) is obtained.
In this manner, it is possible to select a video object located at a place where the video objects are most crowded among the detected video objects of the “person's face”.

最後に、第２決定論理３１_２は、式（１３）によって求められた、種別が「人物の顔」の映像オブジェクトの位置情報Ｐ_ｆ ^（１）を、式（１４）に示したように位置情報Π^（２）の要素として対象選定手段３２に出力する。 Finally, a second decision logic 31 _2, obtained by the equation (13), type is the positional information P _{f ⁽¹⁾} of the video object "human face", positioned as shown in Equation (14) It outputs to the object selection means 32 as an element of information Π ⁽²⁾ .

このように、各第ｎ決定論理３１_ｎは、それぞれ「人物の顔」「ボール」などの異種または同種の映像オブジェクト間における位置関係に応じ、検出された特定の種別の映像オブジェクトを誇張処理の対象候補とするかどうかを選択して決定し、決定した映像オブジェクトの位置情報Π^（ｎ）を対象選定手段３２に出力する。 In this manner, each nth decision logic 31 _n performs exaggeration processing on a specific type of video object detected in accordance with the positional relationship between different types or types of video objects such as “person's face” and “ball”. Whether or not to be a target candidate is selected and determined, and the position information 位置⁽ⁿ⁾ of the determined video object is output to the target selecting means 32.

対象選定手段３２は、第１決定論理３１_１〜第Ｎ決定論理３１_Ｎによって出力された位置情報Π^（１）〜位置情報Π^（Ｎ）を入力し、入力された位置情報Π^（１）〜位置情報Π^（Ｎ）に含まれる誇張処理の対象候補として決定された映像オブジェクトの位置情報の要素数を絞り込み、絞り込んだ各要素を位置情報Πとして、映像オブジェクト領域切り出し手段４０および画像合成手段６０（図１参照）に順次に出力する。 The object selection means 32 inputs the position information Π ⁽¹⁾ to the position information Π ^(N) output by the first decision logic 31 ₁ to the Nth decision logic 31 _N , and the inputted position information Π ⁽¹⁾ to The number of elements of the position information of the video object determined as a candidate for the exaggeration process included in the position information Π ^(N) is narrowed down, and the video object region cutout means 40 and the image composition means 60 are set using the narrowed elements as the position information Π. (See FIG. 1).

対象選定手段３２は、例えば、第１決定論理３１_１〜第Ｎ決定論理３１_Ｎのそれぞれに優先順位を設け、また、出力すべき位置情報Πの個数の上限値Ｗを定めることにより、優先度の高い第ｎ決定論理３１_ｎによって入力された位置情報Π^（ｎ）に含まれる要素から順に選定して出力し、出力した要素の総数がＷに至るか、入力された位置情報のすべての要素を出力するまで、位置情報Πを順次に出力する。 For example, the object selection unit 32 sets the priority order for each of the first determination logic 31 ₁ to the Nth determination logic 31 _N and sets the upper limit value W of the number of position information pieces to be output, thereby determining the priority. The n-th decision logic 31 having a higher value is selected and output in order from the elements included in the position information Π ⁽ⁿ⁾ input by the _n , and the total number of output elements reaches W or all elements of the input position information Until position information is output.

例えば、第１決定論理３１_１よりも第２決定論理３１_２の優先度を高く設定した場合において、第１決定論理３１_１から式（１１）に示した位置情報Π^（１）が対象選定手段３２に入力され、第２決定論理３１_２から式（１４）に示した位置情報Π^（２）が入力されたときには、以下のように処理をする。 For example, in the case than the first decision logic 31 ₁ was set high second decision logic 31 ₂ priority, position information [pi ⁽¹⁾ the target selecting means shown from the first decision logic 31 ₁ in the equation (11) 32 are input to, when the position information shown from the second decision logic 31 ₂ in the equation (14) [pi ^{that (2)} is input, the processing as follows.

出力する位置情報Πの上限値Ｗが１個のときには、Ｋ^（２）≧１ならば式（１４）に示した位置情報Π^（２）の要素Ｐ_ｆ ^（１）のみを出力する。Ｋ^（２）＝０ならば空集合φを出力する。このようにすることで、「人物の顔」の映像オブジェクトが一つでも検出された場合には、最も誇張処理すべき映像オブジェクトとして、検出されたすべての「人物の顔」の映像オブジェクトのうち、最も映像オブジェクトの密集した場所に位置する映像オブジェクトを選定することができる。 When the upper limit value W of the position information する to be output is one, if K ⁽²⁾ ≧ 1, only the element P _f ^{(1) of} the position information Π ⁽²⁾ shown in Expression (14) is output. If K ⁽²⁾ = 0, the empty set φ is output. In this way, if even one video object of “person's face” is detected, among the detected video objects of “person's face” as the video objects to be exaggerated most, A video object located at a place where video objects are most dense can be selected.

出力する位置情報Πの上限値Ｗが、２個以上かつＫ^（１）個以下のときには、式（１４）で示した位置情報Π^（２）の要素Ｐ_ｆ ^（１）と、式（１１）で示した位置情報Π^（１）の要素のうちから最大でＷ−１個とを出力する。Ｗ−１個の要素の選び方としては、例えば、式（１１）に示した位置情報Π^（１）の要素のうちで冒頭のＷ−１個の要素（Ｋ^（１）がＷ−１に満たない場合には、式（１１）に示したすべての要素）を、位置情報Πとして順次に出力する。このようにすることで、選択して検出された「人物の顔」のうちの最も映像オブジェクトの密集した場所に位置する映像オブジェクトに加えて、上限値Ｗ個を超えない範囲で、各ボールに最近接した人物の顔の映像オブジェクトを選定することができる。例えば、サッカーのようにボールが１個のときはＫ^（２）＝１以下であるから、ボールが検出された場合には、最も人物の顔の密集した場所に位置する人物の顔の映像オブジェクトと、ボールに最も近接した人物の顔の映像オブジェクトとが誇張処理の対象として選定され、対応する位置情報がΠとして出力される。 When the upper limit value W of the position information する to be output is 2 or more and K ⁽¹⁾ or less, the element P _f ^{(1) of} the position information Π ⁽²⁾ shown in the equation (14 ⁾ and the equation (11) Among the elements of the position information Π ⁽¹⁾ indicated by ^(1), a maximum of W−1 is output. As a method of selecting W-1 elements, for example, among the elements of the position information Π ⁽¹⁾ shown in Expression (11), the first W-1 elements (K ⁽¹⁾ satisfy W-1 ^). If not, all the elements shown in Expression (11) are sequentially output as position information Π. In this way, in addition to the video object located in the most crowded place of the video objects among the “person's face” selected and detected, each ball can be applied within a range not exceeding the upper limit value W. It is possible to select a video object of the face of the person closest to you. For example, when the number of balls is one, such as soccer, K ⁽²⁾ = 1 or less. Therefore, when a ball is detected, the video object of the person's face located at the most crowded place of the person's face Then, the video object of the face of the person closest to the ball is selected as an exaggeration processing target, and the corresponding position information is output as a bag.

図１に戻って画像変換装置１の説明を続ける。
映像オブジェクト領域切り出し手段（部分画像切り出し手段）４０は、画像入力装置２から供給される入力画像Ｈと、誇張対象映像オブジェクト選定手段３０から順次に出力されるＷ個の位置情報Πとを入力し、各位置情報Πに対応するＷ個の部分画像Ｔを部分画像変倍手段５０に順次に出力する。 Returning to FIG. 1, the description of the image conversion apparatus 1 will be continued.
The video object area cutout means (partial image cutout means) 40 inputs the input image H supplied from the image input device 2 and W position information Π sequentially output from the exaggeration target video object selection means 30. , W partial images T corresponding to each position information Π are sequentially output to the partial image scaling means 50.

映像オブジェクト領域切り出し手段４０は、入力画像Ｈから、誇張処理の対象として選定された映像オブジェクトの位置情報Πによって特定される映像オブジェクトを含む近傍の局所領域を切り出し、当該映像オブジェクトの部分画像Ｔとして出力する。 The video object region cutout means 40 cuts out a local region in the vicinity including the video object specified by the position information の of the video object selected as the exaggeration processing target from the input image H, and uses it as a partial image T of the video object. Output.

部分画像Ｔの切り出し方としては、例えば、位置情報Πが、代表点（例えば重心点）により映像オブジェクトの位置を表現する場合には、入力画像Ｈの代表点近傍の領域を切り出し、これを当該映像オブジェクトの部分画像Ｔとすることがきる。このような近傍の領域としては、例えば、代表点を中心とする半径Ｒ（Ｒは画素数）の円の内部領域とすることができる。 As a method of cutting out the partial image T, for example, when the position information Π represents the position of the video object by a representative point (for example, the center of gravity), a region near the representative point of the input image H is cut out and A partial image T of the video object can be obtained. Such a nearby region can be, for example, an inner region of a circle having a radius R (R is the number of pixels) centered on the representative point.

ここで、図４を参照（適宜図１参照）して、部分画像の切り出しの例を示す。図４は、本発明の第１実施形態における画像変換装置による画像変換の様子を説明するための図である。
図４に示した例では、入力画像Ｈにおいて、誇張対象映像オブジェクト選定手段３０の出力として、種別情報Ｑが「人物の顔」である映像オブジェクトＯ_１が誇張処理の対象として選定され、代表点（ｆ_１，ｇ_１）で示される位置において検出されている。このとき、映像オブジェクト領域切り出し手段４０は、入力画像Ｈにおける点（ｆ_１，ｇ_１）の付近の画像を切り出し（この例では円形）、それぞれ映像オブジェクトＯ_１の部分画像Ｔ_１として部分画像変倍手段５０に出力する。 Here, referring to FIG. 4 (refer to FIG. 1 as appropriate), an example of cutting out a partial image is shown. FIG. 4 is a diagram for explaining a state of image conversion by the image conversion apparatus according to the first embodiment of the present invention.
In the example shown in FIG. 4, in the input image H, as an output of the exaggeration target video object selecting means 30, the video object O ₁ whose type information Q is “person's face” is selected as an exaggeration processing target, and the representative point It is detected at a position indicated by (f ₁ , g ₁ ). At this time, the video object region cutout means 40 cuts out an image in the vicinity of the point (f ₁ , g ₁ ) in the input image H (circular in this example), and changes the partial image as a partial image T ₁ of the video object O _1. Output to the multiplier 50.

なお、映像オブジェクトの部分画像Ｔの形状は、本例のような円形のほか、長方形や多角形、任意形状のいずれによることもできる。例えば、位置情報Ｐが式（５）に示したバウンディングボックスで表現される場合には、このバウンディングボックスの位置と大きさに合わせて切り出せばよい。また、例えば、位置情報Ｐが式（６）で示される任意形状の領域で表現される場合には、この領域形状に合わせて切り出すことができる。 Note that the shape of the partial image T of the video object can be any of a rectangle, a polygon, and an arbitrary shape in addition to a circle as in this example. For example, when the position information P is expressed by the bounding box shown in Expression (5), the position information P may be cut out in accordance with the position and size of the bounding box. Further, for example, when the position information P is expressed by a region having an arbitrary shape represented by Expression (6), the position information P can be cut out according to the region shape.

図１に戻って説明を続ける。
部分画像変倍手段５０は、映像オブジェクト領域切り出し手段４０によって出力されるＷ個の部分画像Ｔを順次に入力し、入力した部分画像Ｔに変倍処理を行って変倍部分画像Ｕを作成し、作成した変倍部分画像Ｕを順次に画像合成手段６０に出力する。 Returning to FIG. 1, the description will be continued.
The partial image scaling unit 50 sequentially inputs the W partial images T output from the video object region cutout unit 40, performs a scaling process on the input partial image T, and creates a zoomed partial image U. Then, the generated zoomed partial images U are sequentially output to the image synthesizing means 60.

部分画像変倍手段５０は、映像オブジェクト領域切り出し手段４０により切り出された映像オブジェクトの部分画像Ｔを変倍処理して解像度を変換する。この場合、部分画像変倍手段５０は、全体画像変倍手段１０が入力画像Ｈを変倍する変倍率よりも大きい変倍率を適用する。すなわち、部分画像変倍手段５０によって得られる変倍部分画像Ｕは、全体画像変倍手段１０によって得られる変倍全体画像Ｌよりも相対的に大きいものとなる。 The partial image scaling unit 50 performs a scaling process on the partial image T of the video object cut out by the video object area cutout unit 40 to convert the resolution. In this case, the partial image scaling unit 50 applies a scaling factor larger than the scaling factor by which the entire image scaling unit 10 scales the input image H. That is, the scaled partial image U obtained by the partial image scaling unit 50 is relatively larger than the scaled whole image L obtained by the whole image scaling unit 10.

このときの部分画像Ｔに対する変倍処理は拡大処理に限らず、変倍全体画像Ｌの縮小率が小さい場合には、等倍か、変倍全体画像Ｌの縮小率よりも大きな縮小率で縮小処理を行うようにしてもよい。例えば、変倍全体画像Ｌの縮小率が６分の１の場合に、部分画像Ｔを２分の１に縮小して変倍部分画像Ｕを出力するようにすることができる。この場合は、変倍部分画像Ｕは、変倍全体画像Ｌに対して相対的に３倍に拡大された画像となる。 The scaling process for the partial image T at this time is not limited to the enlargement process, and when the reduction ratio of the entire scaled image L is small, it is reduced at the same scale or a reduction ratio larger than the reduction ratio of the entire scaled image L. Processing may be performed. For example, when the reduction ratio of the entire zoomed image L is 1/6, the partial image T can be reduced by a factor of 2 and the zoomed partial image U can be output. In this case, the zoomed partial image U is an image that is enlarged three times relative to the zoomed whole image L.

このように、全体画像変倍手段１０と部分画像変倍手段５０とからなる画像変倍手段１００によって、部分画像が相対的に拡大された全体画像（変倍全体画像Ｌ）と部分画像（変倍部分画像Ｕ）とからなる画像対を生成することができる。 As described above, the entire image (magnification entire image L) and the partial image (magnification) are partially enlarged by the image magnification unit 100 including the entire image magnification unit 10 and the partial image magnification unit 50. An image pair consisting of the double partial image U) can be generated.

また、入力画像Ｈは必ずしも変倍処理される必要はなく、入力画像Ｈをそのままの解像度で変倍全体画像Ｌとして画像合成手段６０に出力するようにしてもよい。全体画像変倍手段１０によって入力画像Ｈを変倍処理しない場合は、部分画像変倍手段５０によって部分画像Ｔを拡大処理して変倍部分画像Ｕを作成することにより、実質的に入力画像Ｈ（全体画像）と変倍部分画像Ｕからなる画像対を生成することができる。 The input image H is not necessarily subjected to the scaling process, and the input image H may be output to the image composition unit 60 as the entire scaled image L with the same resolution. When the input image H is not scaled by the whole image scaling unit 10, the partial image T is enlarged by the partial image scaling unit 50 to create a scaled partial image U, thereby substantially adding the input image H. An image pair consisting of the (entire image) and the zoomed partial image U can be generated.

更に、また、全体画像変倍手段１０によって入力画像Ｈを縮小処理して変倍全体画像Ｌを作成する場合においては、部分画像変倍手段５０によって部分画像Ｔを変倍処理せずに
、部分画像Ｔをそのままの解像度で変倍部分画像Ｕとして画像合成手段６０に出力するようにしてもよい。この場合は、変倍全体画像Ｌと、実質的に部分画像Ｔとからなる画像対を生成することができる。 Furthermore, in the case where the input image H is reduced by the entire image scaling unit 10 to create the entire enlarged image L, the partial image T is not subjected to the scaling process by the partial image scaling unit 50. The image T may be output to the image synthesizing unit 60 as a zoomed partial image U with the same resolution. In this case, it is possible to generate an image pair composed of the entire zoom image L and the partial image T.

このように、全体画像変倍手段１０の変倍率と、部分画像変倍手段５０の変倍率とは、入力画像Ｈの解像度と、画像表示したい画像表示装置３の解像度との関係に応じて任意に設定できるように構成してもよい。 As described above, the scaling factor of the entire image scaling unit 10 and the scaling factor of the partial image scaling unit 50 are arbitrary depending on the relationship between the resolution of the input image H and the resolution of the image display device 3 that is desired to display an image. You may comprise so that it can set to.

なお、部分画像変倍手段５０は、映像オブジェクトの部分画像Ｔを補間し、再標本化することで変倍部分画像Ｕを得ることができる。映像オブジェクトの部分画像Ｔが複数存在する場合には、その各々について変倍部分画像Ｕを作成する。映像オブジェクトの部分画像Ｔが複数存在する場合の変倍率は、すべての映像オブジェクトにわたって同一の変倍率としてもよいし、複数の異なる変倍率を混在させてもよい。変倍率を混在させる場合には、例えば、部分画像Ｔを変倍した結果得られる変倍部分画像Ｕの大きさ（例えば、幅、高さ、または面積）がすべて同一となるよう、部分画像Ｔごとに変倍率を変化させるようにすることもできる。 The partial image scaling unit 50 can obtain the scaled partial image U by interpolating and resampling the partial image T of the video object. When there are a plurality of partial images T of the video object, a magnification partial image U is created for each of them. The scaling factor when there are a plurality of partial images T of the video object may be the same scaling factor for all the video objects, or a plurality of different scaling factors may be mixed. In the case where the variable magnifications are mixed, for example, the partial image T is set so that the size (for example, width, height, or area) of the variable partial image U obtained as a result of scaling the partial image T is the same. It is also possible to change the scaling factor every time.

補間の手法は任意であるが、例えば、最近傍補間法（ニアレストネイバー補間法）、双一次補間法（バイリニア補間法）または、三次補間法（キュービックコンボリューション補間法もしくはバイキュービック補間法）を用いることができる。 Any interpolation method can be used. For example, nearest neighbor interpolation method (nearest neighbor interpolation method), bilinear interpolation method (bilinear interpolation method) or cubic interpolation method (cubic convolution interpolation method or bicubic interpolation method). Can be used.

画像合成手段６０は、全体画像変倍手段１０から出力される変倍全体画像Ｌと、誇張対象映像オブジェクト選定手段３０から順次に出力されるＷ個の位置情報Πと、部分画像変倍手段５０から順次に出力されるＷ個の変倍部分画像Ｕとを入力し、合成画像Ｇを作成して、画像表示装置３に出力する。 The image synthesizing unit 60 includes the entire scaled image L output from the entire image scaling unit 10, W position information boxes sequentially output from the exaggeration target video object selecting unit 30, and the partial image scaling unit 50. Are sequentially input from W, and a composite image G is generated and output to the image display device 3.

画像合成手段６０は、全体画像変倍手段１０から出力された変倍全体画像Ｌに対して、部分画像変倍手段５０から出力される変倍部分画像Ｕを、変倍部分画像Ｕに対応して誇張対象映像オブジェクト選定手段３０から出力される位置情報Πによって定まる変倍全体画像Ｌ上の対応する位置に上書きすることにより、合成画像Ｇを作成する。変倍部分画像Ｕが複数入力される場合には、それぞれの変倍部分画像Ｕに対応する位置情報Πに基づいて、変倍全体画像Ｌに順次に上書き合成を行う。 The image synthesizing unit 60 corresponds to the scaled partial image U for the scaled partial image U output from the partial image scaling unit 50 for the scaled whole image L output from the whole image scaling unit 10. Then, the composite image G is created by overwriting the corresponding position on the entire zoom image L determined by the position information Π output from the exaggeration target video object selecting means 30. When a plurality of scaled partial images U are input, overwriting synthesis is sequentially performed on the scaled whole image L based on the position information 対応 corresponding to each scaled partial image U.

次に、変倍全体画像Ｌに変倍部分画像Ｕを上書き合成する手法について説明する。
まず誇張対象映像オブジェクト選定手段３０から出力されるｋ番目の映像オブジェクトの位置情報をΠ_ｋ、位置情報Π_ｋに対応する部分画像をＴ_ｋ、部分画像Ｔ_ｋを変倍して得られる変倍部分画像をＵ_ｋとおく。ｋ番目の変倍部分画像Ｕ_ｋを、入力画像Ｈ上の位置情報Π_ｋで示される場所に対応する変倍全体画像Ｌ上の位置に合成する場合、例えば、位置情報Π_ｋを代表する点として重心を求め、その画像座標を全体画像変倍手段１０による画像変倍処理の変倍率に基づいてスケール変換し、このスケール変換された画像座標に変倍部分画像Ｕ_ｋの重心が一致するように変倍部分画像Ｕ_ｋを配置する。 Next, a method of overwriting and composing the zoomed partial image U on the zoomed entire image L will be described.
First, the positional information of the k-th video object output from the exaggeration target video object selecting means 30 is Π _k , the partial image corresponding to the positional information Π _k is T _k , and the scaling obtained by scaling the partial image T _k. Let U _k be a partial image. When the kth scaling partial image U _k is synthesized at a position on the entire scaling image L corresponding to the location indicated by the positional information _{ｋ k} on the input image H, for example, a point representing the positional information _{ｋ k} As the center of gravity is obtained, the image coordinates are scale-converted based on the scaling ratio of the image scaling processing by the overall image scaling means 10, and the center of gravity of the scaled partial image U _k matches the scale-converted image coordinates. The zoomed partial image U _k is placed in

ここで、図４を参照（適宜図１参照）して、変倍全体画像Ｌに変倍部分画像Ｕを上書き合成する様子について説明する。図４に示した例では、誇張対象映像オブジェクト選定手段３０によって、「人物の顔」の映像オブジェクトＯ_１が誇張対象として選定されているものとする。選定された映像オブジェクトＯ_１の位置情報Π_１から求めた部分画像Ｔ_１の重心の入力画像Ｈ上の画像座標を（ｆ_１，ｇ_１）とする。 Here, with reference to FIG. 4 (refer to FIG. 1 as appropriate), a state in which the zoomed partial image U is overwritten and combined with the zoomed entire image L is described. In the example shown in FIG. 4, it is assumed that the video object O ₁ of “person's face” is selected as an exaggeration target by the exaggeration target video object selection means 30. Selecting has been the image coordinates on the input image H of the center of gravity of the partial image T ₁ obtained from the position information [pi ₁ video objects O ₁ and (f _{1, g} _1).

次に、画像座標（ｆ_１，ｇ_１）を全体画像変倍手段１０の変倍率に基づいてスケール変換する。例えば、水平方向の変倍率がＭ_ｘ、垂直方向の変倍率がＭ_ｙである場合には、画像座標（ｆ_１，ｇ_１）のスケール変換後の画像座標、すなわち変倍全体画像Ｌ上の画像座標は（Ｍ_ｘｆ_１，Ｍ_ｙｇ_１）となる。
そして、変倍全体画像Ｌ上の画像座標（Ｍ_ｘｆ_１，Ｍ_ｙｇ_１）に、変倍部分画像Ｕ_１を、その重心が一致するように配置することで、合成画像Ｇを作成することができる。 Next, the image coordinates (f ₁ , g ₁ ) are scaled based on the scaling factor of the entire image scaling unit 10. For example, magnification is M _x in the horizontal _direction, when the magnification in the vertical direction is M _y are image coordinates scale image coordinates after conversion, i.e., on the zooming whole image L of (f _{1, g} ₁₎ The image coordinates are (M _x f ₁ , M _y g ₁ ).
Then, the image coordinates of the zooming whole image _{_{_{L (M x f 1, M}}} y g 1), the scaled partial image _{U 1,} by arranging such that the center of gravity coincides, to create a composite image G be able to.

なお、本例では、誇張対象となる映像オブジェクトが１つの場合について説明したが、誇張処理の対象となる映像オブジェクトは複数であってもよい。誇張対象映像オブジェクト選定手段３０から誇張処理の対象となる映像オブジェクトの位置情報Πが複数出力された場合には、そのそれぞれについて位置情報Πをスケール変換し、それぞれの位置に応じて変倍全体画像Ｌ上に、対応する変倍部分画像Ｕを上書き合成する。一方、誇張対象映像オブジェクト選定手段３０から位置情報Πが出力されなかった場合、すなわち誇張処理の対象となる映像オブジェクトの個数が０個の場合には、上書き合成動作は行わない。但し、上書き合成動作を行わない場合でも、画像合成手段６０から出力する変倍全体画像Ｌを形成的に合成画像Ｇと呼称する。 In this example, the case where there is one video object to be exaggerated has been described, but there may be a plurality of video objects to be exaggerated. When a plurality of position information 位置 of the video object to be exaggerated is output from the exaggeration target video object selecting means 30, the position information について is scale-converted for each of them, and the scaled whole image is corresponding to each position. The corresponding scaled partial image U is overwritten on L. On the other hand, when the position information Π is not output from the exaggeration target video object selection means 30, that is, when the number of video objects to be exaggerated is zero, the overwriting composition operation is not performed. However, even when the overwriting composition operation is not performed, the entire zoom image L output from the image composition means 60 is formally called a composite image G.

また、図４に示した例では、部分画像Ｔの変倍処理は、一つの部分画像Ｔの全領域に対して一定の倍率として変倍部分画像Ｕを作成し、対応する変倍全体画像Ｌ上の部分画像の重心が一致するように配置して合成画像Ｇを作成したが、重心を一致するように変倍部分画像Ｕを配置することに限らない。 In the example shown in FIG. 4, the scaling process of the partial image T creates a scaled partial image U with a constant magnification for the entire area of one partial image T, and the corresponding scaled whole image L The composite image G is created by arranging the upper partial images so that the centers of gravity coincide with each other.

例えば、変倍部分画像Ｕと接合する変倍全体画像Ｌ中の他の映像オブジェクトの画像との境界を基準点として変倍部分画像Ｕを配置し、かつ基準点における変倍全体画像Ｌと変倍部分画像Ｕとが滑らかに接合するように、部分画像Ｔに非線形な変倍処理を行って変倍部分画像Ｕを作成するようにしてもよい。 For example, the variable-magnification partial image U is arranged with the boundary with the image of another video object in the variable-magnification whole image L to be joined to the variable-magnification partial image U as a reference point, and In order to smoothly join the double partial image U, the partial image T may be subjected to a non-linear scaling process to create the zoom partial image U.

図５および図６を参照（適宜図４参照）して、部分画像Ｔを非線形変倍して変倍全体画像Ｌと合成する例について説明する。図５は、部分画像の非線形変倍の例を説明するための図であり、（ａ）は部分画像Ｔ、（ｂ）は非線形変倍した変倍部分画像Ｕの様子を示す図である。また、図６は、合成画像の例を示す図であり、（ａ）は変倍全体画像と線形変倍した変倍部分画像とを合成した合成画像を示し、（ｂ）は変倍全体画像と非線形変倍した変倍部分画像とを合成した合成画像を示す図である。 With reference to FIGS. 5 and 6 (refer to FIG. 4 as appropriate), an example in which the partial image T is subjected to non-linear scaling and combined with the scaled whole image L will be described. FIG. 5 is a diagram for explaining an example of non-linear scaling of a partial image, where (a) is a partial image T and (b) is a diagram illustrating a state of a non-linear scaling variable partial image U. FIG. 6 is a diagram illustrating an example of a composite image, where (a) illustrates a composite image obtained by combining a zoomed entire image and a linearly zoomed partial image, and (b) illustrates a zoomed entire image. FIG. 3 is a diagram illustrating a composite image obtained by combining a zoomed partial image that has been subjected to nonlinear scaling.

まず、図５（ａ）に入力画像Ｈから切り出した部分画像Ｔの様子を示す。ここで、変倍全体画像Ｌは縮小処理されるものとする。また、画像合成する際の部分画像Ｔの基準点は底辺部分中央の○印で示した点とする。 First, FIG. 5A shows a state of the partial image T cut out from the input image H. Here, it is assumed that the entire zoom image L is reduced. In addition, the reference point of the partial image T when the images are combined is a point indicated by a circle in the center of the bottom portion.

このとき変倍部分画像Ｕは、図５（ｂ）に示したように、○印で示した基準点を含む底辺における水平方向の変倍率を、入力画像Ｈに対する変倍全体画像Ｌの変倍率と一致させる。そして、基準点から垂直方向に離れるほど水平方向の変倍率を高める非線形変倍処理を行うことで、台形状の変倍部分画像Ｕを作成することができる。このような変倍部分画像Ｕを他の映像オブジェクトとの接合部を基準点として変倍全体画像Ｌ上に配置することによって、合成画像Ｇ上では変倍部分画像Ｕの底辺部分における見かけの連続性を保ちつつ、映像オブジェクトの部分画像Ｔを拡大して誇張提示を行うことができる。 At this time, as shown in FIG. 5B, the zoom partial image U has a horizontal scaling factor at the base including the reference point indicated by a circle, and a scaling factor of the scaling whole image L with respect to the input image H. To match. A trapezoidal scaling partial image U can be created by performing nonlinear scaling processing that increases the scaling ratio in the horizontal direction as the distance from the reference point in the vertical direction increases. By arranging such a zoomed partial image U on the entire zoomed image L with a junction with another video object as a reference point, on the synthesized image G, the apparent continuity at the bottom portion of the zoomed partial image U The partial image T of the video object can be enlarged and exaggerated presentation can be performed while maintaining the characteristics.

図６は、図４に示した入力画像Ｈの、人物の映像オブジェクトＭ_２について、その頭部の映像オブジェクトＯ_２を誇張処理して変倍全体画像Ｌと合成して作成した合成画像Ｇの例である。 6 shows a composite image G created by exaggerating the video object O _{2 of the} head of the human image object M ₂ of the input image H shown in FIG. It is an example.

図６（ａ）に示した合成画像Ｇでは、図４に示した例と同様の手法で、頭部の映像オブジェクトＯ_２の部分画像を線形変倍して作成した変倍部分画像Ｕ_２を、その重心が、変倍全体画像の対応する重心位置（Ｍ_ｘｆ_２，Ｍ_ｙｇ_２）に一致するように合成した画像である。 In the synthesized image G shown in FIG. 6A, a scaled partial image U ₂ created by linear scaling of the partial image of the head video object O ₂ is obtained in the same manner as in the example shown in FIG. , And an image synthesized so that the center of gravity coincides with the corresponding center of gravity position (M _x f ₂ , M _y g ₂ ) of the entire zoomed image.

それに対して、図６（ｂ）に示した合成画像Ｇは、本手法を適用し、頭部の映像オブジェクトＯ_２の部分画像を非線形変倍して作成した変倍部分画像Ｕ_２’を、この頭部の映像オブジェクトＯ_２と人物の映像オブジェクトＭ_２との本来の接合部、すなわち頭部と、頭部以外の他の映像オブジェクトとの境界である頸部位置（画像座標（Ｍ_ｘｆ_２’，Ｍ_ｙｇ_２’））を基準点として合成した画像である。なお、ｆ_２’およびｇ_２’は、それぞれ入力画像Ｈにおける人物の映像オブジェクトＭ_２の頸部の水平座標および垂直座標である。 On the other hand, the synthesized image G shown in FIG. 6 (b) is obtained by applying the present technique to a zoomed partial image U ₂ ′ created by nonlinear scaling of the partial image of the head video object O ₂ . The original joint between the video object O _{2 of the} head and the video object M _{2 of the} person, that is, the neck position (image coordinates (M _x f ₂ ′, M _y g ₂ ′)) as a reference point. Note that f ₂ ′ and g ₂ ′ are the horizontal coordinate and the vertical coordinate of the neck of the person's video object M ₂ in the input image H, respectively.

図６（ｂ）に示したように、頭部の映像オブジェクトＯ_２の誇張処理において、図５（ｂ）に示した台形状の変倍部分画像Ｕの底辺部分に相当する人物の頸部を基準点とし、基準点における変倍部分画像Ｕの水平方向の変倍率を変倍全体画像Ｌの変倍率に合わせることで、合成画像Ｇ上における頸部の連続性を確保し、より自然な印象の合成画像Ｇを作成することができる。 As shown in FIG. 6B, in the exaggeration process of the video object O ₂ of the head, the neck of the person corresponding to the bottom portion of the trapezoidal scaling partial image U shown in FIG. By making the horizontal scaling factor of the zoomed partial image U at the reference point the same as the scaling factor of the overall zooming image L, the cervical continuity on the composite image G is ensured and a more natural impression The composite image G can be created.

なお、第１実施形態では、誇張処理する映像オブジェクトとして「人物の顔（頭部）」を例にして説明したが、ユニフォームに描かれた「背番号」や「氏名」を誇張処理するようにしてもよく、また入力画像もサッカーの試合に限らず、他のスポーツや監視カメラの映像などに適用することもできる。 In the first embodiment, “person's face (head)” is described as an example of the video object to be exaggerated. However, the “number” and “name” drawn on the uniform are exaggerated. In addition, the input image is not limited to a soccer game, but can be applied to other sports or video from a surveillance camera.

以上、第１実施形態における画像変換装置１の構成について説明したが、画像変換装置１は、一部またはすべてを専用のハードウェアを作成して実行することができるが、コンピュータプログラムを実行させ、コンピュータ内の演算装置、記憶装置、入力装置、画像表示装置などを動作させることにより実現することもできる。このプログラム（画像変換プログラム）は、通信回線を介して提供することも可能であるし、ＣＤ−ＲＯＭなどの記録媒体に書き込んで配布することも可能である。 As described above, the configuration of the image conversion apparatus 1 according to the first embodiment has been described. The image conversion apparatus 1 can execute a part of or all of the dedicated hardware by creating a computer program, It can also be realized by operating an arithmetic device, a storage device, an input device, an image display device and the like in the computer. This program (image conversion program) can be provided via a communication line, or can be written on a recording medium such as a CD-ROM and distributed .

［画像変換装置の動作］
次に、図７を参照（適宜図１参照）して、本発明の第１実施形態における画像変換装置１の動作について説明する。図７は、図１に示した本発明の第１実施形態における画像変換装置の処理の流れを示すフロー図である。 [Operation of image converter]
Next, the operation of the image conversion apparatus 1 in the first embodiment of the present invention will be described with reference to FIG. 7 (refer to FIG. 1 as appropriate). FIG. 7 is a flowchart showing the flow of processing of the image conversion apparatus in the first embodiment of the present invention shown in FIG.

画像変換装置１は、まず、映像オブジェクト検出手段２０によって、画像入力装置２から供給される入力画像Ｈから、予め定められた特定の種別の映像オブジェクトを検出し、検出した映像オブジェクトの種別情報Ｑおよび位置情報Ｐを誇張対象映像オブジェクト選定手段３０に出力する（ステップＳ１１）。 First, the image conversion apparatus 1 detects a predetermined specific type of video object from the input image H supplied from the image input apparatus 2 by the video object detection means 20, and the type information Q of the detected video object. And the position information P are output to the exaggeration target video object selecting means 30 (step S11).

次に、画像変換装置１は、誇張対象映像オブジェクト選定手段３０によって、ステップＳ１１で映像オブジェクト検出手段２０によって検出された映像オブジェクトの中から誇張処理の対象となる映像オブジェクトを選定し、選定した映像オブジェクトの位置情報Πを映像オブジェクト領域切り出し手段４０と画像合成手段６０とに出力する（ステップＳ１２）。 Next, the image conversion apparatus 1 uses the exaggeration target video object selection unit 30 to select a video object to be exaggerated from the video objects detected by the video object detection unit 20 in step S11, and selects the selected video. The object position information Π is output to the video object region cutout means 40 and the image composition means 60 (step S12).

次に、画像変換装置１は、映像オブジェクト領域切り出し手段４０によって、ステップＳ１２で誇張対象映像オブジェクト選定手段３０によって選定された映像オブジェクトの位置情報Πを参照して、画像入力装置２から供給された入力画像Ｈから、当該映像オブジェクトを含む領域を部分画像Ｔとして切り出し、画像変倍手段１００の部分画像変倍手段５０に出力する（ステップＳ１３）。 Next, the image conversion device 1 is supplied from the image input device 2 by referring to the position information の of the video object selected by the exaggeration target video object selection unit 30 in step S12 by the video object region cutout unit 40. A region including the video object is cut out from the input image H as a partial image T, and is output to the partial image scaling unit 50 of the image scaling unit 100 (step S13).

次に、画像変換装置１は、画像変倍手段１００の部分画像変倍手段５０によって、ステップＳ１３で映像オブジェクト領域切り出し手段４０によって切り出された部分画像Ｔを変倍処理して、後記する変倍全体画像Ｌよりも相対的に拡大した変倍部分画像Ｕを作成し、画像合成手段６０に出力する（ステップＳ１４）。 Next, the image conversion apparatus 1 performs the scaling process on the partial image T cut out by the video object area cutout unit 40 in step S13 by the partial image scaling unit 50 of the image scaling unit 100, and the scaling described later. A zoomed partial image U that is enlarged relative to the entire image L is created and output to the image composition means 60 (step S14).

画像変換装置１は、前記したステップＳ１１からステップＳ１４までの処理をする一方で、画像変倍手段１００の全体画像変倍手段１０によって、画像入力装置２から供給される入力画像Ｈの全体を変倍処理し、画像表示装置３の解像度に適合する変倍率で変倍全体画像Ｌを作成し、画像合成手段６０に出力する（ステップＳ１５）。 While the image conversion apparatus 1 performs the processing from step S11 to step S14 described above, the entire input image H supplied from the image input apparatus 2 is changed by the whole image scaling means 10 of the image scaling means 100. A magnification process is performed to create a zoomed whole image L at a zoom ratio suitable for the resolution of the image display device 3 and output it to the image synthesizing means 60 (step S15).

ステップＳ１４の部分画像変倍処理およびステップＳ１５の全体画像変倍処理の両方の処理が完了すると、画像変換装置１は、画像合成手段６０によって、ステップＳ１２で誇張対象映像オブジェクト選定手段３０によって選定された位置情報Πを参照して、ステップＳ１５で全体画像変倍手段１０によって作成された変倍全体画像Ｌに、ステップＳ１４で部分画像変倍手段５０によって作成された変倍部分画像Ｕを上書き合成して合成画像Ｇを作成し、画像表示装置３に出力する（ステップＳ１６）。 When both the partial image scaling process in step S14 and the entire image scaling process in step S15 are completed, the image conversion device 1 is selected by the image composition means 60 by the exaggeration target video object selection means 30 in step S12. Referring to the position information た, the scaled whole image L created by the whole image scaling unit 10 in step S15 is overwritten with the scaled partial image U created by the partial image scaling unit 50 in step S14. Then, a composite image G is created and output to the image display device 3 (step S16).

なお、ここでは、ステップＳ１１からステップＳ１４までの処理とステップＳ１５の処理とを並列処理するようにしたが、ステップＳ１６の画像合成処理は、ステップＳ１４の部分画像変倍処理とステップＳ１５の全体画像変倍処理とがともに完了した後に開始される。そのため、ステップＳ１１からステップＳ１４の処理とステップＳ１５の処理とをシーケンシャルに実行するようにしてもよい。このときステップＳ１１からステップＳ１４までの処理とステップＳ１５の処理とは、どちらを先に実行してもよく、ステップＳ１５の処理をステップＳ１１からステップＳ１４までの処理の途中に挿入して実行してもよい。 Here, the processing from step S11 to step S14 and the processing of step S15 are performed in parallel, but the image composition processing of step S16 is the partial image scaling processing of step S14 and the entire image of step S15. It starts after the scaling process is completed. Therefore, the processing from step S11 to step S14 and the processing of step S15 may be executed sequentially. At this time, either the process from step S11 to step S14 or the process of step S15 may be executed first. The process of step S15 is inserted and executed in the middle of the process from step S11 to step S14. Also good.

また、誇張処理すべき映像オブジェクトが複数ある場合には、ステップＳ１２で誇張対象映像オブジェクト選定手段３０によって誇張処理の対象となる映像オブジェクトに対応する位置情報Πが出力される毎に、ステップＳ１３とステップＳ１４とステップＳ１６とを繰り返し実行する。ここで、変倍全体画像Ｌは２度作成する必要はないため、ステップＳ１５の実行は１回限りである。ステップＳ１６において、画像合成手段６０に２番目以降の誇張対象の映像オブジェクトの変倍部分画像Ｕが入力されると、画像変換装置１は、画像合成手段６０によって、合成画像Ｇに対して順次に繰り返し変倍部分画像Ｕを上書き合成する。 When there are a plurality of video objects to be exaggerated, every time the position information 対応 corresponding to the video object to be exaggerated is output by the exaggeration target video object selecting unit 30 in step S12, Steps S14 and S16 are repeatedly executed. Here, since it is not necessary to create the entire zoom image L twice, the execution of step S15 is limited to once. In step S <b> 16, when the zoomed partial image U of the second and subsequent exaggerated video objects is input to the image composition unit 60, the image conversion device 1 sequentially applies the composite image G to the composite image G. Overwrite and synthesize the rescaled partial image U repeatedly.

このようにして、画像変換装置１は、画像入力装置２から供給された入力画像Ｈを、画像表示装置３に適合する解像度の合成画像Ｇに変換して出力するため、特定の映像オブジェクトの視認性を損なわないように誇張提示した画像に変換して画像表示装置３に表示することができる。 In this way, the image conversion device 1 converts the input image H supplied from the image input device 2 into a composite image G having a resolution suitable for the image display device 3 and outputs it. The image can be converted into an exaggerated image and displayed on the image display device 3 so as not to impair the performance.

次に、図８を参照（適宜図３、図１および図７参照）して、本発明の第１実施形態における画像変換装置１の映像オブジェクト検出手段２０および誇張対象映像オブジェクト選定手段３０の詳細な動作について説明する。図８は、図３に示した本発明の第１実施形態における画像変換装置の映像オブジェクト検出手段および誇張対象映像オブジェクト選定手段の処理の流れを示すフロー図である。
なお、図８に示しフロー図は、図７に示したフロー図のステップＳ１１およびステップＳ１２に対応する。 Next, referring to FIG. 8 (refer to FIGS. 3, 1 and 7 as appropriate), details of the video object detection means 20 and the exaggeration target video object selection means 30 of the image conversion apparatus 1 in the first embodiment of the present invention. The operation will be described. FIG. 8 is a flowchart showing the processing flow of the video object detection means and the exaggeration target video object selection means of the image conversion apparatus in the first embodiment of the present invention shown in FIG.
Note that the flowchart shown in FIG. 8 corresponds to step S11 and step S12 of the flowchart shown in FIG.

映像オブジェクト検出手段２０は、第１特定映像オブジェクト検出手段２１_１によって、画像入力装置２から供給される入力画像Ｈの中から、予め定められた特定の種別の映像オブジェクトを検出し、検出した映像オブジェクトの種別情報Ｑ^（１）と位置情報Ｐ^（１）とを誇張対象映像オブジェクト選定手段３０の第１決定論理３１_１ないし第Ｎ決定論理３１_Ｎに出力する（ステップＳ２１_１）。 Video video object detecting means 20, the first specific image object detecting means 21 _1, from among the input image H supplied from the image input device 2, for detecting a video object of a specific type predetermined detected and outputs to the type information ^{Q (1)} and the position information ^{P (1)} the first decision logic 31 of the exaggerated object image object selecting means 30 the _first through N decision logic 31 _N objects (step _S21 1).

同様に、映像オブジェクト検出手段２０は、ステップＳ２１_１と並行して、他の第ｍ特定映像オブジェクト検出手段２１_ｍ（ｍ∈｛１，２，…，Ｍ｝）によって、画像入力装置２から供給される入力画像Ｈの中から、それぞれ予め定められた特定の種別の映像オブジェクトを検出し、検出した映像オブジェクトの種別情報Ｑ^（ｍ）と位置情報Ｐ^（ｍ）とを誇張対象映像オブジェクト選定手段３０の第１決定論理３１_１ないし第Ｎ決定論理３１_Ｎに出力する（ステップＳ２１_ｍ（ステップＳ２１_２〜ステップＳ２１_Ｍ））。 Similarly, the video object detection means 20 is supplied from the image input device 2 by the other m-th specific video object detection means 21 _m (m∈ {1, 2,..., M}) in parallel with step S21 _1. A predetermined specific type of video object is detected from each of the input images H, and the type information Q ^(m) and position information P ^{(m) of} the detected video object are exaggerated target video object selection means. It outputs to 30 first decision logic 31 ₁ thru | or Nth decision logic 31 _N (step S21 _m (step S21 ₂ -step S21 _M )).

ここで、ステップＳ２１_１〜ステップＳ２１_Ｍは、並行して処理を行うようにしたが、それぞれのステップを任意の順番でシーケンシャルに実行するようにしてもよい。 Here, steps S21 ₁ to S21 _M are processed in parallel, but each step may be executed sequentially in any order.

ステップＳ２１_１〜ステップＳ２１_Ｍのすべての処理が完了すると、誇張対象映像オブジェクト選定手段３０は、第１決定論理３１_１によって、ステップＳ２１_１〜ステップＳ２１_Ｍで第１特定映像オブジェクト検出手段２１_１〜第１特定映像オブジェクト検出手段２１_Ｍによって出力された映像オブジェクトの種別情報Ｑ^（１）〜種別情報Ｑ^（Ｍ）および位置情報Ｐ^（１）〜位置情報Ｐ^（Ｍ）を参照して、検出された映像オブジェクトの中から誇張処理の対象候補とすべき映像オブジェクトを決定し、決定した映像オブジェクトの位置情報Π^（１）を対象選定手段３２に出力する（ステップＳ２２_１）。 When all the processes of step S21 ₁ to step S21 _M are completed, the exaggeration target video object selecting unit 30 performs the first specific video object detecting unit 21 ₁ to 21 in step S21 ₁ to step S21 _{M according} to the first determination logic 31 ₁ . with reference to the first specific image object detecting means 21 _M of the video object that is output by the type information ^{Q (1)} ~ type information ^{Q (M)} and the position information ^{P (1)} ~ position information ^{P (M),} is detected A video object to be a candidate for the exaggeration process is determined from the video objects, and the position information Π ⁽¹⁾ of the determined video object is output to the target selection unit 32 (step S22 ₁ ).

同様に、誇張対象映像オブジェクト選定手段３０は、ステップＳ２２_１と並行して、他の第ｎ決定論理３１_ｎ（ｎ∈｛１，２，…，Ｎ｝）によって、ステップＳ２１_１〜ステップＳ２１_Ｍで第１特定映像オブジェクト検出手段２１_１〜第Ｍ特定映像オブジェクト検出手段２１_Ｍによって出力された映像オブジェクトの種別情報Ｑ^（１）〜種別情報Ｑ^（Ｍ）および位置情報Ｐ^（１）〜位置情報Ｐ^（Ｍ）を参照して、検出された映像オブジェクトの中から誇張処理の対象候補とすべき映像オブジェクトを決定し、決定した映像オブジェクトの位置情報Π^（ｎ）を対象選定手段３２に出力する（ステップＳ２２_２〜ステップＳ２２_Ｎ）。 Similarly, exaggerated target video object selecting means 30, in parallel with Step S22 _1, the n decision other logic _{31 n (n∈ {1,2, ...} , N}) by steps S21 ₁ ~ step S21 _M The video object type information Q ⁽¹⁾ to the type information Q ^(M) and the position information P ⁽¹⁾ to the position information output by the first specific video object detection unit 21 ₁ to the M-th specific video object detection unit 21 _M Referring to P ^(M) , a video object to be exaggerated from the detected video objects is determined, and position information Π ⁽ⁿ⁾ of the determined video object is output to target selection means 32. (Step S22 ₂ to Step S22 _N ).

ここで、ステップＳ２２_１〜ステップＳ２２_Ｎは、並行して処理を行うようにしたが、それぞれのステップを任意の順番でシーケンシャルに実行するようにしてもよい。 Here, steps S22 1 _~ step S22 _N has been to perform parallel processing, each of the steps may be performed sequentially in any order.

ステップＳ２２_１〜ステップＳ２２_Ｎのすべての処理が完了すると誇張対象映像オブジェクト選定手段３０は、対象選定手段３２によって、第１特定映像オブジェクト検出手段２１_１〜第Ｍ特定映像オブジェクト検出手段２１_Ｍによって誇張処理の対象候補と決定された映像オブジェクトの中から、上限値Ｗ個以下の範囲で誇張処理の対象とする映像オブジェクトを選定し、選定した映像オブジェクトの位置情報Πを順次に映像オブジェクト領域切り出し手段４０および画像合成手段６０に出力する。 Step S22 ₁ to exaggerated target video object selection unit 30 and all of the processing in step S22 _N is completed, the object selecting means 32, exaggerated by the first specific image object detecting means 21 ₁ through M particular video object detecting means 21 _M A video object to be exaggerated within a range of not more than the upper limit value W is selected from the video objects determined as processing target candidates, and the position information 切り出し of the selected video object is sequentially extracted as a video object area extracting means. 40 and the image composition means 60.

このようにして、映像オブジェクト検出手段２０と誇張対象映像オブジェクト選定手段３０とは連係して、多くの種別の映像オブジェクトの中から誇張処理の対象とすべき特定の条件に適合する映像オブジェクトを選定することができる。 In this way, the video object detection unit 20 and the exaggeration target video object selection unit 30 cooperate to select a video object that meets a specific condition to be subjected to the exaggeration process from many types of video objects. can do.

以上、説明したように、本発明の第１実施形態における画像変換装置１によれば、多くの種別の、または同一種別であっても多くの特定の映像オブジェクトを入力画像Ｈから検出し、検出した映像オブジェクト間の位置関係に基づいて誇張処理の対象とすべき映像オブジェクトを選定し、選定した映像オブジェクトに画像拡大などの誇張処理を施すため、入力画像Ｈを、注目すべき映像オブジェクトが選択的に誇張提示された合成画像Ｇに、自動的に変換することができる。 As described above, according to the image conversion device 1 in the first embodiment of the present invention, many types of video objects are detected from the input image H and detected even if they are of the same type or the same type. Based on the positional relationship between the selected video objects, the video object to be exaggerated is selected, and the selected video object is subjected to the exaggeration process such as image enlargement. Therefore, it can be automatically converted into the composite image G that is exaggerated.

特に、高解像度の全体画像である入力画像Ｈを、縮小処理をして低解像度の画像表示装置３に適合した低解像度の変倍全体画像Ｌに変換する場合において、低解像度の画像表示装置３で高解像度の原画像と同様に画像の全体を表示しながら、注目すべき特定の映像オブジェクトが選択的に誇張提示される視認性に優れた合成画像Ｇに変換することができる。 In particular, when the input image H, which is a high-resolution whole image, is converted into a low-resolution variable whole image L suitable for the low-resolution image display device 3 by performing a reduction process, the low-resolution image display device 3 Thus, while displaying the entire image in the same manner as the high-resolution original image, it is possible to convert into a composite image G with excellent visibility in which a specific video object to be noticed is selectively exaggerated.

また、映像オブジェクトの誇張処理の手法として、誇張処理の対象となる映像オブジェクトの部分画像Ｔを、他の映像オブジェクトとの接合部を基準点として、基準点から離れるほど変倍率を大きくする非線形変倍をすることにより、当該映像オブジェクトを、より自然な印象で誇張提示できる画像に変換することができる。 Further, as a method of exaggerating the video object, a non-linear change in which the magnification rate of the partial image T of the video object to be exaggerated is increased with increasing the distance from the reference point with a joint portion with another video object as a reference point. By doubling, the video object can be converted into an image that can be exaggerated and presented with a more natural impression.

＜第２実施形態＞
［画像変換装置の構成］
次に、図９を参照して、本発明の第２実施形態における画像変換装置１ａの構成について説明する。図９は、本発明の第２実施形態における画像変換装置の構成を示すブロック図である。 Second Embodiment
[Configuration of Image Conversion Device]
Next, the configuration of the image conversion apparatus 1a according to the second embodiment of the present invention will be described with reference to FIG. FIG. 9 is a block diagram showing a configuration of an image conversion apparatus according to the second embodiment of the present invention.

図９に示した画像変換装置１ａは、映像オブジェクト検出手段２０と、誇張対象映像オブジェクト選定手段３０と、映像オブジェクト領域切り出し手段４０と、部分画像変倍手段５０を含む画像変倍手段１００ａと、画像合成手段６０とを含んで構成されている。 The image conversion apparatus 1a shown in FIG. 9 includes a video object detection unit 20, an exaggeration target video object selection unit 30, a video object area cutout unit 40, an image scaling unit 100a including a partial image scaling unit 50, The image composition means 60 is included.

画像変換装置１ａは、画像入力装置２から供給される入力画像Ｈを入力し、合成画像Ｇを作成して画像表示装置３に出力する。 The image conversion device 1 a receives the input image H supplied from the image input device 2, creates a composite image G, and outputs it to the image display device 3.

次に、画像変換装置１ａの各部の構成について説明する。
図９に示した画像変換装置１ａにおいて、図１に示した画像変換装置１と同じ符号を付した構成要素は同様の機能を果たすので、詳細な説明は適宜省略する。
なお、画像変換装置１ａは、図１に示した第１実施形態における画像変換装置１の画像変倍手段１００から全体画像変倍手段１０を除いた構成である。 Next, the configuration of each unit of the image conversion apparatus 1a will be described.
In the image conversion apparatus 1a shown in FIG. 9, the components having the same reference numerals as those of the image conversion apparatus 1 shown in FIG. 1 perform the same functions, and thus detailed description thereof will be omitted as appropriate.
The image conversion apparatus 1a has a configuration in which the entire image scaling unit 10 is removed from the image scaling unit 100 of the image conversion apparatus 1 in the first embodiment shown in FIG.

第２実施形態における画像変換装置１ａでは、入力画像Ｈの全体の画像は変倍処理せずに、入力画像Ｈのまま合成処理するため、画像変倍手段１００ａの部分画像変倍手段５０は、部分画像Ｔに対する誇張処理として専ら拡大処理を行う。 In the image conversion apparatus 1a according to the second embodiment, the entire image of the input image H is not subjected to the scaling process, but is synthesized without changing the input image H. Therefore, the partial image scaling unit 50 of the image scaling unit 100a includes: The enlargement process is performed exclusively as an exaggeration process for the partial image T.

これによって、変倍部分画像Ｕは、入力画像Ｈに対して相対的に拡大された画像となり、画像変倍手段１００ａは部分画像変倍手段５０によって、部分画像が相対的に拡大された全体画像（入力画像Ｈ）と部分画像（変倍部分画像Ｕ）とからなる画像対を生成することができる。 As a result, the scaled partial image U becomes an image that is relatively enlarged with respect to the input image H, and the image scaling unit 100a is an entire image in which the partial image is relatively enlarged by the partial image scaling unit 50. An image pair composed of (input image H) and a partial image (magnification partial image U) can be generated.

画像合成手段６０は、画像入力装置２から供給される入力画像Ｈと、誇張対象映像オブジェクト選定手段３０から順次に出力されるＷ個の位置情報Πと、部分画像変倍手段５０から順次に出力されるＷ個の変倍部分画像Ｕとを入力し、合成画像Ｇを作成して、画像表示装置３に出力する。 The image composition unit 60 sequentially outputs the input image H supplied from the image input device 2, W position information boxes sequentially output from the exaggeration target video object selection unit 30, and the partial image scaling unit 50. The W scaled partial images U to be input are input, a composite image G is created, and is output to the image display device 3.

画像合成手段６０は、画像入力装置２から供給された入力画像Ｈに対して、画像変倍手段１００ａの部分画像変倍手段５０から出力される変倍部分画像Ｕを、誇張対象映像オブジェクト選定手段３０から出力される位置情報Πに対応する入力画像Ｈの位置に上書きすることにより、合成画像Ｇを得る。変倍部分画像Ｕが複数入力される場合には、それぞれの変倍部分画像Ｕに対応する位置情報Πに基づいて、入力画像Ｈに順次に上書き合成を行う。 The image synthesizing means 60 uses the enlarged partial image U output from the partial image scaling means 50 of the image scaling means 100a for the input image H supplied from the image input device 2, as an exaggeration target video object selection means. The composite image G is obtained by overwriting the position of the input image H corresponding to the position information 出力 output from 30. When a plurality of zoomed partial images U are input, overwriting is sequentially performed on the input image H based on the position information Π corresponding to each zoomed partial image U.

［画像変換装置の動作］
次に、図１０を参照（適宜図９参照）して、本発明の第２実施形態における画像変換装置１ａの動作について説明する。図１０は、図９に示した本発明の第２実施形態における画像変換装置の処理の流れを示すフロー図である。 [Operation of image converter]
Next, the operation of the image conversion apparatus 1a in the second embodiment of the present invention will be described with reference to FIG. FIG. 10 is a flowchart showing the flow of processing of the image conversion apparatus in the second embodiment of the present invention shown in FIG.

画像変換装置１ａは、まず、映像オブジェクト検出手段２０によって、画像入力装置２から供給される入力画像Ｈから、予め定められた特定の種別の映像オブジェクトを検出し、検出した映像オブジェクトの種別情報Ｑおよび位置情報Ｐを誇張対象映像オブジェクト選定手段３０に出力する（ステップＳ３１）。 First, the image conversion device 1a detects a predetermined specific type of video object from the input image H supplied from the image input device 2 by the video object detection means 20, and the type information Q of the detected video object. And the position information P are output to the exaggeration target video object selecting means 30 (step S31).

次に、画像変換装置１ａは、誇張対象映像オブジェクト選定手段３０によって、ステップＳ３１で映像オブジェクト検出手段２０によって検出された映像オブジェクトの中から誇張処理の対象となる映像オブジェクトを選定し、選定した映像オブジェクトの位置情報Πを映像オブジェクト領域切り出し手段４０と画像合成手段６０とに出力する（ステップＳ３２）。 Next, the image conversion apparatus 1a uses the exaggeration target video object selection unit 30 to select a video object to be exaggerated from the video objects detected by the video object detection unit 20 in step S31, and selects the selected video. The object position information Π is output to the video object area cutout means 40 and the image composition means 60 (step S32).

次に、画像変換装置１ａは、映像オブジェクト領域切り出し手段４０によって、ステップＳ３２で誇張対象映像オブジェクト選定手段３０によって選定された映像オブジェクトの位置情報Πを参照して、画像入力装置２から供給された入力画像Ｈから、当該映像オブジェクトを含む領域を部分画像Ｔとして切り出し、画像変倍手段１００ａの部分画像変倍手段５０に出力する（ステップＳ３３）。 Next, the image conversion device 1a is supplied from the image input device 2 by referring to the position information の of the video object selected by the exaggeration target video object selection unit 30 in step S32 by the video object region cutout unit 40. A region including the video object is cut out from the input image H as a partial image T, and is output to the partial image scaling unit 50 of the image scaling unit 100a (step S33).

次に、画像変換装置１ａは、画像変倍手段１００ａの部分画像変倍手段５０によって、ステップＳ３３で映像オブジェクト領域切り出し手段４０によって切り出された部分画像Ｔを変倍処理（拡大処理）して、変倍部分画像Ｕを作成し、画像合成手段６０に出力する（ステップＳ３４）。 Next, the image conversion apparatus 1a performs the scaling process (enlargement process) on the partial image T cut out by the video object area cutout unit 40 in step S33 by the partial image scaling unit 50 of the image scaling unit 100a. A zoomed partial image U is created and output to the image composition means 60 (step S34).

最後に、画像変換装置１ａは、画像合成手段６０によって、ステップＳ３２で誇張対象映像オブジェクト選定手段３０によって選定された位置情報Πを参照して、画像入力装置２から供給された入力画像Ｈに、ステップＳ３４で部分画像変倍手段５０によって作成された変倍部分画像Ｕを上書き合成して合成画像Ｇを作成し、画像表示装置３に出力する（ステップＳ３５）。 Finally, the image conversion device 1a refers to the position information 選定 selected by the exaggeration target video object selection unit 30 in step S32 by the image synthesis unit 60, and adds the input image H supplied from the image input device 2 to the input image H. In step S34, the scaled partial image U created by the partial image scaling unit 50 is overwritten and synthesized to create a composite image G, which is output to the image display device 3 (step S35).

なお、誇張処理すべき映像オブジェクトが複数ある場合には、ステップＳ３２で誇張対象映像オブジェクト選定手段３０によって誇張処理の対象となる映像オブジェクトに対応する位置情報Πが出力される毎に、ステップＳ３３〜ステップＳ３５を繰り返し実行する。ステップＳ３５において、画像合成手段６０に２番目以降の誇張対象の映像オブジェクトの変倍部分画像Ｕが入力されると、画像変換装置１ａは、画像合成手段６０によって、合成画像Ｇに対して順次に繰り返し変倍部分画像Ｕを上書き合成する。 When there are a plurality of video objects to be exaggerated, every time the position information 対応 corresponding to the video object to be exaggerated is output by the exaggeration target video object selecting unit 30 in step S32, the steps S33 to S33 are performed. Step S35 is repeatedly executed. In step S <b> 35, when the zoomed partial image U of the second and subsequent exaggerated video objects is input to the image composition unit 60, the image conversion device 1 a sequentially applies the composite image G to the composite image G. Overwrite and synthesize the rescaled partial image U repeatedly.

以上、説明したように、本発明の第２実施形態における画像変換装置１ａによれば、入力画像Ｈの全体に対しては解像度の変換は行わない場合において、例えば、入力画像Ｈが広範囲を撮影した映像であるときに、その中に登場する映像オブジェクトが小さくて視認性が悪いときでも、注目すべき特定の映像オブジェクトを選択的に誇張提示する画像に自動的に変換することができる。 As described above, according to the image conversion device 1a of the second embodiment of the present invention, when the resolution conversion is not performed on the entire input image H, for example, the input image H is captured over a wide range. Even when the video object appearing therein is small and the visibility is low, a specific video object to be noticed can be automatically converted into an image that is selectively exaggerated.

特に入力画像Ｈの全体の解像度を変換する必要のない用途に適し、全体画像を変倍する手段を省略できるため経済的に有利であり、コンピュータにプログラムを実行させて実現する場合には、入力画像Ｈの全体の変倍処理を省略できるため、コンピュータの処理負担を低減することができる。
また、映像オブジェクトの誇張処理の手法として、図５に示した非線形変倍の手法を用いるように構成してもよい。 It is particularly suitable for applications that do not require conversion of the overall resolution of the input image H, and is economically advantageous because the means for scaling the entire image can be omitted. Since the entire scaling process of the image H can be omitted, the processing load on the computer can be reduced.
Further, as a method of exaggerating the video object, the nonlinear scaling method shown in FIG. 5 may be used.

＜第３実施形態＞
［画像変換装置の構成］
次に、図１１を参照して、本発明の第３実施形態における画像変換装置１ｂの構成について説明する。図１１は、本発明の第３実施形態における画像変換装置の構成を示すブロック図である。 <Third Embodiment>
[Configuration of Image Conversion Device]
Next, the configuration of the image conversion apparatus 1b according to the third embodiment of the present invention will be described with reference to FIG. FIG. 11 is a block diagram showing a configuration of an image conversion apparatus according to the third embodiment of the present invention.

図１１に示した画像変換装置１ｂは、全体画像変倍手段１０を含む画像変倍手段１００ｂと、映像オブジェクト検出手段２０と、誇張対象映像オブジェクト選定手段３０と、映像オブジェクト領域切り出し手段４０と、画像合成手段６０とを含んで構成されている。
画像変換装置１ｂは、画像入力装置２から供給される入力画像Ｈを入力し、合成画像Ｇを作成して画像表示装置３に出力する。 An image conversion apparatus 1b shown in FIG. 11 includes an image scaling unit 100b including an entire image scaling unit 10, a video object detection unit 20, an exaggeration target video object selection unit 30, a video object area cutout unit 40, The image composition means 60 is included.
The image conversion device 1 b receives the input image H supplied from the image input device 2, creates a composite image G, and outputs it to the image display device 3.

次に、画像変換装置１ｂの各部の構成について説明する。
図１１に示した画像変換装置１ｂにおいて、図１に示した画像変換装置１と同じ符号を付した構成要素は同様の機能を果たすので、詳細な説明は適宜省略する。
なお、画像変換装置１ｂは、図１に示した第１実施形態における画像変換装置１の画像変倍手段１００から部分画像変倍手段５０を除いた構成である。 Next, the configuration of each unit of the image conversion apparatus 1b will be described.
In the image conversion apparatus 1b shown in FIG. 11, the components having the same reference numerals as those of the image conversion apparatus 1 shown in FIG. 1 perform the same functions, and thus detailed description thereof will be omitted as appropriate.
The image conversion apparatus 1b has a configuration in which the partial image scaling unit 50 is removed from the image scaling unit 100 of the image conversion apparatus 1 in the first embodiment shown in FIG.

画像変倍手段１００ｂの全体画像変倍手段１０は、画像入力装置２から供給される全体画像を入力画像Ｈとして入力し、変倍全体画像Ｌを画像合成手段６０に出力する。ここで、全体画像変倍手段１０は、入力画像Ｈを低解像度の画像表示装置３に適合するように、専ら縮小処理を行うものである。 The entire image scaling unit 10 of the image scaling unit 100 b inputs the entire image supplied from the image input device 2 as the input image H, and outputs the scaled entire image L to the image composition unit 60. Here, the entire image scaling unit 10 exclusively performs a reduction process so that the input image H is adapted to the low-resolution image display device 3.

第３実施形態における画像変換装置１ｂでは、前記したように、全体画像変倍手段１０は、入力画像Ｈを専ら縮小処理して、低解像度の画像表示装置３に適合する変倍全体画像Ｌを作成する。このため、画像変換装置１ｂは、図１に示した第１実施形態における画像変換装置１のように部分画像変倍手段５０によって部分画像Ｔを変倍処理（拡大処理）を行わずとも、部分画像Ｔは、変倍全体画像Ｌに対して相対的に拡大された画像となる。 In the image conversion apparatus 1b according to the third embodiment, as described above, the entire image scaling unit 10 exclusively reduces the input image H to obtain a scaled entire image L suitable for the low-resolution image display device 3. create. For this reason, the image conversion apparatus 1b does not perform the scaling process (enlargement process) on the partial image T by the partial image scaling unit 50 like the image conversion apparatus 1 in the first embodiment shown in FIG. The image T is an image that is enlarged relative to the entire zoom image L.

従って、画像変倍手段１００ｂは全体画像変倍手段１０によって、部分画像が相対的に拡大された全体画像（変倍全体画像Ｌ）と部分画像（部分画像Ｔ）とからなる画像対を生成することができる。 Therefore, the image scaling unit 100b generates an image pair composed of the whole image (magnification whole image L) and the partial image (partial image T) in which the partial image is relatively enlarged by the whole image scaling unit 10. be able to.

映像オブジェクト領域切り出し手段（部分画像切り出し手段）４０は、画像入力装置２から供給される入力画像Ｈと、誇張対象映像オブジェクト選定手段３０から順次に出力されるＷ個の位置情報Πとを入力し、各位置情報Πに対応するＷ個の部分画像Ｔを順次に画像合成手段６０に出力する。 The video object area cutout means (partial image cutout means) 40 inputs the input image H supplied from the image input device 2 and W position information Π sequentially output from the exaggeration target video object selection means 30. The W partial images T corresponding to each position information Π are sequentially output to the image composition means 60.

画像合成手段６０は、全体画像変倍手段１０から出力される変倍全体画像Ｌと、誇張対象映像オブジェクト選定手段３０から順次に出力されるＷ個の位置情報Πと、映像オブジェクト領域切り出し手段４０から順次に出力されるＷ個の部分画像Ｔとを入力し、合成画像Ｇを作成して、画像表示装置３に出力する。 The image synthesizing unit 60 includes a scaled whole image L output from the whole image scaling unit 10, W position information boxes sequentially output from the exaggeration target video object selecting unit 30, and a video object area clipping unit 40. The W partial images T that are sequentially output are input, a composite image G is created and output to the image display device 3.

画像合成手段６０は、全体画像変倍手段１０から出力された変倍全体画像Ｌに対して、映像オブジェクト領域切り出し手段４０から出力される部分画像Ｔを、誇張対象映像オブジェクト選定手段３０から出力される位置情報Πに対応する変倍全体画像Ｌの位置に上書きすることにより、合成画像Ｇを得る。部分画像Ｔが複数入力される場合には、それぞれの部分画像Ｔに対応する位置情報Πに基づいて、変倍全体画像Ｌに順次に上書き合成を行う。 The image synthesizing unit 60 outputs the partial image T output from the video object area clipping unit 40 to the exaggerated target video object selecting unit 30 for the scaled whole image L output from the whole image scaling unit 10. By overwriting the position of the entire scaled image L corresponding to the position information る, a composite image G is obtained. When a plurality of partial images T are input, overwriting synthesis is sequentially performed on the entire scaled image L based on the position information 対応 corresponding to each partial image T.

次に、図１２を参照（適宜図１１参照）して、第３実施形態における画像変換装置１ｂの動作について説明する。図１２は、図１１に示した本発明の第３実施形態における画像変換装置の処理の流れを示すフロー図である。 Next, the operation of the image conversion apparatus 1b in the third embodiment will be described with reference to FIG. FIG. 12 is a flowchart showing the flow of processing of the image conversion apparatus in the third embodiment of the present invention shown in FIG.

［画像変換装置の動作］
画像変換装置１ｂは、まず、映像オブジェクト検出手段２０によって、画像入力装置２から供給される入力画像Ｈから、予め定められた特定の種別の映像オブジェクトを検出し、検出した映像オブジェクトの種別情報Ｑおよび位置情報Ｐを誇張対象映像オブジェクト選定手段３０に出力する（ステップＳ４１）。 [Operation of image converter]
The image conversion device 1b first detects a predetermined specific type of video object from the input image H supplied from the image input device 2 by the video object detection means 20, and the type information Q of the detected video object. And the position information P are output to the exaggeration target video object selecting means 30 (step S41).

次に、画像変換装置１ｂは、誇張対象映像オブジェクト選定手段３０によって、ステップＳ４１で映像オブジェクト検出手段２０によって検出された映像オブジェクトの中から誇張処理の対象となる映像オブジェクトを選定し、選定した映像オブジェクトの位置情報Πを順次に映像オブジェクト領域切り出し手段４０と画像合成手段６０とに出力する（ステップＳ４２）。 Next, the image conversion apparatus 1b selects the video object to be exaggerated from the video objects detected by the video object detection unit 20 in step S41 by the exaggeration target video object selection unit 30, and selects the selected video. The object position information Π is sequentially output to the video object region cutout means 40 and the image composition means 60 (step S42).

次に、画像変換装置１ｂは、映像オブジェクト領域切り出し手段４０によって、ステップＳ４２で誇張対象映像オブジェクト選定手段３０によって選定された映像オブジェクトの位置情報Πを参照して、画像入力装置２から供給された入力画像Ｈから、当該映像オブジェクトを含む領域を部分画像Ｔとして切り出し、画像合成手段６０に出力する（ステップＳ４３）。 Next, the image conversion apparatus 1b is supplied from the image input apparatus 2 with reference to the position information の of the video object selected by the exaggeration target video object selection means 30 in step S42 by the video object area cutout means 40. A region including the video object is cut out as a partial image T from the input image H and output to the image composition means 60 (step S43).

画像変換装置１ｂは、前記したステップＳ４１からステップＳ４３までの処理をする一方で、画像変倍手段１００ｂの全体画像変倍手段１０によって、画像入力装置２から供給される入力画像Ｈの全体を変倍処理（縮小処理）し、低解像度の画像表示装置３の解像度に適合する変倍率の変倍全体画像Ｌを作成し、画像合成手段６０に出力する（ステップＳ４４）。 The image conversion apparatus 1b performs the processing from step S41 to step S43, while changing the entire input image H supplied from the image input apparatus 2 by the overall image scaling means 10 of the image scaling means 100b. A magnification process (reduction process) is performed to create a zoomed whole image L with a zoom ratio suitable for the resolution of the low-resolution image display device 3, and is output to the image composition means 60 (step S44).

ステップＳ４３の映像オブジェクト領域切り出し処理およびステップＳ４４の全体画像変倍処理の両方の処理が完了すると、画像変換装置１ｂは、画像合成手段６０によって、ステップＳ４２で誇張対象映像オブジェクト選定手段３０によって選定された位置情報Πを参照して、ステップＳ４４で全体画像変倍手段１０によって作成された変倍全体画像Ｌに、ステップＳ４３で映像オブジェクト領域切り出し手段６０によって切り出された部分画像Ｔを上書き合成して合成画像Ｇを作成し、画像表示装置３に出力する（ステップＳ４５）。 When both the video object region cutout process in step S43 and the entire image scaling process in step S44 are completed, the image conversion device 1b is selected by the exaggeration target video object selection means 30 in step S42 by the image composition means 60. Referring to the position information た, the partial image T cut out by the video object area cutout unit 60 in step S43 is overwritten and synthesized on the full-scale image L created by the whole image enlargement unit 10 in step S44. A composite image G is created and output to the image display device 3 (step S45).

なお、ここでは、ステップＳ４１からステップＳ４３までの処理とステップＳ４４の処理とを並列処理するようにしたが、ステップＳ４５の画像合成処理は、ステップＳ４３の映像オブジェクト領域切り出し処理とステップＳ４４の全体画像変倍処理とがともに完了した後に開始される。そのため、ステップＳ４１からステップＳ４３の処理とステップＳ４４の処理とをシーケンシャルに実行するようにしてもよい。このときステップＳ４１からステップＳ４３までの処理とステップＳ４４の処理とは、どちらを先に実行してもよく、ステップＳ４４の処理をステップＳ４１からステップＳ４３までの処理の途中に挿入して実行してもよい。 Here, the processing from step S41 to step S43 and the processing of step S44 are performed in parallel. However, the image composition processing of step S45 is the video object region cutout processing of step S43 and the entire image of step S44. It starts after the scaling process is completed. Therefore, the processing from step S41 to step S43 and the processing of step S44 may be executed sequentially. At this time, either the process from step S41 to step S43 or the process of step S44 may be executed first, and the process of step S44 is inserted and executed in the middle of the process from step S41 to step S43. Also good.

また、誇張処理すべき映像オブジェクトが複数ある場合には、ステップＳ４２で誇張対象映像オブジェクト選定手段３０によって誇張処理の対象となる映像オブジェクトに対応する位置情報Πが出力される毎に、ステップＳ４３とステップＳ４５とを繰り返し実行する。ここで、変倍全体画像Ｌは２度作成する必要はないため、ステップＳ４４の実行は１回限りである。ステップＳ４５において、画像合成手段６０に２番目以降の誇張対象の映像オブジェクトの部分画像Ｔが入力されると、画像変換装置１は、画像合成手段６０によって、合成画像Ｇに対して順次に繰り返し部分画像Ｔを上書き合成する。 If there are a plurality of video objects to be exaggerated, step S43 and step S43 each time position information corresponding to the video object to be exaggerated is output by the exaggeration target video object selecting unit 30. Step S45 is repeatedly executed. Here, since it is not necessary to create the entire zoom image L twice, the execution of step S44 is limited to once. In step S <b> 45, when the partial image T of the second and subsequent exaggerated video objects is input to the image composition unit 60, the image conversion apparatus 1 sequentially repeats the partial image with respect to the composite image G by the image composition unit 60. The image T is overwritten and synthesized.

以上説明したように、本発明の第３実施形態における画像変換装置１ｂによれば、高解像度の入力画像Ｈを、専ら縮小処理をして低解像度の画像表示装置３に適合した変倍全体画像Ｌに変換する場合には、入力画像Ｈから切り出した部分画像Ｔは、拡大処理をせずとも変倍全体画像Ｌに対して相対的に拡大された画像であるから、このような変倍全体画像Ｌに、部分画像Ｔを合成することで、部分画像Ｔに対応する映像オブジェクトが誇張処理された合成画像Ｇに変換することができる。 As described above, according to the image conversion device 1b of the third embodiment of the present invention, the high-resolution input image H is subjected to the reduction process exclusively and is subjected to the reduction processing, and the entire scaled image adapted to the low-resolution image display device 3 is used. In the case of conversion to L, the partial image T cut out from the input image H is an image that is enlarged relative to the entire variable magnification image L without performing an enlargement process. By synthesizing the partial image T with the image L, the video object corresponding to the partial image T can be converted into a synthetic image G obtained by exaggerating the video object.

また、部分画像Ｔを変倍する手段を省略できるため経済的に有利であり、コンピュータにプログラムを実行させて実現する場合には、部分画像Ｔの変倍処理を省略できるため、コンピュータの処理負担を低減することができる。 In addition, it is economically advantageous because the means for scaling the partial image T can be omitted, and when the program is executed by a computer, the scaling process for the partial image T can be omitted. Can be reduced.

本発明の第１実施形態における画像変換装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of an image conversion apparatus according to a first embodiment of the present invention. 本発明における入力画像の例を表す図である。It is a figure showing the example of the input image in this invention. 本発明の第１実施形態における画像変換装置の映像オブジェクト検出手段および誇張対象映像オブジェクト選定手段の構成を示すブロック図である。It is a block diagram which shows the structure of the video object detection means and the exaggeration object video object selection means of the image conversion apparatus in 1st Embodiment of this invention. 本発明の第１実施形態における画像変換装置による画像変換の様子を説明するための図である。It is a figure for demonstrating the mode of the image conversion by the image conversion apparatus in 1st Embodiment of this invention. 部分画像の非線形変倍の例を説明するための図であり、（ａ）は部分画像、（ｂ）は非線形変倍した変倍部分画像の様子を示す図である。It is a figure for demonstrating the example of the nonlinear scaling of a partial image, (a) is a partial image, (b) is a figure which shows the mode of the scaling partial image which carried out nonlinear scaling. 合成画像の例を示す図であり、（ａ）は変倍全体画像と線形変倍した変倍部分画像とを合成した合成画像を示し、（ｂ）は変倍全体画像と非線形変倍した変倍部分画像とを合成した合成画像を示す図である。It is a figure which shows the example of a synthetic | combination image, (a) shows the synthetic | combination image which synthesize | combined the magnification-change whole image and the linear magnification change partial image, (b) shows the magnification change whole image and the non-linear magnification change. It is a figure which shows the synthesized image which synthesize | combined the double partial image. 本発明の第１実施形態における画像変換装置の処理の流れを示すフロー図である。It is a flowchart which shows the flow of a process of the image converter in 1st Embodiment of this invention. 本発明の第１実施形態における画像変換装置の映像オブジェクト検出手段および誇張対象映像オブジェクト選定手段の処理の流れを示すフロー図である。It is a flowchart which shows the flow of a process of the video object detection means and the exaggeration object video object selection means of the image conversion apparatus in 1st Embodiment of this invention. 本発明の第２実施形態における画像変換装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image converter in 2nd Embodiment of this invention. 本発明の第２実施形態における画像変換装置の処理の流れを示すフロー図である。It is a flowchart which shows the flow of a process of the image converter in 2nd Embodiment of this invention. 本発明の第３実施形態における画像変換装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image converter in 3rd Embodiment of this invention. 本発明の第３実施形態における画像変換装置の処理の流れを示すフロー図である。It is a flowchart which shows the flow of a process of the image converter in 3rd Embodiment of this invention.

Explanation of symbols

１，１ａ，１ｂ画像変換装置
２画像入力装置
３画像表示装置
１０全体画像変倍手段
２０映像オブジェクト検出手段
２１_１第１特定映像オブジェクト検出手段
２１_２第２特定映像オブジェクト検出手段
２１_Ｍ第Ｍ特定映像オブジェクト検出手段
３０誇張対象映像オブジェクト選定手段（映像オブジェクト選定手段）
３１_１第１決定論理（決定論理手段）
３１_２第２決定論理（決定論理手段）
３１_Ｎ第Ｎ決定論理（決定論理手段）
３２対象選定手段
４０映像オブジェクト領域切り出し手段（部分画像切り出し手段）
５０部分画像変倍手段
６０画像合成手段
１００，１００ａ，１００ｂ画像変倍手段
Ｇ合成画像
Ｈ入力画像
Ｌ変倍全体画像
Ｍ_１，Ｍ_２映像オブジェクト
Ｏ_１，Ｏ_２，Ｏ_３映像オブジェクト
Ｐ，Ｐ^（１），…，Ｐ^（Ｍ）位置情報
Ｑ，Ｑ^（１），…，Ｑ^（Ｍ）種別情報
Ｔ，Ｔ_１，Ｔ_２部分画像
Ｕ，Ｕ_１，Ｕ_２，Ｕ_２’ 変倍部分画像
Π，Π^（１），…，Π^（Ｎ）位置情報

1, 1a, 1b Image conversion device 2 Image input device 3 Image display device 10 Whole image scaling unit 20 Video object detection unit 21 ₁ First specific video object detection unit 21 ₂ Second specific video object detection unit 21 _Mth M specific Video object detection means 30 Exaggeration target video object selection means (video object selection means)
31 ₁ First decision logic (decision logic means)
31 ₂ 2nd decision logic (decision logic means)
31 _Nth Nth decision logic (decision logic means)
32 Target selection means 40 Video object area cutout means (partial image cutout means)
50 partial image scaling unit 60 the image synthesizing unit 100, 100a, 100b image scaling unit G composite image H input image L zooming whole image M _1, _{M 2} video objects O _1, _O 2, _{O 3} video object P, P ⁽¹⁾ ,..., P ^(M) Position information Q, Q ⁽¹⁾ ,..., Q ^(M) Type information T, T ₁ , T ₂ Partial images U, U ₁ , U ₂ , U ₂ ′ Image Π, Π ⁽¹⁾ , ..., Π ^(N) Location information

Claims

An image conversion device that converts an input image into an image obtained by relatively enlarging a specific video object in the input image,
Video object detection for detecting one or more types of video objects of a predetermined type from the input image and outputting type information including the type of the detected video object and position information indicating the detected position for each detected video object Means,
A video that selects a video object that meets a predetermined condition from the detected video objects as the specific video object based on the type information and position information of the video object detected by the video object detection means. Object selection means,
Partial image cutout means for cutting out a partial image including the specific video object from the input image;
Scaling at least one of the input image or the partial image to generate an image pair composed of the entire image and the partial image in which the partial image is relatively enlarged with respect to the entire area of the input image Image scaling means to
Image combining means for combining the whole image and the partial image forming the image pair;
Equipped with a,
The video object selecting means selects a video object that meets a predetermined condition from the detected video objects based on the type information and position information of the video object detected by the video object detecting means. A plurality of decision logic means for determining as candidates for the specific video object; and a target selection means for selecting the specific video object from the candidates for the specific video object determined by the plurality of decision logic means; an image conversion apparatus characterized by comprising a.

The image conversion apparatus according to claim 1, wherein the image scaling unit includes an entire image scaling unit that reduces the entire input image.

The image conversion apparatus according to claim 1, wherein the image scaling unit includes a partial image scaling unit that scales the partial image.

The partial image scaling means sets the partial image to the same scaling factor as the other video object at a junction with the other video object that is not the specific video object in the input image synthesized by the image synthesis means, While creating a zoomed partial image that is nonlinearly scaled so that the zooming factor increases as the distance from the joint increases, the image composition means uses the joint as a reference point to form an image pair with the zoomed partial image. The image conversion apparatus according to claim 3, wherein the image and the zoomed partial image are synthesized.

Computer
The image conversion program for functioning as an image conversion apparatus as described in any one of Claims 1-4 .

An image conversion method for converting an input image into an image obtained by relatively enlarging a specific video object in the input image,
One or more types of video objects of a predetermined type are detected from the input image by video object detection means, and for each detected video object, type information including the type of the detected video object and position information indicating the detected position are obtained. A video object detection step to output;
Based on the type information and position information of the video object detected in the video object detection step, a video object that meets a predetermined condition is selected from the detected video objects by the video object selection means. Video object selection step to select as an object,
A partial image cutout step of cutting out a partial image including the specific video object from the input image by a partial image cutout means;
An image pair composed of an entire image and a partial image in which the partial image is relatively enlarged with respect to the entire area of the input image by scaling at least one of the input image or the partial image. An image scaling step generated by the scaling means;
An image combining step of combining the input image and the partial image forming the image pair by an image combining means;
Only including,
In the video object selection step, based on the type information and position information of the video object detected in the video object detection step, a video object that meets a predetermined condition is selected from the detected video objects. A plurality of decision logic steps for determining as candidates for the specific video object; and a target selection step for selecting the specific video object from the candidates for the specific video object determined by the plurality of decision logic steps; The image conversion method characterized by including these.