JP5047381B1

JP5047381B1 - VIDEO GENERATION DEVICE, VIDEO DISPLAY DEVICE, TELEVISION RECEIVER, VIDEO GENERATION METHOD, AND COMPUTER PROGRAM

Info

Publication number: JP5047381B1
Application number: JP2011130600A
Authority: JP
Inventors: 健明末永; 正宏塩井; 健史筑波; 敦稔〆野
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2011-06-10
Filing date: 2011-06-10
Publication date: 2012-10-10
Anticipated expiration: 2031-06-10
Also published as: JP2013004989A; WO2012169217A1

Abstract

【課題】２次元映像に基づいて自然な３次元映像を生成する映像生成装置、映像表示装置、テレビ受像装置、映像生成方法及びコンピュータプログラムを提供する。
【解決手段】複数の画素によって構成される２次元映像の各画素に対応する奥行を示す奥行値を算出して３次元映像を生成する映像生成装置において、２次元映像を構成する各画素の画素値と各画素に隣接する画素の画素値との差分を算出する第１差分算出部、第１差分算出部で算出した差分が所定の第1閾値以上である画素を抽出する第1抽出部、２次元映像を構成する各画素の一のフレームにおける画素値と他のフレームにおける画素値との差分を算出する第２差分算出部、第２差分算出部で算出した差分が所定の第２閾値以上である画素を抽出する第２抽出部及び第１抽出部及び第２抽出部の抽出結果に基づいて２次元映像の各画素に対応する奥行値を算出する奥行算出部を備える。
【選択図】図９A video generation device, a video display device, a television receiver, a video generation method, and a computer program for generating a natural three-dimensional video based on a two-dimensional video are provided.
SOLUTION: In a video generation apparatus that generates a three-dimensional video by calculating a depth value corresponding to each pixel of a two-dimensional video constituted by a plurality of pixels, pixels of each pixel constituting the two-dimensional video A first difference calculation unit that calculates a difference between a value and a pixel value of a pixel adjacent to each pixel; a first extraction unit that extracts a pixel whose difference calculated by the first difference calculation unit is equal to or greater than a predetermined first threshold; A second difference calculation unit that calculates a difference between a pixel value in one frame of each pixel constituting the two-dimensional video and a pixel value in another frame, and the difference calculated by the second difference calculation unit is equal to or greater than a predetermined second threshold value And a depth calculation unit that calculates a depth value corresponding to each pixel of the two-dimensional video based on the extraction results of the second extraction unit, the first extraction unit, and the second extraction unit.
[Selection] Figure 9

Description

本発明は、２次元映像から３次元映像を生成する映像生成装置、映像表示装置、テレビ受像装置、映像生成方法及びコンピュータプログラムに関する。 The present invention relates to a video generation device, a video display device, a television receiver, a video generation method, and a computer program that generate a 3D video from a 2D video.

３次元映像を表示する表示装置が普及しつつあるが、３次元映像の量はまだ少なく、また、２次元で収録した映像を３次元映像として視聴したいという要望もある。 Although display devices that display 3D video are becoming widespread, the amount of 3D video is still small, and there is also a demand for viewing video recorded in 2D as 3D video.

そこで２次元映像から疑似的な３次元映像を生成することが考えられるが、そのためには２次元映像に適切な奥行の情報を与えなければならない。例えば特許文献１には、２次元映像全体の構図から、画面全体について設定されている所定の３種類の基本奥行モデルを合成する比率を算出し、合成された基本奥行モデルに基づいて３次元映像の奥行値を２次元映像に与え、３次元映像を生成する映像生成装置が記載されている。 Therefore, it is conceivable to generate a pseudo 3D video from the 2D video. For this purpose, appropriate depth information must be given to the 2D video. For example, in Patent Document 1, a ratio for combining three predetermined basic depth models set for the entire screen is calculated from the composition of the entire two-dimensional image, and the three-dimensional image is based on the combined basic depth model. Describes a video generation apparatus that gives a depth value of 2 to a 2D video and generates a 3D video.

特開２００９−４４７２２号公報JP 2009-44722 A

しかし、特許文献１における映像生成装置では画面全体に予め設定された基本奥行モデルを２次元映像に当てはめるのみであるため、奥行値が不適切なことがあり、不自然な３次元映像を生成することがあった。 However, since the image generation apparatus in Patent Document 1 only applies a basic depth model set in advance to the entire screen to the 2D image, the depth value may be inappropriate, and an unnatural 3D image is generated. There was a thing.

本発明は係る事情によりなされたものであり、自然な３次元映像の生成を目的とする映像生成装置、映像表示装置、テレビ受像装置、映像生成方法及びコンピュータプログラムを提供する。 The present invention has been made under such circumstances, and provides a video generation device, a video display device, a television receiver, a video generation method, and a computer program for generating natural 3D video.

本発明における映像生成装置は、複数の画素によって構成される２次元映像の各画素に対応する奥行を示す奥行値を算出して３次元映像を生成する映像生成装置において、前記２次元映像を構成する各画素の画素値と該各画素に隣接する画素の画素値との差分を算出する第１差分算出部、該第１差分算出部で算出した差分が所定の第１閾値以上である画素を該画素の位置情報と共に抽出する第１抽出部、前記２次元映像を構成する各画素の一のフレームにおける画素値と他のフレームにおける画素値との差分を算出する第２差分算出部、該第２差分算出部で算出した差分が所定の第２閾値以上である画素を該画素の位置情報と共に抽出する第２抽出部、前記第１抽出部及び第２抽出部により抽出された画素の数又は位置情報に基づいて、前記２次元映像の構図のタイプが、予め定められている複数種類のタイプのいずれであるかを判別する判別部及び前記第１差分算出部又は第２差分算出部の算出結果と、前記判別部が判別した構図のタイプとに基づいて前記２次元映像の各画素に対応する奥行値を算出する奥行算出部を備えることを特徴とする。 According to another aspect of the present invention, there is provided a video generation device that generates a three-dimensional video by calculating a depth value indicating a depth corresponding to each pixel of a two-dimensional video composed of a plurality of pixels. A first difference calculation unit that calculates a difference between a pixel value of each pixel and a pixel value of a pixel adjacent to each pixel, and a pixel in which the difference calculated by the first difference calculation unit is a predetermined first threshold value or more A first extraction unit for extracting together with the position information of the pixels; a second difference calculation unit for calculating a difference between a pixel value in one frame of each pixel constituting the two-dimensional video and a pixel value in another frame; A second extraction unit that extracts a pixel whose difference calculated by the two difference calculation unit is equal to or greater than a predetermined second threshold together with position information of the pixel, the number of pixels extracted by the first extraction unit and the second extraction unit, or Based on location information, The discriminating unit for discriminating whether the composition type of the three-dimensional video is any of a plurality of predetermined types, the calculation result of the first difference calculating unit or the second difference calculating unit, and the discriminating unit discriminate And a depth calculation unit that calculates a depth value corresponding to each pixel of the two-dimensional video based on the composition type .

本発明によれば、２次元映像の一部の画素を抽出し、抽出結果に基づいて奥行値を算出するので、自然な３次元映像を生成することができる。 According to the present invention, a part of pixels of a 2D image is extracted, and a depth value is calculated based on the extraction result, so that a natural 3D image can be generated.

本発明における映像生成装置は、前記第１差分算出部で算出した差分が前記第１閾値より大きい所定の第３閾値以上である画素を抽出する第３抽出部及び該第３抽出部で抽出した各画素が存在する位置の分散の程度を示す位置の分散値を算出する分散値算出部をさらに備え、前記奥行算出部は、前記分散値算出部で算出した位置の分散値と、前記第１抽出部又は第２抽出部で抽出された画素の総数とに基づいて前記奥行値を算出するよう構成してあることを特徴とする。 The video generation device according to the present invention extracts a pixel whose difference calculated by the first difference calculation unit is equal to or greater than a predetermined third threshold greater than the first threshold, and the third extraction unit extracts the pixel. A dispersion value calculation unit that calculates a dispersion value of a position indicating a degree of dispersion of a position where each pixel exists, and the depth calculation unit includes the dispersion value of the position calculated by the dispersion value calculation unit, and the first value The depth value is calculated based on the total number of pixels extracted by the extraction unit or the second extraction unit.

本発明によれば、画素の位置の分散値に基づいて奥行値を算出するので、さらに２次元映像に応じた自然な３次元映像を生成することができる。 According to the present invention, since the depth value is calculated based on the variance value of the pixel position, a natural 3D image corresponding to the 2D image can be generated.

本発明における映像生成装置は、前記第１差分算出部で算出した差分の分散値を算出する第２分散値算出部をさらに備え、前記奥行算出部は、前記第２分散値算出部で算出した各画素の差分の分散値と、前記第１抽出部又は第２抽出部で抽出された画素の総数とに基づいて前記奥行値を算出するよう構成してあることを特徴とする。 The video generation apparatus according to the present invention further includes a second variance value calculation unit that calculates a variance value of the difference calculated by the first difference calculation unit, and the depth calculation unit is calculated by the second variance value calculation unit. The depth value is calculated on the basis of the variance value of the difference of each pixel and the total number of pixels extracted by the first extraction unit or the second extraction unit.

本発明によれば、各画素の差分の分散値に基づいて奥行値を算出するので、さらに２次元映像に応じた自然な３次元映像を生成することができる。 According to the present invention, since the depth value is calculated based on the variance value of the difference of each pixel, a natural three-dimensional image corresponding to the two-dimensional image can be generated.

本発明における映像表示装置は、上述の映像生成装置及び該映像生成装置により生成された前記３次元映像を表示する表示部を備えることを特徴とする。 A video display device according to the present invention includes the above-described video generation device and a display unit that displays the 3D video generated by the video generation device.

本発明によれば、前述した効果を映像表示装置にて実現することができる。 According to the present invention, the above-described effects can be realized by the video display device.

本発明におけるテレビ受像装置は、２次元映像を含むテレビ放送波を受信するチューナ部、該チューナ部により受信した２次元映像に基づき３次元映像を生成する請求項１から３のいずれか１つに記載の映像生成装置及び該映像生成装置により生成された前記３次元映像を表示する表示部を備えることを特徴とする。 The television receiver according to the present invention includes a tuner unit that receives a television broadcast wave including a two-dimensional image, and generates a three-dimensional image based on the two-dimensional image received by the tuner unit. And a display unit for displaying the three-dimensional video generated by the video generation device.

本発明によれば、前述した効果をテレビ受像装置にて実現することができる。 According to the present invention, the above-described effects can be realized by a television receiver.

本発明における映像生成方法は、複数の画素によって構成される２次元映像の各画素に対応する奥行を示す奥行値を算出して３次元映像を生成する映像生成方法において、前記２次元映像を構成する各画素の画素値と該各画素に隣接する画素の画素値との差分を算出する第１差分算出ステップ、該第１差分算出ステップで算出した差分が所定の第１閾値以上である画素を該画素の位置情報と共に抽出する第１抽出ステップ、前記２次元映像を構成する各画素の一のフレームにおける画素値と他のフレームにおける画素値との差分を算出する第２差分算出ステップ、該第２差分算出ステップで算出した差分が所定の第２閾値以上である画素を該画素の位置情報と共に抽出する第２抽出ステップ、前記第１抽出ステップ及び第２抽出ステップにより抽出された画素の数又は位置情報に基づいて、前記２次元映像の構図のタイプが、予め定められている複数種類のタイプのいずれであるかを判別する判別ステップ及び前記第１差分算出ステップ又は第２差分算出ステップの算出結果と、前記判別ステップが判別した構図のタイプとに基づいて前記２次元映像の各画素に対応する奥行値を算出する奥行算出ステップを含む処理を行うことを特徴とする。 According to another aspect of the present invention, there is provided a video generation method for generating a three-dimensional video by calculating a depth value indicating a depth corresponding to each pixel of a two-dimensional video composed of a plurality of pixels. A first difference calculating step for calculating a difference between a pixel value of each pixel to be performed and a pixel value of a pixel adjacent to each pixel, and a pixel having a difference calculated in the first difference calculating step being a predetermined first threshold value or more. A first extraction step for extracting together with position information of the pixels; a second difference calculating step for calculating a difference between a pixel value in one frame of each pixel constituting the two-dimensional video and a pixel value in another frame; second extraction step in which the difference calculated in 2 difference calculation step extracts pixels is above a predetermined second threshold value with the position information of the pixel, in the first extraction step and second extraction step A determination step of determining whether a composition type of the 2D video is a plurality of predetermined types based on the number of extracted pixels or position information; and the first difference calculation step or And performing a process including a depth calculation step of calculating a depth value corresponding to each pixel of the two-dimensional video based on the calculation result of the second difference calculation step and the composition type determined by the determination step. To do.

本発明におけるコンピュータプログラムは、複数の画素によって構成される該２次元映像の各画素に対応する奥行を示す奥行値を算出して３次元映像を生成させるコンピュータプログラムにおいて、前記２次元映像を構成する各画素の画素値と該各画素に隣接する画素の画素値との差分を算出する第１差分算出ステップ、該第１差分算出ステップで算出した差分が所定の第１閾値以上である画素を該画素の位置情報と共に抽出する第１抽出ステップ、前記２次元映像を構成する各画素の一のフレームにおける画素値と他のフレームにおける画素値との差分を算出する第２差分算出ステップ、該第２差分算出ステップで算出した差分が所定の第２閾値以上である画素を該画素の位置情報と共に抽出する第２抽出ステップ、前記第１抽出ステップ及び第２抽出ステップにより抽出された画素の数又は位置情報に基づいて、前記２次元映像の構図のタイプが、予め定められている複数種類のタイプのいずれであるかを判別する判別ステップ及び前記第１差分算出ステップ又は第２差分算出ステップの算出結果と、前記判別ステップが判別した構図のタイプとに基づいて前記２次元映像の各画素に対応する奥行値を算出する奥行算出ステップを含む処理をコンピュータに実行させることを特徴とする。 A computer program according to the present invention is a computer program for generating a three-dimensional video by calculating a depth value indicating a depth corresponding to each pixel of the two-dimensional video constituted by a plurality of pixels, and forming the two-dimensional video. the first difference calculation step of calculating a difference between pixel values of pixels adjacent to the pixel values and the respective pixels of each pixel, difference calculated by the first difference calculation step the pixels is above a predetermined first threshold value A first extraction step for extracting together with pixel position information; a second difference calculating step for calculating a difference between a pixel value in one frame of each pixel constituting the two-dimensional video and a pixel value in another frame; second extraction step in which the difference calculated in the difference calculation step extracts pixels is above a predetermined second threshold value with the position information of the pixel, the first extraction step And a determination step for determining whether the composition type of the two-dimensional image is one of a plurality of predetermined types based on the number or position information of the pixels extracted in the second extraction step, and Processing including a depth calculation step of calculating a depth value corresponding to each pixel of the two-dimensional video based on the calculation result of the first difference calculation step or the second difference calculation step and the composition type determined by the determination step Is executed by a computer.

本発明によれば、算出した２次元映像の各画素のエッジ強度が所定の閾値以上である画素を抽出し、また算出した２次元映像の１のフレームにおける画素値と他のフレームにおける画素値との差分が所定の閾値以上である画素を抽出し、それらの抽出結果に基づいて奥行値を算出するので、自然な３次元映像を提供することができる。 According to the present invention, a pixel whose edge intensity of each pixel of the calculated 2D video is equal to or greater than a predetermined threshold is extracted, and the pixel value in one frame of the calculated 2D video and the pixel value in another frame are Since a pixel whose difference is equal to or greater than a predetermined threshold is extracted and the depth value is calculated based on the extraction result, a natural three-dimensional image can be provided.

第１の実施の形態における映像生成装置により生成される映像を説明するための説明図である。It is explanatory drawing for demonstrating the image | video produced | generated by the video production | generation apparatus in 1st Embodiment. 第１の実施の形態におけるテレビ受像装置の一構成例を示すブロック図である。It is a block diagram which shows the example of 1 structure of the television receiver in 1st Embodiment. 制御部１が処理を行う２次元映像を示す概念図である。It is a conceptual diagram which shows the two-dimensional image | video which the control part 1 processes. ラプラシアンフィルタを表す説明図である。It is explanatory drawing showing a Laplacian filter. エッジ強度平均及び強エッジ画素情報を示す模式図である。It is a schematic diagram which shows edge strength average and strong edge pixel information. 高差分情報に基づく２値映像を示す模式図である。It is a schematic diagram which shows the binary image | video based on high difference information. 第ｎフレームの映像を領域ごとに区分した映像を示す模式図である。It is a schematic diagram which shows the image | video which divided the image | video of the nth frame for every area | region. モフォロジー処理を示す説明図である。It is explanatory drawing which shows a morphology process. 各構図における前景と背景とを示す説明図である。It is explanatory drawing which shows the foreground and background in each composition. ３次元映像を表示部で表示するための所定の方式の例を示す模式図である。It is a schematic diagram which shows the example of the predetermined | prescribed system for displaying a three-dimensional image | video on a display part. 第１の実施の形態における制御部の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the control part in 1st Embodiment. 制御部が強エッジ画素情報を算出する処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in which a control part calculates strong edge pixel information. 制御部が動物体構成画素情報を算出する処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in which a control part calculates moving body structure pixel information. 制御部が２次元映像の構図を判別する処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in which a control part discriminate | determines the composition of a two-dimensional image. 制御部が奥行値を再設定する処理手順を示すフローチャートである。It is a flowchart which shows the process sequence which a control part resets a depth value. 制御部が３次元映像を生成する処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in which a control part produces | generates a three-dimensional image | video. 第２の実施の形態における制御部の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the control part in 2nd Embodiment. 第２の実施の形態における制御部が奥行値を設定する処理手順を示したフローチャートである。It is the flowchart which showed the process sequence in which the control part in 2nd Embodiment sets a depth value. 第３の実施の形態における制御部が構図を判別する処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in which the control part in 3rd Embodiment discriminate | determines a composition. 第４の実施の形態におけるテレビ受像装置のハードウェア各部を示すブロック図である。It is a block diagram which shows each hardware part of the television receiver in 4th Embodiment.

第１の実施の形態
以下、第１の実施の形態を図を用いて説明する。図１は第１の実施の形態における映像生成装置により生成される映像を説明するための説明図である。 First Embodiment Hereinafter, a first embodiment will be described with reference to the drawings. FIG. 1 is an explanatory diagram for explaining a video generated by the video generation device according to the first embodiment.

本実施の形態における映像生成装置は、図１Ａに示す鳥、木、太陽、空、雲、地面等の被写体により構成された２次元映像から３次元映像を生成するにあたり、一旦、２次元映像のエッジ強度及び動きの大きさに基づいて前景とその他の背景とを区分する。ここで前景とは、映像中で人が注目しやすい特徴量が高い領域と定義し、背景とは入力映像中の前景以外の領域と定義する。 When generating a 3D image from a 2D image composed of subjects such as birds, trees, the sun, the sky, clouds, and the ground shown in FIG. The foreground and other backgrounds are distinguished based on the edge strength and the magnitude of movement. Here, the foreground is defined as a region having a high feature amount that is easily noticed by a person in the video, and the background is defined as a region other than the foreground in the input video.

本実施の形態では、人が注目しやすい特徴量として、エッジの強度及び動きの大小を示す数値を用い、その数値が一定値以上か否かで前景か背景かを判定する。判定を行うことにより、図１Ｂに示すようにエッジが強い又は動きが大きい領域である鳥、及びエッジが強い領域である木が前景となり、背景と区分される。 In this embodiment, numerical values indicating the strength of the edge and the magnitude of the movement are used as feature quantities that are easily noticed by a person, and whether the foreground or the background is determined by whether the numerical values are equal to or greater than a certain value. By performing the determination, as shown in FIG. 1B, a bird having a strong edge or a large movement area and a tree having a strong edge area become the foreground and are distinguished from the background.

次に２次元映像の奥行値を設定する。まず２次元映像の全ての画素について、所定の計算により算出した基本奥行値を設定する。続いて、被写体の形状、大きさ又は動き等に合わせて設定された所定の奥行パタンに基づいて前景の奥行値を別途算出し、算出した奥行値を再設定する。設定及び再設定した奥行値に基づいて、視差の付いた３次元映像である左目用映像Ｌ及び右目用映像Ｒを生成する。 Next, the depth value of the 2D image is set. First, the basic depth value calculated by a predetermined calculation is set for all the pixels of the two-dimensional image. Subsequently, the depth value of the foreground is separately calculated based on a predetermined depth pattern set according to the shape, size, movement, etc. of the subject, and the calculated depth value is reset. Based on the set and reset depth values, a left-eye video L and a right-eye video R, which are 3D video with parallax, are generated.

また、図１Ｃに示すように動きの大きい領域である鳥を表す前景とその他の領域とを区分し、動きの強い領域について奥行値を再設定して３次元映像を生成する。 Also, as shown in FIG. 1C, the foreground representing the bird, which is a region having a large movement, is divided into other regions, and the depth value is reset for the region having a strong movement to generate a three-dimensional image.

奥行値の再設定の方法は、エッジ強度の強い画素の位置の分散値及び前景を構成する画素数等によって判別する。 The method of resetting the depth value is determined by the variance value of the position of the pixel having a strong edge strength, the number of pixels constituting the foreground, and the like.

本実施の形態における映像生成装置、映像表示装置及びテレビ受像装置の構成について説明する。本実施の形態におけるテレビ受像装置及び映像表示装置は映像生成装置を含む。図２は、第１の実施の形態におけるテレビ受像装置の一構成例を示すブロック図である。 Configurations of the video generation device, the video display device, and the television receiver in this embodiment will be described. The television receiver and the video display device in this embodiment include a video generation device. FIG. 2 is a block diagram illustrating a configuration example of the television receiver in the first embodiment.

制御部１は、ＣＰＵ（Central Processing Unit）又はＭＰＵ（Micro Processing Unit）等で構成され、映像のデータを処理できる演算回路を備え、ＲＯＭ２に予め格納されている制御プログラムを適宜ＲＡＭ３に読み出して実行する。その他、制御部１は、ＲＯＭ２、ＲＡＭ３、映像記憶部４、入力部５及び出力部６の動作を制御する。なおこれらは、それぞれバスを介して相互に接続されている。 The control unit 1 is composed of a CPU (Central Processing Unit) or MPU (Micro Processing Unit) and the like, and includes an arithmetic circuit capable of processing video data. The control program stored in advance in the ROM 2 is read out to the RAM 3 as appropriate and executed. To do. In addition, the control unit 1 controls operations of the ROM 2, RAM 3, video storage unit 4, input unit 5, and output unit 6. These are connected to each other via a bus.

ＲＯＭ２は、書き込み及び消去可能なＥＰＲＯＭ（Erasable Programmable ROM）又はフラッシュメモリ等で構成され、映像生成装置が動作するために必要な種々の制御プログラムを予め格納してある。 The ROM 2 is composed of a writable and erasable EPROM (Erasable Programmable ROM), a flash memory, or the like, and stores in advance various control programs necessary for the operation of the video generation apparatus.

ＲＡＭ３はＳＲＡＭ又はフラッシュメモリ等であり、制御部１による制御プログラムの実行時に発生する種々のデータを一時的に記憶する。 The RAM 3 is an SRAM or a flash memory, and temporarily stores various data generated when the control unit 1 executes a control program.

制御部１は、放送波の２次元映像が図示しない放送波受信アンテナから、放送波受信用のチューナ７に接続された入力部５を介して入力された場合、２次元映像のデータをＲＡＭ又はフラッシュメモリ等で構成される映像記憶部４に記憶する。２次元映像は、記録メディアを再生する再生装置から入力される映像でもよく、通信装置から入力される通信波の映像等でもよい。また、ＭＰＥＧ−２(Moving Picture Expert Group phase2)、ＭＰＥＧ−４(Moving Picture Expert Group phase4) 又はＨ．２６４等の圧縮された形式でもよく、非圧縮の形式でもよい。 When a two-dimensional image of a broadcast wave is input from a broadcast wave receiving antenna (not shown) via an input unit 5 connected to a tuner 7 for receiving a broadcast wave, the control unit 1 stores the data of the two-dimensional image in a RAM or The image is stored in the video storage unit 4 composed of a flash memory or the like. The two-dimensional video may be a video input from a playback device that plays back a recording medium, a video of a communication wave input from a communication device, or the like. MPEG-2 (Moving Picture Expert Group phase 4), MPEG-4 (Moving Picture Expert Group phase 4), or H.264. A compressed format such as H.264 may be used, or an uncompressed format may be used.

制御部１は、映像記憶部４に記憶された２次元映像のデータを順次読み出す。２次元映像が地上デジタル放送又はＢＳデジタル放送等の映像である場合は限定受信方式（B-CAS : BS Conditional Access System）の標準暗号であるＭＵＬＴＩ２を復号する。２次元映像が圧縮された映像である場合は伸長し、アナログ映像である場合には、例えばＭＰＥＧ−２ＴＳのデジタル映像に変換する。 The control unit 1 sequentially reads out the 2D video data stored in the video storage unit 4. When the two-dimensional video is a video such as terrestrial digital broadcast or BS digital broadcast, MULTI2 which is a standard encryption of a conditional access system (B-CAS: BS Conditional Access System) is decrypted. If the 2D video is a compressed video, the video is decompressed. If the 2D video is an analog video, the video is converted into, for example, an MPEG-2TS digital video.

制御部１は、後述する処理を行って３次元映像を生成する。また、３次元映像を表示部１０で表示するための所定の方式に変換し、出力部６を介して、映像を表示する表示部１０へ出力する。 The control unit 1 generates a 3D image by performing processing described later. Also, the 3D video is converted into a predetermined method for displaying on the display unit 10 and is output to the display unit 10 for displaying the video via the output unit 6.

表示部１０は、テレビ、情報処理端末のモニタの他、携帯電話機、ＰＤＡ、ブックリーダ、ゲーム機又は音楽プレーヤのディスプレイ等であり、後述する処理により生成された映像を含む３次元映像を表示する。表示部１０は、３次元映像の他に２次元映像も表示することができ、この場合は２次元映像又は３次元映像を表示するよう映像を切り替える図示しない映像切替部を備える。 The display unit 10 is a display of a mobile phone, a PDA, a book reader, a game machine, or a music player in addition to a monitor of a television and an information processing terminal, and displays a three-dimensional image including an image generated by processing to be described later. . In addition to the 3D video, the display unit 10 can also display a 2D video. In this case, the display unit 10 includes a video switching unit (not shown) that switches the video to display the 2D video or the 3D video.

制御部１について説明する。図３は制御部１が処理を行う２次元映像を示す概念図であり、時系列の第（ｎ−１）フレーム、第ｎフレーム及び第（ｎ＋１）フレームの映像を含む。 The control unit 1 will be described. FIG. 3 is a conceptual diagram showing a two-dimensional image that the control unit 1 performs processing, and includes time-series images of the (n−1) th frame, the nth frame, and the (n + 1) th frame.

各フレームの画素数は例えば１９２０×１０８０である。フレームの左上の点を原点とし、ｘ列ｙ行の位置における画素を（ｘ，ｙ）で表す。制御部１は２次元映像の各画素から、隣接する画素の濃淡の差が急激である画素を示す強エッジ画素を抽出する。 The number of pixels in each frame is, for example, 1920 × 1080. A pixel at the position of x columns and y rows is represented by (x, y) with the upper left point of the frame as the origin. The control unit 1 extracts, from each pixel of the two-dimensional image, a strong edge pixel that indicates a pixel in which the difference in shading between adjacent pixels is abrupt.

第ｎフレームにおける画素（ｘ，ｙ）の画素値のＲＧＢ成分のうち、Ｒ成分をＳｎＲ（ｘ，ｙ）、Ｇ成分をＳｎＧ（ｘ，ｙ）、Ｂ成分をＳｎＢ（ｘ，ｙ）とする。制御部１は式（１）に示すように、ＳｎＲ（ｘ，ｙ）、ＳｎＧ（ｘ，ｙ）、ＳｎＢ（ｘ，ｙ）にデジタル放送におけるＩＳＤＢ−Ｔ方式の輝度信号と同様の重み付けをした加重平均を取ることにより、グレースケール値Ｇｎ（ｘ，ｙ）を算出する。 Of the RGB components of the pixel value of the pixel (x, y) in the nth frame, the R component is SnR (x, y), the G component is SnG (x, y), and the B component is SnB (x, y). . As shown in Expression (1), the control unit 1 weights SnR (x, y), SnG (x, y), and SnB (x, y) in the same manner as the luminance signal of the ISDB-T system in digital broadcasting. A gray scale value Gn (x, y) is calculated by taking a weighted average.

制御部１は、Ｇｎ（ｘ，ｙ）についてラプラシアンフィルタによる処理を行う。図４はラプラシアンフィルタを表す説明図である。ラプラシアンフィルタはグレースケール値Ｇｎ（ｘ，ｙ）にＧｎ（ｘ，ｙ）を２次微分した値Ｇ”ｎ（ｘ，ｙ）を加えることにより映像のエッジを先鋭にする処理である。 The control unit 1 performs processing using a Laplacian filter for Gn (x, y). FIG. 4 is an explanatory diagram showing a Laplacian filter. The Laplacian filter is a process for sharpening the edge of an image by adding a value G ″ n (x, y) obtained by second-order differentiation of Gn (x, y) to a gray scale value Gn (x, y).

図４Ａに示すように例えば注目画素とその８近傍の画素からなる３×３の画素を考え、注目画素のグレースケール値Ｇｎ（ｘ，ｙ）をα０、８近傍の画素値をα１からα８とする。この画素に対し図４Ｂに示すように８方向のラプラシアンフィルタで処理を行う場合、２次微分値Ｇ”ｎ（ｘ，ｙ）は式（２）で示される。 As shown in FIG. 4A, for example, consider a 3 × 3 pixel composed of a pixel of interest and eight neighboring pixels, and the grayscale value Gn (x, y) of the pixel of interest is α0, and the pixel values near eight are α1 to α8. To do. When this pixel is processed by an eight-direction Laplacian filter as shown in FIG. 4B, the second-order differential value G ″ n (x, y) is expressed by Expression (2).

制御部１は、ラプラシアンフィルタで処理を行うと、行方向及び列方向に隣接する画素の差分を各々算出し、各々の差分の２乗の和の平方根を算出する。このエッジ強度Ｅｎ（ｘ，ｙ）を全ての画素について算出する。 When processing is performed with the Laplacian filter, the control unit 1 calculates the difference between pixels adjacent in the row direction and the column direction, and calculates the square root of the sum of the squares of the differences. This edge strength En (x, y) is calculated for all pixels.

画素（x，ｙ）が列方向にｊ番目で行方向にｋ番目の５×５のブロックに属する場合、ブロックを構成する２５画素のエッジ強度Ｅｎ（ｘ，ｙ）の平均値をＥａｖｅ（ｊ，ｋ）で表す。なお、ｘは５ｊ−４から５ｊ列までの画素に属し、ｙは５ｋ−４から５ｋ行までの画素に属するので、ｘとｊ、ｙとｋには（４）及び（５）式が成り立つ。但し、ｊは０から３８４までの整数、ｋは０から２１６までの整数である。 When a pixel (x, y) belongs to a 5 × 5 block that is j-th in the column direction and k-th in the row direction, the average value of the edge intensities En (x, y) of 25 pixels constituting the block is calculated as Eave (j , K). Since x belongs to pixels from 5j-4 to 5j columns and y belongs to pixels from 5k-4 to 5k rows, equations (4) and (5) hold for x and j and y and k. . However, j is an integer from 0 to 384, and k is an integer from 0 to 216.

制御部１は、フレーム全体を５×５画素のブロックに分割し、Ｅａｖｅ（ｊ，ｋ）を算出する。このように５×５画素のブロックに分割する理由は処理を簡易にし、また、ノイズを低減させるためである。 The control unit 1 divides the entire frame into blocks of 5 × 5 pixels and calculates Eave (j, k). The reason for dividing the block into 5 × 5 pixel blocks is to simplify the process and reduce noise.

図５はエッジ強度平均Ｅａｖｅ（ｊ，ｋ）及び強エッジ画素情報Ｈｎ（ｘ，ｙ）を示す模式図である。図５Ａはブロック単位に分割した２次元映像を示した模式図である。ブロックａ１は鳥を表す映像を構成するブロックであり隣接する画素との画素値の差分が大きいので、エッジ強度平均Ｅａｖｅ（ｊ，ｋ）の値が大きい。一方、ブロックａ２は空を構成するブロックであり、隣接する画素との画素値の差分が小さいのでエッジ強度平均Ｅａｖｅ（ｊ，ｋ）の値は小さい。 FIG. 5 is a schematic diagram showing edge intensity average Eave (j, k) and strong edge pixel information Hn (x, y). FIG. 5A is a schematic diagram illustrating a two-dimensional image divided into blocks. The block a1 is a block constituting an image representing a bird and has a large difference in pixel value from adjacent pixels, so that the value of the edge intensity average Eave (j, k) is large. On the other hand, the block a2 is a block constituting the sky, and since the difference between the pixel values of adjacent pixels is small, the value of the edge intensity average Eave (j, k) is small.

制御部１は、（６）及び（７）式に示すようにエッジ強度平均Ｅａｖｅ（ｊ，ｋ）が予め設定された閾値Ｔｈ０以上である画素は強エッジ画素情報Ｈｎ（ｘ，ｙ）を１とし、閾値Ｔｈ０未満である画素は強エッジ画素情報Ｈｎ（ｘ，ｙ）を０とする。 As shown in the equations (6) and (7), the control unit 1 sets the strong edge pixel information Hn (x, y) to 1 for pixels whose edge intensity average Eave (j, k) is equal to or greater than a preset threshold Th0. And the pixel having a threshold value less than Th0 has the strong edge pixel information Hn (x, y) set to 0.

強エッジ画素情報Ｈｎ（ｘ，ｙ）は後述する前景と背景との区分において用いられる。従って閾値Ｔｈ０はエッジの強さの観点から２次元映像を前景と背景とに区分するのに適した値に設定してある。 The strong edge pixel information Hn (x, y) is used in the foreground and background classification described later. Therefore, the threshold value Th0 is set to a value suitable for classifying a two-dimensional image into foreground and background from the viewpoint of edge strength.

これにより、強エッジ画素情報Ｈｎ（ｘ，ｙ）の値からエッジの強い画素である強エッジ画素を抽出することができる。図５Ｂでは強エッジ画素情報Ｈｎ（ｘ，ｙ）が１である画素を白色、０である画素を黒色で示してある。 Thereby, the strong edge pixel which is a pixel with a strong edge can be extracted from the value of the strong edge pixel information Hn (x, y). In FIG. 5B, a pixel whose strong edge pixel information Hn (x, y) is 1 is shown in white, and a pixel whose 0 is 0 is shown in black.

なお、強エッジ画素情報Ｈｎ（ｘ，ｙ）を算出する手順は上述した手順に限らない。例えば、ブロック中の全画素のうち、閾値Ｔｈ０以上のエッジ強度を持つ画素数が予め定められた数以上あるか否かによって強エッジ画素情報Ｈｎ（ｘ，ｙ）を算出してもよい。 The procedure for calculating the strong edge pixel information Hn (x, y) is not limited to the procedure described above. For example, the strong edge pixel information Hn (x, y) may be calculated based on whether or not the number of pixels having an edge intensity equal to or greater than the threshold Th0 among all the pixels in the block is greater than or equal to a predetermined number.

続いて、制御部１は、動きが大きい２次元映像を示す画素である動物体構成画素を抽出する。動きが大きい２次元映像は、画素値の変化が大きいと考えられる。従って、前後のフレームの画素値と比較して画素値の変化が大きい画素を、動きの大きい画素と考える。 Subsequently, the control unit 1 extracts a moving object constituent pixel that is a pixel indicating a two-dimensional image with a large movement. A two-dimensional image with a large movement is considered to have a large change in pixel value. Therefore, a pixel having a large change in pixel value compared to the pixel values of the previous and subsequent frames is considered as a pixel having a large movement.

動物体構成画素情報Ｍｎ（ｘ，ｙ）を算出する手順について説明する。まず、（８）式に示すように、既に算出したグレースケール値Ｇｎ（ｘ，ｙ）及び第（ｎ−１）フレームの画素の画素値をグレースケール変換したＧｎ−１（ｘ，ｙ）の差分の絶対値である絶対差分情報Ｄｎ（ｘ，ｙ）を算出する。 A procedure for calculating the moving object constituent pixel information Mn (x, y) will be described. First, as shown in the equation (8), the already calculated gray scale value Gn (x, y) and Gn−1 (x, y) obtained by performing gray scale conversion on the pixel value of the pixel in the (n−1) th frame. Absolute difference information Dn (x, y) which is an absolute value of the difference is calculated.

続いて絶対差分情報Ｄｎ（ｘ，ｙ）が予め設定された閾値Ｔｈ１以上か否かを判定し、差分の絶対値が大きい画素であるか否かを示す高差分情報Ｂｎ（ｘ，ｙ）を算出する。（９）及び（１０）式に示すように閾値Ｔｈ１以上である画素は高差分情報Ｂｎ（ｘ，ｙ）を１とし、閾値Ｔｈ１未満である画素は高差分情報Ｂｎ（ｘ，ｙ）を０とする。高差分情報Ｂｎ（ｘ，ｙ）は後述する前景と背景との区分において用いられる。従って閾値Ｔｈ１は２次元映像における動きの大きさの観点から前景と背景とに区分するのに適した値に設定してある。 Subsequently, it is determined whether or not the absolute difference information Dn (x, y) is greater than or equal to a preset threshold value Th1, and high difference information Bn (x, y) indicating whether or not the pixel has a large absolute value of the difference. calculate. As shown in the equations (9) and (10), the pixel having the threshold value Th1 or more sets the high difference information Bn (x, y) to 1, and the pixel having the threshold value Th1 less than 0 sets the high difference information Bn (x, y) to 0. And The high difference information Bn (x, y) is used in the foreground and background classification described later. Therefore, the threshold value Th1 is set to a value suitable for distinguishing between the foreground and the background from the viewpoint of the magnitude of motion in the two-dimensional video.

図６は高差分情報Ｂｎ（ｘ，ｙ）及びＢｎ＋１（ｘ，ｙ）に基づく２値映像を示す模式図であり、図６Ａは高差分情報Ｂｎ（ｘ，ｙ）が１である画素を白色、０である画素を黒色で示してある。 FIG. 6 is a schematic diagram showing a binary video based on the high difference information Bn (x, y) and Bn + 1 (x, y). FIG. 6A shows a pixel whose high difference information Bn (x, y) is 1 in white , 0 pixels are shown in black.

次に、第（ｎ＋１）フレームについても前述した（８）（９）及び（１０）式と同様の演算を行い、絶対差分情報Ｄｎ＋１（ｘ，ｙ）を算出する。また、絶対差分情報Ｄｎ＋１（ｘ，ｙ）に基づいて高差分情報Ｂｎ＋１（ｘ，ｙ）を算出する。図６Ｂは高差分情報Ｂｎ＋１（ｘ，ｙ）が１である画素を白色、０である画素を黒色で示してある。 Next, with respect to the (n + 1) th frame, the same calculation as the above-described equations (8), (9) and (10) is performed to calculate the absolute difference information Dn + 1 (x, y). Further, high difference information Bn + 1 (x, y) is calculated based on the absolute difference information Dn + 1 (x, y). FIG. 6B shows pixels whose high difference information Bn + 1 (x, y) is 1 in white and pixels where 0 is 0 in black.

最後に（１１）式に示すとおり、高差分領域Ｂｎ（ｘ，ｙ）及びＢｎ＋１（ｘ，ｙ）の論理積である動物体構成画素情報Ｍｎ（ｘ，ｙ）を算出する。 Finally, as shown in the equation (11), the moving object constituent pixel information Mn (x, y), which is the logical product of the high difference regions Bn (x, y) and Bn + 1 (x, y), is calculated.

動物体構成画素情報Ｍｎ（ｘ，ｙ）が１となる画素は画素値の変化が大きく、動物体構成画素情報Ｍｎ（ｘ，ｙ）が０となる画素は画素値の変化が小さい。従って動物体構成画素情報Ｍｎ（ｘ，ｙ）の値から動物体構成画素を抽出することができる。 A pixel in which the moving object constituent pixel information Mn (x, y) is 1 has a large change in pixel value, and a pixel in which the moving object constituent pixel information Mn (x, y) is 0 has a small change in pixel value. Therefore, the moving object constituent pixels can be extracted from the value of the moving object constituent pixel information Mn (x, y).

図６Ｃは動物体構成画素情報Ｍｎ（ｘ，ｙ）が１である領域を白色、０である領域を黒色で示してある。 FIG. 6C shows a region where the moving object constituent pixel information Mn (x, y) is 1 in white and a region where 0 is 0 in black.

２次元映像における動きの大きい画素又はエッジの強い画素は、２次元映像を視る者が注目する被写体を示す画素であり、前景を構成すると考えられる。一方、その他の画素は背景を構成すると考えられる。 A pixel with a large motion or a strong edge in a two-dimensional image is a pixel that indicates a subject that is viewed by a person viewing the two-dimensional image, and is considered to constitute a foreground. On the other hand, other pixels are considered to constitute the background.

制御部１は、動物体構成画素情報Ｍｎ（ｘ，ｙ）を求めた後、前景と背景とを区分し、前景を抽出するために、ラベリング処理を行う。制御部１は強エッジ領域情報Ｈｎ（ｘ，ｙ）が１である画素が連なった群に同じラベルを付ける。動物体構成画素情報Ｍｎ（ｘ，ｙ）が１である画素が連なった群にも同様の処理を行う。これにより前景を構成する画素を領域として抽出することができる。 After obtaining the moving object constituent pixel information Mn (x, y), the control unit 1 performs a labeling process to segment the foreground and the background and extract the foreground. The control unit 1 attaches the same label to a group of pixels in which the strong edge region information Hn (x, y) is 1. The same processing is performed for a group of pixels having the moving object constituent pixel information Mn (x, y) of 1 in series. Thereby, the pixels constituting the foreground can be extracted as a region.

図７は第ｎフレームの映像を領域ごとに区分した映像を示す模式図である。制御部１は図７Ａに示すように強エッジ領域情報Ｈｎ（ｘ，ｙ）が１となる領域ｂ１からｂ４と、強エッジ領域情報Ｅｎ（ｘ，ｙ）が０となる領域ｂ５とに区分し、図７Ｂは動物体構成画素情報Ｍｎ（ｘ，ｙ）の値が１となる領域ｂ６と強エッジ領域情報Ｈｎ（ｘ，ｙ）が０となる領域ｂ７とに区分する。 FIG. 7 is a schematic diagram showing an image obtained by dividing the image of the nth frame for each region. As shown in FIG. 7A, the control unit 1 classifies the strong edge area information Hn (x, y) into areas b1 to b4 where the strong edge area information Hn (x, y) is 1 and the area b5 where the strong edge area information En (x, y) is 0. FIG. 7B is divided into a region b6 where the value of the moving object constituent pixel information Mn (x, y) is 1 and a region b7 where the strong edge region information Hn (x, y) is 0.

次に、（１２）式に示すとおり、強エッジ領域情報Ｈｎ（ｘ，ｙ）及び動物体構成画素情報Ｍｎ（ｘ，ｙ）の論理和を取ることにより、前景情報Ｐｎ（ｘ，ｙ）を算出する。 Next, as shown in the equation (12), the foreground information Pn (x, y) is obtained by calculating the logical sum of the strong edge region information Hn (x, y) and the moving object constituent pixel information Mn (x, y). calculate.

前景情報Ｐｎ（ｘ，ｙ）が１である画素は、強エッジ領域情報Ｈｎ（ｘ，ｙ）が１となるｂ１からｂ４を構成する画素又は動物体構成画素情報Ｍｎ（ｘ，ｙ）が１となるｂ６を構成する画素であり、図７Ｃにて、前景情報Ｐｎ（ｘ，ｙ）が１である領域を白色、０である領域を黒色で示してある。 Pixels for which foreground information Pn (x, y) is 1 have pixels in b1 to b4 having strong edge region information Hn (x, y) of 1 or moving body constituent pixel information Mn (x, y) of 1 In FIG. 7C, the area where the foreground information Pn (x, y) is 1 is shown in white, and the area where 0 is shown in black.

なお、上記処理を行う際、所望の前景すべてが抽出されない場合がある。例えば、被写体である「人の顔」にピントがあっている顔のアップが撮影されている場合、人が注目しやすい領域は、ピントが合った「人の顔」全体であり、これを前景として抽出するのが望ましい。しかし、エッジがはっきりしている輪郭は前景として検出されやすいが、一方でエッジが比較的弱い頬や額等の一部は、背景と判断されやすい。従って本来前景であると判定されるべき「人の顔」を構成する画素であるにもかかわらず、一部背景と判定された領域が穴のように存在する場合がある。 Note that when performing the above processing, not all of the desired foreground may be extracted. For example, when a close-up of a person's face that is in focus is being shot, the area that is easy for people to focus on is the entire person's face that is in focus. It is desirable to extract as However, a contour with a clear edge is easily detected as a foreground, but a portion such as a cheek or a forehead with a relatively weak edge is easily determined as a background. Therefore, there are cases where an area determined to be part of the background exists like a hole even though it is a pixel constituting a “human face” that should be determined to be the foreground.

こうした場合に対処するため、制御部１は、適時モフォロジー処理(Morphological Operations)を行う。図８はモフォロジー処理を示す説明図である。説明のため、０または１の２値で表される７×７の画素を考える。 In order to cope with such a case, the control unit 1 performs timely morphology processing (Morphological Operations). FIG. 8 is an explanatory diagram showing morphology processing. For the sake of explanation, consider a 7 × 7 pixel represented by a binary value of 0 or 1.

図８Ａのように画素値が１である画素の中に画素値０の画素がある場合、図８Ｂのように、一旦画素値が１である画素の４近傍にある画素の画素値を１に変更する膨張（Dilate）処理を行う。すると画素値が１である画素の群の中に存在した画素の画素値が０から１へと変更される。その後、図８Ｃに示すように、膨張させた画素値が１である画素の群における周辺を構成する画素の画素値を１から０へと変更する収縮（Erode）処理を行う。 When there is a pixel with a pixel value of 0 among the pixels with a pixel value of 1 as shown in FIG. 8A, the pixel value of a pixel once in the vicinity of 4 of the pixel with a pixel value of 1 is set to 1 as shown in FIG. Dilate processing to be changed is performed. Then, the pixel value of the pixel existing in the group of pixels having a pixel value of 1 is changed from 0 to 1. Thereafter, as shown in FIG. 8C, a contraction (Erode) process is performed in which the pixel values of the pixels constituting the periphery in the group of pixels having the expanded pixel value of 1 are changed from 1 to 0.

このモフォロジー処理により、穴のように存在する背景と判定された箇所を適切に前景とすることができる。なお、穴のように存在する背景と判定された箇所が大きい場合は、複数回膨張処理及び収縮処理を行うことにより前景とすることができる。 By this morphology processing, a portion determined as a background such as a hole can be appropriately set as the foreground. In addition, when the location determined as the background which exists like a hole is large, it can be set as a foreground by performing expansion processing and contraction processing a plurality of times.

続いて、制御部１は、２次元映像の構図を判別する。エッジの強い画素又は動きが大きい画素からなる領域は、映像の視聴者が注目する被写体が映っている領域であると考えてよい。制御部１は、エッジの強い画素又は動きが大きい画素の数または配置から、２次元映像における各被写体の大きさ、動き又は配置を判定し、映像の構図が複数のタイプのいずれであるかを判別する。 Subsequently, the control unit 1 determines the composition of the two-dimensional video. An area composed of pixels with strong edges or pixels with a large movement may be considered as an area in which a subject that is viewed by the viewer of the video is shown. The control unit 1 determines the size, movement, or arrangement of each subject in the two-dimensional image from the number or arrangement of pixels with strong edges or pixels with large movement, and determines whether the composition of the image is of a plurality of types. Determine.

本実施の形態では、構図には俯瞰から風景全体を被写体として収めた「俯瞰」タイプ、特定の被写体をクローズアップした「アップ」タイプ、及び前記「俯瞰」と「アップ」の中間であり、前景と背景をバランスよく収めた「バランス」タイプの３タイプがあると考える。 In the present embodiment, the composition is an “overhead” type in which the entire landscape is captured as an object from an overhead view, an “up” type in which a specific subject is close-up, and an intermediate between the “overhead” and “up”, and the foreground I think that there are three types of "balance" type with a well-balanced background.

「バランス」タイプは、人や動物又は建物等の主要な被写体が前景とそれ以外の背景が同程度存在している映像のタイプを指し、前景の画素数は比較的少ない。図９は各構図における前景と背景とを示す説明図である。図９Ａは「バランス」タイプの映像を示す説明図であり、前景情報Ｐｎ（ｘ，ｙ）が１となる画素を白色、０となる画素を黒色で示している。 The “balance” type refers to an image type in which a main subject such as a person, an animal, or a building has the same amount of foreground and other background, and the number of pixels in the foreground is relatively small. FIG. 9 is an explanatory diagram showing the foreground and background in each composition. FIG. 9A is an explanatory diagram showing a “balance” type image, in which foreground information Pn (x, y) is 1 in white, and 0 is in black.

一方、「アップ」タイプにおける被写体（例えば、人や動物の顔のアップ）は映像全体に占める割合が非常に大きいので、前景の画素数は比較的多い。図９Ｂは「アップ」タイプの映像を示す説明図であり、前景情報Ｐｎ（ｘ，ｙ）が１となる画素を白色、０となる画素を黒色で示している。 On the other hand, the subject in the “up” type (for example, the close-up of the face of a person or an animal) occupies a very large proportion of the entire image, so the number of foreground pixels is relatively large. FIG. 9B is an explanatory diagram showing an “up” type image, in which foreground information Pn (x, y) is 1 in white, and 0 is in black.

また、「俯瞰」タイプにおける前景（例えば、街並み等の風景）の画素数も、「アップ」タイプ同様に比較的多いと判断される。図９Ｃは「俯瞰」タイプの映像を示す説明図であり、前景情報Ｐｎ（ｘ，ｙ）が１となる画素を白色、０となる画素を黒色で示している。 Further, it is determined that the number of pixels of the foreground (for example, a landscape such as a cityscape) in the “overhead” type is relatively large as in the “up” type. FIG. 9C is an explanatory diagram showing a “bird's-eye view” type image, in which foreground information Pn (x, y) is 1 in white and 0 is in black.

そして、映像の構図が被写体のアップである場合は、被写体のアップが映る領域にエッジの強い画素が集中する。一方、映像の構図が俯瞰である場合は、多数の被写体が点在しているので、エッジの強い画素はフレーム全体に点在している。従って、エッジの強い画素の位置は「俯瞰」タイプの方が「アップ」タイプより分散している傾向が強い。 Then, when the composition of the video is up of the subject, pixels with strong edges are concentrated in a region where the up of the subject is reflected. On the other hand, when the composition of the video is a bird's-eye view, since many subjects are scattered, pixels with strong edges are scattered throughout the frame. Therefore, the positions of pixels with strong edges tend to be more dispersed in the “overhead” type than in the “up” type.

制御部１が構図を判別する手順について説明する。制御部１は前景の画素を計数する。計数した結果、前景の画素数が予め設定された閾値Ｔｈ２未満である場合は、構図は「バランス」タイプであると判別する。閾値Ｔｈ２は前景を構成する画素の総数が、「アップ」タイプ及び「俯瞰」タイプの二者と、「バランス」タイプとを判別するのに適した値に設定してある。 A procedure for the control unit 1 to determine the composition will be described. The control unit 1 counts the foreground pixels. If the number of foreground pixels is less than a preset threshold value Th2 as a result of the counting, it is determined that the composition is a “balance” type. In the threshold Th2, the total number of pixels constituting the foreground is set to a value suitable for discriminating between the “up” type and the “overhead” type and the “balance” type.

前景の画素数が閾値Ｔｈ２以上である場合は、エッジ強度Ｅｎ（ｘ，ｙ）の値が予め設定された閾値Ｔｈ３以上である画素を抽出する。閾値Ｔｈ３（＞Ｔｈ０）は、映像の特徴的な被写体を構成する画素とその他の画素とを判別するのに適した値に設定してある。 When the number of foreground pixels is equal to or greater than the threshold Th2, pixels whose edge intensity En (x, y) is equal to or greater than a preset threshold Th3 are extracted. The threshold value Th3 (> Th0) is set to a value suitable for discriminating between pixels constituting a characteristic subject of the video and other pixels.

抽出したエッジ強度Ｅｎ（ｘ，ｙ）の値が閾値Ｔｈ３以上である画素の座標の位置についての分散値σ１を算出する。
エッジ強度Ｅｎ（ｘ，ｙ）の値が閾値Ｔｈ３以上である画素が（ｘ１，ｙ１）、（ｘ２，ｙ２）、…、（ｘｍ，ｙｍ）のｍ個ある場合、分散値σ１を以下の（１３）式によって算出する。 A variance value σ1 is calculated for the coordinate position of the pixel whose extracted edge strength En (x, y) is equal to or greater than the threshold Th3.
When there are m pixels (x1, y1), (x2, y2),..., (Xm, ym) whose edge intensity En (x, y) is greater than or equal to the threshold Th3, the variance value σ1 is set to the following ( 13) Calculated by the equation.

制御部１は、分散値σ１が閾値Ｔｈ４以上か否かを判定する。制御部１は、分散値σ１が予め設定された閾値Ｔｈ４以上である場合は、構図が「アップ」タイプであると判別する。分散値σ１が閾値Ｔｈ４未満である場合は、構図が「俯瞰」タイプであると判別する。 The control unit 1 determines whether or not the variance value σ1 is greater than or equal to the threshold value Th4. When the variance value σ1 is equal to or greater than the preset threshold Th4, the control unit 1 determines that the composition is an “up” type. When the variance value σ1 is less than the threshold Th4, it is determined that the composition is the “overhead” type.

閾値Ｔｈ４は、映像の構図が「アップ」タイプか「俯瞰」タイプかを判別するために適切な値に設定してある。 The threshold value Th4 is set to an appropriate value for determining whether the composition of the video is the “up” type or the “overhead” type.

制御部１は、２次元映像の構図を判別した後、消失点及び消失線を算出し、算出した消失点及び消失線に基づいて、さらに２次元映像の基本奥行値を算出する。 After determining the composition of the 2D video, the control unit 1 calculates a vanishing point and a vanishing line, and further calculates a basic depth value of the 2D video based on the calculated vanishing point and vanishing line.

まず、消失点及び消失線を算出する手順について説明する。制御部１は、エッジ強度Ｅｎ（ｘ，ｙ）の大きい画素を抽出する。エッジ強度Ｅｎ（ｘ，ｙ）の大きい画素は前述したエッジ強度Ｅｎ（ｘ，ｙ）の値が閾値Ｔｈ３以上である画素と同じ画素である。 First, the procedure for calculating the vanishing point and the vanishing line will be described. The control unit 1 extracts a pixel having a large edge intensity En (x, y). A pixel having a large edge strength En (x, y) is the same pixel as the pixel having a value of the edge strength En (x, y) equal to or greater than the threshold Th3.

制御部１は、エッジ強度Ｅｎ（ｘ，ｙ）の大きい画素のうち任意の１画素を選択して、その８近傍に在るエッジ強度Ｅｎ（ｘ，ｙ）の大きい画素を選択する。続いて、直近に選択した画素の８近傍にエッジ強度Ｅｎ（ｘ，ｙ）の大きい画素がある場合は、そのエッジ強度Ｅｎ（ｘ，ｙ）の大きい画素を選択する。８近傍にエッジ強度Ｅｎ（ｘ，ｙ）の大きい画素が複数ある場合は、最初に選択した１画素と最も遠い画素を選択する、という手順を繰り返す。 The control unit 1 selects any one of the pixels having a large edge strength En (x, y), and selects a pixel having a large edge strength En (x, y) in the vicinity of the eight pixels. Subsequently, when there is a pixel having a large edge strength En (x, y) in the vicinity of 8 of the most recently selected pixels, a pixel having a large edge strength En (x, y) is selected. If there are a plurality of pixels having a large edge strength En (x, y) in the vicinity of 8, the procedure of selecting the pixel farthest from the first selected pixel is repeated.

このような手順により、選択した画素が線分状に並んだ群となった場合は、制御部１は画素の群の回帰直線を算出する。画素の群が２つ以上ある場合は、それら全ての回帰直線を算出する。 When the selected pixels are grouped in line segments by such a procedure, the control unit 1 calculates a regression line of the group of pixels. When there are two or more groups of pixels, the regression lines for all of them are calculated.

算出した回帰直線のうち、任意の２本の回帰直線がなす角度が所定の角度以下の組み合わせがある場合は、２本の回帰直線は消失線であり、算出した複数の回帰直線の交点は消失点である。消失点及び消失線を算出した場合、制御部１は、基本奥行値を算出する。基本奥行値は０からｄの値であり、消失点にあたる画素の奥行値をｄとする。 Of the calculated regression lines, if there is a combination in which the angle formed by any two regression lines is less than or equal to the predetermined angle, the two regression lines are vanishing lines, and the intersection of the calculated regression lines is vanishing Is a point. When the vanishing point and the vanishing line are calculated, the control unit 1 calculates a basic depth value. The basic depth value is a value from 0 to d, and the depth value of the pixel corresponding to the vanishing point is d.

消失点はフレーム内に限らず、フレーム外に存在する場合もある。また、消失点から最も遠い位置にある画素（ｘ，ｙ）の奥行値が最も小さい。基本奥行値は各画素（ｘ，ｙ）の消失点からの距離を算出し、基本奥行値は消失点からの距離に反比例するように設定する。 The vanishing point is not limited to within the frame, but may exist outside the frame. Further, the depth value of the pixel (x, y) located farthest from the vanishing point is the smallest. The basic depth value calculates the distance from the vanishing point of each pixel (x, y), and the basic depth value is set to be inversely proportional to the distance from the vanishing point.

基本奥行値は他にも、構図やグレースケール値Ｇｎ（ｘ，ｙ）等に基づいて算出するようにしてもよい。例えば、水平線を境に海と空が映っている映像であれば、海と空とを構成する空間毎に異なる設定方法で奥行値を算出することが望ましく、一方、室内の映像であれば、映像全体について一律の設定方法でよい。従って、映像に応じて異なる基本奥行値の設定方法を行うことが望ましい。 In addition, the basic depth value may be calculated based on the composition, the gray scale value Gn (x, y), or the like. For example, if the image shows the sea and the sky with the horizon as the boundary, it is desirable to calculate the depth value by a different setting method for each space that constitutes the sea and the sky. A uniform setting method may be used for the entire video. Therefore, it is desirable to perform different basic depth value setting methods depending on the video.

この場合、消失点以外の画素（ｘ，ｙ）の奥行値を算出するため、制御部１は、予めＲＯＭ２に記憶させてあるテンプレートを読み出す。テンプレートは消失点の奥行値をｄとし、また、消失点以外の２次元映像全体の各画素（ｘ，ｙ）における奥行値が設定されている。また、テンプレートは構図や、２次元映像の各画素のグレースケール値Ｇｎ（ｘ，ｙ）のヒストグラム等に関連付けられている。 In this case, in order to calculate the depth value of the pixel (x, y) other than the vanishing point, the control unit 1 reads a template stored in advance in the ROM 2. In the template, the depth value of the vanishing point is set to d, and the depth value at each pixel (x, y) of the entire two-dimensional video other than the vanishing point is set. The template is associated with a composition, a histogram of gray scale values Gn (x, y) of each pixel of the 2D video, and the like.

制御部１は、ＲＯＭ２から読み出したテンプレートのうち、構図やグレースケール値Ｇｎ（ｘ，ｙ）等を用いて最適なテンプレートを選択し、消失点の位置を選択したテンプレートに一致させ、消失点以外の画素（ｘ，ｙ）の奥行値を算出する。 The control unit 1 selects an optimum template from the templates read from the ROM 2 using the composition, the gray scale value Gn (x, y), etc., matches the vanishing point position with the selected template, and other than the vanishing point. The depth value of the pixel (x, y) is calculated.

制御部１は、基本奥行値を設定すると、構図に基づいて奥行値を再設定する。以下、奥行値を再設定する手順について説明する。
制御部１は、判別した結果、構図が「バランス」タイプである場合は、各前景ｂ１からｂ４及びｂ６にて、各前景を構成する画素（ｘ，ｙ）のうち、ｙの値が最も大きい１画素の基本奥行値及び予め設定された奥行値のパタンに従って前景の奥行値を再設定する。 When the basic depth value is set, the control unit 1 resets the depth value based on the composition. Hereinafter, a procedure for resetting the depth value will be described.
As a result of the determination, if the composition is a “balance” type, the value of y is the largest among the pixels (x, y) constituting each foreground in each foreground b1 to b4 and b6. The foreground depth value is reset according to the basic depth value of one pixel and the pattern of the preset depth value.

ＲＯＭ２には諸々の２次元の形状に奥行値が設定されてある複数のパタンのデータが記録されている。パタンの形状は、鳥、木、雲、人等を模した形状や、球形又は直方体等を２次元に投影した形状等があり、形状に合わせた奥行値が設定されている。 The ROM 2 stores data of a plurality of patterns in which depth values are set in various two-dimensional shapes. The shape of the pattern includes a shape imitating a bird, a tree, a cloud, a person, or the like, a shape obtained by projecting a spherical shape or a rectangular parallelepiped two-dimensionally, and the like, and a depth value corresponding to the shape is set.

再設定にあたって、制御部１は、ＲＯＭ２よりパタンを読み出し、各前景ｂ１からｂ４及びｂ６毎にそれぞれの形状、大きさ又は動き等に基づいてパタンとのマッチングを行い、適切なパタンを選択する。 In resetting, the control unit 1 reads a pattern from the ROM 2, performs matching with the pattern on the basis of the shape, size, movement, or the like for each of the foregrounds b1 to b4 and b6, and selects an appropriate pattern.

ここで、パタンに設定されてある奥行値はパタン内の相対的な奥行値を示すだけであるので、２次元映像に奥行値を再設定するにあたっては、各前景の基準となる２次元映像の奥行値が必要になる。そこで、各前景ｂ１からｂ４及びｂ６毎に前景を構成する画素（ｘ，ｙ）のうちｙの値が最も大きい１画素の基本奥行値を基準値とし、各前景ｂ１からｂ４及びｂ６を構成する画素（ｘ，ｙ）の奥行値をパタンの奥行値を用いて再設定する。 Here, since the depth value set in the pattern only indicates the relative depth value in the pattern, in resetting the depth value in the 2D image, the 2D image serving as the reference for each foreground is used. Depth value is required. Therefore, the basic depth value of one pixel having the largest y value among the pixels (x, y) constituting the foreground for each foreground b1 to b4 and b6 is used as a reference value, and each foreground b1 to b4 and b6 is configured. The depth value of the pixel (x, y) is reset using the depth value of the pattern.

制御部１は、判別した結果、構図が「アップ」タイプである場合は、前景を構成する画素（ｘ，ｙ）の奥行値を第１奥行値ＤＰｆに再設定し、背景を構成する画素（ｘ，ｙ）は奥行値を第２奥行値ＤＰｇに再設定する。 When the composition is “up” type as a result of the determination, the control unit 1 resets the depth value of the pixel (x, y) constituting the foreground to the first depth value DPf, and the pixel ( x, y) resets the depth value to the second depth value DPg.

前景を構成する各画素の基本奥行値の平均値を第１奥行値ＤＰｆ、背景を構成する各画素の基本奥行値の平均値を第２奥行値ＤＰｇとする。 An average value of basic depth values of the pixels constituting the foreground is defined as a first depth value DPf, and an average value of basic depth values of the pixels constituting the background is defined as a second depth value DPg.

ここで、第１奥行値ＤＰｆ及び第２奥行値ＤＰｇは、各前景の画素数又は分散値σ１に基づいて重みづけをした、基本奥行値の加重平均によって算出した値を用いてもよい。この場合は、前景を構成する画素数が多い場合は前景が全体的に手前側にあるとして、第１奥行値ＤＰｆが小さい値となるように設定する。また、分散値σ１が大きい場合は被写体を遠くから撮影していると考え、第１奥行値ＤＰｆと第２奥行値ＤＰｇとの差が小さくなるように設定する。 Here, the first depth value DPf and the second depth value DPg may be values calculated by a weighted average of the basic depth values weighted based on the number of pixels of each foreground or the variance value σ1. In this case, when the number of pixels constituting the foreground is large, the foreground is entirely on the near side, and the first depth value DPf is set to a small value. Further, when the variance value σ1 is large, it is considered that the subject is photographed from a distance, and the difference between the first depth value DPf and the second depth value DPg is set to be small.

「アップ」タイプの映像では、前景に焦点が合っており、背景は焦点の合っていない映像となることが多い。このような奥行感及び立体感をもたせた映像を生成するためには、前景の奥行と背景の奥行値に大きな差を付ける必要があるため、「バランス」タイプとは奥行値の設定方法が異なるように設定してある。 In “up” type images, the foreground is in focus, and the background is often out of focus. In order to generate a video with such a sense of depth and stereoscopic effect, it is necessary to make a large difference between the depth value of the foreground and the depth value of the background, so the depth value setting method differs from the “balance” type. It is set as follows.

制御部１は、判別した結果、構図が「俯瞰」タイプである場合は、動物体構成画素からなる前景ｂ６のうち、例えばｙの値が最も大きい１画素の基本奥行値を基準値とし、基本奥行値及び予め設定された奥行値のパタンに従って前景の奥行値を再設定する。 If the composition is a “bird's-eye view” type as a result of the determination, the basic depth value of, for example, one pixel having the largest y value among the foreground b6 made up of moving object constituent pixels is used as a reference value. The depth value of the foreground is reset according to the pattern of the depth value and the preset depth value.

「俯瞰」タイプの映像では、フレーム全体に焦点が合ったいわゆる全焦点と呼ばれる映像が多い。この場合、木や建物などの被写体は、前景を構成していても背景と大きく異ならない奥行値をとると考えられる。従って、前景のうち動物体構成画素のみ奥行値を再設定すればよい。 In the “overhead” type video, there are many so-called omnifocal images in which the entire frame is in focus. In this case, it is considered that subjects such as trees and buildings have depth values that do not differ greatly from the background even if they constitute the foreground. Therefore, it is only necessary to reset the depth value only for the pixels constituting the moving object in the foreground.

前景における奥行値の再設定にて、３次元映像では動きの速い領域は手前にある領域であると考えられるので、動物体構成画素は、動きの速い領域ほど手前の奥行値となるように奥行値を算出し、再設定してもよい。 By resetting the depth value in the foreground, the fast-moving area is considered to be the area in the foreground in the 3D image. The value may be calculated and reset.

また、複数の前景が重なる領域については処理の複雑さを回避するため、これら重なりのある各前景の奥行値のうち、最も手前側の奥行値を再設定してもよい。 Further, in order to avoid processing complexity for a region where a plurality of foregrounds overlap, the depth value closest to the foreground among the overlapping foreground depth values may be reset.

制御部１は、設定および再設定した奥行値に基づき、視差のある左目用映像Ｌ及び右目用映像Ｒの３次元映像を生成する。 The control unit 1 generates a three-dimensional image of the left-eye video L and the right-eye video R having parallax based on the set and reset depth values.

制御部１は、２次元映像の第ｎフレームにおける各画素について算出した奥行値に基づいて、視差の大きさを示すシフト量ＳＦＴｎ（ｘ，ｙ）を算出する。シフト量をＳＦＴｎ（ｘ，ｙ）とし奥行値をＤＰｎ（ｘ，ｙ）とすると、制御部１は、奥行値とシフト量との関係を表す（１４）式によりシフト量を算出する。シフト量ＳＦＴｎ（ｘ，ｙ）は−ｓからｓまでの値に設定されている。 The control unit 1 calculates a shift amount SFTn (x, y) indicating the magnitude of the parallax based on the depth value calculated for each pixel in the nth frame of the 2D video. When the shift amount is SFTn (x, y) and the depth value is DPn (x, y), the control unit 1 calculates the shift amount by the equation (14) that represents the relationship between the depth value and the shift amount. The shift amount SFTn (x, y) is set to a value from -s to s.

なお、ここでは説明の為、奥行値とシフト量の対応付けに（１４）式に示すような線形的変化を伴う式を用いたが、用いる式が非線形式であっても問題が無いことは言うまでも無い。また、予め０からｄの各奥行値に対応するシフト量を記したルックアップテーブルを作成し、これを用いることとしても良い。 Here, for the sake of explanation, an equation with a linear change as shown in equation (14) is used for associating the depth value with the shift amount, but there is no problem even if the equation to be used is a nonlinear equation. Needless to say. Alternatively, a lookup table in which the shift amount corresponding to each depth value from 0 to d is created in advance may be used.

次に、算出したシフト量ＳＦＴｎ（ｘ，ｙ）に応じて左目用映像Ｌ及び右目用映像Ｒを生成する。シフトについて、左目用映像Ｌは、ＳＦＴｎ（ｘ，ｙ）の値が正の場合は右方向に画素のシフトを行い、負の場合は左方向に画素のシフトを行う。一方、右目用映像Ｒについては、ＳＦＴｎ（ｘ，ｙ）の値が正の場合は左方向に画素のシフトを行い、負の場合は右方向に画素のシフトを行う。 Next, a left-eye video L and a right-eye video R are generated according to the calculated shift amount SFTn (x, y). Regarding the shift, the left-eye video L shifts the pixel in the right direction when the value of SFTn (x, y) is positive, and shifts the pixel in the left direction when the value is negative. On the other hand, for the right-eye video R, when the value of SFTn (x, y) is positive, the pixel is shifted leftward, and when it is negative, the pixel is shifted rightward.

映像によって映像の一部分に有効な画素値が得られなくなる画素が発生する場合があるが、この画素については周囲の画素から補間を行い、有効な画素値を設定する。 There may be a case where a pixel in which an effective pixel value cannot be obtained in a part of the image may be generated depending on the image. For this pixel, interpolation is performed from surrounding pixels to set an effective pixel value.

制御部１は、３次元映像を表示部１０で表示するための所定の方式に変換し、出力部６を介して表示部１０に出力する。 The control unit 1 converts the three-dimensional image into a predetermined method for displaying on the display unit 10 and outputs the converted 3D image to the display unit 10 via the output unit 6.

図１０は３次元映像を表示部１０で表示するための所定の方式の例を示す模式図である。図１０Ａは左目用映像Ｌ及び右目用映像Ｒを示す模式図である。所定の方式には図１０Ｂに示すように行方向の解像度を半分にした左目用映像Ｌ及び右目用映像Ｒを１つのフレームに収めたトップアンドボトム方式、図１０Ｃに示すように列方向の解像度を半分にした左目用映像Ｌ及び右目用映像Ｒを１つのフレームに収めたサイドバイサイド方式、図１０Ｄに示すように時間軸方向に左目用映像Ｌ及び右目用映像Ｒを重畳したフレームシーケンシャル方式等がある。 FIG. 10 is a schematic diagram illustrating an example of a predetermined method for displaying a three-dimensional image on the display unit 10. FIG. 10A is a schematic diagram showing a left-eye image L and a right-eye image R. The predetermined method includes a top-and-bottom method in which the left-eye video L and the right-eye video R are halved in the row direction as shown in FIG. 10B, and the column-direction resolution as shown in FIG. 10C. A side-by-side method in which the left-eye video L and the right-eye video R are stored in one frame, and a frame sequential method in which the left-eye video L and the right-eye video R are superimposed in the time axis direction as shown in FIG. 10D. is there.

第１の実施の形態における処理手順について説明する。図１１は第１の実施の形態における制御部１の処理手順を示すフローチャートである。 A processing procedure in the first embodiment will be described. FIG. 11 is a flowchart illustrating a processing procedure of the control unit 1 according to the first embodiment.

図示しない放送波受信アンテナから２次元映像が入力されると（ステップＳ１０）、制御部１は、映像記憶部４に２次元映像のデータを記憶させる（ステップＳ１２）。制御部１は、記憶させた２次元映像のデータを順次読み出す（ステップＳ１４）。 When a 2D video is input from a broadcast wave receiving antenna (not shown) (step S10), the control unit 1 stores the 2D video data in the video storage unit 4 (step S12). The control unit 1 sequentially reads the stored two-dimensional video data (step S14).

制御部１は、２次元映像の各画素について強エッジ画素を抽出し（ステップＳ１６）、続いて、動物体構成画素を抽出する（ステップＳ１８）。 The control unit 1 extracts strong edge pixels for each pixel of the two-dimensional video (step S16), and then extracts moving object constituent pixels (step S18).

制御部１は、２次元映像の前景にラベリング処理を行い、前景と背景とを区分する（ステップＳ２０）。 The control unit 1 performs a labeling process on the foreground of the 2D video, and classifies the foreground and the background (step S20).

制御部１は、２次元映像の構図を判別し（ステップＳ２２）、各画素（ｘ，ｙ）の基本奥行値を算出し、設定する（ステップＳ２４）。制御部１は構図に基づいて奥行値を再設定する（ステップＳ２６）。 The control unit 1 determines the composition of the two-dimensional video (step S22), calculates and sets the basic depth value of each pixel (x, y) (step S24). The control unit 1 resets the depth value based on the composition (step S26).

制御部１は、奥行値に基づき、左目用映像Ｌ及び右目用映像Ｒの３次元映像を生成する（ステップＳ２８）。 Based on the depth value, the control unit 1 generates a 3D image of the left eye image L and the right eye image R (step S28).

制御部１は、生成した左目用映像Ｌ及び右目用映像Ｒを所定の方式に変換し（ステップＳ３０）、出力部６を介して表示部１０へ出力する（ステップＳ３２）。 The control unit 1 converts the generated left-eye video L and right-eye video R into a predetermined format (step S30), and outputs the converted video to the display unit 10 via the output unit 6 (step S32).

こうした制御部１の処理手順のうち、一部の処理手順についてさらに説明する。まず、制御部１が強エッジ画素を抽出する処理手順（ステップＳ１６）について説明する。図１２は制御部１が強エッジ画素情報Ｈｎ（ｘ，ｙ）を算出する処理手順を示すフローチャートである。 A part of the processing procedures of the control unit 1 will be further described. First, a processing procedure (step S16) in which the control unit 1 extracts strong edge pixels will be described. FIG. 12 is a flowchart illustrating a processing procedure in which the control unit 1 calculates the strong edge pixel information Hn (x, y).

制御部１は、画素（ｘ，ｙ）のＲ成分ＳｎＲ（ｘ，ｙ）、Ｇ成分ＳｎＧ（ｘ，ｙ）、Ｂ成分ＳｎＢ（ｘ，ｙ）に（１）式で示すように一定の重み付けをして加重平均をとり、グレースケール値Ｇｎ（ｘ，ｙ）を算出する（ステップＳ１６１）。 The control unit 1 assigns constant weights to the R component SnR (x, y), G component SnG (x, y), and B component SnB (x, y) of the pixel (x, y) as shown by the equation (1). A weighted average is taken to calculate a gray scale value Gn (x, y) (step S161).

さらに（２）式で示すラプラシアンフィルタによる処理を行い（ステップＳ１６２）、２次元映像のエッジを強調する。制御部１は、（３）式で示すエッジ強度Ｅｎ（ｘ，ｙ）を全ての画素について算出する（ステップＳ１６３）。 Further, processing using a Laplacian filter expressed by equation (2) is performed (step S162), and the edges of the two-dimensional image are emphasized. The control unit 1 calculates the edge intensity En (x, y) expressed by the equation (3) for all the pixels (step S163).

制御部１は、フレーム全体を所定数のブロックに分割し（ステップＳ１６４）、画素（x，ｙ）が属するブロックのエッジ強度Ｅｎ（ｘ，ｙ）の平均値であるＥａｖｅ（ｊ，ｋ）を算出する（ステップＳ１６５）。 The control unit 1 divides the entire frame into a predetermined number of blocks (step S164), and calculates Eave (j, k) that is an average value of the edge strength En (x, y) of the block to which the pixel (x, y) belongs. Calculate (step S165).

制御部１は、エッジ強度平均Ｅａｖｅ（ｊ，ｋ）が予め設定された閾値Ｔｈ０以上か否かを判定し、強エッジ画素情報Ｈｎ（ｘ，ｙ）を算出し、強エッジ画素情報Ｈｎ（ｘ，ｙ）が１である強エッジ画素を抽出する（ステップＳ１６６）。 The control unit 1 determines whether the edge intensity average Eave (j, k) is equal to or greater than a preset threshold Th0, calculates strong edge pixel information Hn (x, y), and determines strong edge pixel information Hn (x , Y) is a strong edge pixel of 1 (step S166).

次に、制御部１が動物体構成画素情報Ｍｎ（ｘ，ｙ）を算出する処理手順（ステップＳ１８）について説明する。図１３は制御部１が動物体構成画素情報Ｍｎ（ｘ，ｙ）を算出する処理手順を示すフローチャートである。制御部１は、（８）式に示すようにグレースケール値Ｇｎ（ｘ，ｙ）及びＧｎ−１（ｘ，ｙ）の差分の絶対値を取り、絶対差分情報Ｄｎ（ｘ，ｙ）を算出する（ステップＳ１８１）。 Next, a processing procedure (step S18) in which the control unit 1 calculates the moving object constituent pixel information Mn (x, y) will be described. FIG. 13 is a flowchart illustrating a processing procedure in which the control unit 1 calculates the moving object constituent pixel information Mn (x, y). The control unit 1 calculates the absolute difference information Dn (x, y) by taking the absolute value of the difference between the gray scale values Gn (x, y) and Gn−1 (x, y) as shown in the equation (8). (Step S181).

続いて絶対差分情報Ｄｎ（ｘ，ｙ）が予め設定された閾値Ｔｈ１以上か否かを判定し、差分の絶対値が大きい画素であるか否かを示す高差分情報Ｂｎ（ｘ，ｙ）を算出する（ステップＳ１８２）。 Subsequently, it is determined whether or not the absolute difference information Dn (x, y) is greater than or equal to a preset threshold value Th1, and high difference information Bn (x, y) indicating whether or not the pixel has a large absolute value of the difference. Calculate (step S182).

制御部１は、同様の演算を行って、絶対差分情報Ｄｎ＋１（ｘ，ｙ）を算出し（ステップＳ１８３）、高差分情報Ｂｎ＋１（ｘ，ｙ）を算出し（ステップＳ１８４）、高差分領域Ｂｎ（ｘ，ｙ）及びＢｎ＋１（ｘ，ｙ）の論理積である動物体構成画素情報Ｍｎ（ｘ，ｙ）を算出し、動物体構成画素情報Ｍｎ（ｘ，ｙ）が１である動物体構成画素を抽出する（ステップＳ１８５）。 The control unit 1 performs the same calculation to calculate absolute difference information Dn + 1 (x, y) (step S183), high difference information Bn + 1 (x, y) (step S184), and high difference area Bn. The moving object configuration pixel information Mn (x, y) that is the logical product of (x, y) and Bn + 1 (x, y) is calculated, and the moving object configuration pixel information Mn (x, y) is 1. Pixels are extracted (step S185).

また、制御部１が２次元映像の構図を判別する処理手順（ステップＳ２２）について説明する。図１４は制御部１が２次元映像の構図を判別する処理手順を示すフローチャートである。 A processing procedure (step S22) in which the control unit 1 determines the composition of the 2D video will be described. FIG. 14 is a flowchart illustrating a processing procedure in which the control unit 1 determines the composition of the two-dimensional video.

制御部１は、前景を構成する画素を計数し（ステップＳ２２１）、画素数が予め設定された閾値Ｔｈ２以上か否かを判定する（ステップＳ２２２）。前景の画素数が閾値Ｔｈ２未満である場合は（ステップＳ２２２でＮＯ）、第ｎフレームの構図は「バランス」タイプであると判別する（ステップＳ２２３）。 The control unit 1 counts the pixels constituting the foreground (step S221), and determines whether or not the number of pixels is equal to or greater than a preset threshold value Th2 (step S222). If the number of foreground pixels is less than the threshold Th2 (NO in step S222), it is determined that the composition of the nth frame is the “balance” type (step S223).

前景を構成する画素数が閾値Ｔｈ２以上である場合は（ステップＳ２２２でＹＥＳ）、エッジ強度Ｅｎ（ｘ，ｙ）の値が予め設定された閾値Ｔｈ３以上である画素を抽出し（ステップＳ２２４）、（１３）式で示す画素の位置についての分散値σ１を算出する（ステップＳ２２５）。 When the number of pixels constituting the foreground is equal to or greater than the threshold Th2 (YES in Step S222), pixels whose edge intensity En (x, y) is equal to or greater than a preset threshold Th3 are extracted (Step S224). The variance value σ1 for the pixel position indicated by the equation (13) is calculated (step S225).

制御部１は、分散値σ１が閾値Ｔｈ４以上であるか否かを判定し（ステップＳ２２６）、分散値σ１が閾値Ｔｈ４未満である場合は（ステップＳ２２６でＮＯ）、構図が「アップ」タイプであると判別する（ステップＳ２２７）。分散値σ１が閾値Ｔｈ４以上である場合は（ステップＳ２２６でＹＥＳ）、構図が「俯瞰」タイプであると判別する（ステップＳ２２８）。 The control unit 1 determines whether or not the variance value σ1 is greater than or equal to the threshold value Th4 (step S226). If the variance value σ1 is less than the threshold value Th4 (NO in step S226), the composition is “up” type. It is determined that it exists (step S227). If the variance value σ1 is greater than or equal to the threshold Th4 (YES in step S226), it is determined that the composition is the “overhead” type (step S228).

続いて、制御部１が奥行値を再設定する処理手順（ステップＳ２６）について説明する。図１５は制御部１が奥行値を再設定する処理手順を示すフローチャートである。 Then, the process sequence (step S26) in which the control part 1 resets a depth value is demonstrated. FIG. 15 is a flowchart illustrating a processing procedure in which the control unit 1 resets the depth value.

制御部１は前述した手順により決定した構図が「バランス」タイプか否かを判定する（ステップＳ２６１）。バランスタイプである場合は（ステップＳ２６１でＹＥＳ）、各前景ｂ１からｂ４及びｂ６毎にｙの値が最も大きい１画素を選択し（ステップＳ２６２）、選択した１画素の基本奥行値を基準として用い、各前景ｂ１からｂ４及びｂ６を構成する画素（ｘ，ｙ）の奥行値を再設定する（ステップＳ２６３）。 The control unit 1 determines whether or not the composition determined by the above-described procedure is a “balance” type (step S261). In the case of the balance type (YES in step S261), one pixel having the largest y value is selected for each foreground b1 to b4 and b6 (step S262), and the basic depth value of the selected one pixel is used as a reference. Then, the depth values of the pixels (x, y) constituting the foregrounds b1 to b4 and b6 are reset (step S263).

構図が「バランス」タイプでない場合は（ステップＳ２６１でＮＯ）、構図が「アップ」タイプであるか否かを判定する（ステップＳ２６４）。 If the composition is not “balance” type (NO in step S261), it is determined whether or not the composition is “up” type (step S264).

構図が「アップ」タイプの場合は（ステップＳ２６４でＹＥＳ）、前景を構成する画素は奥行値を第１奥行値ＤＰｆに再設定し（ステップＳ２６５）、背景を構成する画素は奥行値を第２奥行値ＤＰｇに再設定する（ステップＳ２６６）。 When the composition is the “up” type (YES in step S264), the pixels constituting the foreground are reset to the first depth value DPf (step S265), and the pixels constituting the background are set to the second depth value. The depth value DPg is reset (step S266).

構図が「アップ」タイプでない場合は（ステップＳ２６４でＮＯ）、「俯瞰」タイプであり、この場合はｂ６を構成する動物体構成画素のｙの値が最も大きい１画素を選択し（ステップＳ２６７）、選択した１画素における基本奥行値及び予め設定された奥行値のパタンに従って、動物体構成画素の奥行値を再設定する（ステップＳ２６８）。 If the composition is not the “up” type (NO in step S264), the “overhead” type is selected. In this case, one pixel having the largest y value of the moving object constituent pixels constituting b6 is selected (step S267). Then, the depth value of the moving object constituent pixel is reset according to the basic depth value of the selected one pixel and the pattern of the preset depth value (step S268).

制御部１が３次元映像を生成する処理手順（ステップＳ２８）について説明する。図１６は制御部１が３次元映像を生成する処理手順を示すフローチャートである。 A processing procedure (step S28) in which the control unit 1 generates a 3D video will be described. FIG. 16 is a flowchart illustrating a processing procedure in which the control unit 1 generates a 3D video.

制御部１は、（１４）式に示すように奥行値ＤＰｎ（ｘ，ｙ）に対応する２次元映像のシフト量ＳＦＴｎ（ｘ，ｙ）を算出する（ステップＳ２８１）。次に、制御部１は、算出したシフト量ＳＦＴｎ（ｘ，ｙ）に応じて、２次元映像をシフトする（ステップＳ２８２）。ここで、画素のシフトによって映像の一部分に有効な画素値が得られなくなる画素を周囲の画素から補間を行い（ステップＳ２８３）、有効な画素値を設定する。 The control unit 1 calculates the shift amount SFTn (x, y) of the two-dimensional image corresponding to the depth value DPn (x, y) as shown in the equation (14) (step S281). Next, the control unit 1 shifts the two-dimensional video according to the calculated shift amount SFTn (x, y) (step S282). Here, a pixel in which an effective pixel value cannot be obtained for a part of the image due to pixel shift is interpolated from surrounding pixels (step S283), and an effective pixel value is set.

本実施の形態における映像生成装置は、画面の構図、動きの大きさ、前景か背景かなどの要素を考慮して奥行値を算出するので、自然な３次元映像を生成することができる。 The video generation apparatus according to the present embodiment calculates the depth value in consideration of factors such as the composition of the screen, the magnitude of the motion, and whether the foreground or the background, so that a natural three-dimensional video can be generated.

第２の実施の形態
第２の実施の形態について説明する。第２の実施の形態は構図が「アップ」タイプの場合、第１奥行値及び第２奥行値をそれぞれ常に一定の値に設定する。 Second Embodiment A second embodiment will be described. In the second embodiment, when the composition is an “up” type, the first depth value and the second depth value are always set to constant values.

「アップ」タイプの映像では、前景を構成する画素における奥行値及び背景を構成する画素における奥行値に予め定められた一定の値を常に用いることで、奥行感及び立体感をもたせた映像を生成することができる場合がある。また、奥行値に常に一定値を用いることで制御部１の処理を簡易にすることができる。 In the “up” type video, a predetermined value is always used for the depth value in the pixels that make up the foreground and the depth value in the pixels that make up the background, creating a video with a sense of depth and stereoscopic effect. You may be able to. Moreover, the process of the control part 1 can be simplified by always using a fixed value for the depth value.

従って本実施の形態における制御部１は、構図が「アップ」タイプの場合は基本奥行値の設定を行わず、前景及び背景の奥行値をそれぞれ一定値に設定する。 Therefore, the control unit 1 in this embodiment does not set the basic depth value when the composition is the “up” type, and sets the depth values of the foreground and the background to constant values.

本実施の形態における制御部１について説明する。放送波受信アンテナから２次元映像が入力されると、制御部１は、映像記憶部４に２次元映像のデータを記憶させる。制御部１は記録させた２次元映像のデータを順次読み出す。 The control part 1 in this Embodiment is demonstrated. When a 2D video is input from the broadcast wave receiving antenna, the control unit 1 causes the video storage unit 4 to store 2D video data. The controller 1 sequentially reads the recorded 2D video data.

制御部１は、強エッジ画素及び動物体構成画素を抽出し、２次元映像の前景にラベリング処理を行い、前景と背景とを区分する。制御部１は、前景情報Ｐｎ（ｘ，ｙ）及びエッジ強度Ｅｎ（ｘ，ｙ）に基づき、奥行値を設定する。 The control unit 1 extracts strong edge pixels and moving object constituent pixels, performs a labeling process on the foreground of the two-dimensional image, and classifies the foreground and the background. The control unit 1 sets the depth value based on the foreground information Pn (x, y) and the edge strength En (x, y).

制御部１は、前景の画素を計数し、画素数が予め設定された閾値Ｔｈ２以上か否かを判定する。前景の画素数が閾値Ｔｈ２未満である場合は、構図は「バランス」タイプであるので、制御部１は、基本奥行値を生成し、さらに前景の奥行値を再設定する。 The control unit 1 counts the foreground pixels and determines whether the number of pixels is equal to or greater than a preset threshold value Th2. When the number of pixels in the foreground is less than the threshold Th2, the composition is a “balance” type, so the control unit 1 generates a basic depth value and resets the foreground depth value.

前景の画素数が閾値Ｔｈ２以上である場合は、エッジ強度Ｅｎ（ｘ，ｙ）の値が閾値Ｔｈ３以上である画素を抽出し、これらの画素の位置についての分散値σ１を算出する。 When the number of foreground pixels is equal to or greater than the threshold Th2, pixels whose edge intensity En (x, y) is equal to or greater than the threshold Th3 are extracted, and a variance value σ1 for the positions of these pixels is calculated.

制御部１は、分散値σ１が閾値Ｔｈ４以上であるか否かを判定し、分散値σ１が閾値Ｔｈ４以上である場合は、構図が「アップ」タイプであるので、前景を構成する画素には一定値である第１奥行値ＤＰｆを設定し、背景を構成する画素には一定値である第２奥行値ＤＰｇを設定する。 The control unit 1 determines whether or not the variance value σ1 is equal to or greater than the threshold value Th4. If the variance value σ1 is equal to or greater than the threshold value Th4, the composition is “up” type. A first depth value DPf, which is a constant value, is set, and a second depth value DPg, which is a constant value, is set for the pixels constituting the background.

分散値σ１が閾値Ｔｈ４未満である場合は、構図が「俯瞰」タイプであるので、基本奥行値を算出して設定し、さらに動物体構成画素の奥行値を再設定する。 When the variance value σ1 is less than the threshold value Th4, the composition is “overhead” type, so the basic depth value is calculated and set, and the depth value of the moving object constituent pixel is reset.

制御部１は、奥行値ＤＰｎ（ｘ，ｙ）に基づき、左目用映像Ｌ及び右目用映像Ｒの３次元映像を生成し、生成した左目用映像Ｌ及び右目用映像Ｒを適式な方式に変換し、出力部６を介して表示部に送信する。表示部１０は３次元映像を表示する。 Based on the depth value DPn (x, y), the control unit 1 generates a three-dimensional image of the left-eye video L and the right-eye video R, and uses the generated left-eye video L and right-eye video R in a proper manner. The data is converted and transmitted to the display unit via the output unit 6. The display unit 10 displays a 3D image.

本実施の形態における制御部１の処理手順について説明する。図１７は第２の実施の形態における制御部１の処理手順を示すフローチャートである。放送波受信アンテナから２次元映像が入力されると（ステップＳ４０）、制御部１は、映像記憶部４に２次元映像のデータを記憶させる（ステップＳ４２）。制御部１は記憶させた２次元映像のデータを順次読み出す（ステップＳ４４）。 A processing procedure of the control unit 1 in the present embodiment will be described. FIG. 17 is a flowchart illustrating a processing procedure of the control unit 1 according to the second embodiment. When the 2D video is input from the broadcast wave receiving antenna (step S40), the control unit 1 stores the data of the 2D video in the video storage unit 4 (step S42). The control unit 1 sequentially reads the stored 2D video data (step S44).

制御部１は、２次元映像の各フレームの各画素について強エッジ画素を抽出する（ステップＳ４６）。また、動物体構成画素を抽出する（ステップＳ４８）。 The control unit 1 extracts strong edge pixels for each pixel of each frame of the 2D video (step S46). Further, the moving object constituent pixels are extracted (step S48).

制御部１は、２次元映像の前景にラベリング処理を行う。また、強エッジ領域情報Ｈｎ（ｘ，ｙ）及び動物体構成画素情報Ｍｎ（ｘ，ｙ）の論理和を取ることにより、前景情報Ｐｎ（ｘ，ｙ）を算出し、前景と背景とを区分する（ステップＳ５０）。 The control unit 1 performs a labeling process on the foreground of the 2D video. Further, the foreground information Pn (x, y) is calculated by calculating the logical sum of the strong edge region information Hn (x, y) and the moving object constituent pixel information Mn (x, y), and the foreground and the background are distinguished. (Step S50).

制御部１は、構図を判別し、判別した構図に基づいて奥行値を算出し、設定する（ステップＳ５２）。 The control unit 1 determines the composition and calculates and sets the depth value based on the determined composition (step S52).

制御部１は、奥行値に基づき、左目用映像Ｌ及び右目用映像Ｒの３次元映像を生成する（ステップＳ５４）。制御部１は、生成した左目用映像Ｌ及び右目用映像Ｒを所定の方式に変換し（ステップＳ５６）、出力部６を介して表示部１０へ出力する（ステップＳ５８）。 Based on the depth value, the control unit 1 generates a 3D image of the left eye image L and the right eye image R (step S54). The control unit 1 converts the generated left-eye video L and right-eye video R into a predetermined format (step S56), and outputs the converted video to the display unit 10 via the output unit 6 (step S58).

制御部１が奥行値を設定する処理手順（ステップＳ５２）についてさらに説明する。図１８は第２の実施の形態における制御部１が奥行値を設定する処理手順を示したフローチャートである。 The processing procedure (step S52) in which the control unit 1 sets the depth value will be further described. FIG. 18 is a flowchart illustrating a processing procedure in which the control unit 1 according to the second embodiment sets a depth value.

制御部１は、前景を構成する画素を計数し（ステップＳ５２１）、画素数が予め設定された閾値Ｔｈ２以上か否かを判定する（ステップＳ５２２）。 The control unit 1 counts the pixels constituting the foreground (step S521), and determines whether or not the number of pixels is equal to or greater than a preset threshold value Th2 (step S522).

前景の画素数が閾値Ｔｈ２未満である場合は（ステップＳ５２２でＮＯ）、構図は「バランス」タイプであるので、制御部１は基本奥行値を設定し（ステップＳ５２３）、前景を構成する画素（ｘ，ｙ）のうち、ｙの値が最も大きい画素である１画素を選択し（ステップＳ５２４）、さらに前景の奥行値を再設定する（ステップＳ５２５）。 If the number of pixels in the foreground is less than the threshold Th2 (NO in step S522), the composition is “balance” type, so the control unit 1 sets a basic depth value (step S523), and the pixels ( Among the x, y), one pixel having the largest y value is selected (step S524), and the depth value of the foreground is reset (step S525).

前景の画素数が閾値Ｔｈ２以上である場合は（ステップＳ５２２でＹＥＳ）、エッジ強度Ｅｎ（ｘ，ｙ）の値が閾値Ｔｈ３以上である画素を抽出し（ステップＳ５２６）、これらの画素の位置についての分散値σ１を算出する（ステップＳ５２７）。 If the number of foreground pixels is greater than or equal to the threshold Th2 (YES in step S522), pixels whose edge intensity En (x, y) is greater than or equal to the threshold Th3 are extracted (step S526), and the positions of these pixels are determined. Is calculated (step S527).

制御部１は、分散値σ１が閾値Ｔｈ４以上であるか否かを判定し（ステップＳ５２８）、分散値σ１が閾値Ｔｈ４以上である場合は（ステップＳ５２８でＹＥＳ）、構図が「アップ」タイプであるので、前景情報には奥行値ＤＰｆを設定し（ステップＳ５２９）、背景情報には奥行値ＤＰｇを設定する（ステップＳ５３０）。 The control unit 1 determines whether or not the variance value σ1 is greater than or equal to the threshold value Th4 (step S528). If the variance value σ1 is greater than or equal to the threshold value Th4 (YES in step S528), the composition is “up” type. Therefore, the depth value DPf is set for the foreground information (step S529), and the depth value DPg is set for the background information (step S530).

分散値σ１が閾値Ｔｈ４未満である場合は（ステップＳ５２８でＮＯ）、構図は「俯瞰」タイプであるので、基本奥行値を設定し（ステップＳ５３１）、動物体構成画素のうち、ｙの値が最も大きい画素である１画素を選択し（ステップＳ５３２）、さらに動物体構成画素の奥行値を再設定する（ステップＳ５３３）。 If the variance value σ1 is less than the threshold value Th4 (NO in step S528), the composition is “overhead” type, so a basic depth value is set (step S531). One pixel which is the largest pixel is selected (step S532), and the depth value of the moving object constituent pixels is reset (step S533).

本実施の形態は、「アップ」タイプの映像では、前景の奥行値及び背景の奥行値にそれぞれ一定の値を用いることで奥行感及び立体感をもたせた映像を生成することができ、また、制御部１の処理を簡易にすることができる。 This embodiment can generate an image with a sense of depth and a stereoscopic effect by using constant values for the depth value of the foreground and the depth value of the background in the “up” type image, The process of the control unit 1 can be simplified.

第３の実施の形態
第３の実施の形態について説明する。第３の実施の形態は、制御部１が行う構図を判別する処理手順が第１の実施の形態における処理手順（ステップＳ２２）と異なる。本実施の形態は、映像の構図を判別するにあたり、全画素のエッジ強度の分散値を用いるので、映像の構図を正確に判別することができる。 Third Embodiment A third embodiment will be described. In the third embodiment, the processing procedure for determining the composition performed by the control unit 1 is different from the processing procedure (step S22) in the first embodiment. In the present embodiment, since the variance value of the edge intensity of all pixels is used in determining the composition of the video, the composition of the video can be accurately determined.

本実施における制御部１は、２次元映像の構図を判別するにあたり、まず、前景を構成する画素の画素を計数する。計数した画素数が予め設定された閾値Ｔｈ２以上か否かを判定する。前景の画素数が閾値Ｔｈ２未満である場合は、構図は「バランス」タイプであると判別する。 In determining the composition of the two-dimensional video, the control unit 1 in the present embodiment first counts the pixels constituting the foreground. It is determined whether or not the counted number of pixels is equal to or greater than a preset threshold value Th2. When the number of pixels in the foreground is less than the threshold value Th2, the composition is determined to be a “balance” type.

前景の画素数が閾値Ｔｈ２以上である場合は、（１５）式に従い、全画素のエッジ強度Ｅｎ（ｘ，ｙ）の値の分散値σ２を算出する。 When the number of foreground pixels is equal to or greater than the threshold Th2, the variance value σ2 of the values of the edge intensity En (x, y) of all the pixels is calculated according to equation (15).

映像が被写体のアップである場合は、被写体が移っている領域にエッジの強い画素が集中する。一方、映像が俯瞰の映像である場合は、多数の被写体が点在しているので、エッジの強い画素はフレーム全体に点在している。 When the image is up of the subject, pixels with strong edges are concentrated in the area where the subject has moved. On the other hand, when the video is a bird's-eye video, a large number of subjects are scattered, so pixels with strong edges are scattered throughout the frame.

従って、被写体のアップの映像は俯瞰の映像よりもエッジ強度Ｅｎ（ｘ，ｙ）の分散値σ２が一般的に大きいと考えられる。 Therefore, it is considered that the image of the subject up generally has a larger variance value σ2 of the edge intensity En (x, y) than the overhead image.

分散値σ２が閾値Ｔｈ５以上であるか否かを判定し、分散値σ２が閾値Ｔｈ５以上である場合は、構図が「アップ」タイプであると判別する。分散値σ２が閾値Ｔｈ５未満である場合は、構図が「俯瞰」タイプであると判別する。 It is determined whether or not the variance value σ2 is greater than or equal to the threshold Th5. If the variance value σ2 is greater than or equal to the threshold Th5, it is determined that the composition is the “up” type. If the variance value σ2 is less than the threshold Th5, it is determined that the composition is the “overhead” type.

次に、本実施の形態における、制御部１が構図を判別する処理手順について説明する。図１９は第３の実施の形態における制御部１が構図を判別する処理手順を示すフローチャートである。 Next, a processing procedure in which the control unit 1 determines the composition in the present embodiment will be described. FIG. 19 is a flowchart illustrating a processing procedure in which the control unit 1 according to the third embodiment determines the composition.

制御部１は、前景の画素を計数し（ステップＳ６２１）、画素数が予め設定された閾値Ｔｈ２以上か否かを判定する（ステップＳ６２２）。前景の画素数が閾値Ｔｈ２未満である場合は（ステップＳ６２２でＮＯ）、第ｎフレームの構図は「バランス」タイプであると判別する（ステップＳ６２３）。 The control unit 1 counts the foreground pixels (step S621), and determines whether the number of pixels is equal to or greater than a preset threshold Th2 (step S622). If the number of foreground pixels is less than the threshold Th2 (NO in step S622), it is determined that the composition of the nth frame is the “balance” type (step S623).

前景の画素数が閾値Ｔｈ２以上である場合は（ステップＳ６２２でＹＥＳ）、全画素のエッジ強度Ｅｎ（ｘ，ｙ）の分散値σ２を算出する（ステップＳ６２４）。 If the number of foreground pixels is equal to or greater than the threshold Th2 (YES in step S622), the variance value σ2 of the edge intensity En (x, y) of all pixels is calculated (step S624).

分散値σ２が閾値Ｔｈ５以上であるか否かを判定し（ステップＳ６２５）、分散値σ２が閾値Ｔｈ５未満である場合は（ステップＳ６２５でＮＯ）、構図が「アップ」タイプであると判別する（ステップＳ６２６）。分散値σ２が閾値Ｔｈ５以上である場合は（ステップＳ６２５でＹＥＳ）、構図が「俯瞰」タイプであると判別する（ステップＳ６２７）。 It is determined whether or not the variance value σ2 is greater than or equal to the threshold Th5 (step S625). If the variance value σ2 is less than the threshold Th5 (NO in step S625), it is determined that the composition is the “up” type (step S625). Step S626). If the variance value σ2 is greater than or equal to the threshold Th5 (YES in step S625), it is determined that the composition is the “overhead” type (step S627).

本実施の形態によれば、映像の構図を判別するにあたり、全画素のエッジ強度の分散値を用いるので、映像の構図を正確に判別することができる。 According to the present embodiment, since the variance value of the edge intensity of all pixels is used in determining the composition of the video, the composition of the video can be accurately determined.

第４の実施の形態
図２０は第４の実施の形態におけるテレビ受像装置のハードウェア各部を示すブロック図である。制御部１を動作させるためのプログラムは、ディスクドライブ等の読取部９に、ＣＤ−ＲＯＭ、ＤＶＤディスクまたはＵＳＢメモリ等の可搬型記録媒体２０Ａを読み取らせてＲＯＭ２に記憶する。 Fourth Embodiment FIG. 20 is a block diagram showing hardware parts of a television receiver according to a fourth embodiment. A program for operating the control unit 1 is stored in the ROM 2 by causing the reading unit 9 such as a disk drive to read the portable recording medium 20A such as a CD-ROM, a DVD disk, or a USB memory.

ここで、当該プログラムは、当該プログラムを記憶したフラッシュメモリ等の半導体メモリ２０Ｂを制御部１内に実装してもよく、インターネット等の通信網Ｎを介して通信部８と接続される図示しない他のサーバコンピュータからダウンロードしてもよい。 Here, the program may include a semiconductor memory 20B such as a flash memory storing the program in the control unit 1, and is connected to the communication unit 8 through a communication network N such as the Internet. You may download from the server computer.

図２０に示す制御部１は、上述した各種ソフトウェア処理を実行するプログラムを可搬型記録媒体又は半導体メモリから読み取り、或いは、通信網Ｎを介して図示しない他のサーバコンピュータからダウンロードする。当該プログラムは制御プログラムとしてインストールされ、ＲＯＭ２にロードして実行される。これにより、上述した制御部１として機能する。 The control unit 1 shown in FIG. 20 reads a program for executing the above-described various software processes from a portable recording medium or a semiconductor memory, or downloads it from another server computer (not shown) via the communication network N. The program is installed as a control program, loaded into the ROM 2 and executed. Thereby, it functions as the control unit 1 described above.

今回開示された実施の形態はすべての点で例示であって、制限的なものでは無いと考えられるべきである。本発明の範囲は、上記した意味では無く、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。 The embodiments disclosed herein are illustrative in all respects and should not be considered as restrictive. The scope of the present invention is defined not by the above-mentioned meaning but by the scope of the claims, and is intended to include all modifications within the meaning and scope equivalent to the scope of the claims.

例えば、本発明における構図は、前述した３つのタイプに限らず、被写体の大きさ、動き又は配置等に基づいて他の構図を設けてもよい。 For example, the composition in the present invention is not limited to the three types described above, and other compositions may be provided based on the size, movement, or arrangement of the subject.

一方、制御部１は構図の判別を行わなくてもよい。この場合制御部１は、強エッジ領域情報Ｈｎ（ｘ，ｙ）及び動物体構成画素情報Ｍｎ（ｘ，ｙ）により前景を構成する画素を抽出し、抽出した画素の奥行値を前述した方法により再設定して、３次元映像を生成するようにしてもよい。 On the other hand, the control unit 1 does not have to determine the composition. In this case, the control unit 1 extracts the pixels constituting the foreground from the strong edge region information Hn (x, y) and the moving object constituent pixel information Mn (x, y), and calculates the depth value of the extracted pixels by the method described above. It may be reset to generate a 3D video.

また、強エッジ領域情報Ｈｎ（ｘ，ｙ）及び動物体構成画素情報Ｍｎ（ｘ，ｙ）は、グレースケール値Ｇｎ（ｘ，ｙ）の代わりに画素値ＳｎＲ（ｘ，ｙ）、ＳｎＧ（ｘ，ｙ）、ＳｎＢ（ｘ，ｙ）の一若しくは複数を用い、この画素値が所定の閾値以上か否かの判定を行うことにより算出してもよい。絶対差分情報Ｄｎ（ｘ，ｙ）は、３以上のフレームにおける画素（ｘ，ｙ）のグレースケール値に基づいて算出してもよいし、３以上のフレームにおけるＲ、Ｇ、Ｂ成分の画素値の一若しくは複数を用いて算出してもよい。勿論、ＳｎＲ（ｘ，ｙ）、ＳｎＧ（ｘ，ｙ）、ＳｎＢ（ｘ，ｙ）に示されるＲＧＢ色空間情報を、ＨＳＶやＹＵＶなどに代表される他の色空間に変換した上で用いることが可能なことは言うまでも無い。 Further, the strong edge region information Hn (x, y) and the moving object constituent pixel information Mn (x, y) include pixel values SnR (x, y) and SnG (x instead of the gray scale value Gn (x, y). , Y), one or more of SnB (x, y), and may be calculated by determining whether or not the pixel value is equal to or greater than a predetermined threshold value. The absolute difference information Dn (x, y) may be calculated based on the gray scale value of the pixel (x, y) in three or more frames, or the pixel values of the R, G, and B components in three or more frames. You may calculate using one or more. Of course, the RGB color space information shown in SnR (x, y), SnG (x, y), SnB (x, y) is converted into another color space represented by HSV, YUV, etc. It goes without saying that is possible.

１制御部
２ＲＯＭ
３ＲＡＭ
４映像記憶部
５入力部
６出力部
７チューナ
８通信部
９読取部
１０表示部
２０Ａ可搬型記録媒体
２０Ｂ半導体メモリ 1 Control unit 2 ROM
3 RAM
4 Video storage unit 5 Input unit 6 Output unit 7 Tuner 8 Communication unit 9 Reading unit 10 Display unit 20A Portable recording medium 20B Semiconductor memory

Claims

In a video generation device that generates a three-dimensional video by calculating a depth value indicating a depth corresponding to each pixel of a two-dimensional video constituted by a plurality of pixels,
A first difference calculation unit that calculates a difference between a pixel value of each pixel constituting the two-dimensional image and a pixel value of a pixel adjacent to each pixel;
A first extraction unit that extracts a pixel whose difference calculated by the first difference calculation unit is equal to or greater than a predetermined first threshold together with position information of the pixel ;
A second difference calculation unit for calculating a difference between a pixel value in one frame of each pixel constituting the two-dimensional video and a pixel value in another frame;
A second extraction unit that extracts a pixel whose difference calculated by the second difference calculation unit is equal to or greater than a predetermined second threshold together with position information of the pixel ;
Based on the number of pixels or the position information extracted by the first extraction unit and the second extraction unit, it is determined whether the composition type of the 2D video is a plurality of predetermined types. Discriminating part to perform and
A depth calculation unit that calculates a depth value corresponding to each pixel of the two-dimensional video based on a calculation result of the first difference calculation unit or the second difference calculation unit and a composition type determined by the determination unit. A video generation apparatus characterized by that.

A third extraction unit that extracts pixels whose difference calculated by the first difference calculation unit is greater than or equal to a predetermined third threshold value that is greater than the first threshold value, and dispersion of positions where each pixel extracted by the third extraction unit exists A dispersion value calculation unit for calculating a dispersion value at a position indicating the degree of
The depth calculation unit is configured to calculate the depth value based on the variance value of the position calculated by the variance value calculation unit and the total number of pixels extracted by the first extraction unit or the second extraction unit. The video generation device according to claim 1, wherein:

A second variance value calculation unit that calculates a variance value of the difference calculated by the first difference calculation unit;
The depth calculation unit calculates the depth value based on the variance value of the difference of each pixel calculated by the second variance value calculation unit and the total number of pixels extracted by the first extraction unit or the second extraction unit. The image generation device according to claim 1, wherein the image generation device is configured to calculate.

A video display device comprising: the video generation device according to any one of 1 to 3; and a display unit that displays the three-dimensional video generated by the video generation device.

A tuner unit that receives TV broadcast waves including two-dimensional video,
The video generation device according to any one of claims 1 to 3, which generates a 3D video based on the 2D video received by the tuner unit, and a display for displaying the 3D video generated by the video generation device A television receiver characterized by comprising a unit.

In a video generation method for generating a three-dimensional video by calculating a depth value indicating a depth corresponding to each pixel of a two-dimensional video constituted by a plurality of pixels,
A first difference calculating step for calculating a difference between a pixel value of each pixel constituting the two-dimensional image and a pixel value of a pixel adjacent to each pixel;
A first extraction step of extracting a pixel whose difference calculated in the first difference calculation step is equal to or greater than a predetermined first threshold together with position information of the pixel ;
A second difference calculating step of calculating a difference between a pixel value in one frame of each pixel constituting the two-dimensional video and a pixel value in another frame;
A second extraction step of extracting a pixel whose difference calculated in the second difference calculation step is equal to or greater than a predetermined second threshold together with position information of the pixel ;
Based on the number of pixels or the position information extracted in the first extraction step and the second extraction step, it is determined whether the composition type of the 2D video is a plurality of predetermined types. Discriminating step and
Including a calculation result of the first difference calculation step or the second difference calculation step, the depth calculation step of calculating a depth value corresponding to each pixel of the two-dimensional image based on the type of composition in which the determination step determines A video generation method characterized by performing processing .

In a computer program for generating a three-dimensional image by calculating a depth value indicating a depth corresponding to each pixel of the two-dimensional image composed of a plurality of pixels,
A first difference calculating step for calculating a difference between a pixel value of each pixel constituting the two-dimensional image and a pixel value of a pixel adjacent to each pixel;
A first extraction step of extracting a pixel whose difference calculated in the first difference calculation step is equal to or greater than a predetermined first threshold together with position information of the pixel ;
A second difference calculating step of calculating a difference between a pixel value in one frame of each pixel constituting the two-dimensional video and a pixel value in another frame;
A second extraction step of extracting a pixel whose difference calculated in the second difference calculation step is equal to or greater than a predetermined second threshold together with position information of the pixel ;
Based on the number of pixels or the position information extracted in the first extraction step and the second extraction step, it is determined whether the composition type of the 2D video is a plurality of predetermined types. Discriminating step and
A depth calculating step of calculating a depth value corresponding to each pixel of the two-dimensional video based on a calculation result of the first difference calculating step or the second difference calculating step and a composition type determined by the determining step. A computer program for causing a computer to execute processing.