JP2012120143A

JP2012120143A - Stereoscopic image data transmission device, stereoscopic image data transmission method, stereoscopic image data reception device, and stereoscopic image data reception method

Info

Publication number: JP2012120143A
Application number: JP2010293675A
Authority: JP
Inventors: Ikuo Tsukagoshi; 郁夫塚越
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2010-11-10
Filing date: 2010-12-28
Publication date: 2012-06-21
Also published as: CN102714744A; RU2012127786A; BR112012016472A2; WO2012063421A1; EP2508006A1; AU2011327700A1; US20120256951A1; AR083685A1; KR20130132241A

Abstract

PROBLEM TO BE SOLVED: To reduce, when transmitting disparity information to be sequentially updated within a predetermined number of frame periods during which overlapping information is displayed, the data amount of the disparity information.SOLUTION: A segment including disparity information to be sequentially updated during a subtitle display period is transmitted. On a reception side, disparity to be provided between a left eye subtitle and a right eye subtitle can be dynamically changed in conjunction with a change in image content. The disparity information is composed of disparity information of the first frame in the subtitle display period and disparity information of subsequent frames at update frame intervals. Therefore, the amount of transmission data can be reduced, and on the reception side, the capacity of a memory for holding the disparity information can be greatly conserved.

Description

この発明は、立体画像データ送信装置、立体画像データ送信方法、立体画像データ受信装置および立体画像データ受信方法に関し、特に、立体画像データと共に字幕などの重畳情報のデータを送信する立体画像データ送信装置等に関する。 The present invention relates to a stereoscopic image data transmission device, a stereoscopic image data transmission method, a stereoscopic image data reception device, and a stereoscopic image data reception method, and in particular, a stereoscopic image data transmission device that transmits superimposition information data such as captions together with stereoscopic image data. Etc.

例えば、特許文献１には、立体画像データのテレビ放送電波を用いた伝送方式について提案されている。この伝送方式では、左眼用画像データおよび右眼用画像データを持つ立体画像データが送信され、両眼視差を利用した立体画像表示が行われる。 For example, Patent Document 1 proposes a transmission method that uses a television broadcast radio wave of stereoscopic image data. In this transmission method, stereoscopic image data having left-eye image data and right-eye image data is transmitted, and stereoscopic image display using binocular parallax is performed.

図９５は、両眼視差を利用した立体画像表示において、スクリーン上におけるオブジェクト（物体）の左右像の表示位置と、その立体像の再生位置との関係を示している。例えば、スクリーン上に図示のように左像Ｌａが右側に右像Ｒａが左側にずれて表示されているオブジェクトＡに関しては、左右の視線がスクリーン面より手前で交差するため、その立体像の再生位置はスクリーン面より手前となる。ＤＰａは、オブジェクトＡに関する水平方向の視差ベクトルを表している。 FIG. 95 shows the relationship between the display position of the left and right images of an object (object) on the screen and the playback position of the stereoscopic image in stereoscopic image display using binocular parallax. For example, with respect to the object A in which the left image La is displayed on the right side and the right image Ra is shifted to the left side as shown in the figure on the screen, the right and left line of sight intersects in front of the screen surface. The position is in front of the screen surface. DPa represents a horizontal disparity vector related to the object A.

また、例えば、スクリーン上に図示のように左像Ｌｂおよび右像Ｒｂが同一位置に表示されているオブジェクトＢに関しては、左右の視線がスクリーン面で交差するため、その立体像の再生位置はスクリーン面上となる。さらに、例えば、スクリーン上に図示のように左像Ｌｃが左側に右像Ｒｃが右側にずれて表示されているオブジェクトＣに関しては、左右の視線がスクリーン面より奥で交差するため、その立体像の再生位置はスクリーン面より奥となる。ＤＰｃは、オブジェクトＣに関する水平方向の視差ベクトルを表している。 Further, for example, with respect to the object B in which the left image Lb and the right image Rb are displayed at the same position as shown in the figure on the screen, the right and left lines of sight intersect on the screen surface. It becomes on the surface. Further, for example, with respect to the object C displayed on the screen as shown in the figure, the left image Lc is shifted to the left side and the right image Rc is shifted to the right side, the right and left lines of sight intersect at the back of the screen surface. The playback position is behind the screen. DPc represents a horizontal disparity vector related to the object C.

特開２００５−６１１４号公報Japanese Patent Laid-Open No. 2005-6114

上述したように立体画像表示において、視聴者は、両眼視差を利用して、立体画像の遠近感を知覚することが普通である。画像に重畳される重畳情報、例えば字幕等に関しても、２次元空間的のみならず、３次元の奥行き感としても、立体画像表示と連動してレンダリングされることが期待される。例えば、画像に字幕を重畳表示(オーバーレイ表示)する場合、遠近感でいうところの最も近い画像内の物体（オブジェクト）よりも手前に表示されないと、視聴者は、遠近感の矛盾を感じる場合がある。 As described above, in stereoscopic image display, a viewer usually perceives the perspective of a stereoscopic image using binocular parallax. Superimposition information superimposed on an image, such as subtitles, is expected to be rendered in conjunction with stereoscopic image display not only in a two-dimensional space but also in a three-dimensional sense of depth. For example, when subtitles are superimposed on an image (overlay display), viewers may feel inconsistencies in perspective unless they are displayed in front of the closest object (object) in the perspective. is there.

そこで、重畳情報のデータと共に、左眼画像および右眼画像の間の視差情報を送信し、受信側で、左眼重畳情報および右眼重畳情報との間に視差を付与することが考えられる。この場合、立体画像の変化に合わせて、左眼重畳情報および右眼重畳情報との間に付与すべき視差をダイナミックに変化させるためには、重畳情報が表示される所定数のフレーム期間内で順次更新される視差情報を送ることが必要となる。 Therefore, it is conceivable to transmit parallax information between the left eye image and the right eye image together with the superimposition information data, and to add parallax between the left eye superimposition information and the right eye superimposition information on the receiving side. In this case, in order to dynamically change the parallax to be given between the left eye superimposing information and the right eye superimposing information in accordance with the change of the stereoscopic image, the superimposing information is displayed within a predetermined number of frame periods. It is necessary to send disparity information that is sequentially updated.

この発明の目的は、重畳情報が表示される所定数のフレーム期間内で順次更新される視差情報を送る際に、この視差情報のデータ量の低減を図ることにある。 An object of the present invention is to reduce the data amount of disparity information when sending disparity information that is sequentially updated within a predetermined number of frame periods in which superimposition information is displayed.

この発明の概念は、
左眼画像データおよび右眼画像データを持つ立体画像データを出力する画像データ出力部と、
上記左眼画像データおよび上記右眼画像データによる画像に重畳する重畳情報のデータを出力する重畳情報データ出力部と、
上記左眼画像データおよび上記右眼画像データによる画像に重畳する重畳情報をシフトさせて視差を付与するための視差情報を出力する視差情報出力部と、
上記画像データ出力部から出力される立体画像データ、上記重畳情報データ出力部から出力される重畳情報データおよび上記視差情報出力部から出力される視差情報を送信するデータ送信部とを備え、
上記視差情報は、上記重畳情報が表示される所定数のフレーム期間内で順次更新される視差情報であり、上記所定数のフレーム期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなる
立体画像データ送信装置にある。 The concept of this invention is
An image data output unit for outputting stereoscopic image data having left-eye image data and right-eye image data;
A superimposition information data output unit for outputting superimposition information data to be superimposed on an image based on the left eye image data and the right eye image data;
A parallax information output unit for outputting parallax information for shifting the superimposition information to be superimposed on the image based on the left-eye image data and the right-eye image data and providing parallax;
A stereoscopic image data output from the image data output unit, a superimposition information data output from the superimposition information data output unit, and a data transmission unit that transmits parallax information output from the parallax information output unit,
The disparity information is disparity information that is sequentially updated within a predetermined number of frame periods in which the superimposition information is displayed, and the disparity information of the first frame of the predetermined number of frame periods and every subsequent update frame interval. It exists in the stereoscopic image data transmission apparatus which consists of the parallax information of a frame.

この発明において、画像データ出力部により、左眼画像データおよび右眼画像データを持つ所定の伝送フォーマットの立体画像データが出力される。例えば、立体画像データの伝送フォーマットは、サイド・バイ・サイド（Side By Side）方式、トップ・アンド・ボトム（Top & Bottom）方式などである。 In the present invention, the image data output unit outputs stereoscopic image data of a predetermined transmission format having left eye image data and right eye image data. For example, the transmission format of stereoscopic image data includes a side-by-side (Side By Side) method, a top-and-bottom (Top & Bottom) method, and the like.

重畳情報データ出力部により、左眼画像データおよび右眼画像データによる画像に重畳する重畳情報のデータが出力される。ここで、重畳情報は、画像に重畳される字幕、グラフィクス、テキストなどの情報である。 The superimposition information data output unit outputs the superimposition information data to be superimposed on the image based on the left eye image data and the right eye image data. Here, the superimposition information is information such as subtitles, graphics, and text superimposed on the image.

視差情報出力部により、左眼画像データおよび右眼画像データによる画像に重畳する重畳情報をシフトさせて視差を付与するための視差情報が出力される。この視差情報は、重畳情報が表示される所定数のフレーム期間内で順次更新される視差情報であり、この所定数のフレーム期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなるものとされる。例えば、この視差情報は、同一画面に表示される特定の重畳情報に対応した視差情報および／または同一画面に表示される複数の重畳情報に共通に対応した視差情報とされる。 The disparity information output unit outputs disparity information for giving disparity by shifting the superimposition information to be superimposed on the image based on the left eye image data and the right eye image data. The disparity information is disparity information that is sequentially updated within a predetermined number of frame periods in which superimposition information is displayed. The disparity information of the first frame in the predetermined number of frame periods and the frames for each subsequent update frame interval The parallax information is included. For example, the disparity information is disparity information corresponding to specific superimposition information displayed on the same screen and / or disparity information corresponding to a plurality of superimposition information displayed on the same screen.

そして、データ送信部により、上述の立体画像データ、重畳情報データおよび視差情報が送信される。例えば、重畳情報のデータはＤＶＢ方式のサブタイトルデータであり、データ送信部では、視差情報が、サブタイトルデータが含まれるサブタイトルデータストリームに含めて送信される。例えば、視差情報は、リージョン単位、あるいはこのリージョンに含まれるサブリージョン単位の視差情報である。また、例えば、視差情報は、全てのリージョンを含むページ単位の視差情報である。 Then, the above-described stereoscopic image data, superimposition information data, and parallax information are transmitted by the data transmission unit. For example, the superimposition information data is DVB subtitle data, and the data transmission unit transmits the disparity information in a subtitle data stream including the subtitle data. For example, the disparity information is disparity information in units of regions or subregions included in this region. Further, for example, the disparity information is disparity information in units of pages including all regions.

また、例えば、重畳情報のデータは、ＡＲＩＢ方式の字幕データであり、データ送信部では、視差情報が、字幕データが含まれる字幕データストリームに含めて送信される。また、例えば、重畳情報のデータは、ＣＥＡ方式のクローズド・キャプションデータであり、データ送信部では、視差情報が、クローズド・キャプションデータが含まれるビデオデータストリームのユーザデータ領域に含めて送信される。 For example, the superimposition information data is ARIB subtitle data, and the data transmission unit transmits the parallax information included in the subtitle data stream including the subtitle data. Further, for example, the superimposition information data is CEA closed caption data, and the data transmission unit transmits the disparity information by including it in the user data area of the video data stream including the closed caption data.

このように、この発明においては、立体画像データおよび重畳情報のデータと共に、この立体画像データに対応した視差情報が送信される。そして、この視差情報は、重畳情報が表示される所定数のフレーム期間内で順次更新される視差情報であり、この所定数のフレーム期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなるものとされる。そのため、受信側において、立体画像の変化に合わせて、左眼重畳情報および右眼重畳情報との間に付与すべき視差をダイナミックに変化させることができる。この場合、各フレームの視差情報を全て送信するものではなく、視差情報のデータ量の低減が図られる。 Thus, in the present invention, the parallax information corresponding to the stereoscopic image data is transmitted together with the stereoscopic image data and the superimposition information data. The disparity information is disparity information that is sequentially updated within a predetermined number of frame periods in which the superimposition information is displayed. The disparity information of the first frame in the predetermined number of frame periods and each subsequent update frame interval And the disparity information of the frame. Therefore, the parallax to be given between the left eye superimposition information and the right eye superimposition information can be dynamically changed on the reception side in accordance with the change of the stereoscopic image. In this case, not all the disparity information of each frame is transmitted, and the data amount of the disparity information can be reduced.

なお、この発明において、視差情報には、各更新フレーム間隔の情報として、単位期間の情報およびこの単位期間の個数の情報が付加されていてもよい。視差情報に各更新フレーム間隔の情報が付加されることで、更新フレーム間隔を固定ではなく、視差情報カーブに応じた更新フレーム間隔の設定が可能となる。また、更新フレーム間隔の情報として単位期間の情報およびこの単位期間の個数の情報が付加されることで、各更新フレーム間隔を「単位期間＊個数」の計算により簡単に求めることができる。 In the present invention, information on a unit period and information on the number of unit periods may be added to the disparity information as information on each update frame interval. By adding information of each update frame interval to the disparity information, the update frame interval is not fixed, but the update frame interval can be set according to the disparity information curve. Further, by adding unit period information and information on the number of unit periods as update frame interval information, each update frame interval can be easily obtained by calculating “unit period * number”.

例えば、単位期間の情報は、単位期間を９０ＫＨｚのクロックで計測した値を２４ビット長で表した情報とされる。ＰＥＳのヘッダ部に挿入されているＰＴＳが３３ビット長であるのに対して、２４ビット長とされているのは、以下の理由からである。すなわち、３３ビット長では２４時間分を超える時間を表現できるが、字幕などの重畳情報の表示期間としては不必要な長さである。また、２４ビットとすることで、データサイズを縮小でき、コンパクトな伝送を行うことができる。また、２４ビットは８×３ビットであり、バイトアラインが容易となる。 For example, the unit period information is information that represents a unit period measured with a 90 KHz clock in a 24-bit length. The reason why the PTS inserted in the header portion of the PES is 33 bits long is 24 bits long for the following reason. That is, the 33-bit length can express a time exceeding 24 hours, but is an unnecessary length as a display period of superimposition information such as captions. In addition, by using 24 bits, the data size can be reduced and compact transmission can be performed. Further, 24 bits are 8 × 3 bits, and byte alignment is easy.

また、この発明において、例えば、視差情報には、更新フレーム間隔毎のフレームのそれぞれについて、視差情報の更新の有無を示すフラグ情報が付加されていてもよい。この場合、視差情報の時間方向の変化が同様となる期間が続く場合には、このフラグ情報を用いてその期間内の視差情報の伝送を省略でき、視差情報のデータ量を抑制することが可能となる。 In the present invention, for example, flag information indicating whether or not the disparity information is updated may be added to the disparity information for each frame at each update frame interval. In this case, when a period in which the change in the time direction of the disparity information continues is the same, the transmission of the disparity information within the period can be omitted using this flag information, and the data amount of the disparity information can be suppressed. It becomes.

また、この発明において、例えば、視差情報には、更新フレーム期間毎のフレームのそれぞれについて、更新フレーム間隔を調整する情報が付加されていてもよい。この場合、この調整情報に基づいて、更新フレーム間隔を短くする方向あるいは長くする方向に任意に調整することが可能となり、受信側に、視差情報の時間方向の変化をより的確に伝えることが可能となる。 In the present invention, for example, information for adjusting the update frame interval may be added to the disparity information for each frame for each update frame period. In this case, based on this adjustment information, it is possible to arbitrarily adjust the update frame interval in the direction of shortening or lengthening the update frame interval, and the change in the time direction of the disparity information can be more accurately transmitted to the receiving side. It becomes.

また、この発明において、例えば、視差情報には、フレーム周期を指定する情報が挿入されていてもよい。これにより、送信側で意図する視差情報の更新フレーム間隔を、受信側に正しく伝えることが可能となる。この情報が付加されていない場合、受信側においては、例えば、ビデオのフレーム周期が参照される。 In the present invention, for example, information specifying a frame period may be inserted into the disparity information. As a result, the disparity information update frame interval intended on the transmission side can be correctly transmitted to the reception side. When this information is not added, for example, the video frame period is referred to on the receiving side.

また、この発明において、例えば、視差情報には、重畳情報の表示の際に必須の、視差情報に対する対応レベルを示す情報が挿入されていてもよい。この場合、この情報により、受信側における視差情報に対する対応を制御することが可能となる。 In the present invention, for example, information indicating a correspondence level for disparity information, which is essential when displaying superimposition information, may be inserted into the disparity information. In this case, it is possible to control the correspondence to the parallax information on the reception side with this information.

また、この発明の他の概念は、
左眼画像データおよび右眼画像データを含む立体画像データと、上記左眼画像データによる画像に重畳する重畳情報のデータと、上記左眼画像データおよび上記右眼画像データによる画像に重畳する重畳情報をシフトさせて視差を付与するための視差情報を受信するデータ受信部を備え、
上記受信部で受信される上記視差情報は、上記重畳情報が表示される所定数のフレーム期間内に順次更新される視差情報であり、上記所定数のフレーム期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなり、
上記データ受信部で受信される上記左眼画像データおよび上記右眼画像データと、上記重畳情報のデータと、上記視差情報とを用い、左眼画像および右眼画像に重畳する同一の重畳情報に視差を付与し、上記重畳情報が重畳された左眼画像のデータおよび上記重畳情報が重畳された右眼画像データを得る画像データ処理部をさらに備える
立体画像データ受信装置にある。 Another concept of the present invention is
Stereoscopic image data including left-eye image data and right-eye image data, superimposition information data to be superimposed on an image based on the left-eye image data, and superimposition information to be superimposed on an image based on the left-eye image data and the right-eye image data A data receiving unit that receives parallax information for shifting the image to give parallax,
The disparity information received by the receiving unit is disparity information that is sequentially updated within a predetermined number of frame periods in which the superimposition information is displayed, and the disparity information of the first frame of the predetermined number of frame periods; It consists of the disparity information of the frame for each subsequent update frame interval,
Using the left eye image data and the right eye image data received by the data receiving unit, the superimposition information data, and the parallax information, the same superimposition information to be superimposed on the left eye image and the right eye image The stereoscopic image data receiving apparatus further includes an image data processing unit that provides parallax and obtains left-eye image data on which the superimposition information is superimposed and right-eye image data on which the superimposition information is superimposed.

この発明において、データ受信部により、左眼画像データおよび右眼画像データを含む立体画像データと共に、重畳情報のデータおよび視差情報が受信される。重畳情報のデータは、左眼画像データおよび右眼画像データによる画像に重畳する重畳情報のデータである。ここで、重畳情報は、画像に重畳される字幕、グラフィクス、テキストなどの情報である。視差情報は、左眼画像データおよび右眼画像データによる画像に重畳する重畳情報をシフトさせて視差を付与するためのものである。この視差情報は、重畳情報が表示される所定数のフレーム期間内で順次更新される視差情報であり、この所定数のフレーム期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなるものとされる。 In the present invention, the data reception unit receives the superimposition information data and the parallax information together with the stereoscopic image data including the left eye image data and the right eye image data. The superimposition information data is superimposition information data to be superimposed on the image based on the left eye image data and the right eye image data. Here, the superimposition information is information such as subtitles, graphics, and text superimposed on the image. The parallax information is for giving parallax by shifting the superimposition information to be superimposed on the image based on the left eye image data and the right eye image data. The disparity information is disparity information that is sequentially updated within a predetermined number of frame periods in which superimposition information is displayed. The disparity information of the first frame in the predetermined number of frame periods and the frames for each subsequent update frame interval The parallax information is included.

そして、画像データ処理部により、左眼画像データおよび右眼画像データと、重畳情報のデータと、視差情報とが用いられ、左眼画像および右眼画像に重畳する同一の重畳情報に視差が付与され、重畳情報が重畳された左眼画像のデータおよび重畳情報が重畳された右眼画像のデータが得られる。 Then, the image data processing unit uses the left eye image data and the right eye image data, the superimposition information data, and the parallax information, and gives the parallax to the same superimposition information to be superimposed on the left eye image and the right eye image. Thus, left-eye image data on which superimposition information is superimposed and right-eye image data on which superimposition information is superimposed are obtained.

このように、この発明においては、立体画像データおよび重畳情報のデータと共に、この立体画像データに対応した視差情報が受信される。そして、この視差情報は、重畳情報が表示される所定数のフレーム期間内で順次更新される視差情報であり、この所定数のフレーム期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなっている。そのため、立体画像の変化に合わせて、左眼重畳情報および右眼重畳情報との間に付与すべき視差をダイナミックに変化させることができる。また、各フレームの視差情報が全て送信されてくるものではなく、視差情報を保持するためのメモリ容量の大幅な節約が可能となる。 Thus, in the present invention, the parallax information corresponding to the stereoscopic image data is received together with the stereoscopic image data and the superimposition information data. The disparity information is disparity information that is sequentially updated within a predetermined number of frame periods in which the superimposition information is displayed. The disparity information of the first frame in the predetermined number of frame periods and each subsequent update frame interval Frame disparity information. Therefore, the parallax to be given between the left eye superimposition information and the right eye superimposition information can be dynamically changed in accordance with the change of the stereoscopic image. Also, not all the disparity information of each frame is transmitted, and it is possible to greatly save the memory capacity for holding the disparity information.

なお、この発明において、例えば、画像データ処理部は、所定数のフレーム期間内で順次更新される視差情報を構成する複数フレームの視差情報に対して補間処理を施し、所定数のフレーム期間内における任意のフレーム間隔の視差情報を生成して使用する、ようにされてもよい。この場合、送信側から更新フレーム間隔毎に視差情報が送信される場合であっても、重畳情報に付与される視差を、細かな間隔で、例えばフレーム毎に制御することが可能となる。 In the present invention, for example, the image data processing unit performs an interpolation process on disparity information of a plurality of frames constituting disparity information that is sequentially updated within a predetermined number of frame periods, and within the predetermined number of frame periods. You may make it produce | generate and use the parallax information of arbitrary frame intervals. In this case, even when disparity information is transmitted from the transmission side every update frame interval, the disparity imparted to the superimposition information can be controlled at fine intervals, for example, for each frame.

この場合、補間処理は、線形補間処理であってもよいが、例えば、時間方向（フレーム方向）のローパスフィルタ処理を伴うようにされてもよい。これにより、送信側から更新フレーム間隔毎に視差情報が送信される場合であっても、補間処理後の視差情報の時間方向の変化をなだらかにでき、重畳情報に付与される視差の推移が、更新フレーム間隔毎に不連続となることによる違和感を抑制できる。 In this case, the interpolation process may be a linear interpolation process, but may be accompanied by a low-pass filter process in the time direction (frame direction), for example. Thereby, even when disparity information is transmitted from the transmission side every update frame interval, the change in the time direction of the disparity information after the interpolation process can be performed smoothly, and the transition of the disparity given to the superimposition information is A sense of incongruity due to discontinuity at every update frame interval can be suppressed.

また、この発明において、例えば、視差情報には、更新フレーム間隔の情報として、単位期間の情報および該単位期間の個数の情報が付加されており、画像データ処理部は、重畳情報の表示開始時刻を基準として、視差情報の各更新時刻を、各更新フレーム間隔の情報である単位期間の情報および個数の情報に基づいて求める、ようにされてもよい。 Also, in the present invention, for example, information on the unit period and information on the number of unit periods are added to the disparity information as information on the update frame interval, and the image data processing unit displays the display start time of the superimposition information. As a reference, each update time of the disparity information may be obtained based on unit period information and number information that are information of each update frame interval.

この場合、画像データ処理部では、重畳情報の表示開始時刻から順次各更新時刻を求めることができる。例えば、ある更新時刻に対して次の更新時刻は、ある更新時刻に、次の更新フレーム間隔の情報である単位期間の情報および個数の情報を用いて、単位期間×個数の時間を加算することで、簡単に求められる。なお、重畳情報の表示開始時刻は、例えば、視差情報が含まれるＰＥＳストリームのヘッダ部に挿入されているＰＴＳで与えられる。 In this case, the image data processing unit can obtain each update time sequentially from the display start time of the superimposition information. For example, with respect to a certain update time, the next update time is obtained by adding the unit period × the number of times to the certain update time by using the information of the unit period and the number of pieces of information of the next update frame interval. It is easily requested. Note that the display start time of the superimposition information is given by, for example, a PTS inserted in the header part of the PES stream including the disparity information.

この発明によれば、立体画像データおよび重畳情報のデータと共に、この立体画像データに対応した視差情報が送信され、この視差情報は、重畳情報が表示される所定数のフレーム期間内で順次更新される視差情報であり、この所定数のフレーム期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなるものとされる。そのため、受信側において、立体画像の変化に合わせて、左眼重畳情報および右眼重畳情報との間に付与すべき視差をダイナミックに変化させることができる。また、送信側において、各フレームの視差情報を全て送信するものではなく、送信データ量を低減できる。また、受信側において、視差情報を保持するためのメモリ容量の大幅な節約が可能となる。 According to the present invention, the parallax information corresponding to the stereoscopic image data is transmitted together with the stereoscopic image data and the superimposition information data, and the parallax information is sequentially updated within a predetermined number of frame periods in which the superimposition information is displayed. Disparity information, and is composed of disparity information of the first frame in the predetermined number of frame periods and disparity information of frames at subsequent update frame intervals. Therefore, the parallax to be given between the left eye superimposition information and the right eye superimposition information can be dynamically changed on the reception side in accordance with the change of the stereoscopic image. Further, the transmission side does not transmit all the disparity information of each frame, and the transmission data amount can be reduced. In addition, on the receiving side, it is possible to greatly save the memory capacity for holding the parallax information.

この発明の実施の形態としての画像送受信システムの構成例を示すブロック図である。It is a block diagram which shows the structural example of the image transmission / reception system as embodiment of this invention. 放送局における送信データ生成部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the transmission data generation part in a broadcast station. １９２０×１０８０のピクセルフォーマットの画像データを示す図である。It is a figure which shows the image data of a 1920 * 1080 pixel format. 立体画像データ（３Ｄ画像データ）の伝送方式である「Top & Bottom」方式、「Side By Side」方式、「Frame Sequential」方式を説明するための図である。It is a figure for demonstrating the "Top & Bottom" system, the "Side By Side" system, and the "Frame Sequential" system which are the transmission systems of stereo image data (3D image data). 左眼画像に対する右眼画像の視差ベクトルを検出する例を説明するための図である。It is a figure for demonstrating the example which detects the parallax vector of the right eye image with respect to a left eye image. 視差ベクトルをブロックマッチング方式で求めることを説明するための図である。It is a figure for demonstrating calculating | requiring a parallax vector by a block matching system. ピクセル（画素）毎の視差ベクトルの値を各ピクセル（各画素）の輝度値として用いた場合の画像例を示す図である。It is a figure which shows the example of an image at the time of using the value of the parallax vector for every pixel (pixel) as a luminance value of each pixel (each pixel). ブロック（Block）毎の視差ベクトルの一例を示す図である。It is a figure which shows an example of the parallax vector for every block (Block). 送信データ生成部の視差情報作成部で行われるダウンサイジング処理を説明するための図である。It is a figure for demonstrating the downsizing process performed in the parallax information creation part of a transmission data generation part. ビデオエレメンタリストリーム、サブタイトルエレメンタリストリーム、オーディオエレメンタリストリームを含むトランスポートストリーム（ビットストリームデータ）の構成例を示す図である。It is a figure which shows the structural example of the transport stream (bit stream data) containing a video elementary stream, a subtitle elementary stream, and an audio elementary stream. サブタイトルデータを構成するＰＣＳ（page_composition_segment）の構造を示す図である。It is a figure which shows the structure of PCS (page_composition_segment) which comprises subtitle data. 「segment_type」の各値とセグメントタイプとの対応関係を示す図である。It is a figure which shows the correspondence of each value of "segment_type" and a segment type. 新たに定義される３Ｄ用サブタイトルのフォーマットを示す情報（Component_type=0x15，0x25）を説明するための図である。It is a figure for demonstrating the information (Component_type = 0x15, 0x25) which shows the format of the subtitle for 3D newly defined. 立体画像データの伝送フォーマットがサイド・バイ・サイド方式である場合における立体画像用のサブタイトルデータの作成方法を概念的に示す図である。It is a figure which shows notionally the production method of the subtitle data for stereoscopic images in case the transmission format of stereoscopic image data is a side-by-side system. 立体画像データの伝送フォーマットがトップ・アンド・ボトム方式である場合における立体画像用のサブタイトルデータの作成方法を概念的に示す図である。It is a figure which shows notionally the production method of the subtitle data for stereoscopic images in case the transmission format of stereoscopic image data is a top and bottom system. 立体画像データの伝送フォーマットがフレーム・シーケンシャル方式である場合における立体画像用のサブタイトルデータの作成方法を概念的に示す図である。It is a figure which shows notionally the production method of the subtitle data for stereoscopic images in case the transmission format of stereoscopic image data is a frame sequential system. ＳＣＳ（Subregion composition segment）の構造例（syntax）を示す図である。It is a figure which shows the structural example (syntax) of SCS (Subregion composition segment). ＳＣＳに含まれる「Subｒegion_payload()」の構造例（syntax）を示す図である。It is a figure which shows the structural example (syntax) of "Subregion_payload ()" contained in SCS. ＳＣＳの主要なデータ規定内容（semantics）を示す図である。It is a figure which shows the main data regulation content (semantics) of SCS. ベースセグメント期間（ＢＳＰ）毎の視差情報の更新例を示す図である。It is a figure which shows the example of an update of the parallax information for every base segment period (BSP). 「disparity_temporal_extension（）」の構造例（syntax）を示す図である。It is a figure which shows the structural example (syntax) of "disparity_temporal_extension ()". 「disparity_temporal_extension（）」の構造例における主要なデータ規定内容（semantics）を示している。The main data regulation content (semantics) in the structural example of “disparity_temporal_extension ()” is shown. ベースセグメント期間（ＢＳＰ）毎の視差情報の更新例を示す図である。It is a figure which shows the example of an update of the parallax information for every base segment period (BSP). 放送局からセットトップボックスを介してテレビ受信機に至る、あるいは放送局から直接テレビ受信機に至る、立体画像データおよびサブタイトルデータ（表示制御情報を含む）の流れを概略的に示す図である。It is a figure which shows roughly the flow of the stereo image data and subtitle data (including display control information) from a broadcasting station to a television receiver via a set top box, or from a broadcasting station to a television receiver directly. 放送局からセットトップボックスを介してテレビ受信機に至る、あるいは放送局から直接テレビ受信機に至る、立体画像データおよびサブタイトルデータ（表示制御情報を含む）の流れを概略的に示す図である。It is a figure which shows roughly the flow of the stereo image data and subtitle data (including display control information) from a broadcasting station to a television receiver via a set top box, or from a broadcasting station to a television receiver directly. 放送局からセットトップボックスを介してテレビ受信機に至る、あるいは放送局から直接テレビ受信機に至る、立体画像データおよびサブタイトルデータ（表示制御情報を含む）の流れを概略的に示す図である。It is a figure which shows roughly the flow of the stereo image data and subtitle data (including display control information) from a broadcasting station to a television receiver via a set top box, or from a broadcasting station to a television receiver directly. 画像上における字幕（グラフィクス情報）の表示例と、背景、近景オブジェクト、字幕の遠近感を示す図である。It is a figure which shows the example of a subtitle (graphics information) display on an image, and the perspective of a background, a foreground object, and a subtitle. 画像上における字幕の表示例と、字幕を表示するための左眼字幕ＬＧＩおよび右眼字幕ＲＧＩを示す図である。It is a figure which shows the example of a display of a subtitle on an image, and the left eye subtitle LGI and the right eye subtitle RGI for displaying a subtitle. 立体画像表示システムを構成するセットトップボックスの構成例を示すブロック図である。It is a block diagram which shows the structural example of the set top box which comprises a stereo image display system. セットトップボックスを構成するビットストリーム処理部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the bit stream process part which comprises a set top box. 字幕表示期間内で順次更新される視差情報を構成する複数フレームの視差情報に対してローパスフィルタ処理を伴った補間処理を行って任意のフレーム間隔の視差情報（補間視差情報）を生成する一例を示す図である。An example of generating disparity information (interpolated disparity information) at an arbitrary frame interval by performing interpolation processing with low-pass filter processing on disparity information of a plurality of frames constituting disparity information sequentially updated within a caption display period FIG. 立体画像表示システムを構成するテレビ受信機の構成例を示すブロック図である。It is a block diagram which shows the structural example of the television receiver which comprises a stereo image display system. 放送局における送信データ生成部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the transmission data generation part in a broadcast station. 字幕データストリームの構成例とキャプション・ユニット（字幕）の表示例を示す図である。It is a figure which shows the structural example of a caption data stream, and the example of a caption unit (caption) display. 字幕エンコーダで生成される字幕データストリームの構成例と、その場合の視差ベクトルの作成例を示す図である。It is a figure which shows the structural example of the caption data stream produced | generated by a caption encoder, and the example of creation of the disparity vector in that case. 字幕エンコーダで生成される字幕データストリームの他の構成例と、その場合の視差ベクトルの作成例を示す図である。It is a figure which shows the other structural example of the caption data stream produced | generated by a caption encoder, and the example of creation of the disparity vector in that case. 字幕エンコーダで生成される字幕データストリームの構成例と、その場合の視差ベクトルの作成例を示す図である。It is a figure which shows the structural example of the caption data stream produced | generated by a caption encoder, and the example of creation of the disparity vector in that case. 字幕エンコーダで生成される字幕データストリームの他の構成例と、その場合の視差ベクトルの作成例を示す図である。It is a figure which shows the other structural example of the caption data stream produced | generated by a caption encoder, and the example of creation of the disparity vector in that case. 第１、第２のビューに重畳する各キャプション・ユニットの位置をシフトさせる場合を説明するための図である。It is a figure for demonstrating the case where the position of each caption unit superimposed on the 1st, 2nd view is shifted. 字幕文データグループのＰＥＳストリームに含まれる字幕符号のパケット構造を示す図である。It is a figure which shows the packet structure of the subtitle code contained in the PES stream of a subtitle sentence data group. 字幕管理データグループのＰＥＳストリームに含まれる制御符号のパケット構造を示す図である。It is a figure which shows the packet structure of the control code contained in the PES stream of a caption management data group. 字幕データストリーム（ＰＥＳストリーム）内のデータグループの構造を示す図である。It is a figure which shows the structure of the data group in a caption data stream (PES stream). 字幕管理データグループのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合における字幕管理データの構造を概略的に示す図である。It is a figure which shows roughly the structure of caption management data in case a disparity vector (disparity information) is inserted in the PES stream of a caption management data group. 字幕管理データグループのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合における字幕データの構造を概略的に示す図である。It is a figure which shows roughly the structure of caption data in case a disparity vector (disparity information) is inserted in the PES stream of a caption management data group. 字幕文データグループのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合における字幕データの構造を概略的に示す図である。It is a figure which shows roughly the structure of caption data in case a disparity vector (disparity information) is inserted in the PES stream of a caption text data group. 字幕文データグループのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合における字幕管理データの構造を概略的に示す図である。It is a figure which shows roughly the structure of caption management data in case a disparity vector (disparity information) is inserted in the PES stream of a caption text data group. 字幕データストリームに含まれるデータユニット（data_unit）の構造（Syntax）を示す図である。It is a figure which shows the structure (Syntax) of the data unit (data_unit) contained in a caption data stream. データユニットの種類と、データユニットパラメータおよび機能を示す図である。It is a figure which shows the kind of data unit, a data unit parameter, and a function. 拡張表示制御のデータユニット（data_unit）の構造（Syntax）を示す図である。It is a figure which shows the structure (Syntax) of the data unit (data_unit) of extended display control. 字幕管理データグループのＰＥＳストリームが有する拡張表示制御のデータユニットにおける「Advanced_Rendering_Control」の構造（Syntax）を示す図である。It is a figure which shows the structure (Syntax) of "Advanced_Rendering_Control" in the data unit of the extended display control which the PES stream of a caption management data group has. 字幕分データグループのＰＥＳストリームが有する拡張表示制御のデータユニットにおける「Advanced_Rendering_Control」の構造（Syntax）を示す図である。It is a figure which shows the structure (Syntax) of "Advanced_Rendering_Control" in the data unit of the extended display control which the PES stream of a subtitle data group has. 「Advanced_Rendering_Control」の構造、および「disparity_information」の構造における主要なデータ規定内容を示す図である。It is a figure which shows the main data prescription | regulation content in the structure of "Advanced_Rendering_Control" and the structure of "disparity_information". 字幕文データグループに含まれる拡張表示制御のデータユニット（data_unit）内の「Advanced_Rendering_Control」における「disparity_information」の構造（Syntax）を示す図である。It is a figure which shows the structure (Syntax) of "disparity_information" in "Advanced_Rendering_Control" in the data unit (data_unit) of the extended display control contained in a caption text data group. 「disparity_information」の構造を示す図である。It is a figure which shows the structure of "disparity_information". ビデオエレメンタリストリーム、オーディオエレメンタリストリーム、字幕エレメンタリストリームを含む一般的なトランスポートストリーム（多重化データストリーム）の構成例を示す図である。It is a figure which shows the structural example of the general transport stream (multiplexed data stream) containing a video elementary stream, an audio elementary stream, and a caption elementary stream. データコンテンツ記述子の構造例（Syntax）を示す図である。It is a figure which shows the structural example (Syntax) of a data content descriptor. 「arib_caption_info」の構造例（Syntax）を示す図である。It is a figure which shows the structural example (Syntax) of "arib_caption_info". ＰＭＴの配下にフラグ情報を挿入する場合におけるトランスポートストリーム（多重化データストリーム）の構成例を示す図である。It is a figure which shows the structural example of the transport stream (multiplexed data stream) in the case of inserting flag information under PMT. データ符号化方式記述子の構造例（Syntax）を示す図である。It is a figure which shows the structural example (Syntax) of a data encoding system descriptor. 「additional_arib_caption_info」の構造例（Syntax）を示す図である。It is a figure which shows the structural example (Syntax) of "additional_arib_caption_info". セットトップボックスのビットストリーム処理部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the bit stream process part of a set top box. 放送局における送信データ生成部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the transmission data generation part in a broadcast station. ビデオのエレメンタリストリームの先頭にシーケンス単位のパラメータを含むシーケンスヘッダ部が配置されていることを示す図である。It is a figure which shows that the sequence header part containing the parameter of a sequence unit is arrange | positioned at the head of the elementary stream of video. ＣＥＡテーブルを概略的に示す図である。It is a figure which shows a CEA table schematically. 拡張コマンドを構成するByte1”, “Byte2”, “Byte3”の３バイトフィールドの構造例を示す図である。It is a figure which shows the structural example of the 3 byte field of Byte1 "," Byte2 "," Byte3 "which comprises an extended command. ベースセグメント期間（ＢＳＰ）毎の視差情報の更新例を示す図である。It is a figure which shows the example of an update of the parallax information for every base segment period (BSP). ＣＥＡテーブルを概略的に示す図である。It is a figure which shows a CEA table schematically. “Header(Byte1)”“Byte2”, “Byte3”, “Byte4”の４バイトフィールドの構造例を示す図である。It is a figure which shows the structural example of the 4-byte field of "Header (Byte1)" "Byte2", "Byte3", "Byte4". 従来のクローズド・キャプションデータ（ＣＣデータ）の構造例（Syntax）を示す図である。It is a figure which shows the structural example (Syntax) of the conventional closed caption data (CC data). 視差情報（disparity）対応のために修正されたクローズド・キャプションデータ（ＣＣデータ）の構造例（Syntax）を示す図である。It is a figure which shows the structural example (Syntax) of the closed caption data (CC data) correct | amended for disparity information (disparity) correspondence. 「cc_data_1」、「cc_data_2」の２フィールドを制御する「extended_control」の２ビットフィールドを説明するための図である。It is a figure for demonstrating the 2 bit field of "extended_control" which controls 2 fields of "cc_data_1" and "cc_data_2". 「caption_disparity_data()」の構造例（syntax）を示す図である。It is a figure which shows the structural example (syntax) of "caption_disparity_data ()". 「disparity_temporal_extension（）」の構造例（syntax）を示す図である。It is a figure which shows the structural example (syntax) of "disparity_temporal_extension ()". 「caption_disparity_data()」の構造例における主要なデータ規定内容（semantics）を示す図である。It is a figure which shows the main data prescription | regulation content (semantics) in the structural example of "caption_disparity_data ()". ビデオエレメンタリストリーム、オーディオエレメンタリストリーム、字幕エレメンタリストリームを含む一般的なトランスポートストリーム（多重化データストリーム）の構成例を示す図である。It is a figure which shows the structural example of the general transport stream (multiplexed data stream) containing a video elementary stream, an audio elementary stream, and a caption elementary stream. セットトップボックスのビットストリーム処理部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the bit stream process part of a set top box. 「disparity_temporal_extension（）」の他の構造例（syntax）を示す図である。It is a figure which shows the other structural example (syntax) of "disparity_temporal_extension ()". 「disparity_temporal_extension（）」の構造例に関連する主要なデータ規定内容（semantics）を示している。The main data specification content (semantics) related to the structural example of “disparity_temporal_extension ()” is shown. disparity_temporal_extension（）」の他の構造例を用いた場合における、視差情報の更新例を示す図である。It is a figure which shows the example of an update of parallax information at the time of using the other structural example of disparity_temporal_extension (). disparity_temporal_extension（）」の他の構造例を用いた場合における、視差情報の更新例を示す図である。It is a figure which shows the example of an update of parallax information at the time of using the other structural example of disparity_temporal_extension (). サブタイトルデータストリームの構成例を示す図である。It is a figure which shows the structural example of a subtitle data stream. ＳＣＳセグメントを順次送信する場合における、視差情報の更新例を示す図である。It is a figure which shows the example of an update of parallax information in the case of transmitting an SCS segment sequentially. 更新フレーム間隔が単位期間としてのインターバル期間（ＩＤ：Interval Duration）の倍数で表される視差情報（disparity）の更新例を示す図である。It is a figure which shows the example of an update of the disparity information (disparity) represented by the multiple of the interval period (ID: Interval Duration) as an update frame interval as a unit period. ＰＥＳペイロードデータとしてＤＤＳ、ＰＣＳ、ＲＣＳ、ＣＤＳ、ＯＤＳ、ＤＳＳ、ＥＯＳの各セグメントが含まれているサブタイトルデータストリームの構成例を示す図である。It is a figure which shows the structural example of the subtitle data stream in which each segment of DDS, PCS, RCS, CDS, ODS, DSS, and EOS is included as PES payload data. ページ領域（Area for Page_default）に字幕表示領域としてのリージョン（Region）が２つ含まれているサブタイトルの表示例を示す図である。It is a figure which shows the example of a display of the subtitle in which the page area | region (Area for Page_default) contains two regions (Region) as a caption display area. ＤＳＳのセグメントに、字幕表示期間に順次更新される視差情報（Disparity）として、リージョン単位の視差情報と全てのリージョンを含むページ単位の視差情報の双方が含まれている場合において、各リージョンとページの視差情報カーブの一例を示す図である。When the DSS segment includes both disparity information in units of regions and disparity information in units of pages including all regions as disparity information (Disparity) sequentially updated in the caption display period, each region and page It is a figure which shows an example of this parallax information curve. ページおよび各リージョンの視差情報がどのような構造で送られるかを示す図である。It is a figure which shows what kind of structure the parallax information of a page and each region is sent. ＤＳＳの構造例（syntax）を示す図（１／４）である。It is a figure (1/4) which shows the structural example (syntax) of DSS. ＤＳＳの構造例を示す図（２／４）である。It is a figure (2/4) which shows the structural example of DSS. ＤＳＳの構造例を示す図（３／４）である。It is a figure (3/4) which shows the structural example of DSS. ＤＳＳの構造例を示す図（４／４）である。It is a figure (4/4) which shows the structural example of DSS. ＤＳＳの主要なデータ規定内容（semantics）を示ず図（１／２）である。FIG. 2 is a diagram (1/2) showing the main data definition contents (semantics) of the DSS. ＤＳＳの主要なデータ規定内容を示ず図（２／２）である。FIG. 2 is a diagram (2/2) showing the main data definition contents of the DSS. 画像送受信システムの他の構成例を示すブロック図である。It is a block diagram which shows the other structural example of an image transmission / reception system. 両眼視差を利用した立体画像表示において、スクリーン上におけるオブジェクトの左右像の表示位置と、その立体像の再生位置との関係を説明するための図である。In stereoscopic image display using binocular parallax, it is a figure for demonstrating the relationship between the display position of the left-right image of the object on a screen, and the reproduction position of the stereoscopic image.

以下、発明を実施するための形態（以下、「実施の形態」とする）について説明する。なお、説明を以下の順序で行う。
１．実施の形態
２．変形例 Hereinafter, modes for carrying out the invention (hereinafter referred to as “embodiments”) will be described. The description will be given in the following order.
1. Embodiment 2. FIG. Modified example

＜１．実施の形態＞
［画像送受信システムの構成例］
図１は、実施の形態としての画像送受信システム１０の構成例を示している。この画像送受信システム１０は、放送局１００と、セットトップボックス（ＳＴＢ）２００と、テレビ受信機（ＴＶ）３００を有している。 <1. Embodiment>
[Image transmission / reception system configuration example]
FIG. 1 shows a configuration example of an image transmission / reception system 10 as an embodiment. The image transmission / reception system 10 includes a broadcasting station 100, a set top box (STB) 200, and a television receiver (TV) 300.

セットトップボックス２００およびテレビ受信機３００は、ＨＤＭＩ(High Definition Multimedia Interface)のデジタルインタフェースで接続されている。セットトップボックス２００およびテレビ受信機３００は、ＨＤＭＩケーブル４００を用いて接続されている。セットトップボックス２００には、ＨＤＭＩ端子２０２が設けられている。テレビ受信機３００には、ＨＤＭＩ端子３０２が設けられている。ＨＤＭＩケーブル４００の一端はセットトップボックス２００のＨＤＭＩ端子２０２に接続され、このＨＤＭＩケーブル４００の他端はテレビ受信機３００のＨＤＭＩ端子３０２に接続されている。 The set top box 200 and the television receiver 300 are connected by a digital interface of HDMI (High Definition Multimedia Interface). The set top box 200 and the television receiver 300 are connected using an HDMI cable 400. The set top box 200 is provided with an HDMI terminal 202. The television receiver 300 is provided with an HDMI terminal 302. One end of the HDMI cable 400 is connected to the HDMI terminal 202 of the set top box 200, and the other end of the HDMI cable 400 is connected to the HDMI terminal 302 of the television receiver 300.

［放送局の説明］
放送局１００は、ビットストリームデータＢＳＤを、放送波に載せて送信する。放送局１００は、ビットストリームデータＢＳＤを生成する送信データ生成部１１０を備えている。このビットストリームデータＢＳＤには、立体画像データ、音声データ、重畳情報のデータ、視差情報などが含まれる。立体画像データは所定の伝送フォーマットを有し、立体画像を表示するための左眼画像データおよび右眼画像データを持っている。重畳情報は、一般的には、字幕、グラフィクス情報、テキスト情報などであるが、この実施の形態においては字幕である。 [Description of broadcasting station]
The broadcasting station 100 transmits the bit stream data BSD on a broadcast wave. The broadcast station 100 includes a transmission data generation unit 110 that generates bit stream data BSD. The bit stream data BSD includes stereoscopic image data, audio data, superimposition information data, disparity information, and the like. The stereoscopic image data has a predetermined transmission format, and has left-eye image data and right-eye image data for displaying a stereoscopic image. The superimposition information is generally subtitles, graphics information, text information, etc., but in this embodiment, it is a subtitle.

「送信データ生成部の構成例」
図２は、放送局１００における送信データ生成部１１０の構成例を示している。この送信データ生成部１１０は、既存の放送規格の一つであるＤＶＢ（Digital Video Broadcasting）方式に容易に連携できるデータ構造で視差情報（視差ベクトル）を送信する。この送信データ生成部１１０は、データ取り出し部（アーカイブ部）１１１と、ビデオエンコーダ１１２と、オーディオエンコーダ１１３を有している。また、この送信データ生成部１１０は、サブタイトル発生部１１４と、視差情報作成部１１５と、サブタイトル処理部１１６と、サブタイトルエンコーダ１１８と、マルチプレクサ１１９を有している。 "Configuration example of transmission data generator"
FIG. 2 shows a configuration example of the transmission data generation unit 110 in the broadcast station 100. The transmission data generation unit 110 transmits disparity information (disparity vector) with a data structure that can be easily linked to the DVB (Digital Video Broadcasting) method, which is one of existing broadcasting standards. The transmission data generation unit 110 includes a data extraction unit (archive unit) 111, a video encoder 112, and an audio encoder 113. Further, the transmission data generation unit 110 includes a subtitle generation unit 114, a disparity information creation unit 115, a subtitle processing unit 116, a subtitle encoder 118, and a multiplexer 119.

データ取り出し部１１１には、データ記録媒体１１１ａが、例えば、着脱自在に装着される。このデータ記録媒体１１１ａには、左眼画像データおよび右眼画像データを含む立体画像データと共に、音声データ、視差情報が対応付けて記録されている。データ取り出し部１１１は、データ記録媒体１１１ａから、立体画像データ、音声データ、視差情報等を取り出して出力する。データ記録媒体１１１ａは、ディスク状記録媒体、半導体メモリ等である。 A data recording medium 111a is detachably attached to the data extraction unit 111, for example. In this data recording medium 111a, audio data and parallax information are recorded in association with stereoscopic image data including left eye image data and right eye image data. The data extraction unit 111 extracts and outputs stereoscopic image data, audio data, parallax information, and the like from the data recording medium 111a. The data recording medium 111a is a disk-shaped recording medium, a semiconductor memory, or the like.

データ記録媒体１１１ａに記録されている立体画像データは、所定の伝送方式の立体画像データである。立体画像データ（３Ｄ画像データ）の伝送方式の一例を説明する。ここでは、以下の第１〜第３の伝送方式を挙げるが、これら以外の伝送方式であってもよい。また、ここでは、図３に示すように、左眼（Ｌ）および右眼（Ｒ）の画像データが、それぞれ、決められた解像度、例えば、１９２０×１０８０のピクセルフォーマットの画像データである場合を例にとって説明する。 The stereoscopic image data recorded on the data recording medium 111a is stereoscopic image data of a predetermined transmission method. An example of a transmission method of stereoscopic image data (3D image data) will be described. Here, although the following 1st-3rd transmission systems are mentioned, transmission systems other than these may be used. Also, here, as shown in FIG. 3, the case where the image data of the left eye (L) and the right eye (R) is image data of a predetermined resolution, for example, a 1920 × 1080 pixel format. Let's take an example.

第１の伝送方式は、トップ・アンド・ボトム（Top & Bottom）方式で、図４（ａ）に示すように、垂直方向の前半では左眼画像データの各ラインのデータを伝送し、垂直方向の後半では左眼画像データの各ラインのデータを伝送する方式である。この場合、左眼画像データおよび右眼画像データのラインが１／２に間引かれることから原信号に対して垂直解像度は半分となる。 The first transmission method is a top-and-bottom method. As shown in FIG. 4A, in the first half of the vertical direction, the data of each line of the left eye image data is transmitted, and the vertical direction In the latter half of the method, the data of each line of the left eye image data is transmitted. In this case, since the lines of the left eye image data and the right eye image data are thinned out to ½, the vertical resolution is halved with respect to the original signal.

第２の伝送方式は、サイド・バイ・サイド（Side By Side）方式で、図４（ｂ）に示すように、水平方向の前半では左眼画像データのピクセルデータを伝送し、水平方向の後半では右眼画像データのピクセルデータを伝送する方式である。この場合、左眼画像データおよび右眼画像データは、それぞれ、水平方向のピクセルデータが１／２に間引かれる。原信号に対して、水平解像度は半分となる。 The second transmission method is a side-by-side method, and as shown in FIG. 4B, the pixel data of the left eye image data is transmitted in the first half in the horizontal direction, and the second half in the horizontal direction. Then, the pixel data of the right eye image data is transmitted. In this case, in the left eye image data and the right eye image data, the pixel data in the horizontal direction is thinned out to 1/2. The horizontal resolution is halved with respect to the original signal.

第３の伝送方式は、フレーム・シーケンシャル（Frame Sequential）方式で、図４（ｃ）に示すように、左眼画像データと右眼画像データとをフレーム毎に順次切換えて伝送する方式である。なお、このフレーム・シーケンシャル方式は、フル・フレーム（Full Frame）方式、あるいはバックワード・コンパチブル（BackwardCompatible）方式と称される場合もある。 The third transmission method is a frame sequential method, in which left-eye image data and right-eye image data are sequentially switched and transmitted for each frame, as shown in FIG. 4C. This frame sequential method may be referred to as a full frame method or a backward compatible method.

また、データ記録媒体１１１ａに記録されている視差情報は、例えば、画像を構成するピクセル（画素）毎の視差ベクトルである。視差ベクトルの検出例について説明する。ここでは、左眼画像に対する右眼画像の視差ベクトルを検出する例について説明する。図５に示すように、左眼画像を検出画像とし、右眼画像を参照画像とする。この例では、（xi,yi）および（xj,yj）の位置における視差ベクトルが検出される。 Moreover, the parallax information recorded on the data recording medium 111a is, for example, a parallax vector for each pixel (pixel) constituting the image. A detection example of a disparity vector will be described. Here, an example in which the parallax vector of the right eye image with respect to the left eye image is detected will be described. As shown in FIG. 5, the left eye image is a detected image, and the right eye image is a reference image. In this example, the disparity vectors at the positions (xi, yi) and (xj, yj) are detected.

（xi,yi）の位置における視差ベクトルを検出する場合を例にとって説明する。この場合、左眼画像に、（xi,yi）の位置の画素を左上とする、例えば４×４、８×８あるいは１６×１６の画素ブロック（視差検出ブロック）Ｂｉが設定される。そして、右眼画像において、画素ブロックＢｉとマッチングする画素ブロックが探索される。 A case where a disparity vector at the position (xi, yi) is detected will be described as an example. In this case, for example, a 4 × 4, 8 × 8, or 16 × 16 pixel block (parallax detection block) Bi is set in the left eye image with the pixel at the position (xi, yi) at the upper left. Then, a pixel block matching the pixel block Bi is searched in the right eye image.

この場合、右眼画像に、（xi,yi）の位置を中心とする探索範囲が設定され、その探索範囲内の各画素を順次注目画素として、上述の画素ブロックＢｉと同様の例えば４×４、８×８あるいは１６×１６の比較ブロックが順次設定されていく。 In this case, a search range centered on the position of (xi, yi) is set in the right eye image, and each pixel in the search range is sequentially set as a pixel of interest, for example, 4 × 4 similar to the above-described pixel block Bi. 8 × 8 or 16 × 16 comparison blocks are sequentially set.

画素ブロックＢｉと順次設定される比較ブロックとの間で、対応する画素毎の差分絶対値の総和が求められる。ここで、図６に示すように、画素ブロックＢｉの画素値をＬ(x,y)とし、比較ブロックの画素値をＲ(x,y)とするとき、画素ブロックＢｉと、ある比較ブロックとの間における差分絶対値の総和は、Σ｜Ｌ(x,y)−Ｒ(x,y)｜で表される。 Between the pixel block Bi and the sequentially set comparison blocks, the sum of absolute difference values for each corresponding pixel is obtained. Here, as shown in FIG. 6, when the pixel value of the pixel block Bi is L (x, y) and the pixel value of the comparison block is R (x, y), the pixel block Bi, a certain comparison block, The sum of absolute differences between the two is represented by Σ | L (x, y) −R (x, y) |.

右眼画像に設定される探索範囲にｎ個の画素が含まれているとき、最終的にｎ個の総和Ｓ１〜Ｓｎが求められ、その中で最小の総和Ｓminが選択される。そして、この総和Ｓminが得られた比較ブロックから左上の画素の位置が（xi′,yi′）が得られる。これにより、（xi,yi）の位置における視差ベクトルは、（xi′−xi，yi′−yi）のように検出される。詳細説明は省略するが、（xj,yj）の位置における視差ベクトルについても、左眼画像に、（xj,yj）の位置の画素を左上とする、例えば４×４、８×８あるいは１６×１６の画素ブロックＢｊが設定されて、同様の処理過程で検出される。 When n pixels are included in the search range set in the right eye image, n total sums S1 to Sn are finally obtained, and the minimum sum Smin is selected. Then, the position of the upper left pixel (xi ′, yi ′) is obtained from the comparison block from which the sum Smin is obtained. Thereby, the disparity vector at the position (xi, yi) is detected as (xi′−xi, yi′−yi). Although detailed description is omitted, for the disparity vector at the position (xj, yj), the left eye image has the pixel at the position (xj, yj) at the upper left, for example, 4 × 4, 8 × 8, or 16 × Sixteen pixel blocks Bj are set and detected in the same process.

ビデオエンコーダ１１２は、データ取り出し部１１１から取り出された立体画像データに対して、ＭＰＥＧ４−ＡＶＣ、ＭＰＥＧ２、ＶＣ−１等の符号化を施し、ビデオデータストリーム（ビデオエレメンタリストリーム）を生成する。オーディオエンコーダ１１３は、データ取り出し部１１１から取り出された音声データに対して、ＡＣ３、ＡＡＣ等の符号化を施し、オーディオデータストリーム（オーディオエレメンタリストリーム）を生成する。 The video encoder 112 performs encoding such as MPEG4-AVC, MPEG2, or VC-1 on the stereoscopic image data extracted from the data extraction unit 111, and generates a video data stream (video elementary stream). The audio encoder 113 performs encoding such as AC3 or AAC on the audio data extracted from the data extraction unit 111 to generate an audio data stream (audio elementary stream).

サブタイトル発生部１１４は、ＤＶＢ（Digital Video Broadcasting）方式の字幕データであるサブタイトルデータを発生する。このサブタイトルデータは、二次元画像用のサブタイトルデータである。このサブタイトル発生部１１４は、重畳情報データ出力部を構成している。 The subtitle generating unit 114 generates subtitle data that is DVB (Digital Video Broadcasting) subtitle data. This subtitle data is subtitle data for a two-dimensional image. The subtitle generation unit 114 constitutes a superimposition information data output unit.

視差情報作成部１１５は、データ取り出し部１１１から取り出されたピクセル（画素）毎の視差ベクトル（水平方向視差ベクトル）に対して、ダウンサイジング処理を施し、サブタイトルに適用すべき視差情報（水平方向視差ベクトル）を作成する。この視差情報作成部１１５は、視差情報出力部を構成している。なお、サブタイトルに適用する視差情報は、ページ単位、リージョン単位、あるいはオブジェクト単位で付すことが可能である。また、この視差情報は必ずしも視差情報作成部１１５で生成される必要はなく、外部から別途供給される構成も可能である。 The disparity information creating unit 115 performs a downsizing process on the disparity vector (horizontal disparity vector) for each pixel (pixel) extracted from the data extracting unit 111, and disparity information (horizontal disparity to be applied to the subtitle) Vector). The parallax information creation unit 115 constitutes a parallax information output unit. Note that the disparity information applied to the subtitle can be attached in units of pages, regions, or objects. The disparity information does not necessarily have to be generated by the disparity information creating unit 115, and a configuration in which the disparity information is separately supplied from the outside is also possible.

図７は、各ピクセル（画素）の輝度値のようにして与えられる相対的な深さ方向のデータの例を示している。ここで、相対的な深さ方向のデータは所定の変換により画素ごとの視差ベクトルとして扱うことが可能となる。この例において、人物部分の輝度値は高くなっている。これは、人物部分の視差ベクトルの値が大きいことを意味し、従って、立体画像表示では、この人物部分が浮き出た状態に知覚されることを意味している。また、この例において、背景部分の輝度値は低くなっている。これは、背景部分の視差ベクトルの値が小さいことを意味し、従って、立体画像表示では、この背景部分が沈んだ状態に知覚されることを意味している。 FIG. 7 shows an example of data in the relative depth direction given as the luminance value of each pixel (pixel). Here, the data in the relative depth direction can be handled as a disparity vector for each pixel by a predetermined conversion. In this example, the luminance value of the person portion is high. This means that the value of the parallax vector of the person portion is large, and therefore, in stereoscopic image display, this means that the person portion is perceived as being raised. In this example, the luminance value of the background portion is low. This means that the value of the parallax vector in the background portion is small, and therefore, in stereoscopic image display, this means that the background portion is perceived as a sunken state.

図８は、ブロック（Block）毎の視差ベクトルの一例を示している。ブロックは、最下層に位置するピクセル（画素）の上位層に当たる。このブロックは、画像（ピクチャ）領域が、水平方向および垂直方向に所定の大きさで分割されることで構成される。各ブロックの視差ベクトルは、例えば、そのブロック内に存在する全ピクセル（画素）の視差ベクトルから、最も値の大きな視差ベクトルが選択されることで得られる。この例においては、各ブロックの視差ベクトルを矢印で示しており、矢印の長さが視差ベクトルの大きさに対応している。 FIG. 8 shows an example of a disparity vector for each block. The block corresponds to an upper layer of pixels (picture elements) located at the lowermost layer. This block is configured by dividing an image (picture) region into a predetermined size in the horizontal direction and the vertical direction. The disparity vector of each block is obtained, for example, by selecting the disparity vector having the largest value from the disparity vectors of all pixels (pixels) existing in the block. In this example, the disparity vector of each block is indicated by an arrow, and the length of the arrow corresponds to the magnitude of the disparity vector.

図９は、視差情報作成部１１５で行われるダウンサイジング処理の一例を示している。最初に、視差情報作成部１１５は、図９（ａ）に示すように、ピクセル（画素）毎の視差ベクトルを用いて、ブロック毎の視差ベクトルを求める。上述したように、ブロックは、最下層に位置するピクセル（画素）の上位層に当たり、画像（ピクチャ）領域が水平方向および垂直方向に所定の大きさで分割されることで構成される。そして、各ブロックの視差ベクトルは、例えば、そのブロック内に存在する全ピクセル（画素）の視差ベクトルから、最も値の大きな視差ベクトルが選択されることで得られる。 FIG. 9 shows an example of the downsizing process performed by the disparity information creating unit 115. First, as shown in FIG. 9A, the disparity information creating unit 115 obtains a disparity vector for each block using a disparity vector for each pixel (pixel). As described above, a block corresponds to an upper layer of pixels located at the lowest layer, and is configured by dividing an image (picture) region into a predetermined size in the horizontal direction and the vertical direction. Then, the disparity vector of each block is obtained, for example, by selecting the disparity vector having the largest value from the disparity vectors of all the pixels (pixels) existing in the block.

次に、視差情報作成部１１５は、図９（ｂ）に示すように、ブロック毎の視差ベクトルを用いて、グループ（Group Of Block）毎の視差ベクトルを求める。グループは、ブロックの上位層に当たり、複数個の近接するブロックをまとめてグループ化することで得られる。図９（ｂ）の例では、各グループは、破線枠で括られる４個のブロックにより構成されている。そして、各グループの視差ベクトルは、例えば、そのグループ内の全ブロックの視差ベクトルから、最も値の大きな視差ベクトルが選択されることで得られる。 Next, the disparity information creating unit 115 obtains a disparity vector for each group (Group Of Block) using the disparity vector for each block, as illustrated in FIG. A group is an upper layer of a block, and is obtained by grouping a plurality of adjacent blocks together. In the example of FIG. 9B, each group is composed of four blocks bounded by a broken line frame. The disparity vector of each group is obtained, for example, by selecting the disparity vector having the largest value from the disparity vectors of all blocks in the group.

次に、視差情報作成部１１５は、図９（ｃ）に示すように、グループ毎の視差ベクトルを用いて、パーティション（Partition）毎の視差ベクトルを求める。パーティションは、グループの上位層に当たり、複数個の近接するグループをまとめてグループ化することで得られる。図９（ｃ）の例では、各パーティションは、破線枠で括られる２個のグループにより構成されている。そして、各パーティションの視差ベクトルは、例えば、そのパーティション内の全グループの視差ベクトルから、最も値の大きな視差ベクトルが選択されることで得られる。 Next, the disparity information creating unit 115 obtains a disparity vector for each partition (Partition) using the disparity vector for each group, as shown in FIG. The partition is an upper layer of the group and is obtained by grouping a plurality of adjacent groups together. In the example of FIG. 9C, each partition is configured by two groups bounded by a broken line frame. The disparity vector of each partition is obtained, for example, by selecting the disparity vector having the largest value from the disparity vectors of all groups in the partition.

次に、視差情報作成部１１５は、図９（ｄ）に示すように、パーティション毎の視差ベクトルを用いて、最上位層に位置するピクチャ全体（画像全体）の視差ベクトルを求める。図９（ｄ）の例では、ピクチャ全体には、破線枠で括られる４個のパーティションが含まれている。そして、ピクチャ全体の視差ベクトルは、例えば、ピクチャ全体に含まれる全パーティションの視差ベクトルから、最も値の大きな視差ベクトルが選択されることで得られる。 Next, the disparity information creating unit 115 obtains the disparity vector of the entire picture (entire image) located in the highest layer using the disparity vector for each partition, as shown in FIG. In the example of FIG. 9D, the entire picture includes four partitions that are bounded by a broken line frame. Then, the disparity vector for the entire picture is obtained, for example, by selecting the disparity vector having the largest value from the disparity vectors for all partitions included in the entire picture.

このようにして、視差情報作成部１１５は、最下層に位置するピクセル（画素）毎の視差ベクトルにダウンサイジング処理を施して、ブロック、グループ、パーティション、ピクチャ全体の各階層の各領域の視差ベクトルを求めることができる。なお、図９に示すダウンサイジング処理の一例では、最終的に、ピクセル（画素）の階層の他、ブロック、グループ、パーティション、ピクチャ全体の４階層の視差ベクトルを求めている。しかし、階層数ならびに各階層の領域の切り方や領域の数はこれに限定されるものではない。 In this way, the disparity information creating unit 115 performs the downsizing process on the disparity vector for each pixel (pixel) located in the lowest layer, and the disparity vectors of the respective regions in each layer of the block, group, partition, and entire picture Can be requested. In the example of the downsizing process shown in FIG. 9, finally, in addition to the pixel (pixel) layer, disparity vectors of four layers of blocks, groups, partitions, and entire pictures are obtained. However, the number of hierarchies, how to cut areas in each hierarchy, and the number of areas are not limited to this.

図２に戻って、サブタイトル処理部１１６は、サブタイトル発生部１１４で発生されたサブタイトルデータを、データ取り出し部１１１から取り出される立体画像データの伝送フォーマットに対応した立体画像用（三次元画像用）のサブタイトルデータに変換する。このサブタイトル処理部１１６は、重畳情報データ処理部を構成し、変換後の立体画像データ用のサブタイトルデータは、送信用重畳情報データを構成する。 Returning to FIG. 2, the subtitle processing unit 116 converts the subtitle data generated by the subtitle generation unit 114 for a stereoscopic image (for a three-dimensional image) corresponding to the transmission format of the stereoscopic image data extracted from the data extraction unit 111. Convert to subtitle data. The subtitle processing unit 116 constitutes a superimposition information data processing unit, and the converted subtitle data for stereoscopic image data constitutes transmission superimposition information data.

この立体画像用のサブタイトルデータは、左眼サブタイトルのデータおよび右眼サブタイトルのデータを持っている。ここで、左眼サブタイトルのデータは、上述の立体画像データに含まれる左眼画像データに対応したデータであり、受信側において、立体画像データが持つ左眼画像データに重畳する左眼サブタイトルの表示データを発生するためのデータである。また、右眼サブタイトルのデータは、上述の立体画像データに含まれる右眼画像データに対応したデータであり、受信側において、立体画像データが持つ右眼画像データに重畳する右眼サブタイトルの表示データを発生するためのデータである。 The stereoscopic image subtitle data includes left-eye subtitle data and right-eye subtitle data. Here, the left-eye subtitle data is data corresponding to the left-eye image data included in the above-described stereoscopic image data, and the display of the left-eye subtitle superimposed on the left-eye image data included in the stereoscopic image data on the receiving side. Data for generating data. The right-eye subtitle data is data corresponding to the right-eye image data included in the above-described stereoscopic image data, and display data of the right-eye subtitle to be superimposed on the right-eye image data included in the stereoscopic image data on the receiving side. It is data for generating.

この場合、サブタイトル処理部１１６は、視差情報作成部１１５からのサブタイトルに適用すべき視差情報（水平方向視差ベクトル）に基づき、少なくとも、左眼サブタイトルまたは右眼サブタイトルをシフトさせて、左眼サブタイトルと右眼サブタイトルとの間に視差を付与することもできる。このように左眼サブタイトルと右眼サブタイトルとの間に視差を付与することで、受信側においては、視差を付与する処理を行わなくても、サブタイトル（字幕）の表示において、画像内の各物体との間の遠近感の整合性を最適な状態に維持できる。 In this case, the subtitle processing unit 116 shifts at least the left eye subtitle or the right eye subtitle based on the disparity information (horizontal disparity vector) to be applied to the subtitle from the disparity information creating unit 115, and Parallax can be given to the right eye subtitle. In this way, by giving parallax between the left eye subtitle and the right eye subtitle, each object in the image can be displayed in the display of the subtitle (caption) without performing parallax processing on the receiving side. The consistency of perspective between the two can be maintained in an optimum state.

このサブタイトル処理部１１６は、表示制御情報生成部１１７を備えている。この表示制御情報生成部１１７は、サブリージョン（Subregion）に関連した表示制御情報を生成する。ここで、サブリージョンは、リージョン内にのみ定義される領域である。このサブリージョンには、左眼サブリージョン（左眼ＳＲ）および右眼サブリージョン（右眼ＳＲ）がある。以下、適宜、サブリージョンを左眼ＳＲと呼び、右眼サブリージョンを右眼ＳＲと呼ぶ。 The subtitle processing unit 116 includes a display control information generation unit 117. The display control information generation unit 117 generates display control information related to the subregion. Here, the sub-region is an area defined only within the region. This subregion includes a left eye subregion (left eye SR) and a right eye subregion (right eye SR). Hereinafter, as appropriate, the subregion is referred to as the left eye SR, and the right eye subregion is referred to as the right eye SR.

左眼サブリージョンは、送信用重畳情報データの表示領域であるリージョン内に、左眼サブタイトルの表示位置に対応して設定された領域である。また、右眼サブリージョンは、送信用重畳情報データの表示領域であるリージョン内に、右眼サブタイトルの表示位置に対応して設定された領域である。例えば、左眼サブリージョンは第１の表示領域を構成し、右眼サブリージョンは第２の表示領域を構成する。これら左眼ＳＲおよび右眼ＳＲの領域は、サブタイトル発生部１１６で発生されるサブタイトルデータ毎に、例えば、ユーザ操作に基づいて、あるいは自動的に設定される。なお、この場合、左眼ＳＲ内の左眼サブタイトルと右眼ＳＲ内の右眼サブタイトルとが対応したものとなるように、左眼ＳＲおよび右眼ＳＲの領域が設定される。 The left-eye subregion is an area set corresponding to the display position of the left-eye subtitle in the region that is the display area of the transmission superimposition information data. The right-eye subregion is an area set corresponding to the display position of the right-eye subtitle in the region that is the display area of the transmission superimposition information data. For example, the left eye subregion constitutes a first display area, and the right eye subregion constitutes a second display area. These left eye SR and right eye SR regions are set for each subtitle data generated by the subtitle generating unit 116, for example, based on a user operation or automatically. In this case, the regions of the left eye SR and the right eye SR are set so that the left eye subtitle in the left eye SR and the right eye subtitle in the right eye SR correspond to each other.

表示制御情報には、左眼ＳＲの領域情報と、右眼ＳＲの領域情報とが含まれる。また、この表示制御情報には、左眼ＳＲに含まれる左眼サブタイトルを表示するターゲットフレームの情報と、右眼ＳＲに含まれる右眼サブタイトルを表示するターゲットフレームの情報とが含まれる。ここで、左眼ＳＲに含まれる左眼サブタイトルを表示するターゲットフレームの情報は左眼画像のフレームを示し、右眼ＳＲに含まれる右眼サブタイトルを表示するターゲットフレームの情報は右眼画像のフレームを示す。 The display control information includes area information for the left eye SR and area information for the right eye SR. The display control information includes information on a target frame that displays a left-eye subtitle included in the left eye SR and information on a target frame that displays a right-eye subtitle included in the right eye SR. Here, the information of the target frame displaying the left eye subtitle included in the left eye SR indicates the frame of the left eye image, and the information of the target frame displaying the right eye subtitle included in the right eye SR is the frame of the right eye image. Indicates.

また、この表示制御情報には、左眼ＳＲに含まれる左眼サブタイトルの表示位置をシフト調整する視差情報（disparity）と、右眼ＳＲに含まれる右眼サブタイトルの表示位置をシフト調整する視差情報とが含まれる。これら視差情報は、左眼ＳＲに含まれる左眼サブタイトルと右眼ＳＲに含まれる右眼サブタイトルとの間に視差を付与するためのものである。 The display control information includes disparity information (disparity) for shifting the display position of the left eye subtitle included in the left eye SR and disparity information for shifting the display position of the right eye subtitle included in the right eye SR. And are included. These pieces of parallax information are for giving parallax between the left eye subtitle included in the left eye SR and the right eye subtitle included in the right eye SR.

この場合、表示制御情報生成部１１７は、視差情報作成部１１５で作成された例えばサブタイトルに適用すべき視差情報（水平方向視差ベクトル）に基づいて、上述の表示制御情報に含ませるシフト調整のための視差情報を取得する。ここで、左眼ＳＲの視差情報「Disparity1」および右眼ＳＰの視差情報「Disparity2」は、それらの絶対値が等しく、しかもそれらの差が、サブタイトルに適用すべき視差情報（Disparity）に対応した値となるように、決定される。例えば、立体画像データの伝送フォーマットがサイド・バイ・サイド方式の場合には、視差情報（Disparity）に対応した値は、“Disparity／２”である。また、例えば、立体画像データの伝送フォーマットがトップ・アンド・ボトム（Top & Bottom）方式の場合には、視差情報（Disparity）に対応した値は、“Disparity”とされる。 In this case, the display control information generation unit 117 performs shift adjustment to be included in the above-described display control information based on the disparity information (horizontal disparity vector) to be applied to, for example, the subtitle created by the disparity information creation unit 115. Is obtained. Here, the disparity information “Disparity1” of the left eye SR and the disparity information “Disparity2” of the right eye SP have the same absolute value, and the difference between them corresponds to the disparity information (Disparity) to be applied to the subtitle. It is determined to be a value. For example, when the transmission format of the stereoscopic image data is the side-by-side format, the value corresponding to the disparity information (Disparity) is “Disparity / 2”. Further, for example, when the transmission format of the stereoscopic image data is a top-and-bottom method, the value corresponding to the disparity information (Disparity) is “Disparity”.

なお、サブタイトルデータは、ＤＤＳ、ＰＣＳ、ＲＳＣ、ＣＤＳ、ＯＤＳなどのセグメントを持つ。ＤＤＳ（display definition segment）は、ＨＤＴＶ用の表示（display）サイズを指定する。ＰＣＳ（page composition segment）は、ページ（page）内のリージョン（region）位置を指定する。ＲＣＳ（region compositionsegment）は、リージョン（Region）の大きさやオブジェクト（object）の符号化モードを指定し、また、オブジェクト（object）の開始位置を指定する。ＣＤＳ（CLUT definition segment）は、ＣＬＵＴ内容の指定をする。ＯＤＳ（objectdata segment）は、符号化ピクセルデータ（Pixeldata）を含む。 The subtitle data has segments such as DDS, PCS, RSC, CDS, and ODS. A display definition segment (DDS) designates a display size for HDTV. PCS (page composition segment) designates a region position within a page. The RCS (region composition segment) specifies the size of the region (Region), the encoding mode of the object (object), and specifies the start position of the object (object). CDS (CLUT definition segment) designates CLUT contents. The ODS (objectdata segment) includes encoded pixel data (Pixeldata).

この実施の形態においては、ＳＣＳ（Subregion composition segment）のセグメントが新たに定義される。このＳＣＳのセグメントに、上述したように表示制御情報生成部１１７で生成された表示制御情報が挿入される。サブタイトル処理部１１６の処理の詳細ついては、さらに、後述する。 In this embodiment, an SCS (Subregion composition segment) segment is newly defined. The display control information generated by the display control information generation unit 117 as described above is inserted into this SCS segment. Details of the processing of the subtitle processing unit 116 will be described later.

図２に戻って、サブタイトルエンコーダ１１８は、サブタイトル処理部１１６から出力される立体画像用のサブタイトルデータおよび表示制御情報を含むサブタイトルデータストリーム（サブタイトルエレメンタリストリーム）を生成する。マルチプレクサ１１９は、ビデオエンコーダ１１９、オーディオエンコーダ１２０およびサブタイトルエンコーダ１２５からの各データストリームを多重化し、ビットストリームデータ（トランスポートストリーム）ＢＳＤとしての多重化データストリームを得る。 Returning to FIG. 2, the subtitle encoder 118 generates a subtitle data stream (subtitle elementary stream) including the subtitle data for stereoscopic images output from the subtitle processing unit 116 and display control information. The multiplexer 119 multiplexes each data stream from the video encoder 119, the audio encoder 120, and the subtitle encoder 125 to obtain a multiplexed data stream as bit stream data (transport stream) BSD.

なお、この実施の形態において、マルチプレクサ１１９は、サブタイトルデータストリームに、立体画像用のサブタイトルデータが含まれることを識別する識別情報を挿入する。具体的には、ＥＩＴ（Event Information Table）の配下に挿入されているコンポーネント・デスクリプタ（Component_Descriptor）に、Stream_content(‘0x03’=DVB subtitles) ＆ Component_type(for 3D target)が記述される。Component_type(for3D target)は、立体画像用のサブタイトルデータを示すために新たに定義される。 In this embodiment, the multiplexer 119 inserts identification information for identifying that subtitle data for stereoscopic images is included in the subtitle data stream. Specifically, Stream_content ('0x03' = DVB subtitles) & Component_type (for 3D target) is described in a component descriptor (Component_Descriptor) inserted under an EIT (Event Information Table). Component_type (for3D target) is newly defined to indicate subtitle data for stereoscopic images.

図２に示す送信データ生成部１１０の動作を簡単に説明する。データ取り出し部１１１から取り出された立体画像データは、ビデオエンコーダ１１２に供給される。このビデオエンコーダ１１２では、その立体画像データに対してＭＰＥＧ４−ＡＶＣ、ＭＰＥＧ２、ＶＣ−１等の符号化が施され、符号化ビデオデータを含むビデオデータストリームが生成される。このビデオデータストリームはマルチプレクサ１１９に供給される。 The operation of the transmission data generation unit 110 shown in FIG. 2 will be briefly described. The stereoscopic image data extracted from the data extraction unit 111 is supplied to the video encoder 112. In the video encoder 112, the stereoscopic image data is encoded by MPEG4-AVC, MPEG2, VC-1, or the like, and a video data stream including the encoded video data is generated. This video data stream is supplied to the multiplexer 119.

データ取り出し部１１１で取り出された音声データはオーディオエンコーダ１１３に供給される。このオーディオエンコーダ１１３では、音声データに対して、ＭＰＥＧ−２ＡｕｄｉｏＡＡＣ、あるいは、ＭＰＥＧ−４ＡＡＣ等の符号化が施され、符号化オーディオデータを含むオーディオデータストリームが生成される。このオーディオデータストリームはマルチプレクサ１１９に供給される。 The audio data extracted by the data extraction unit 111 is supplied to the audio encoder 113. The audio encoder 113 performs encoding such as MPEG-2Audio AAC or MPEG-4 AAC on the audio data, and generates an audio data stream including the encoded audio data. This audio data stream is supplied to the multiplexer 119.

サブタイトル発生部１１４では、ＤＶＢの字幕データであるサブタイトルデータ（二次元画像用）が発生される。このサブタイトルデータは、視差情報作成部１１５およびサブタイトル処理部１１６に供給される。 The subtitle generation unit 114 generates subtitle data (for two-dimensional images) that is DVB subtitle data. This subtitle data is supplied to the disparity information creating unit 115 and the subtitle processing unit 116.

データ取り出し部１１１から取り出されたピクセル（画素）毎の視差ベクトルは、視差情報作成部１１５に供給される。この視差情報作成部１１５では、ピクセル毎の視差ベクトルに対してダウンサイジング処理が施され、サブタイトルに適用すべき視差情報（水平方向視差ベクトル＝Disparity）が作成される。この視差情報は、サブタイトル処理部１１６に供給される。 The disparity vector for each pixel (pixel) extracted from the data extracting unit 111 is supplied to the disparity information creating unit 115. In the disparity information creating unit 115, downsizing processing is performed on the disparity vector for each pixel, and disparity information (horizontal disparity vector = Disparity) to be applied to the subtitle is created. This disparity information is supplied to the subtitle processing unit 116.

サブタイトル処理部１１６では、サブタイトル発生部１１４で発生された二次元画像用のサブタイトルデータが、上述のデータ取り出し部１１１から取り出された立体画像データの伝送フォーマットに対応した立体画像用のサブタイトルデータに変換される。この立体画像用のサブタイトルデータは、左眼サブタイトルのデータおよび右眼サブタイトルのデータを持っている。この場合、サブタイトル処理部１１６では、視差情報作成部１１５からの、サブタイトルに適用すべき視差情報に基づき、少なくとも、左眼サブタイトルまたは右眼サブタイトルをシフトさせて、左眼サブタイトルと右眼サブタイトルとの間に視差が付与される場合もある。 The subtitle processing unit 116 converts the 2D image subtitle data generated by the subtitle generation unit 114 into 3D image subtitle data corresponding to the transmission format of the 3D image data extracted from the data extraction unit 111 described above. Is done. The stereoscopic image subtitle data includes left-eye subtitle data and right-eye subtitle data. In this case, the subtitle processing unit 116 shifts at least the left-eye subtitle or the right-eye subtitle based on the disparity information to be applied to the subtitle from the disparity information creation unit 115, and In some cases, parallax may be provided between them.

サブタイトル処理部１１６の表示制御情報生成部１１７では、サブリージョン（Subregion）に関連した表示制御情報（領域情報、ターゲットフレーム情報、視差情報）が生成される。サブリージョンには、上述したように、左眼サブリージョン（左眼ＳＲ）および右眼サブリージョン（右眼ＳＲ）が含まれる。そのため、表示制御情報として、左眼ＳＲ、右眼ＳＲのそれぞれの領域情報、ターゲットフレーム情報、視差情報が生成される。 The display control information generation unit 117 of the subtitle processing unit 116 generates display control information (region information, target frame information, and disparity information) related to the subregion (Subregion). As described above, the subregion includes the left eye subregion (left eye SR) and the right eye subregion (right eye SR). Therefore, area information, target frame information, and parallax information of the left eye SR and right eye SR are generated as display control information.

上述したように、左眼ＳＲは、例えば、ユーザ操作に基づいて、あるいは自動的に、送信用重畳情報データの表示領域であるリージョン内に、左眼サブタイトルの表示位置に対応して設定される。同様に、右眼ＳＲは、例えば、ユーザ操作に基づいて、あるいは自動的に、送信用重畳情報データの表示領域であるリージョン内に、右眼サブタイトルの表示位置に対応して設定される。 As described above, the left eye SR is set corresponding to the display position of the left eye subtitle, for example, in a region that is a display area of superimposing information data for transmission based on a user operation. . Similarly, the right eye SR is set corresponding to the display position of the right eye subtitle, for example, based on a user operation or automatically in a region that is a display area of the superimposed information data for transmission.

サブタイトル処理部１１７で得られる立体画像用のサブタイトルデータおよび表示制御情報は、サブタイトルエンコーダ１１８に供給される。このサブタイトルエンコーダ１１８では、立体画像用のサブタイトルデータおよび表示制御情報を含むサブタイトルデータストリームが生成される。このサブタイトルデータストリームには、立体画像用のサブタイトルデータが挿入されたＤＤＳ、ＰＣＳ、ＲＣＳ、ＣＤＳ、ＯＤＳ等のセグメントと共に、表示制御情報を含む新たに定義されたＳＣＳのセグメントが含まれる。 The stereoscopic image subtitle data and display control information obtained by the subtitle processing unit 117 are supplied to the subtitle encoder 118. This subtitle encoder 118 generates a subtitle data stream including stereoscopic image subtitle data and display control information. The subtitle data stream includes a newly defined SCS segment including display control information, as well as segments such as DDS, PCS, RCS, CDS, and ODS into which stereoscopic image subtitle data is inserted.

マルチプレクサ１１９には、上述したように、ビデオエンコーダ１１２、オーディオエンコーダ１１３およびサブタイトルエンコーダ１１８からの各データリストリームが供給される。そして、このマルチプレクサ１１９では、各データストリームがパケット化されて多重され、ビットストリームデータ（トランスポートストリーム）ＢＳＤとしての多重化データストリームが得られる。 As described above, each data stream from the video encoder 112, the audio encoder 113, and the subtitle encoder 118 is supplied to the multiplexer 119. In the multiplexer 119, each data stream is packetized and multiplexed to obtain a multiplexed data stream as bit stream data (transport stream) BSD.

図１０は、トランスポートストリーム（ビットストリームデータ）の構成例を示している。このトランスポートストリームには、各エレメンタリストリームをパケット化して得られたＰＥＳパケットが含まれている。この構成例では、ビデオエレメンタリストリームのＰＥＳパケット「Video PES」、オーディオエレメンタリストリームのＰＥＳパケット「AudioPES」、サブタイトルエレメンタリストリームのＰＥＳパケット「「Subtitle PES」が含まれている。 FIG. 10 shows a configuration example of a transport stream (bit stream data). This transport stream includes PES packets obtained by packetizing each elementary stream. In this configuration example, the PES packet “Video PES” of the video elementary stream, the PES packet “AudioPES” of the audio elementary stream, and the PES packet ““ Subtitle PES ”of the subtitle elementary stream are included.

この実施の形態において、サブタイトルエレメンタリストリーム（サブタイトルデータストリーム）には、立体画像用のサブタイトルデータおよび表示制御情報が含まれる。このストリームには、ＤＤＳ、ＰＣＳ、ＲＣＳ、ＣＤＳ、ＯＤＳなどの従来周知のセグメントと共に、新たに定義された表示制御情報を含むＳＣＳのセグメントが含まれる。 In this embodiment, the subtitle elementary stream (subtitle data stream) includes subtitle data for stereoscopic images and display control information. This stream includes SCS segments including newly defined display control information, as well as conventionally known segments such as DDS, PCS, RCS, CDS, and ODS.

図１１は、ＰＣＳ（page_composition_segment）の構造を示している。このＰＣＳのセグメントタイプは、図１２に示すように、「0x10」である。「region_horizontal_address」、「region_vertical_address」は、リージョン（region）の開始位置を示す。なお、ＤＤＳ、ＲＳＣ、ＯＤＳなどのその他のセグメントについては、その構造の図示は省略する。図１２に示すように、ＤＤＳのセグメントタイプは「0x14」であり、ＲＣＳのセグメントタイプは「0x11」であり、ＣＤＳのセグメントタイプは「0x12」であり、ＯＤＳのセグメントタイプは「0x13」である。例えば、図１２に示すように、ＳＣＳのセグメントタイプは「0x40」とされる。このＳＣＳのセグメントの詳細構造については、後述する。 FIG. 11 shows the structure of PCS (page_composition_segment). The segment type of this PCS is “0x10” as shown in FIG. “Region_horizontal_address” and “region_vertical_address” indicate the start position of the region. Note that the structure of other segments such as DDS, RSC, and ODS is not shown. As shown in FIG. 12, the DDS segment type is “0x14”, the RCS segment type is “0x11”, the CDS segment type is “0x12”, and the ODS segment type is “0x13”. . For example, as shown in FIG. 12, the SCS segment type is “0x40”. The detailed structure of the SCS segment will be described later.

図１０に戻って、また、トランスポートストリームには、ＰＳＩ（Program Specific Information）として、ＰＭＴ（ProgramMap Table）が含まれている。このＰＳＩは、トランスポートストリームに含まれる各エレメンタリストリームがどのプログラムに属しているかを記した情報である。また、トランスポートストリームには、イベント単位の管理を行うＳＩ（Serviced Information）としてのＥＩＴ(EventInformation Table)が含まれている。このＥＩＴには、番組単位のメタデータが記載される。 Returning to FIG. 10, the transport stream includes a PMT (Program Map Table) as PSI (Program Specific Information). This PSI is information describing to which program each elementary stream included in the transport stream belongs. Further, the transport stream includes an EIT (Event Information Table) as SI (Serviced Information) for managing each event. In the EIT, metadata for each program is described.

ＰＭＴには、プログラム全体に関連する情報を記述するプログラム・デスクリプタ（Program Descriptor）が存在する。また、このＰＭＴには、各エレメンタリストリームに関連した情報を持つエレメンタリ・ループが存在する。この構成例では、ビデオエレメンタリ・ループ、オーディオエレメンタリ・ループ、サブタイトルエレメンタリ・ループが存在する。各エレメンタリ・ループには、ストリーム毎に、パケット識別子（PID）等の情報が配置されると共に、図示していないが、そのエレメンタリストリームに関連する情報を記述する記述子（デスクリプタ）も配置される。 The PMT includes a program descriptor (Program Descriptor) that describes information related to the entire program. The PMT includes an elementary loop having information related to each elementary stream. In this configuration example, there are a video elementary loop, an audio elementary loop, and a subtitle elementary loop. In each elementary loop, information such as a packet identifier (PID) is arranged for each stream, and a descriptor (descriptor) describing information related to the elementary stream is also arranged, although not shown. The

ＥＩＴの配下に、コンポーネント・デスクリプタ（Component_Descriptor）が挿入されている。この実施の形態において、このコンポーネント・デスクリプタに、Stream_content(‘0x03’=DVB subtitles) ＆ Component_type(for 3Dtarget)が記述される。これにより、サブタイトルデータストリームに立体画像用のサブタイトルデータが含まれることが識別可能とされる。この実施の形態においては、図１３に示すように、配信内容を示す「component_descriptor」の「stream_content」がサブタイトル（subtitle）を示す場合に、３Ｄ用サブタイトルのフォーマットを示す情報（Component_type=0x15，0x25）が新たに定義される。 A component descriptor (Component_Descriptor) is inserted under the EIT. In this embodiment, Stream_content ('0x03' = DVB subtitles) & Component_type (for 3D target) is described in this component descriptor. Thereby, it is possible to identify that the subtitle data stream includes the subtitle data for stereoscopic images. In this embodiment, as shown in FIG. 13, when “stream_content” of “component_descriptor” indicating distribution content indicates a subtitle (subtitle), information indicating a 3D subtitle format (Component_type = 0x15, 0x25) Is newly defined.

［サブタイトル処理部の処理］
図２に示す送信データ生成部１１０のサブタイトル処理部１１６の処理の詳細を説明する。このサブタイトル処理部１１６は、上述したように、二次元画像用のサブタイトルデータを立体画像用のサブタイトルデータに変換する。また、このサブタイトル処理部１１６は、上述したように、表示制御情報生成部１１７において、表示制御情報（左眼ＳＲおよび右眼ＳＲの領域情報、ターゲットフレーム情報、視差情報を含む）を生成する。 [Process of subtitle processing section]
Details of the processing of the subtitle processing unit 116 of the transmission data generating unit 110 shown in FIG. 2 will be described. As described above, the subtitle processing unit 116 converts the subtitle data for two-dimensional images into subtitle data for stereoscopic images. Further, as described above, the subtitle processing unit 116 generates display control information (including region information, target frame information, and disparity information of the left eye SR and right eye SR) in the display control information generation unit 117.

図１４は、立体画像データの伝送フォーマットがサイド・バイ・サイド方式である場合における立体画像用のサブタイトルデータの作成方法を概念的に示している。図１４（ａ）は、二次元画像用のサブタイトルデータによるリージョン（region）を示している。なお、この例では、リージョンに３つのオブジェクト（object）が含まれている。 FIG. 14 conceptually shows a method for creating stereoscopic image subtitle data when the transmission format of stereoscopic image data is the side-by-side format. FIG. 14A shows a region based on subtitle data for a two-dimensional image. In this example, the region includes three objects.

最初に、サブタイトル処理部１１６は、上述の二次元画像用のサブタイトルデータによるリージョン（region）のサイズを、図１４（ｂ）に示すように、サイド・バイ・サイド方式に適したサイズに変換し、そのサイズのビットマップデータを発生させる。 First, the subtitle processing unit 116 converts the size of the region based on the above-described subtitle data for a two-dimensional image into a size suitable for the side-by-side method, as shown in FIG. Generate bitmap data of that size.

次に、サブタイトル処理部１１６は、図１４（ｃ）に示すように、サイズ変換後のビットマップデータを、立体画像用のサブタイトルデータにおけるリージョン（region）の構成要素とする。つまり、サイズ変換後のビットマップデータを、リージョン内の左眼サブタイトルに対応したオブジェクトとすると共に、リージョン内の右眼サブタイトルに対応したオブジェクトとする。 Next, as shown in FIG. 14C, the subtitle processing unit 116 uses the bitmap data after the size conversion as a constituent element of the region in the stereoscopic image subtitle data. That is, the bitmap data after the size conversion is an object corresponding to the left eye subtitle in the region and an object corresponding to the right eye subtitle in the region.

サブタイトル処理部１１６は、上述したようにして、二次元画像用のサブタイトルデータを、立体画像用のサブタイトルデータに変換し、この立体画像用のサブタイトルデータに対応したＤＤＳ、ＰＣＳ、ＲＣＳ、ＣＤＳ、ＯＤＳなどのセグメントを作成する。 As described above, the subtitle processing unit 116 converts the subtitle data for two-dimensional images into subtitle data for stereoscopic images, and DDS, PCS, RCS, CDS, ODS corresponding to the subtitle data for stereoscopic images. Create a segment such as

次に、サブタイトル処理部１１６は、ユーザ操作に基づいて、あるいは自動的に、立体画像用のサブタイトルデータにおけるリージョン（region）の領域上に、図１４（ｃ）に示すように、左眼ＳＲおよび右眼ＳＲを設定する。左眼ＳＲは、左眼サブタイトルに対応したオブジェクトを含む領域に設定される。右眼ＳＲは、右眼サブタイトルに対応したオブジェクトを含む領域に設定される。 Next, as shown in FIG. 14C, the subtitle processing unit 116, based on a user operation or automatically, on the region of the region in the stereoscopic image subtitle data, as shown in FIG. Set the right eye SR. The left eye SR is set in an area including an object corresponding to the left eye subtitle. The right eye SR is set in an area including an object corresponding to the right eye subtitle.

サブタイトル処理部１１６は、上述したように設定された左眼ＳＲおよび右眼ＳＲの領域情報、ターゲットフレーム情報、視差情報を含むＳＣＳのセグメントを作成する。例えば、サブタイトル処理部１１６は、左眼ＳＲおよび右眼ＳＲの領域情報、ターゲットフレーム情報、視差情報を共通に含むＳＣＳを作成するか、左眼ＳＲおよび右眼ＳＲの領域情報、ターゲットフレーム情報、視差情報をそれぞれ含むＳＣＳのセグメントを作成する。 The subtitle processing unit 116 creates an SCS segment including the region information, target frame information, and disparity information of the left eye SR and right eye SR set as described above. For example, the subtitle processing unit 116 creates an SCS including the left eye SR and right eye SR region information, target frame information, and disparity information in common, or the left eye SR and right eye SR region information, target frame information, SCS segments each including disparity information are created.

図１５は、立体画像データの伝送フォーマットがトップ・アンド・ボトム方式である場合における立体画像用のサブタイトルデータの作成方法を概念的に示している。図１５（ａ）は、二次元画像用のサブタイトルデータによるリージョン（region）を示している。なお、この例では、リージョンに３つのオブジェクト（object）が含まれている。 FIG. 15 conceptually shows a method for creating stereoscopic image subtitle data when the transmission format of stereoscopic image data is the top-and-bottom method. FIG. 15A shows a region (region) based on subtitle data for a two-dimensional image. In this example, the region includes three objects.

最初に、サブタイトル処理部１１６は、上述の二次元画像用のサブタイトルデータによるリージョン（region）のサイズを、図１５（ｂ）に示すように、トップ・アンド・ボトム方式に適したサイズに変換し、そのサイズのビットマップデータを発生させる。 First, the subtitle processing unit 116 converts the size of the region based on the above-described subtitle data for a two-dimensional image into a size suitable for the top-and-bottom method, as shown in FIG. Generate bitmap data of that size.

次に、サブタイトル処理部１１６は、図１５（ｃ）に示すように、サイズ変換後のビットマップデータを立体画像用のサブタイトルデータのリージョン（region）の構成要素とする。つまり、サイズ変換後のビットマップデータを、左眼画像（leftview ）側のリージョンのオブジェクトとすると共に、右眼画像（Right view ）側のリージョンのオブジェクトとする。 Next, as shown in FIG. 15C, the subtitle processing unit 116 uses the bitmap data after the size conversion as a constituent element of the region of the stereoscopic image subtitle data. That is, the bitmap data after the size conversion is used as a region object on the left eye image (leftview) side and a region object on the right eye image (Right view) side.

サブタイトル処理部１１６は、上述したようにして、二次元画像用のサブタイトルデータを、立体画像用のサブタイトルデータに変換し、この立体画像用のサブタイトルデータに対応したＰＣＳ、ＲＣＳ、ＣＤＳ、ＯＤＳなどのセグメントを作成する。 As described above, the subtitle processing unit 116 converts the subtitle data for the two-dimensional image into the subtitle data for the stereoscopic image, and PCS, RCS, CDS, ODS, etc. corresponding to the stereoscopic image subtitle data. Create a segment.

次に、サブタイトル処理部１１６は、ユーザ操作に基づいて、あるいは自動的に、立体画像用のサブタイトルデータにおけるリージョン（region）の領域上に、図１５（ｃ）に示すように、左眼ＳＲおよび右眼ＳＲを設定する。左眼ＳＲは、左眼画像側のリージョン内のオブジェクトを含む領域に設定される。右眼ＳＲは、左眼画像側のリージョン内のオブジェクトを含む領域に設定される。 Next, as shown in FIG. 15C, the subtitle processing unit 116, on the region of the region in the stereoscopic image subtitle data, automatically or based on a user operation or automatically. Set the right eye SR. The left eye SR is set to an area including an object in the region on the left eye image side. The right eye SR is set to an area including an object in the region on the left eye image side.

図１６は、立体画像データの伝送フォーマットがフレーム・シーケンシャル方式である場合における立体画像用のサブタイトルデータの作成方法を概念的に示している。図１６（ａ）は、二次元画像用のサブタイトルデータによるリージョン（region）を示している。なお、この例では、リージョンに１つのオブジェクト（object）が含まれている。立体画像データの伝送フォーマットがフレーム・シーケンシャル方式である場合、この二次元画像用のサブタイトルデータをそのまま立体画像用のサブタイトルデータとする。この場合、二次元画像用のサブタイトルデータに対応したＤＤＳ、ＰＣＳ、ＲＣＳ、ＯＤＳなどのセグメントが、そのまま立体画像用のサブタイトルデータに対応したＤＤＳ、ＰＣＳ、ＲＣＳ、ＯＤＳなどのセグメントとなる。 FIG. 16 conceptually shows a method for creating stereoscopic image subtitle data when the transmission format of stereoscopic image data is the frame sequential method. FIG. 16A shows a region based on subtitle data for a two-dimensional image. In this example, the region includes one object. When the transmission format of the stereoscopic image data is the frame sequential method, the 2D image subtitle data is used as it is as the stereoscopic image subtitle data. In this case, segments such as DDS, PCS, RCS, and ODS corresponding to subtitle data for 2D images become segments such as DDS, PCS, RCS, and ODS corresponding to subtitle data for stereoscopic images as they are.

次に、サブタイトル処理部１１６は、ユーザ操作に基づいて、あるいは自動的に、立体画像用のサブタイトルデータにおけるリージョン（region）の領域上に、図１６（ｂ）に示すように、左眼ＳＲおよび右眼ＳＲを設定する。左眼ＳＲは、左眼サブタイトルに対応したオブジェクトを含む領域に設定される。右眼ＳＲは、右眼サブタイトルに対応したオブジェクトを含む領域に設定される。 Next, as shown in FIG. 16B, the subtitle processing unit 116, based on the user operation or automatically, on the region of the region in the stereoscopic image subtitle data, as shown in FIG. Set the right eye SR. The left eye SR is set in an area including an object corresponding to the left eye subtitle. The right eye SR is set in an area including an object corresponding to the right eye subtitle.

図１７、図１８は、ＳＣＳ（Subregion Composition segment）の構造例（syntax）を示している。図１９は、ＳＣＳの主要なデータ規定内容（semantics）を示している。この構造には、「Sync_byte」、「segment_type」、「page_id」、「segment_length」の各情報が含まれている。「segment_type」は、セグメントタイプを示す８ビットのデータであり、ここでは、ＳＣＳを示す「0x40」とされる（図１２参照）。「segment_length」は、セグメントの長さ（サイズ）を示す８ビットのデータである。 FIG. 17 and FIG. 18 show a structural example (syntax) of an SCS (Subregion Composition segment). FIG. 19 shows the main data definition contents (semantics) of the SCS. This structure includes information of “Sync_byte”, “segment_type”, “page_id”, and “segment_length”. “Segment_type” is 8-bit data indicating the segment type, and is “0x40” indicating SCS here (see FIG. 12). “Segment_length” is 8-bit data indicating the length (size) of the segment.

図１８は、ＳＣＳの実質的な情報を含む部分を示している。この構造例では、左眼ＳＲ、右眼ＳＲの表示制御情報、つまり左眼ＳＲ、右眼ＳＲの領域情報、ターゲットフレーム情報、視差情報、表示オンオフコマンド情報を伝送できる。なお、この構造例では、任意の個数のサブリージョンの表示制御情報を持つことができる。 FIG. 18 shows a portion including substantial information of the SCS. In this structure example, display control information for the left eye SR and right eye SR, that is, area information for the left eye SR and right eye SR, target frame information, parallax information, and display on / off command information can be transmitted. In this structure example, display control information for an arbitrary number of subregions can be held.

「region_id」は、リージョン（region）の識別子を示す８ビット情報である。「subregion_id」は、サブリージョン（Subregion）の識別子を示す８ビット情報である。「subregion_visible_flag」は、対応するサブリージョンの表示（重畳）のオンオフを制御する１ビットのフラグ情報（コマンド情報）である。「subregion_visible_flag=1」は、対応するサブリージョンの表示オンを示すと共に、その前に表示されていた対応するサブリージョンの表示オフを示す。 “Region_id” is 8-bit information indicating an identifier of a region. “Subregion_id” is 8-bit information indicating an identifier of a subregion (Subregion). “Subregion_visible_flag” is 1-bit flag information (command information) for controlling on / off of display (superimposition) of a corresponding subregion. “Subregion_visible_flag = 1” indicates that the display of the corresponding subregion is on, and also indicates that the display of the corresponding subregion displayed before that is off.

「subregion_extent_flag」は、サブリージョンとリージョンとが、サイズおよび位置に関して、同じか否かを示す１ビットのフラグ情報である。「subregion_extent_flag=1」は、サブリージョンとリージョンとが、サイズおよび位置に関して、同じであることを示す。一方、「subregion_extent_flag=0」は、サブリージョンはリージョンより小さいことを示す。 “Subregion_extent_flag” is 1-bit flag information indicating whether or not the subregion and the region are the same in terms of size and position. “Subregion_extent_flag = 1” indicates that the subregion and the region are the same in terms of size and position. On the other hand, “subregion_extent_flag = 0” indicates that the subregion is smaller than the region.

「subregion_position_flag」は、続くデータにサブリージョンの領域（位置およびサイズ）の情報を含むか否かを示す１ビットのフラグ情報である。「subregion_position_flag=1」は、続くデータにサブリージョンの領域（位置およびサイズ）の情報を含むことを示す。一方、「subregion_position_flag=0」は、続くデータにサブリージョンの領域（位置およびサイズ）の情報を含まないことを示す。 “Subregion_position_flag” is 1-bit flag information indicating whether or not the following data includes information on the region (position and size) of the subregion. “Subregion_position_flag = 1” indicates that the following data includes information on the region (position and size) of the subregion. On the other hand, “subregion_position_flag = 0” indicates that the following data does not include information on the region (position and size) of the subregion.

「target_stereo_frame」は、対応するサブリージョンのターゲットフレーム（表示対象フレーム）を指定する１ビットの情報である。この「target_stereo_frame」は、ターゲットフレーム情報を構成する。「target_stereo_frame=0」は、対応するサブリージョンがフレーム０（例えば、左眼フレーム、あるいはベースビューフレームなど）に表示されるものであることを示す。一方、「target_stereo_frame=1」は、対応するサブリージョンがフレーム1（例えば、右眼フレーム、あるいはノンベースビューフレームなど）に表示されるものであることを示す。 “Target_stereo_frame” is 1-bit information that specifies the target frame (display target frame) of the corresponding subregion. This “target_stereo_frame” constitutes target frame information. “Target_stereo_frame = 0” indicates that the corresponding subregion is displayed in frame 0 (for example, the left eye frame or the base view frame). On the other hand, “target_stereo_frame = 1” indicates that the corresponding subregion is displayed in frame 1 (for example, a right eye frame or a non-base view frame).

「rendering_level」は、字幕表示の際に受信側（デコーダ側）で必須の視差情報（disparity）対応レベルを示す。“００”は、視差情報を用いた字幕の３次元表示は任意（optional）であることを示す。“０１”は、字幕表示期間内で共通に使用される視差情報（default_disparity）による字幕の３次元表示が必須であることを示す。“１０”は、字幕表示期間内で順次更新される視差情報（disparity_update）による字幕の３次元表示が必須であることを示す。 “Rendering_level” indicates a level corresponding to disparity information (disparity) that is essential on the reception side (decoder side) when displaying captions. “00” indicates that subtitle three-dimensional display using parallax information is optional. “01” indicates that subtitle three-dimensional display using disparity information (default_disparity) commonly used within the subtitle display period is essential. “10” indicates that subtitle three-dimensional display using disparity information (disparity_update) sequentially updated within the subtitle display period is essential.

「temporal_extension_flag」は、字幕表示期間内で順次更新される視差情報（disparity_update）の存在の有無を示す１ビットのフラグ情報である。この場合、“１”は存在することを示し、“０”は存在しないことを示す。「shared_disparity」は、全てのリージョン（region）に跨る共通の視差情報（disparity）制御を行うかどうかを示す。“１”は、以後の全てのリージョンに対して、一つの共通の視差情報（disparity）が適用されることを示す。“０”は、視差情報（Disparity）は、一つのリージョンにのみ適用されることを示す。 “Temporal_extension_flag” is 1-bit flag information indicating the presence / absence of disparity information (disparity_update) that is sequentially updated within the caption display period. In this case, “1” indicates that it exists, and “0” indicates that it does not exist. “Shared_disparity” indicates whether to perform common disparity information (disparity) control over all regions. “1” indicates that one common disparity information (disparity) is applied to all subsequent regions. “0” indicates that the disparity information (Disparity) is applied to only one region.

「subregion_disparity」の８ビットフィールドは、デフォルトの視差情報を示す。この視差情報は、更新をしない場合の視差情報、つまり字幕表示期間内において共通に使用される視差情報である。「subregion_position_flag=1」のとき、以下のサブリージョンの領域（位置およびサイズ）の情報が含まれる。 An 8-bit field of “subregion_disparity” indicates default disparity information. This disparity information is disparity information when updating is not performed, that is, disparity information that is commonly used within a caption display period. When “subregion_position_flag = 1”, information on the following subregion regions (position and size) is included.

「subregion_horizontal_position」は、矩形領域であるサブリージョンの左端の位置を示す１６ビット情報である。「subregion_vertical_position」は、矩形領域であるサブリージョンの上端の位置を示す１６ビット情報である。「subregion_width」は、矩形領域であるサブリージョンの水平方向のサイズ（ピクセル数）を示す１６ビット情報である。「subregion_height」は、矩形領域であるサブリージョンの垂直方向のサイズ（ピクセル数）を示す１６ビット情報である。これらの位置情報およびサイズ情報は、サブリージョンの領域情報を構成している。 “Subregion_horizontal_position” is 16-bit information indicating the position of the left end of the subregion which is a rectangular region. “Subregion_vertical_position” is 16-bit information indicating the position of the upper end of the subregion which is a rectangular region. “Subregion_width” is 16-bit information indicating the horizontal size (number of pixels) of a subregion which is a rectangular region. “Subregion_height” is 16-bit information indicating the vertical size (number of pixels) of a subregion which is a rectangular region. These position information and size information constitute subregion region information.

「temporal_extension_flag」が“１”である場合、「disparity_temporal_extension（）」を有する。ここには、基本的に、ベースセグメント期間（ＢＳＰ：Base Segment Period）毎に更新すべき視差情報が格納される。図２０は、ベースセグメント期間（ＢＳＰ）毎の視差情報の更新例を示している。ここで、ベースセグメント期間は、更新フレーム間隔を意味する。この図からも明らかなように、字幕表示期間内で順次更新される視差情報は、字幕表示期間の最初のフレームの視差情報と、その後のベースセグメント期間（更新フレーム間隔）毎のフレームの視差情報とからなっている。 When “temporal_extension_flag” is “1”, “disparity_temporal_extension ()” is included. Here, basically, disparity information to be updated for each base segment period (BSP) is stored. FIG. 20 illustrates an example of updating disparity information for each base segment period (BSP). Here, the base segment period means an update frame interval. As is clear from this figure, the disparity information sequentially updated within the caption display period includes the disparity information of the first frame in the caption display period, and the disparity information of the frame for each subsequent base segment period (updated frame interval). It is made up of.

なお、図２１は、「disparity_temporal_extension（）」の構造例（syntax）を示している。図２２は、その主要なデータ規定内容（semantics）を示している。「temporal_division_size」の２ビットフィールドは、ベースセグメント期間（更新フレーム間隔）に含まれるフレーム数を示す。“００”は、１６フレームであることを示す。“０１”は、２５フレームであることを示す。“１０”は、３０フレームであることを示す。さらに、“１１”は、３２フレームであることを示す。 FIG. 21 shows a structural example (syntax) of “disparity_temporal_extension ()”. FIG. 22 shows the main data definition contents (semantics). A 2-bit field of “temporal_division_size” indicates the number of frames included in the base segment period (update frame interval). “00” indicates 16 frames. “01” indicates 25 frames. “10” indicates 30 frames. Further, “11” indicates 32 frames.

「temporal_division_count」の５ビットフィールドは、字幕表示期間に含まれるベースセグメントの個数を示す。「disparity_curve_no_update_flag」は、視差情報の更新の有無を示す１ビットのフラグ情報である。“１”は対応するベースセグメントのエッジで視差情報の更新を行わない、つまりスキップすることを示し、“０”は対応するベースセグメントのエッジで視差情報の更新を行うことを示す。 A 5-bit field of “temporal_division_count” indicates the number of base segments included in the caption display period. “Disparity_curve_no_update_flag” is 1-bit flag information indicating whether or not disparity information is updated. “1” indicates that the disparity information is not updated at the edge of the corresponding base segment, that is, skipping, and “0” indicates that the disparity information is updated at the edge of the corresponding base segment.

図２３は、ベースセグメント期間（ＢＳＰ）毎の視差情報の更新例を示している。図において、「skip」が付されたベースセグメントのエッジでは視差情報の更新は行われない。このフラグ情報が存在することで、視差情報のフレーム方向の変化が同様となる期間が長く続く場合、視差情報の更新を行わないようにして、その期間内の視差情報の伝送を省略でき、視差情報のデータ量を抑制することが可能となる。 FIG. 23 illustrates an example of updating disparity information for each base segment period (BSP). In the figure, the disparity information is not updated at the edge of the base segment to which “skip” is attached. When the period in which the change in the frame direction of the disparity information is similar continues for a long time due to the presence of this flag information, disparity information is not updated, and transmission of disparity information within that period can be omitted. It becomes possible to suppress the data amount of information.

「disparity_curve_no_update_flag」が“０”で視差情報の更新を行う場合、対応するベースセグメントの「shifting_interval_counts」が含まれる。一方、「disparity_curve_no_update_flag」が“１”で視差情報の更新を行わない場合、「disparity_update」は含まれない。「shifting_interval_counts」の６ビットフィールドは、ベースセグメント期間(更新フレーム間隔)を調整するドローファクタ（Draw factor）、つまり差し引きフレーム数を示す。 When “disparity_curve_no_update_flag” is “0” and the disparity information is updated, “shifting_interval_counts” of the corresponding base segment is included. On the other hand, when “disparity_curve_no_update_flag” is “1” and disparity information is not updated, “disparity_update” is not included. A 6-bit field of “shifting_interval_counts” indicates a draw factor for adjusting the base segment period (update frame interval), that is, the number of subtracted frames.

図２３のベースセグメント期間（ＢＳＰ）毎の視差情報の更新例において、時点Ｃ〜Ｆの視差情報の更新タイミングに関しては、ドローファクタ（Draw factor）により、ベースセグメント期間が調整されている。この調整情報が存在することで、ベースセグメント期間(更新フレーム間隔)を調整することが可能となり、受信側に、視差情報の時間方向（フレーム方向）の変化をより的確に伝えることが可能となる。 In the example of updating the disparity information for each base segment period (BSP) in FIG. 23, the base segment period is adjusted by the draw factor with respect to the update timing of the disparity information at the time points C to F. The presence of this adjustment information makes it possible to adjust the base segment period (update frame interval), and more accurately convey changes in the time direction (frame direction) of disparity information to the receiving side. .

なお、ベースセグメント期間(更新フレーム間隔)の調整としては、上述した差し引きフレーム数で短くする方向に調整する他に、加算フレーム数で長くする方向に調整することも考えられる。例えば、「shifting_interval_counts」の５ビットフィールドを符号付き整数とすることで、双方向の調整が可能となる。 In addition, as the adjustment of the base segment period (update frame interval), in addition to the adjustment in the direction of shortening by the number of subtracted frames described above, the adjustment in the direction of increasing by the number of additional frames may be considered. For example, bi-directional adjustment is possible by setting a 5-bit field of “shifting_interval_counts” as a signed integer.

「disparity_update」の８ビットフィールドは、対応するベースセグメントの視差情報を示す。なお、ｋ＝０における「disparity_update」は、字幕表示期間内において更新フレーム間隔で順次更新される視差情報の初期値、つまり、字幕表示期間における最初のフレームの視差情報である。 The 8-bit field of “disparity_update” indicates disparity information of the corresponding base segment. Note that “disparity_update” at k = 0 is an initial value of disparity information that is sequentially updated at update frame intervals within the caption display period, that is, disparity information of the first frame in the caption display period.

図２４は、放送局１００からセットトップボックス２００を介してテレビ受信機３００に至る、あるいは放送局１００から直接テレビ受信機３００に至る、立体画像データおよびサブタイトルデータ（表示制御情報を含む）の流れを概略的に示している。この場合、放送局１００ではサイド・バイ・サイド（Side-by-Side）方式に合わせた立体画像用のサブタイトルデータが生成される。立体画像データはビデオデータストリームに含まれて送信され、立体画像用のサブタイトルデータはサブタイトルデータストリームに含まれて送信される。 FIG. 24 shows the flow of stereoscopic image data and subtitle data (including display control information) from the broadcast station 100 to the television receiver 300 via the set top box 200 or directly from the broadcast station 100 to the television receiver 300. Is shown schematically. In this case, the broadcast station 100 generates stereoscopic image subtitle data in accordance with a side-by-side format. The stereoscopic image data is included in the video data stream and transmitted, and the stereoscopic image subtitle data is included in the subtitle data stream and transmitted.

最初に、放送局１００からセットトップボックス２００に立体画像データおよびサブタイトルデータ（表示制御情報を含む）が送られ、このセットトップボックス２００がレガシーの２Ｄ対応機器（Legacy 2D STB）である場合について説明する。セットトップボックス２００は、サブタイトルデータ(サブリージョンの表示制御情報を除く)に基づいて、左眼サブタイトルおよび右眼サブタイトルを表示するためのリージョンの表示データを生成し、この表示データを立体画像データに重畳して、出力立体画像データを得る。この場合の重畳位置は、リージョンの位置である。 First, a case where stereoscopic image data and subtitle data (including display control information) are sent from the broadcasting station 100 to the set top box 200, and the set top box 200 is a legacy 2D-compatible device (Legacy 2D STB) will be described. To do. The set top box 200 generates region display data for displaying the left eye subtitle and the right eye subtitle based on the subtitle data (excluding subregion display control information), and converts the display data into stereoscopic image data. The output stereoscopic image data is obtained by superimposing. In this case, the overlapping position is the position of the region.

セットトップボックス２００は、この出力立体画像データを、例えばＨＤＭＩのデジタルインタフェースを通じて、テレビ受信機３００に送信する。この場合、セットトップボックス２００からテレビ受信機３００への立体画像データの伝送フォーマットは、例えば、サイド・バイ・サイド（Side-by-Side）方式とされる。 The set top box 200 transmits the output stereoscopic image data to the television receiver 300 through, for example, an HDMI digital interface. In this case, the transmission format of the stereoscopic image data from the set top box 200 to the television receiver 300 is, for example, a side-by-side system.

テレビ受信機３００は、３Ｄ対応機器（3D TV）である場合、セットトップボックス２００から送られてくるサイド・バイ・サイド方式の立体画像データに３Ｄ信号処理を施し、サブタイトルが重畳された左眼画像および右眼画像のデータを生成する。そして、テレビ受信機３００は、ＬＣＤ等の表示パネルに、ユーザに立体画像を認識させるための両眼視差画像（左眼画像および右眼画像）を表示する。 When the television receiver 300 is a 3D-compatible device (3D TV), the left eye on which side-by-side stereoscopic image data sent from the set-top box 200 is subjected to 3D signal processing and a subtitle is superimposed. Image and right eye image data are generated. Then, the television receiver 300 displays binocular parallax images (a left-eye image and a right-eye image) for allowing the user to recognize a stereoscopic image on a display panel such as an LCD.

次に、放送局１００からセットトップボックス２００に立体画像データおよびサブタイトルデータ（表示制御情報を含む）が送られ、このセットトップボックス２００が３Ｄ対応機器（3D STB）である場合について説明する。セットトップボックス２００は、サブタイトルデータ(サブリージョンの表示制御情報を除く)に基づいて、左眼サブタイトルおよび右眼サブタイトルを表示するためのリージョンの表示データを生成する。そして、セットトップボックス２００は、このリージョンの表示データから、左眼ＳＲに対応した表示データおよび右眼ＳＲに対応した表示データを抽出する。 Next, a case will be described in which stereoscopic image data and subtitle data (including display control information) are sent from the broadcasting station 100 to the set top box 200, and the set top box 200 is a 3D-compatible device (3D STB). The set top box 200 generates region display data for displaying the left-eye subtitle and the right-eye subtitle based on the subtitle data (excluding subregion display control information). The set top box 200 extracts display data corresponding to the left eye SR and display data corresponding to the right eye SR from the display data of this region.

そして、セットトップボックス２００は、左眼ＳＲ、右眼ＳＲに対応した表示データを、立体画像データに重畳して、出力立体画像データを得る。この場合、左眼ＳＲに対応した表示データは、この左眼ＳＲのターゲットフレーム情報であるframe0で示されるフレーム部分(左眼画像フレーム部分)に重畳される。また、右眼ＳＲに対応した表示データは、この右眼ＳＲのターゲットフレーム情報であるframe1で示されるフレーム部分（右眼画像フレーム部分）に重畳される。 Then, the set top box 200 obtains output stereoscopic image data by superimposing display data corresponding to the left eye SR and right eye SR on the stereoscopic image data. In this case, the display data corresponding to the left eye SR is superimposed on the frame portion (left eye image frame portion) indicated by frame0 which is the target frame information of the left eye SR. The display data corresponding to the right eye SR is superimposed on a frame portion (right eye image frame portion) indicated by frame1 which is target frame information of the right eye SR.

この場合、左眼ＳＲに対応した表示データは、サイド・バイ・サイド方式の立体画像データの、左眼ＳＲの領域情報であるPosition1で示される位置を、この左眼ＳＲの視差情報であるDisparity1の半分だけずらした位置に、重畳される。また、左眼ＳＲに対応した表示データは、サイド・バイ・サイド方式の立体画像データの、右眼ＳＲの領域情報であるPosition2で示される位置を、この左眼ＳＲの視差情報であるDisparity2の半分だけずらした位置に、重畳される。 In this case, the display data corresponding to the left eye SR is the position indicated by Position1 which is the region information of the left eye SR in the side-by-side stereoscopic image data, and the disparity1 which is the disparity information of the left eye SR. It is superimposed at a position shifted by half of. The display data corresponding to the left eye SR is the position indicated by Position2 which is the area information of the right eye SR in the side-by-side stereoscopic image data, and the disparity2 which is the disparity information of the left eye SR. It is superimposed at a position shifted by half.

そして、セットトップボックス２００は、上述のようにして得られた出力立体画像データを、例えばＨＤＭＩのデジタルインタフェースを通じて、テレビ受信機３００に送信する。この場合、セットトップボックス２００からテレビ受信機３００への立体画像データの伝送フォーマットは、例えば、サイド・バイ・サイド（Side-by-Side）方式とされる。 Then, the set top box 200 transmits the output stereoscopic image data obtained as described above to the television receiver 300 through, for example, an HDMI digital interface. In this case, the transmission format of the stereoscopic image data from the set top box 200 to the television receiver 300 is, for example, a side-by-side system.

次に、放送局１００からテレビ受信機３００に立体画像データおよびサブタイトルデータ（表示制御情報を含む）が送られ、このテレビ受信機３００が３Ｄ対応機器（3D TV）である場合について説明する。テレビ受信機３００は、サブタイトルデータ(サブリージョンの表示制御情報を除く)に基づいて、左眼サブタイトルおよび右眼サブタイトルを表示するためのリージョンの表示データを生成する。そして、テレビ受信機３００は、このリージョンの表示データから、左眼ＳＲに対応した表示データおよび右眼ＳＲに対応した表示データ（右眼表示データ）を抽出する。 Next, a case where stereoscopic image data and subtitle data (including display control information) are sent from the broadcast station 100 to the television receiver 300, and the television receiver 300 is a 3D-compatible device (3D TV) will be described. The television receiver 300 generates region display data for displaying the left-eye subtitle and the right-eye subtitle based on the subtitle data (excluding subregion display control information). Then, the television receiver 300 extracts display data corresponding to the left eye SR and display data (right eye display data) corresponding to the right eye SR from the display data of this region.

テレビ受信機３００は、左眼ＳＲに対応した表示データを水平方向に２倍にスケーリングしてフル解像度対応の左眼表示データを得る。そして、テレビ受信機３００は、この左眼表示データを、この左眼ＳＲのターゲットフレーム情報であるframe0に対応したフル解像度の左眼画像データに重畳する。すなわち、テレビ受信機３００は、この左眼表示データを、サイド・バイ・サイド方式の立体画像データの左眼画像部分を水平方向に２倍にスケーリングして得られたフル解像度の左眼画像データに重畳して、サブタイトルが重畳された左眼画像データを生成する。 The television receiver 300 obtains full-resolution left-eye display data by scaling display data corresponding to the left eye SR twice in the horizontal direction. Then, the television receiver 300 superimposes the left-eye display data on full-resolution left-eye image data corresponding to frame0 that is target frame information of the left eye SR. That is, the television receiver 300 performs full-resolution left-eye image data obtained by scaling the left-eye display data by horizontally doubling the left-eye image portion of the side-by-side stereoscopic image data. The left eye image data with the subtitle superimposed is generated.

テレビ受信機３００は、右眼ＳＲに対応した表示データを水平方向に２倍にスケーリングしてフル解像度対応の右眼表示データを得る。そして、テレビ受信機３００は、この右眼表示データを、この右眼ＳＲのターゲットフレーム情報であるframe1に対応したフル解像度の右眼画像データに重畳する。すなわち、テレビ受信機３００は、この右眼表示データを、サイド・バイ・サイド方式の立体画像データの右眼画像部分を水平方向に２倍にスケーリングして得られたフル解像度の右眼画像データに重畳して、サブタイトルが重畳された右眼画像データを生成する。 The television receiver 300 scales the display data corresponding to the right eye SR twice in the horizontal direction to obtain right-eye display data corresponding to full resolution. Then, the television receiver 300 superimposes the right-eye display data on the full-resolution right-eye image data corresponding to the frame 1 that is the target frame information of the right eye SR. That is, the television receiver 300 uses the right-eye display data obtained by scaling the right-eye image portion of the stereoscopic image data of the side-by-side method twice in the horizontal direction in the horizontal direction. The right eye image data on which the subtitle is superimposed is generated.

この場合、左眼表示データは、フル解像度の左眼画像データの、左眼ＳＲの領域情報であるPosition1が２倍とされる位置を、この左眼ＳＲの視差情報であるDisparity1分だけずらした位置に、重畳される。また、この場合、右眼表示データは、フル解像度の右眼画像データの、右眼ＳＲの領域情報であるPosition2からＨ／２を差し引いて２倍とされる位置を、この左眼ＳＲの視差情報であるDisparity2分だけずらした位置に、重畳される In this case, in the left-eye display data, the position where Position1 that is the region information of the left eye SR is doubled in the left-eye image data of the full resolution is shifted by Disparity1 that is the disparity information of the left eye SR. It is superimposed on the position. Further, in this case, the right-eye display data is the parallax of the left-eye SR at a position that is doubled by subtracting H / 2 from Position2 that is the area information of the right-eye SR of the full-resolution right-eye image data. It is superimposed at a position shifted by the information Disparity2

テレビ受信機３００は、上述のように生成したサブタイトルが重畳された左眼画像データおよび右眼画像データに基づいて、ＬＣＤ等の表示パネルに、ユーザに立体画像を認識させるための両眼視差画像（左眼画像および右眼画像）を表示する。 The television receiver 300 uses a binocular parallax image for allowing a user to recognize a stereoscopic image on a display panel such as an LCD based on the left-eye image data and the right-eye image data on which the subtitles generated as described above are superimposed. (Left eye image and right eye image) are displayed.

図２５は、放送局１００からセットトップボックス２００を介してテレビ受信機３００に至る、あるいは放送局１００から直接テレビ受信機３００に至る、立体画像データおよびサブタイトルデータ（表示制御情報を含む）の流れを概略的に示している。この場合、放送局１００では、ＭＶＣ（Multi-view Video Coding）方式に合わせた立体画像用のサブタイトルデータが生成される。この場合、ベースビューの画像データ（左眼画像データ）およびノンベースビューの画像データ（右眼画像データ）により立体画像データが構成される。この立体画像データはビデオデータストリームに含まれて送信され、立体画像用のサブタイトルデータはサブタイトルデータストリームに含まれて送信される。 FIG. 25 shows a flow of stereoscopic image data and subtitle data (including display control information) from the broadcasting station 100 to the television receiver 300 via the set top box 200 or directly from the broadcasting station 100 to the television receiver 300. Is shown schematically. In this case, the broadcast station 100 generates stereoscopic image subtitle data that conforms to the MVC (Multi-view Video Coding) method. In this case, stereoscopic image data is composed of base-view image data (left-eye image data) and non-base-view image data (right-eye image data). The stereoscopic image data is included in the video data stream and transmitted, and the stereoscopic image subtitle data is included in the subtitle data stream and transmitted.

最初に、放送局１００からセットトップボックス２００に立体画像データおよびサブタイトルデータ（表示制御情報を含む）が送られ、このセットトップボックス２００がレガシーの２Ｄ対応機器（Legacy 2D STB）である場合について説明する。セットトップボックス２００は、サブタイトルデータ(サブリージョンの表示制御情報を除く)に基づいて、左眼サブタイトルおよび右眼サブタイトルを表示するためのリージョンの表示データを生成し、この表示データをベースビュー（左眼画像データ）に重畳して、出力画像データを得る。この場合の重畳位置は、リージョンの位置である。 First, a case where stereoscopic image data and subtitle data (including display control information) are sent from the broadcasting station 100 to the set top box 200, and the set top box 200 is a legacy 2D-compatible device (Legacy 2D STB) will be described. To do. The set top box 200 generates region display data for displaying the left-eye subtitle and the right-eye subtitle based on the subtitle data (excluding subregion display control information), and uses this display data as a base view (left The output image data is obtained by superimposing it on the eye image data. In this case, the overlapping position is the position of the region.

セットトップボックス２００は、この出力画像データを、例えばＨＤＭＩのデジタルインタフェースを通じて、テレビ受信機３００に送信する。テレビ受信機３００は、２Ｄ対応機器（2D TV）あるいは３Ｄ対応機器（3D TV）のいずれであっても、表示パネルに２Ｄ画像を表示する。 The set top box 200 transmits the output image data to the television receiver 300 through, for example, an HDMI digital interface. The television receiver 300 displays a 2D image on the display panel regardless of whether the device is a 2D compatible device (2D TV) or a 3D compatible device (3D TV).

セットトップボックス２００は、左眼ＳＲに対応した表示データを、この左眼ＳＲのターゲットフレーム情報であるframe0で示されるベースビュー(左眼画像)の画像データに重畳して、左眼サブタイトルが重畳されたベースビュー(左眼画像)の出力画像データを得る。この場合、左眼ＳＲに対応した表示データは、ベースビュー(左眼画像)の画像データの、左眼ＳＲの領域情報であるPosition1で示される位置を、この左眼ＳＲの視差情報であるDisparity1分だけずらした位置に、重畳される。 The set top box 200 superimposes the display data corresponding to the left eye SR on the image data of the base view (left eye image) indicated by the frame 0 that is the target frame information of the left eye SR, and superimposes the left eye subtitle. Output image data of the base view (left eye image) is obtained. In this case, the display data corresponding to the left eye SR is the position indicated by Position1 which is the region information of the left eye SR in the image data of the base view (left eye image), and Disparity1 which is the disparity information of the left eye SR. It is superimposed at a position shifted by the amount.

また、セットトップボックス２００は、右眼ＳＲに対応した表示データを、この右眼ＳＲのターゲットフレーム情報であるframe1で示されるノンベースビュー(右眼画像)の画像データに重畳して、右眼サブタイトルが重畳されたノンベースビュー(左眼画像)の出力画像データを得る。この場合、右眼ＳＲに対応した表示データは、ノンベースビュー(右眼画像)の画像データの、右眼ＳＲの領域情報であるPosition2で示される位置を、この右眼ＳＲの視差情報であるDisparity2分だけずらした位置に、重畳される。 Further, the set-top box 200 superimposes display data corresponding to the right eye SR on the image data of the non-base view (right eye image) indicated by the frame 1 that is the target frame information of the right eye SR, so that the right eye Output image data of a non-base view (left eye image) on which a subtitle is superimposed is obtained. In this case, the display data corresponding to the right eye SR is the disparity information of the right eye SR at the position indicated by Position2 which is the area information of the right eye SR in the image data of the non-base view (right eye image). It is superimposed at a position shifted by Disparity2.

そして、セットトップボックス２００は、上述のようにして得られたベースビュー(左眼画像)およびノンベースビュー(右眼画像)の画像データを、例えばＨＤＭＩのデジタルインタフェースを通じて、テレビ受信機３００に送信する。この場合、セットトップボックス２００からテレビ受信機３００への立体画像データの伝送フォーマットは、例えば、フレームパッキング（Frame Packing）方式とされる。 The set-top box 200 transmits the image data of the base view (left eye image) and the non-base view (right eye image) obtained as described above to the television receiver 300 through, for example, an HDMI digital interface. To do. In this case, the transmission format of the stereoscopic image data from the set top box 200 to the television receiver 300 is, for example, a frame packing system.

テレビ受信機３００は、３Ｄ対応機器（3D TV）である場合、セットトップボックス２００から送られてくるフレームパッキング方式の立体画像データに３Ｄ信号処理を施し、サブタイトルが重畳された左眼画像および右眼画像のデータを生成する。そして、テレビ受信機３００は、ＬＣＤ等の表示パネルに、ユーザに立体画像を認識させるための両眼視差画像（左眼画像および右眼画像）を表示する。 When the television receiver 300 is a 3D-compatible device (3D TV), the left-eye image and the right image on which the subtitle is superimposed by performing 3D signal processing on the frame packing type stereoscopic image data transmitted from the set-top box 200. Generate eye image data. Then, the television receiver 300 displays binocular parallax images (a left-eye image and a right-eye image) for allowing the user to recognize a stereoscopic image on a display panel such as an LCD.

次に、放送局１００からテレビ受信機３００に立体画像データおよびサブタイトルデータ（表示制御情報を含む）が送られ、このテレビ受信機３００が３Ｄ対応機器（3D TV）である場合について説明する。テレビ受信機３００は、サブタイトルデータ(サブリージョンの表示制御情報を除く)に基づいて、左眼サブタイトルおよび右眼サブタイトルを表示するためのリージョンの表示データを生成する。そして、テレビ受信機３００は、このリージョンの表示データから、左眼ＳＲに対応した表示データおよび右眼ＳＲに対応した表示データを抽出する。 Next, a case where stereoscopic image data and subtitle data (including display control information) are sent from the broadcast station 100 to the television receiver 300, and the television receiver 300 is a 3D-compatible device (3D TV) will be described. The television receiver 300 generates region display data for displaying the left-eye subtitle and the right-eye subtitle based on the subtitle data (excluding subregion display control information). Then, the television receiver 300 extracts display data corresponding to the left eye SR and display data corresponding to the right eye SR from the display data of this region.

テレビ受信機３００は、左眼ＳＲに対応した表示データを、この左眼ＳＲのターゲットフレーム情報であるframe0で示されるベースビュー(左眼画像)の画像データに重畳して、左眼サブタイトルが重畳されたベースビュー(左眼画像)の出力画像データを得る。この場合、左眼ＳＲに対応した表示データは、ベースビュー(左眼画像)の画像データの、左眼ＳＲの領域情報であるPosition1で示される位置を、この左眼ＳＲの視差情報であるDisparity1分だけずらした位置に、重畳される。 The television receiver 300 superimposes the display data corresponding to the left eye SR on the image data of the base view (left eye image) indicated by frame0 that is the target frame information of the left eye SR, and superimposes the left eye subtitle. Output image data of the base view (left eye image) is obtained. In this case, the display data corresponding to the left eye SR is the position indicated by Position1 which is the region information of the left eye SR in the image data of the base view (left eye image), and Disparity1 which is the disparity information of the left eye SR. It is superimposed at a position shifted by the amount.

また、テレビ受信機３００は、右眼ＳＲに対応した表示データを、この右眼ＳＲのターゲットフレーム情報であるframe1で示されるノンベースビュー(右眼画像)の画像データに重畳して、右眼サブタイトルが重畳されたノンベースビュー(左眼画像)の出力画像データを得る。この場合、右眼ＳＲに対応した表示データは、ノンベースビュー(右眼画像)の画像データの、右眼ＳＲの領域情報であるPosition2で示される位置を、この右眼ＳＲの視差情報であるDisparity2分だけずらした位置に、重畳される。 Also, the television receiver 300 superimposes display data corresponding to the right eye SR on the image data of the non-base view (right eye image) indicated by frame1 that is the target frame information of the right eye SR, so that the right eye Output image data of a non-base view (left eye image) on which a subtitle is superimposed is obtained. In this case, the display data corresponding to the right eye SR is the disparity information of the right eye SR at the position indicated by Position2 which is the area information of the right eye SR in the image data of the non-base view (right eye image). It is superimposed at a position shifted by Disparity2.

テレビ受信機３００は、上述のように生成したサブタイトルが重畳されたベースビュー(左眼画像)およびノンベースビュー(右眼画像)の画像データに基づいて、ＬＣＤ等の表示パネルに、ユーザに立体画像を認識させるための両眼視差画像（左眼画像および右眼画像）を表示する。 Based on the image data of the base view (left eye image) and the non-base view (right eye image) on which the subtitles generated as described above are superimposed, the television receiver 300 displays a stereoscopic image to the user on the display panel such as an LCD. A binocular parallax image (left eye image and right eye image) for recognizing the image is displayed.

なお、上述では、左眼ＳＲおよび右眼ＳＲの表示制御情報（領域情報、ターゲットフレーム情報、視差情報）が別個に作成される例を示した。しかし、これら左眼ＳＲおよび右眼ＳＲのうち、いずれか一方、例えば左眼ＳＲの表示制御情報のみを作成することも考えられる。その場合、この左眼ＳＲの表示制御情報には、右眼ＳＲの領域情報、ターゲットフレーム情報、視差情報のうち、領域情報は含まれないが、ターゲットフレーム情報、視差情報は含まれる。 In the above description, the example in which the display control information (region information, target frame information, parallax information) for the left eye SR and the right eye SR is created separately has been shown. However, it is also conceivable to create only display control information for one of the left eye SR and the right eye SR, for example, the left eye SR. In this case, the display control information for the left eye SR does not include region information among the region information, target frame information, and disparity information for the right eye SR, but includes target frame information and disparity information.

図２６は、その場合における、放送局１００からセットトップボックス２００を介してテレビ受信機３００に至る、あるいは放送局１００から直接テレビ受信機３００に至る、立体画像データおよびサブタイトルデータ（表示制御情報を含む）の流れの一例を概略的に示している。この場合、放送局１００ではサイド・バイ・サイド（Side-by-Side）方式に合わせた立体画像用のサブタイトルデータが生成される。立体画像データはビデオデータストリームに含まれて送信され、立体画像用のサブタイトルデータはサブタイトルデータストリームに含まれて送信される。 FIG. 26 illustrates stereoscopic image data and subtitle data (display control information) from the broadcasting station 100 to the television receiver 300 via the set-top box 200 or directly from the broadcasting station 100 to the television receiver 300 in that case. An example of the flow of In this case, the broadcast station 100 generates stereoscopic image subtitle data in accordance with a side-by-side format. The stereoscopic image data is included in the video data stream and transmitted, and the stereoscopic image subtitle data is included in the subtitle data stream and transmitted.

次に、放送局１００からセットトップボックス２００に立体画像データおよびサブタイトルデータ（表示制御情報を含む）が送られ、このセットトップボックス２００が３Ｄ対応機器（3D STB）である場合について説明する。セットトップボックス２００は、サブタイトルデータ(サブリージョンの表示制御情報を除く)に基づいて、左眼サブタイトルおよび右眼サブタイトルを表示するためのリージョンの表示データを生成する。そして、セットトップボックス２００は、このリージョンの表示データから、左眼ＳＲに対応した表示データを抽出する。 Next, a case will be described in which stereoscopic image data and subtitle data (including display control information) are sent from the broadcasting station 100 to the set top box 200, and the set top box 200 is a 3D-compatible device (3D STB). The set top box 200 generates region display data for displaying the left-eye subtitle and the right-eye subtitle based on the subtitle data (excluding subregion display control information). Then, the set top box 200 extracts display data corresponding to the left eye SR from the display data of this region.

そして、セットトップボックス２００は、左眼ＳＲに対応した表示データを、立体画像データに重畳して、出力立体画像データを得る。この場合、左眼ＳＲに対応した表示データは、この左眼ＳＲのターゲットフレーム情報であるframe0で示されるフレーム部分(左眼画像フレーム部分)に重畳される。また、左眼ＳＲに対応した表示データは、右眼ＳＲのターゲットフレーム情報であるframe1で示されるフレーム部分（右眼画像フレーム部分）に重畳される。 Then, the set top box 200 obtains output stereoscopic image data by superimposing display data corresponding to the left eye SR on the stereoscopic image data. In this case, the display data corresponding to the left eye SR is superimposed on the frame portion (left eye image frame portion) indicated by frame0 which is the target frame information of the left eye SR. Further, the display data corresponding to the left eye SR is superimposed on a frame portion (right eye image frame portion) indicated by frame1 which is target frame information of the right eye SR.

この場合、左眼ＳＲに対応した表示データは、サイド・バイ・サイド方式の立体画像データの、領域情報であるPositionで示される位置を、この左眼ＳＲの視差情報であるDisparity1の半分だけずらした位置に、重畳される。また、左眼ＳＲに対応した表示データが、サイド・バイ・サイド方式の立体画像データの、領域情報であるPosition＋H／２で示される位置を、右眼ＳＲの視差情報であるDisparity2の半分だけずらした位置に、重畳される。 In this case, the display data corresponding to the left eye SR shifts the position indicated by Position, which is the region information, of the side-by-side stereoscopic image data by half of Disparity1, which is the parallax information of the left eye SR. Is superimposed on the selected position. In addition, the display data corresponding to the left eye SR shifts the position indicated by Position + H / 2, which is the region information, of the side-by-side stereoscopic image data by half of Disparity2, which is the parallax information of the right eye SR. Is superimposed on the selected position.

次に、放送局１００からテレビ受信機３００に立体画像データおよびサブタイトルデータ（表示制御情報を含む）が送られ、このテレビ受信機３００が３Ｄ対応機器（3D TV）である場合について説明する。テレビ受信機３００は、サブタイトルデータ(サブリージョンの表示制御情報を除く)に基づいて、左眼サブタイトルおよび右眼サブタイトルを表示するためのリージョンの表示データを生成する。そして、テレビ受信機３００は、このリージョンの表示データから、左眼ＳＲに対応した表示データを抽出する。 Next, a case where stereoscopic image data and subtitle data (including display control information) are sent from the broadcast station 100 to the television receiver 300, and the television receiver 300 is a 3D-compatible device (3D TV) will be described. The television receiver 300 generates region display data for displaying the left-eye subtitle and the right-eye subtitle based on the subtitle data (excluding subregion display control information). Then, the television receiver 300 extracts display data corresponding to the left eye SR from the display data of this region.

テレビ受信機３００は、左眼ＳＲに対応した表示データを水平方向に２倍にスケーリングしてフル解像度対応の左眼表示データを得る。そして、テレビ受信機３００は、この左眼表示データを、ターゲットフレーム情報であるframe0に対応したフル解像度の左眼画像データに重畳する。すなわち、テレビ受信機３００は、この左眼表示データを、サイド・バイ・サイド方式の立体画像データの左眼画像部分を水平方向に２倍にスケーリングして得られたフル解像度の左眼画像データに重畳して、サブタイトルが重畳された左眼画像データを生成する。 The television receiver 300 obtains full-resolution left-eye display data by scaling display data corresponding to the left eye SR twice in the horizontal direction. Then, the television receiver 300 superimposes the left-eye display data on the full-resolution left-eye image data corresponding to the frame 0 that is target frame information. That is, the television receiver 300 performs full-resolution left-eye image data obtained by scaling the left-eye display data by horizontally doubling the left-eye image portion of the side-by-side stereoscopic image data. The left eye image data with the subtitle superimposed is generated.

また、テレビ受信機３００は、左眼ＳＲに対応した表示データを水平方向に２倍にスケーリングしてフル解像度対応の右眼表示データを得る。そして、テレビ受信機３００は、この右眼表示データを、ターゲットフレーム情報であるframe1に対応したフル解像度の右眼画像データに重畳する。すなわち、テレビ受信機３００は、この右眼表示データを、サイド・バイ・サイド方式の立体画像データの右眼画像部分を水平方向に２倍にスケーリングして得られたフル解像度の右眼画像データに重畳して、サブタイトルが重畳された右眼画像データを生成する。 In addition, the television receiver 300 scales the display data corresponding to the left eye SR twice in the horizontal direction to obtain right-eye display data corresponding to full resolution. Then, the television receiver 300 superimposes the right-eye display data on the full-resolution right-eye image data corresponding to the target frame information frame1. That is, the television receiver 300 uses the right-eye display data obtained by scaling the right-eye image portion of the stereoscopic image data of the side-by-side method twice in the horizontal direction in the horizontal direction. The right eye image data on which the subtitle is superimposed is generated.

この場合、左眼表示データは、フル解像度の左眼画像データの、領域情報であるPositionが２倍とされる位置を、視差情報であるDisparity1分だけずらした位置に、重畳される。また、この場合、右眼表示データは、フル解像度の右眼画像データの、領域情報であるPositionが２倍とされる位置を、視差情報であるDisparity2分だけずらした位置に、重畳される In this case, the left-eye display data is superimposed on a position where the position that is the region information of the full-resolution left-eye image data is doubled by the Disparity 1 that is the disparity information. In this case, the right-eye display data is superimposed at a position where the position of the position information, which is the region information, is doubled by the amount of Disparity 2 of the disparity information in the right-eye image data of full resolution.

図２に示す送信データ生成部１１０において、マルチプレクサ１１９から出力されるビットストリームデータＢＳＤは、ビデオデータストリームとサブタイトルデータストリームとを有する多重化データストリームである。ビデオデータストリームには、立体画像データが含まれている。また、サブタイトルデータストリームには、その立体画像データの伝送フォーマットに対応した立体画像用（三次元画像用）のサブタイトルデータが含まれている。 In the transmission data generation unit 110 illustrated in FIG. 2, the bit stream data BSD output from the multiplexer 119 is a multiplexed data stream including a video data stream and a subtitle data stream. The video data stream includes stereoscopic image data. Further, the subtitle data stream includes subtitle data for stereoscopic images (for 3D images) corresponding to the transmission format of the stereoscopic image data.

この立体画像用のサブタイトルデータは、左眼サブタイトルのデータおよび右眼サブタイトルのデータを持っている。そのため、受信側においては、このサブタイトルデータに基づいて、立体画像データが持つ左眼画像データに重畳する左眼サブタイトルの表示データおよび立体画像データが持つ右眼画像データに重畳する右眼サブタイトルの表示データを容易に発生できる。これにより、処理の容易化が図られる。 The stereoscopic image subtitle data includes left-eye subtitle data and right-eye subtitle data. Therefore, on the receiving side, based on the subtitle data, the display data of the left eye subtitle superimposed on the left eye image data included in the stereoscopic image data and the display of the right eye subtitle superimposed on the right eye image data included in the stereoscopic image data are displayed. Data can be generated easily. This facilitates processing.

また、図２に示す送信データ生成部１１０において、マルチプレクサ１１９から出力されるビットストリームデータＢＳＤには、立体画像データ、立体画像用のサブタイトルデータの他に、表示制御情報も含まれる。この表示制御情報には、左眼ＳＲおよび右眼ＳＲに関連した表示制御情報（領域情報、ターゲットフレーム情報、視差情報）が含まれている。 In the transmission data generation unit 110 illustrated in FIG. 2, the bit stream data BSD output from the multiplexer 119 includes display control information in addition to stereoscopic image data and stereoscopic image subtitle data. This display control information includes display control information (region information, target frame information, parallax information) related to the left eye SR and the right eye SR.

そのため、受信側においては、左眼ＳＲ内の左眼サブタイトルおよび右眼ＳＲ内のサブタイトルのみをそれぞれターゲットフレームに重畳表示することが容易となる。そして、これら左眼ＳＲ内の左眼サブタイトルおよび右眼ＳＲ内のサブタイトルの表示位置に視差を付与でき、サブタイトル（字幕）の表示において、画像内の各物体との間の遠近感の整合性を最適な状態に維持することが可能となる。 Therefore, on the receiving side, it is easy to superimpose and display only the left eye subtitle in the left eye SR and the subtitle in the right eye SR on the target frame. Then, parallax can be given to the display positions of the left eye subtitle in the left eye SR and the subtitle in the right eye SR, and in the display of the subtitle (caption), the consistency of perspective with each object in the image is improved. It becomes possible to maintain the optimum state.

また、図２に示す送信データ生成部１１０において、サブタイトル処理部１２３からは、サブタイトル表示期間において順次更新される視差情報を含むＳＣＳセグメントを送信できるので、左眼ＳＲ内の左眼サブタイトルおよび右眼ＳＲ内の右眼サブタイトルの表示位置を動的に制御できる。これにより、受信側においては、左眼サブタイトルおよび右眼サブタイトルの間に付与する視差を画像内容の変化に連動して動的に変化させることが可能となる。 In the transmission data generation unit 110 shown in FIG. 2, the subtitle processing unit 123 can transmit the SCS segment including the disparity information sequentially updated in the subtitle display period, so the left eye subtitle and the right eye in the left eye SR can be transmitted. The display position of the right-eye subtitle in SR can be dynamically controlled. As a result, on the receiving side, the parallax provided between the left eye subtitle and the right eye subtitle can be dynamically changed in conjunction with the change in the image content.

また、図２に示す送信データ生成部１１０において、サブタイトル処理部１１６で作成されるＳＣＳのセグメントに含まれる視差情報は、サブタイトル表示期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなるものとされる。そのため、送信データ量を低減でき、また、受信側において、視差情報を保持するためのメモリ容量の大幅な節約が可能となる。 In the transmission data generation unit 110 illustrated in FIG. 2, the disparity information included in the SCS segment created by the subtitle processing unit 116 includes disparity information of the first frame in the subtitle display period and each update frame interval thereafter. It is assumed to consist of disparity information of the frame. Therefore, the amount of transmission data can be reduced, and the memory capacity for holding the parallax information can be greatly saved on the receiving side.

また、図２に示す送信データ生成部１１０において、サブタイトル処理部１１６で作成されるＳＣＳのセグメントに含まれる更新フレーム間隔毎のフレームの視差情報は、前回の視差情報からのオフセット値ではなく、視差情報そのものである。そのため、受信側において、補間過程でエラーが生じても、一定遅延時間内にエラーからの復帰が可能になる。 Further, in the transmission data generation unit 110 illustrated in FIG. 2, the disparity information of the frame for each update frame interval included in the SCS segment created by the subtitle processing unit 116 is not an offset value from the previous disparity information, but the disparity information. Information itself. Therefore, even if an error occurs in the interpolation process on the receiving side, it is possible to recover from the error within a certain delay time.

また、図２に示す送信データ生成部１１０において、サブタイトル処理部１１６で作成されるＳＣＳのセグメントに含まれる視差情報は整数画素精度とされている。そのため、受信機毎の能力差は生じにくく、よって、時間経過に伴って異なる受信機同士で差が開くことはない。また、更新フレーム間隔の間の補間は受信機性能によって自由度が与えられているため、受信機設計の自由度があがる。 In addition, in the transmission data generation unit 110 illustrated in FIG. 2, the disparity information included in the SCS segment created by the subtitle processing unit 116 has integer pixel accuracy. For this reason, a difference in capability between receivers is unlikely to occur, and therefore, a difference does not open between different receivers as time elapses. In addition, since interpolation between update frame intervals is given by the receiver performance, the degree of freedom in receiver design is increased.

［セットトップボックスの説明］
図１に戻って、セットトップボックス２００は、放送局１００から放送波に載せて送信されてくるビットストリームデータ（トランスポートストリーム）ＢＳＤを受信する。このビットストリームデータＢＳＤには、左眼画像データおよび右眼画像データを含む立体画像データ、音声データが含まれている。また、このビットストリームデータＢＳＤには、サブタイトル（字幕）を表示するための立体画像用のサブタイトルデータ（表示制御情報を含む）が含まれている。 [Description of Set Top Box]
Returning to FIG. 1, the set-top box 200 receives bit stream data (transport stream) BSD transmitted from the broadcasting station 100 on broadcast waves. The bit stream data BSD includes stereoscopic image data and audio data including left eye image data and right eye image data. The bit stream data BSD includes stereoscopic image subtitle data (including display control information) for displaying a subtitle (caption).

セットトップボックス２００は、ビットストリーム処理部２０１を有している。このビットストリーム処理部２０１は、ビットストリームデータＢＳＤから、立体画像データ、音声データ、サブタイトルデータを抽出する。そして、このビットストリーム処理部２０１は、立体画像データ、サブタイトルデータ等を用いて、サブタイトルが重畳された立体画像データを生成する。 The set top box 200 has a bit stream processing unit 201. The bit stream processing unit 201 extracts stereoscopic image data, audio data, and subtitle data from the bit stream data BSD. The bit stream processing unit 201 generates stereoscopic image data on which the subtitle is superimposed using stereoscopic image data, subtitle data, and the like.

この場合、左眼画像に重畳する左眼サブタイトルと右眼画像に重畳する右眼サブタイトルとの間に視差を付与できる。例えば、上述したように、放送局１００から送信する立体画像用のサブタイトルデータを、左眼サブタイトルと右眼サブタイトルとの間に視差が付与されるように生成できる。また、例えば、上述したように、放送局１００から送られてくる立体画像用のサブタイトルデータに付加されている表示制御情報には、視差情報が含まれており、この視差情報に基づいて、左眼サブタイトルと右眼サブタイトルとの間に視差を付与できる。このように、左眼サブタイトルと右眼サブタイトルとの間に視差が付与されることで、ユーザは、サブタイトル（字幕）を画像の手前に認識可能となる。 In this case, parallax can be given between the left eye subtitle superimposed on the left eye image and the right eye subtitle superimposed on the right eye image. For example, as described above, the subtitle data for stereoscopic images transmitted from the broadcast station 100 can be generated so that parallax is given between the left eye subtitle and the right eye subtitle. Further, for example, as described above, the display control information added to the stereoscopic image subtitle data transmitted from the broadcast station 100 includes disparity information, and based on this disparity information, Parallax can be given between the eye subtitle and the right eye subtitle. In this manner, by providing parallax between the left eye subtitle and the right eye subtitle, the user can recognize the subtitle (caption) in front of the image.

図２７（ａ）は、画像上におけるサブタイトル（字幕）の表示例を示している。この表示例では、背景と近景オブジェクトとからなる画像上に、字幕が重畳された例である。図２７（ｂ）は、背景、近景オブジェクト、字幕の遠近感を示し、字幕が最も手前に認識されることを示している。 FIG. 27A shows a display example of a subtitle (caption) on an image. In this display example, captions are superimposed on an image composed of a background and a foreground object. FIG. 27B shows the perspective of the background, the foreground object, and the caption, and indicates that the caption is recognized most forward.

図２８（ａ）は、図２７（ａ）と同じ、画像上におけるサブタイトル（字幕）の表示例を示している。図２８（ｂ）は、左眼画像に重畳される左眼字幕ＬＧＩと、右眼画像に重畳される右眼字幕ＲＧＩを示している。図２８（ｃ）は、字幕が最も手前に認識されるために、左眼字幕ＬＧＩと右眼字幕ＲＧＩとの間に視差が与えられることを示している。 FIG. 28A shows a display example of subtitles (captions) on the same image as FIG. 27A. FIG. 28B shows a left-eye caption LGI superimposed on the left-eye image and a right-eye caption RGI superimposed on the right-eye image. FIG. 28C shows that a parallax is given between the left-eye caption LGI and the right-eye caption RGI because the caption is recognized most forward.

［セットトップボックスの構成例］
セットトップボックス２００の構成例を説明する。図２９は、セットトップボックス２００の構成例を示している。このセットトップボックス２００は、ビットストリーム処理部２０１と、ＨＤＭＩ端子２０２と、アンテナ端子２０３と、デジタルチューナ２０４と、映像信号処理回路２０５と、ＨＤＭＩ送信部２０６と、音声信号処理回路２０７を有している。また、このセットトップボックス２００は、ＣＰＵ２１１と、フラッシュＲＯＭ２１２と、ＤＲＡＭ２１３と、内部バス２１４と、リモコン受信部２１５と、リモコン送信機２１６を有している。 [Configuration example of set-top box]
A configuration example of the set top box 200 will be described. FIG. 29 shows a configuration example of the set top box 200. The set top box 200 includes a bit stream processing unit 201, an HDMI terminal 202, an antenna terminal 203, a digital tuner 204, a video signal processing circuit 205, an HDMI transmission unit 206, and an audio signal processing circuit 207. ing. The set top box 200 includes a CPU 211, a flash ROM 212, a DRAM 213, an internal bus 214, a remote control receiving unit 215, and a remote control transmitter 216.

アンテナ端子２０３は、受信アンテナ（図示せず）で受信されたテレビ放送信号を入力する端子である。デジタルチューナ２０４は、アンテナ端子２０３に入力されたテレビ放送信号を処理して、ユーザの選択チャネルに対応した所定のビットストリームデータ（トランスポートストリーム）ＢＳＤを出力する。 The antenna terminal 203 is a terminal for inputting a television broadcast signal received by a receiving antenna (not shown). The digital tuner 204 processes the television broadcast signal input to the antenna terminal 203 and outputs predetermined bit stream data (transport stream) BSD corresponding to the user's selected channel.

ビットストリーム処理部２０１は、上述したように、ビットストリームデータＢＳＤから立体画像データ、音声データ、立体画像用のサブタイトルデータ（表示制御情報を含む）等を抽出する。ビットストリーム処理部２０１は、音声データを出力する。また、このビットストリーム処理部２０１は、立体画像データに対して、左眼サブタイトルおよび右眼サブタイトルの表示データを合成し、サブタイトルが重畳された出力立体画像データを得る。表示制御情報は、左眼ＳＲおよび右眼ＳＲの領域情報、ターゲットフレーム情報、視差情報を含んでいる。 As described above, the bit stream processing unit 201 extracts stereoscopic image data, audio data, stereoscopic image subtitle data (including display control information), and the like from the bit stream data BSD. The bit stream processing unit 201 outputs audio data. Further, the bit stream processing unit 201 combines the display data of the left eye subtitle and the right eye subtitle with the stereoscopic image data, and obtains output stereoscopic image data on which the subtitle is superimposed. The display control information includes left eye SR and right eye SR area information, target frame information, and parallax information.

この場合、ビットストリーム処理部２０１は、サブタイトルデータ(サブリージョンの表示制御情報を除く)に基づいて、左眼サブタイトルおよび右眼サブタイトルを表示するためのリージョンの表示データを生成する。そして、ビットストリーム処理部２０１は、このリージョンの表示データから、左眼ＳＲおよび右眼ＳＲの領域情報に基づいて、左眼ＳＲに対応した表示データおよび右眼ＳＲに対応した表示データを抽出する。 In this case, the bitstream processing unit 201 generates region display data for displaying the left-eye subtitle and the right-eye subtitle based on the subtitle data (excluding subregion display control information). Then, the bit stream processing unit 201 extracts display data corresponding to the left eye SR and display data corresponding to the right eye SR from the display data of this region based on the region information of the left eye SR and the right eye SR. .

そして、ビットストリーム処理部２０１は、左眼ＳＲ、右眼ＳＲに対応した表示データを、立体画像データに重畳して、出力立体画像データ（表示用立体画像データ）を得る。この場合、左眼ＳＲに対応した表示データは、この左眼ＳＲのターゲットフレーム情報であるframe0で示されるフレーム部分(左眼画像フレーム部分)に重畳される。また、右眼ＳＲに対応した表示データは、この右眼ＳＲのターゲットフレーム情報であるframe1で示されるフレーム部分（右眼画像フレーム部分）に重畳される。この際、ビットストリーム処理部２０１は、左眼ＳＲ内の左眼サブタイトルおよび右眼ＳＲ内の右眼サブタイトルの表示位置（重畳位置）を、視差情報に基づいて、シフト調整する。 Then, the bit stream processing unit 201 superimposes display data corresponding to the left eye SR and right eye SR on the stereoscopic image data to obtain output stereoscopic image data (display stereoscopic image data). In this case, the display data corresponding to the left eye SR is superimposed on the frame portion (left eye image frame portion) indicated by frame0 which is the target frame information of the left eye SR. The display data corresponding to the right eye SR is superimposed on a frame portion (right eye image frame portion) indicated by frame1 which is target frame information of the right eye SR. At this time, the bit stream processing unit 201 shift-adjusts the display positions (superimposition positions) of the left eye subtitle in the left eye SR and the right eye subtitle in the right eye SR based on the parallax information.

映像信号処理回路２０５は、ビットストリーム処理部２０１で得られた出力立体画像データに対して必要に応じて画質調整処理などを行い、処理後の出力立体画像データをＨＤＭＩ送信部２０６に供給する。音声信号処理回路２０７は、ビットストリーム処理部２０１から出力された音声データに対して必要に応じて音質調整処理等を行い、処理後の音声データをＨＤＭＩ送信部２０６に供給する。 The video signal processing circuit 205 performs image quality adjustment processing on the output stereoscopic image data obtained by the bit stream processing unit 201 as necessary, and supplies the processed output stereoscopic image data to the HDMI transmission unit 206. The audio signal processing circuit 207 performs sound quality adjustment processing or the like on the audio data output from the bit stream processing unit 201 as necessary, and supplies the processed audio data to the HDMI transmission unit 206.

ＨＤＭＩ送信部２０６は、ＨＤＭＩに準拠した通信により、例えば、非圧縮の画像データおよび音声データを、ＨＤＭＩ端子２０２から送出する。この場合、ＨＤＭＩのＴＭＤＳチャネルで送信するため、画像データおよび音声データがパッキングされて、ＨＤＭＩ送信部２０６からＨＤＭＩ端子２０２に出力される。 The HDMI transmission unit 206 sends, for example, uncompressed image data and audio data from the HDMI terminal 202 by communication conforming to HDMI. In this case, since transmission is performed using an HDMI TMDS channel, image data and audio data are packed and output from the HDMI transmission unit 206 to the HDMI terminal 202.

例えば、放送局１００からの立体画像データの伝送フォーマットがサイド・バイ・サイド方式であるとき、ＴＭＤＳ伝送フォーマットはサイド・バイ・サイド方式とされる（図２４参照）。また、例えば、放送局１００からの立体画像データの伝送フォーマットがトップ・アンド・ボトム方式であるとき、ＴＭＤＳ伝送フォーマットはトップ・アンド・ボトム方式とされる。また、例えば、放送局１００からの立体画像データの伝送フォーマットがＭＶＣ方式であるとき、ＴＭＤＳ伝送フォーマットはフレームパッキング方式とされる（図２５参照）。 For example, when the transmission format of stereoscopic image data from the broadcasting station 100 is a side-by-side format, the TMDS transmission format is a side-by-side format (see FIG. 24). For example, when the transmission format of stereoscopic image data from the broadcasting station 100 is the top-and-bottom method, the TMDS transmission format is the top-and-bottom method. For example, when the transmission format of the stereoscopic image data from the broadcasting station 100 is the MVC method, the TMDS transmission format is a frame packing method (see FIG. 25).

ＣＰＵ２１１は、セットトップボックス２００の各部の動作を制御する。フラッシュＲＯＭ２１２は、制御ソフトウェアの格納およびデータの保管を行う。ＤＲＡＭ２１３は、ＣＰＵ２１１のワークエリアを構成する。ＣＰＵ２１１は、フラッシュＲＯＭ２１２から読み出したソフトウェアやデータをＤＲＡＭ２１３上に展開してソフトウェアを起動させ、セットトップボックス２００の各部を制御する。 The CPU 211 controls the operation of each part of the set top box 200. The flash ROM 212 stores control software and data. The DRAM 213 constitutes a work area for the CPU 211. The CPU 211 develops software and data read from the flash ROM 212 on the DRAM 213 to activate the software, and controls each part of the set top box 200.

リモコン受信部２１５は、リモコン送信機２１６から送信されたリモートコントロール信号（リモコンコード）を受信し、ＣＰＵ２１１に供給する。ＣＰＵ２１１は、このリモコンコードに基づいて、セットトップボックス２００の各部を制御する。ＣＰＵ２１１、フラッシュＲＯＭ２１２およびＤＲＡＭ２１３は内部バス２１４に接続されている。 The remote control receiving unit 215 receives the remote control signal (remote control code) transmitted from the remote control transmitter 216 and supplies it to the CPU 211. The CPU 211 controls each part of the set top box 200 based on the remote control code. The CPU 211, flash ROM 212 and DRAM 213 are connected to the internal bus 214.

セットトップボックス２００の動作を簡単に説明する。アンテナ端子２０３に入力されたテレビ放送信号はデジタルチューナ２０４に供給される。このデジタルチューナ２０４では、テレビ放送信号が処理されて、ユーザの選択チャネルに対応した所定のビットストリームデータ（トランスポートストリーム）ＢＳＤが出力される。 The operation of the set top box 200 will be briefly described. A television broadcast signal input to the antenna terminal 203 is supplied to the digital tuner 204. The digital tuner 204 processes the television broadcast signal and outputs predetermined bit stream data (transport stream) BSD corresponding to the user's selected channel.

デジタルチューナ２０４から出力されるビットストリームデータＢＳＤは、ビットストリーム処理部２０１に供給される。このビットストリーム処理部２０１では、ビットストリームデータＢＳＤから立体画像データ、音声データ、立体画像用のサブタイトルデータ（表示制御情報を含む）等が抽出される。ビットストリーム処理部２０１では、立体画像データに対して、左眼サブタイトルおよび右眼サブタイトルの表示データ（ビットマップデータ）が合成され、サブタイトルが重畳された出力立体画像データが得られる。 The bit stream data BSD output from the digital tuner 204 is supplied to the bit stream processing unit 201. The bit stream processing unit 201 extracts stereoscopic image data, audio data, stereoscopic image subtitle data (including display control information), and the like from the bit stream data BSD. In the bit stream processing unit 201, display data (bitmap data) of the left eye subtitle and the right eye subtitle is combined with the stereoscopic image data, and output stereoscopic image data on which the subtitle is superimposed is obtained.

ビットストリーム処理部２０１で得られた出力立体画像データは、映像信号処理回路２０５に供給される。この映像信号処理回路２０５では、出力立体画像データに対して、必要に応じて画質調整処理等が行われる。この映像信号処理回路２０５から出力される処理後の出力立体画像データは、ＨＤＭＩ送信部２０６に供給される。 The output stereoscopic image data obtained by the bit stream processing unit 201 is supplied to the video signal processing circuit 205. In the video signal processing circuit 205, image quality adjustment processing or the like is performed on the output stereoscopic image data as necessary. The processed output stereoscopic image data output from the video signal processing circuit 205 is supplied to the HDMI transmission unit 206.

また、ビットストリーム処理部２０１で得られた音声データは、音声信号処理回路２０７に供給される。この音声信号処理回路２０７では、音声データに対して、必要に応じて音質調整処理等の処理が行われる。この音声信号処理回路２０７から出力される処理後の音声データは、ＨＤＭＩ送信部２０６に供給される。そして、ＨＤＭＩ送信部２０６に供給された立体画像データおよび音声データは、ＨＤＭＩのＴＭＤＳチャネルにより、ＨＤＭＩ端子２０２からＨＤＭＩケーブル４００に送出される。 Also, the audio data obtained by the bit stream processing unit 201 is supplied to the audio signal processing circuit 207. The audio signal processing circuit 207 performs processing such as sound quality adjustment processing on the audio data as necessary. The processed audio data output from the audio signal processing circuit 207 is supplied to the HDMI transmission unit 206. The stereoscopic image data and audio data supplied to the HDMI transmission unit 206 are transmitted from the HDMI terminal 202 to the HDMI cable 400 via the HDMI TMDS channel.

［ビットストリーム処理部の構成例］
図３０は、ビットストリーム処理部２０１の構成例を示している。このビットストリーム処理部２０１は、上述の図２に示す送信データ生成部１１０に対応した構成となっている。このビットストリーム処理部２０１は、デマルチプレクサ２２１と、ビデオデコーダ２２２と、オーディオデコーダ２２９を有している。また、このビットストリーム処理部２０１は、サブタイトルデコーダ２２３と、立体画像用サブタイトル発生部２２４と、表示制御部２２５と、表示制御情報取得部２２６と、視差情報処理部２２７と、ビデオ重畳部２２８を有している。 [Configuration example of bit stream processing unit]
FIG. 30 shows a configuration example of the bit stream processing unit 201. The bit stream processing unit 201 has a configuration corresponding to the transmission data generation unit 110 shown in FIG. The bit stream processing unit 201 includes a demultiplexer 221, a video decoder 222, and an audio decoder 229. The bit stream processing unit 201 also includes a subtitle decoder 223, a stereoscopic image subtitle generation unit 224, a display control unit 225, a display control information acquisition unit 226, a parallax information processing unit 227, and a video superimposition unit 228. Have.

デマルチプレクサ２２１は、ビットストリームデータＢＳＤから、ビデオ、オーディオ、サブタイトルのパケットを抽出し、各デコーダに送る。なお、デマルチプレクサ２２１は、ビットストリームデータＢＳＤに挿入されているＰＭＴ、ＥＩＴ等の情報を抽出し、ＣＰＵ２１１に送る。上述したように、ＥＩＴの配下にあるコンポーネント・デスクリプタに,Stream_content(‘0x03’=DVB subtitles) ＆ Component_type(for 3Dtarget)が記述されている。これにより、サブタイトルデータストリームに立体画像用のサブタイトルデータが含まれることが識別可能とされている。したがって、ＣＰＵ２１１は、この記述により、サブタイトルデータストリームに立体画像用のサブタイトルデータが含まれることを識別できる。 The demultiplexer 221 extracts video, audio, and subtitle packets from the bit stream data BSD, and sends them to each decoder. The demultiplexer 221 extracts information such as PMT and EIT inserted in the bit stream data BSD, and sends the information to the CPU 211. As described above, Stream_content ('0x03' = DVB subtitles) & Component_type (for 3Dtarget) is described in the component descriptor under the EIT. Thereby, it can be identified that the subtitle data stream includes subtitle data for stereoscopic images. Therefore, the CPU 211 can identify that the subtitle data stream includes the subtitle data for stereoscopic images based on this description.

ビデオデコーダ２２２は、上述の送信データ生成部１１０のビデオエンコーダ１１２とは逆の処理を行う。すなわち、デマルチプレクサ２２１で抽出されたビデオのパケットからビデオデータストリームを再構成し、復号化処理を行って、左眼画像データおよび右眼画像データを含む立体画像データを得る。この立体画像データの伝送フォーマットは、例えば、サイド・バイ・サイド方式、トップ・アンド・ボトム方式、フレーム・シーケンシャル方式、ＭＶＣ方式などである。 The video decoder 222 performs processing reverse to that of the video encoder 112 of the transmission data generation unit 110 described above. In other words, a video data stream is reconstructed from the video packets extracted by the demultiplexer 221, and decoding processing is performed to obtain stereoscopic image data including left eye image data and right eye image data. The transmission format of the stereoscopic image data is, for example, a side-by-side method, a top-and-bottom method, a frame-sequential method, an MVC method, or the like.

サブタイトルデコーダ２２３は、上述の送信データ生成部１１０のサブタイトルエンコーダ１１８とは逆の処理を行う。すなわち、このサブタイトルデコーダ２２３は、デマルチプレクサ２２１で抽出されたサブタイトルのパケットからサブタイトルデータストリームを再構成し、復号化処理を行って、立体画像用のサブタイトルデータ（表示制御情報を含む）を得る。立体画像用サブタイトル発生部２２４は、立体画像用のサブタイトルデータ（表示制御情報を除く）に基づいて、立体画像データに重畳する左眼サブタイトルおよび右眼サブタイトルの表示データ（ビットマップデータ）を発生する。この立体画像用サブタイトル発生部２２４は、表示データ発生部を構成している。 The subtitle decoder 223 performs processing reverse to that of the subtitle encoder 118 of the transmission data generation unit 110 described above. That is, the subtitle decoder 223 reconstructs a subtitle data stream from the subtitle packet extracted by the demultiplexer 221 and performs a decoding process to obtain stereoscopic image subtitle data (including display control information). The stereoscopic image subtitle generating unit 224 generates display data (bitmap data) of the left eye subtitle and the right eye subtitle to be superimposed on the stereoscopic image data based on the stereoscopic image subtitle data (excluding display control information). . The stereoscopic image subtitle generating unit 224 constitutes a display data generating unit.

表示制御部２２５は、表示制御情報（左眼ＳＲ、右眼ＳＲの領域情報、ターゲットフレーム情報、視差情報）に基づいて、立体画像データに重畳する表示データを制御する。すなわち、表示制御部２２５は、左眼ＳＲ、右眼ＳＲの領域情報に基づいて、立体画像データに重畳する左眼サブタイトルおよび右眼サブタイトルの表示データ（ビットマップデータ）から、左眼ＳＲに対応した表示データおよび右眼ＳＲに対応した表示データを抽出する。 The display control unit 225 controls display data to be superimposed on stereoscopic image data based on display control information (left eye SR, right eye SR region information, target frame information, and parallax information). That is, the display control unit 225 corresponds to the left eye SR from the display data (bitmap data) of the left eye subtitle and the right eye subtitle superimposed on the stereoscopic image data based on the region information of the left eye SR and the right eye SR. And display data corresponding to the right eye SR is extracted.

また、表示制御部２２５は、左眼ＳＲ、右眼ＳＲに対応した表示データを、ビデオ重畳部２２８に供給して、立体画像データに重畳する。この場合、左眼ＳＲに対応した表示データは、この左眼ＳＲのターゲットフレーム情報であるframe0で示されるフレーム部分(左眼画像フレーム部分)に重畳される。また、右眼ＳＲに対応した表示データは、この右眼ＳＲのターゲットフレーム情報であるframe1で示されるフレーム部分（右眼画像フレーム部分）に重畳される。この際、表示制御部２２５は、左眼ＳＲ内の左眼サブタイトルおよび右眼ＳＲ内の右眼サブタイトルの表示位置（重畳位置）を、視差情報に基づいて、シフト調整して、左眼サブタイトルおよび右眼サブタイトルの間に視差を付与する。 In addition, the display control unit 225 supplies display data corresponding to the left eye SR and right eye SR to the video superimposing unit 228 and superimposes them on the stereoscopic image data. In this case, the display data corresponding to the left eye SR is superimposed on the frame portion (left eye image frame portion) indicated by frame0 which is the target frame information of the left eye SR. The display data corresponding to the right eye SR is superimposed on a frame portion (right eye image frame portion) indicated by frame1 which is target frame information of the right eye SR. At this time, the display control unit 225 shifts and adjusts the display positions (superimposition positions) of the left eye subtitle in the left eye SR and the right eye subtitle in the right eye SR based on the parallax information, Parallax is given between right-eye subtitles.

表示制御情報取り出し部２２６は、サブタイトルデータストリームから表示制御情報（領域情報、ターゲットフレーム情報、視差情報）を取得する。この表示制御情報には、字幕表示期間内で共通に使用される視差情報（図１８の「subregion_disparity」参照）が含まれる。また、この表示制御情報には、さらに、字幕表示期間内で順次更新される視差情報（図２１の「disparity_update」参照）が含まれることもある。この字幕表示期間内で順次更新される視差情報は、上述したように、字幕表示期間の最初のフレームの視差情報と、その後のベースセグメント期間（更新フレーム間隔）毎のフレームの視差情報とからなっている。 The display control information extraction unit 226 acquires display control information (region information, target frame information, parallax information) from the subtitle data stream. This display control information includes disparity information (see “subregion_disparity” in FIG. 18) that is commonly used within the caption display period. Further, the display control information may further include disparity information (see “disparity_update” in FIG. 21) that is sequentially updated within the caption display period. As described above, the disparity information sequentially updated within the caption display period is composed of the disparity information of the first frame in the caption display period and the disparity information of the frame for each subsequent base segment period (update frame interval). ing.

視差情報処理部２２７は、表示制御情報に含まれる領域情報およびターゲットフレーム情報、さらに、字幕表示期間内で共通に使用される視差情報に関しては、そのまま表示制御部２２５に送る。一方、視差情報処理部２２７は、字幕表示期間内で順次更新される視差情報に関しては、補間処理を施して、字幕表示期間内における任意のフレーム間隔、例えば、１フレーム間隔の視差情報を生成して、表示制御部２２５に送る。 The disparity information processing unit 227 sends the region information and target frame information included in the display control information, as well as the disparity information used in common within the caption display period, to the display control unit 225 as it is. On the other hand, the disparity information processing unit 227 performs interpolation processing on the disparity information sequentially updated within the caption display period, and generates disparity information at an arbitrary frame interval, for example, one frame interval within the caption display period. To the display control unit 225.

視差情報処理部２２７は、この補間処理として、線形補間処理ではなく、時間方向（フレーム方向）にローパスフィルタ（ＬＰＦ）処理を伴った補間処理を行って、補間処理後の所定フレーム間隔の視差情報の時間方向（フレーム方向）を変化がなだらかになるようにしている。図３１は、視差情報処理部２２７における上述のＬＰＦ処理を伴った補間処理の一例を示している。この例では、上述の図２３の視差情報の更新例に対応している。 As the interpolation processing, the disparity information processing unit 227 performs not the linear interpolation processing but the interpolation processing with the low-pass filter (LPF) processing in the time direction (frame direction), and the disparity information at a predetermined frame interval after the interpolation processing. The time direction (frame direction) is made gentle. FIG. 31 illustrates an example of interpolation processing with the above-described LPF processing in the parallax information processing unit 227. This example corresponds to the parallax information update example of FIG. 23 described above.

ここで、上述の表示制御部２２５は、視差情報処理部２２７から字幕表示期間内で共通に使用される視差情報（視差ベクトル）のみが送られてくる場合、その視差情報を使用する。また、表示制御部２２５は、視差情報処理部２２７から、さらに字幕表示期間内で順次更新される視差情報も送られてくる場合には、いずれかを使用する。 Here, when only the disparity information (disparity vector) used in common within the caption display period is sent from the disparity information processing unit 227, the display control unit 225 described above uses the disparity information. Further, the display control unit 225 uses one of the parallax information that is sequentially updated within the caption display period from the parallax information processing unit 227.

いずれを使用するかは、例えば、上述したように、拡張表示制御のデータユニットに含まれている、字幕表示の際に受信側（デコーダ側）で必須の視差情報（disparity）対応レベルを示す情報（図１８の「rendering_level」参照）に拘束される。その場合、例えば、“００”であるときは、ユーザ設定による。字幕表示期間内で順次更新される視差情報を用いることで、左眼サブタイトルおよび右眼サブタイトルに付与する視差を画像内容の変化に連動して動的に変化させることが可能となる。 Which one is used is, for example, as described above, information indicating the disparity information (disparity) correspondence level that is included in the extended display control data unit and is essential on the reception side (decoder side) when displaying captions. (See “rendering_level” in FIG. 18). In this case, for example, when “00”, it depends on the user setting. By using the disparity information sequentially updated within the caption display period, it is possible to dynamically change the disparity to be given to the left eye subtitle and the right eye subtitle in conjunction with the change of the image content.

ビデオ重畳部２２８は、出力立体画像データＶoutを得る。この場合、ビデオ重畳部２２８は、ビデオデコーダ２２２で得られた立体画像データに対し、表示制御部２２５でシフト調整された左眼ＳＲ、右眼ＳＲの表示データ（ビットマップデータ）を、対応するターゲットフレーム部分に、重畳する。そして、ビデオ重畳部２２８は、この出力立体画像データＶoutを、ビットストリーム処理部２０１の外部に出力する。 The video superimposing unit 228 obtains output stereoscopic image data Vout. In this case, the video superimposing unit 228 corresponds to the display data (bitmap data) of the left eye SR and the right eye SR that are shift-adjusted by the display control unit 225 with respect to the stereoscopic image data obtained by the video decoder 222. Superimpose on the target frame part. Then, the video superimposing unit 228 outputs the output stereoscopic image data Vout to the outside of the bit stream processing unit 201.

また、オーディオデコーダ２２９は、上述の送信データ生成部１１０のオーディオエンコーダ１１３とは逆の処理を行う。すなわち、このオーディオデコーダ２２９は、デマルチプレクサ２２１で抽出されたオーディオのパケットからオーディオのエレメンタリストリームを再構成し、復号化処理を行って、音声データＡoutを得る。そして、このオーディオデコーダ２２９は、音声データＡoutを、ビットストリーム処理部２０１の外部に出力する。 Also, the audio decoder 229 performs a process reverse to that of the audio encoder 113 of the transmission data generation unit 110 described above. That is, the audio decoder 229 reconstructs an audio elementary stream from the audio packet extracted by the demultiplexer 221 and performs a decoding process to obtain audio data Aout. The audio decoder 229 outputs the audio data Aout to the outside of the bit stream processing unit 201.

図３０に示すビットストリーム処理部２０１の動作を簡単に説明する。デジタルチューナ２０４（図２９参照）から出力されるビットストリームデータＢＳＤは、デマルチプレクサ２２１に供給される。このデマルチプレクサ２２１では、ビットストリームデータＢＳＤから、ビデオ、オーディオおよびサブタイトルのパケットが抽出され、各デコーダに供給される。 The operation of the bit stream processing unit 201 illustrated in FIG. 30 will be briefly described. The bit stream data BSD output from the digital tuner 204 (see FIG. 29) is supplied to the demultiplexer 221. In the demultiplexer 221, video, audio, and subtitle packets are extracted from the bit stream data BSD and supplied to each decoder.

ビデオデコーダ２２２では、デマルチプレクサ２２１で抽出されたビデオのパケットからビデオデータストリームが再構成され、さらに復号化処理が行われて、左眼画像データおよび右眼画像データを含む立体画像データが得られる。この立体画像データは、ビデオ重畳部２２６に供給される。 In the video decoder 222, a video data stream is reconstructed from the video packet extracted by the demultiplexer 221, and further, decoding processing is performed to obtain stereoscopic image data including left eye image data and right eye image data. . The stereoscopic image data is supplied to the video superimposing unit 226.

また、サブタイトルデコーダ２２３では、デマルチプレクサ２２１で抽出されたサブタイトルのパケットからサブタイトルデータストリームが再構成され、さらに復号化処理が行われて、立体画像用のサブタイトルデータ（表示制御情報を含む）が得られる。このサブタイトルデータは、立体画像用サブタイトル発生部２２４に供給される。 Further, the subtitle decoder 223 reconstructs a subtitle data stream from the subtitle packet extracted by the demultiplexer 221 and further performs a decoding process to obtain stereoscopic image subtitle data (including display control information). It is done. This subtitle data is supplied to the stereoscopic image subtitle generating unit 224.

立体画像用サブタイトル発生部２２４では、立体画像用のサブタイトルデータ（表示制御情報を除く）に基づいて、立体画像データに重畳する左眼サブタイトルおよび右眼サブタイトルの表示データ（ビットマップデータ）が発生される。この表示データは、表示制御部２２５に供給される。 The stereoscopic image subtitle generation unit 224 generates display data (bitmap data) of the left eye subtitle and the right eye subtitle to be superimposed on the stereoscopic image data based on the stereoscopic image subtitle data (excluding display control information). The This display data is supplied to the display control unit 225.

また、表示制御情報取得部２２６では、サブタイトルデータストリームから表示制御情報（領域情報、ターゲットフレーム情報、視差情報）が取得される。この表示制御情報は、視差情報処理部２２７を通じて表示制御部２２５に供給される。この際、視差情報処理部２２７では、字幕表示期間内で順次更新される視差情報に関して、以下の処理が行われる。すなわち、視差情報処理部２２７では、時間方向（フレーム方向）のＬＰＦ処理を伴った補間処理が施されて、字幕表示期間内における任意のフレーム間隔、例えば、１フレーム間隔の視差情報が生成されて、表示制御部２２５に送られる。 In addition, the display control information acquisition unit 226 acquires display control information (region information, target frame information, parallax information) from the subtitle data stream. This display control information is supplied to the display control unit 225 through the parallax information processing unit 227. At this time, the disparity information processing unit 227 performs the following processing on disparity information that is sequentially updated within the caption display period. That is, the disparity information processing unit 227 performs interpolation processing with LPF processing in the time direction (frame direction), and generates disparity information at an arbitrary frame interval, for example, one frame interval within the caption display period. To the display control unit 225.

表示制御部２２５では、表示制御情報（左眼ＳＲ、右眼ＳＲの領域情報、ターゲットフレーム情報、視差情報）に基づいて、立体画像データに対する表示データの重畳が制御される。すなわち、立体画像用サブタイトル発生部２２４で発生された表示データから、左眼ＳＲ、右眼ＳＲの表示データが抽出されて、シフト調整される。その後に、シフト調整された左眼ＳＲ、右眼ＳＲの表示データが、立体画像データのターゲットフレームに重畳されるように、ビデオ重畳部２２８に供給される。 The display control unit 225 controls superimposition of display data on stereoscopic image data based on display control information (left eye SR, right eye SR region information, target frame information, and parallax information). That is, the display data of the left eye SR and the right eye SR is extracted from the display data generated by the stereoscopic image subtitle generation unit 224, and shift adjustment is performed. Thereafter, the display data of the left eye SR and the right eye SR that have been subjected to the shift adjustment is supplied to the video superimposing unit 228 so as to be superimposed on the target frame of the stereoscopic image data.

ビデオ重畳部２２８では、ビデオデコーダ２２２で得られた立体画像データに対し、表示制御部２２５でシフト調整された表示データが重畳され、出力立体画像データＶoutが得られる。この出力立体画像データＶoutは、ビットストリーム処理部２０１の外部に出力される。 The video superimposing unit 228 superimposes the display data shift-adjusted by the display control unit 225 on the stereoscopic image data obtained by the video decoder 222 to obtain output stereoscopic image data Vout. The output stereoscopic image data Vout is output to the outside of the bit stream processing unit 201.

また、オーディオデコーダ２２９では、デマルチプレクサ２２１で抽出されたオーディオのパケットからオーディオエレメンタリストリームが再構成され、さらに復号化処理が行われて、上述の表示用立体画像データＶoutに対応した音声データＡoutが得られる。この音声データＡoutは、ビットストリーム処理部２０１の外部に出力される。 Also, the audio decoder 229 reconstructs an audio elementary stream from the audio packet extracted by the demultiplexer 221, further performs a decoding process, and the audio data Aout corresponding to the display stereoscopic image data Vout described above. Is obtained. The audio data Aout is output to the outside of the bit stream processing unit 201.

図２９に示すセットトップボックス２００において、デジタルチューナ２０４から出力されるビットストリームデータＢＳＤは、ビデオデータストリームとサブタイトルデータストリームとを有する多重化データストリームである。ビデオデータストリームには、立体画像データが含まれている。また、サブタイトルデータストリームには、その立体画像データの伝送フォーマットに対応した立体画像用（三次元画像用）のサブタイトルデータが含まれている。 In the set top box 200 shown in FIG. 29, the bit stream data BSD output from the digital tuner 204 is a multiplexed data stream having a video data stream and a subtitle data stream. The video data stream includes stereoscopic image data. Further, the subtitle data stream includes subtitle data for stereoscopic images (for 3D images) corresponding to the transmission format of the stereoscopic image data.

この立体画像用のサブタイトルデータは、左眼サブタイトルのデータおよび右眼サブタイトルのデータを持っている。そのため、ビットストリーム処理部２０１の立体画像用サブタイトル発生部２２４では、立体画像データが持つ左眼画像データに重畳する左眼サブタイトルの表示データを容易に発生できる。また、ビットストリーム処理部２０１の立体画像用サブタイトル発生部２２４では、立体画像データが持つ右眼画像データに重畳する右眼サブタイトルの表示データを容易に発生できる。これにより、処理の容易化が図られる。 The stereoscopic image subtitle data includes left-eye subtitle data and right-eye subtitle data. Therefore, the stereoscopic image subtitle generating unit 224 of the bitstream processing unit 201 can easily generate display data for the left eye subtitle to be superimposed on the left eye image data included in the stereoscopic image data. Further, the stereoscopic image subtitle generating unit 224 of the bitstream processing unit 201 can easily generate display data of the right eye subtitle to be superimposed on the right eye image data included in the stereoscopic image data. This facilitates processing.

また、図２９に示すセットトップボックス２００において、デジタルチューナ２０４から出力されるビットストリームデータＢＳＤには、立体画像データ、立体画像用のサブタイトルデータの他に、表示制御情報も含まれる。この表示制御情報には、左眼ＳＲおよび右眼ＳＲに関連した表示制御情報（領域情報、ターゲットフレーム情報、視差情報）が含まれている。そのため、左眼ＳＲ内の左眼サブタイトルおよび右眼ＳＲ内のサブタイトルのみをそれぞれターゲットフレームに重畳表示することが容易となる。また、これら左眼ＳＲ内の左眼サブタイトルおよび右眼ＳＲ内のサブタイトルの表示位置に視差を付与でき、サブタイトル（字幕）の表示において、画像内の各物体との間の遠近感の整合性を最適な状態に維持することが可能となる。 In the set top box 200 shown in FIG. 29, the bit stream data BSD output from the digital tuner 204 includes display control information in addition to stereoscopic image data and stereoscopic image subtitle data. This display control information includes display control information (region information, target frame information, parallax information) related to the left eye SR and the right eye SR. Therefore, it becomes easy to superimpose and display only the left eye subtitle in the left eye SR and the subtitle in the right eye SR on the target frame. Further, parallax can be given to the display positions of the left eye subtitle in the left eye SR and the subtitle in the right eye SR, and the consistency of perspective with each object in the image is displayed in the display of the subtitle (caption). It becomes possible to maintain the optimum state.

また、図２９に示すセットトップボックス２００において、ビットストリーム処理部２０１の表示制御情報取得部２２６で取得される表示制御情報に字幕表示期間内で順次更新される視差情報が含まれる場合、表示制御部２２５により、左眼ＳＲ内の左眼サブタイトルおよび右眼ＳＲ内の右眼サブタイトルの表示位置を動的に制御できる。これにより、左眼サブタイトルおよび右眼サブタイトルの間に付与する視差を画像内容の変化に連動して動的に変化させることが可能となる。 Also, in the set-top box 200 shown in FIG. 29, when the display control information acquired by the display control information acquisition unit 226 of the bitstream processing unit 201 includes disparity information that is sequentially updated within the caption display period, display control is performed. The unit 225 can dynamically control the display positions of the left eye subtitle in the left eye SR and the right eye subtitle in the right eye SR. Thereby, the parallax provided between the left eye subtitle and the right eye subtitle can be dynamically changed in conjunction with the change of the image content.

また、図２９に示すセットトップボックス２００において、ビットストリーム処理部２０１の視差情報処理部２２７で、字幕表示期間（所定数のフレーム期間）内で順次更新される視差情報を構成する複数フレームの視差情報に対して補間処理が施される。この場合、送信側から更新フレーム間隔毎に視差情報が送信される場合であっても、左眼サブタイトルおよび右眼サブタイトルの間に付与する視差を、細かな間隔で、例えばフレーム毎に制御することが可能となる。 Also, in the set-top box 200 shown in FIG. 29, the disparity of a plurality of frames constituting disparity information sequentially updated within the caption display period (a predetermined number of frame periods) by the disparity information processing unit 227 of the bitstream processing unit 201. Interpolation processing is performed on the information. In this case, even when disparity information is transmitted from the transmission side every update frame interval, the disparity provided between the left eye subtitle and the right eye subtitle is controlled at a fine interval, for example, for each frame. Is possible.

また、図２９に示すセットトップボックス２００において、ビットストリーム処理部２０１の視差情報処理部２２７における補間処理は、例えば、時間方向（フレーム方向）のローパスフィルタ処理を伴うようにされる。そのため、送信側から更新フレーム間隔毎に視差情報が送信される場合であっても、補間処理後の視差情報の時間方向の変化をなだらかにでき、左眼サブタイトルおよび右眼サブタイトルの間に付与される視差の推移が、更新フレーム間隔毎に不連続となることによる違和感を抑制できる。 In the set top box 200 shown in FIG. 29, the interpolation processing in the disparity information processing unit 227 of the bit stream processing unit 201 is accompanied by, for example, low-pass filter processing in the time direction (frame direction). Therefore, even when disparity information is transmitted from the transmission side every update frame interval, the change in the time direction of the disparity information after the interpolation process can be gently performed, and is provided between the left eye subtitle and the right eye subtitle. It is possible to suppress a sense of incongruity due to discontinuity in the disparity transitions at every update frame interval.

［テレビ受信機の説明］
図１に戻って、テレビ受信機３００は、セットトップボックス２００からＨＤＭＩケーブル４００を介して送られてくる立体画像データを受信する。このテレビ受信機３００は、３Ｄ信号処理部３０１を有している。この３Ｄ信号処理部３０１は、立体画像データに対して、伝送フォーマットに対応した処理（デコード処理）を行って、左眼画像データおよび右眼画像データを生成する。 [Description of TV receiver]
Returning to FIG. 1, the television receiver 300 receives stereoscopic image data sent from the set top box 200 via the HDMI cable 400. The television receiver 300 includes a 3D signal processing unit 301. The 3D signal processing unit 301 performs processing (decoding processing) corresponding to the transmission format on the stereoscopic image data to generate left-eye image data and right-eye image data.

［テレビ受信機の構成例］
テレビ受信機３００の構成例を説明する。図３２は、テレビ受信機３００の構成例を示している。このテレビ受信機３００は、３Ｄ信号処理部３０１と、ＨＤＭＩ端子３０２と、ＨＤＭＩ受信部３０３と、アンテナ端子３０４と、デジタルチューナ３０５と、ビットストリーム処理部３０６を有している。 [Configuration example of TV receiver]
A configuration example of the television receiver 300 will be described. FIG. 32 illustrates a configuration example of the television receiver 300. The television receiver 300 includes a 3D signal processing unit 301, an HDMI terminal 302, an HDMI receiving unit 303, an antenna terminal 304, a digital tuner 305, and a bit stream processing unit 306.

また、このテレビ受信機３００は、映像・グラフィック処理回路３０７と、パネル駆動回路３０８と、表示パネル３０９と、音声信号処理回路３１０と、音声増幅回路３１１と、スピーカ３１２を有している。また、このテレビ受信機３００は、ＣＰＵ３２１と、フラッシュＲＯＭ３２２と、ＤＲＡＭ３２３と、内部バス３２４と、リモコン受信部３２５と、リモコン送信機３２６を有している。 The television receiver 300 includes a video / graphic processing circuit 307, a panel drive circuit 308, a display panel 309, an audio signal processing circuit 310, an audio amplification circuit 311, and a speaker 312. In addition, the television receiver 300 includes a CPU 321, a flash ROM 322, a DRAM 323, an internal bus 324, a remote control receiving unit 325, and a remote control transmitter 326.

アンテナ端子３０４は、受信アンテナ（図示せず）で受信されたテレビ放送信号を入力する端子である。デジタルチューナ３０５は、アンテナ端子３０４に入力されたテレビ放送信号を処理して、ユーザの選択チャネルに対応した所定のビットストリームデータ（トランスポートストリーム）ＢＳＤを出力する。ビットストリーム処理部３０６は、ビットストリームデータＢＳＤから立体画像データ、音声データ、立体画像用のサブタイトルデータ（表示制御情報も含む）等を抽出する。 The antenna terminal 304 is a terminal for inputting a television broadcast signal received by a receiving antenna (not shown). The digital tuner 305 processes the television broadcast signal input to the antenna terminal 304 and outputs predetermined bit stream data (transport stream) BSD corresponding to the user's selected channel. The bit stream processing unit 306 extracts stereoscopic image data, audio data, stereoscopic image subtitle data (including display control information), and the like from the bit stream data BSD.

また、このビットストリーム処理部３０６は、セットトップボックス２００のビットストリーム処理部２０１と同様に、構成される。このビットストリーム処理部３０６は、立体画像データに対して、左眼サブタイトルおよび右眼サブタイトルの表示データを合成し、サブタイトルが重畳された出力立体画像データを生成して出力する。なお、このビットストリーム処理部３０６は、例えば、立体画像データの伝送フォーマットがサイド・バイ・サイド方式、あるいはトップ・アンド・ボトム方式などの場合、スケーリング処理を施し、フル解像度の左眼画像データおよび右眼画像データを出力する（図２４〜図２６のテレビ受信機３００の部分参照）。また、ビットストリーム処理部３０６は、音声データを出力する。 The bit stream processing unit 306 is configured in the same manner as the bit stream processing unit 201 of the set top box 200. The bit stream processing unit 306 combines the display data of the left eye subtitle and the right eye subtitle with the stereoscopic image data, and generates and outputs output stereoscopic image data on which the subtitle is superimposed. Note that the bit stream processing unit 306 performs scaling processing, for example, when the transmission format of the stereoscopic image data is a side-by-side method or a top-and-bottom method, and performs full-resolution left-eye image data and Right-eye image data is output (see the portion of the television receiver 300 in FIGS. 24 to 26). The bit stream processing unit 306 outputs audio data.

ＨＤＭＩ受信部３０３は、ＨＤＭＩに準拠した通信により、ＨＤＭＩケーブル４００を介してＨＤＭＩ端子３０２に供給される非圧縮の画像データおよび音声データを受信する。このＨＤＭＩ受信部３０３は、そのバージョンが例えばＨＤＭＩ１．４ａとされており、立体画像データの取り扱いが可能な状態にある。 The HDMI receiving unit 303 receives uncompressed image data and audio data supplied to the HDMI terminal 302 via the HDMI cable 400 by communication conforming to HDMI. The HDMI receiving unit 303 has a version of, for example, HDMI 1.4a, and can handle stereoscopic image data.

３Ｄ信号処理部３０１は、ＨＤＭＩ受信部３０３で受信された立体画像データに対して、デコード処理を行って、フル解像度の左眼画像データおよび右眼画像データを生成する。３Ｄ信号処理部３０１は、ＴＭＤＳ伝送データフォーマットに対応したデコード処理を行う。なお、３Ｄ信号処理部３０１は、ビットストリーム処理部３０６で得られたフル解像度の左眼画像データおよび右眼画像データに対しては何もしない。 The 3D signal processing unit 301 performs decoding processing on the stereoscopic image data received by the HDMI receiving unit 303 to generate full-resolution left-eye image data and right-eye image data. The 3D signal processing unit 301 performs a decoding process corresponding to the TMDS transmission data format. Note that the 3D signal processing unit 301 does nothing with the full-resolution left-eye image data and right-eye image data obtained by the bit stream processing unit 306.

映像・グラフィック処理回路３０７は、３Ｄ信号処理部３０１で生成された左眼画像データおよび右眼画像データに基づいて、立体画像を表示するための画像データを生成する。また、映像・グラフィック処理回路３０７は、画像データに対して、必要に応じて、画質調整処理を行う。また、映像・グラフィック処理回路３０７は、画像データに対して、必要に応じて、メニュー、番組表などの重畳情報のデータを合成する。パネル駆動回路３０８は、映像・グラフィック処理回路３０７から出力される画像データに基づいて、表示パネル３０９を駆動する。表示パネル３０９は、例えば、ＬＣＤ(Liquid Crystal Display)、ＰＤＰ(Plasma DisplayPanel)等で構成されている。 The video / graphic processing circuit 307 generates image data for displaying a stereoscopic image based on the left-eye image data and the right-eye image data generated by the 3D signal processing unit 301. The video / graphic processing circuit 307 performs image quality adjustment processing on the image data as necessary. Further, the video / graphic processing circuit 307 synthesizes superimposition information data such as a menu and a program guide with the image data as necessary. The panel drive circuit 308 drives the display panel 309 based on the image data output from the video / graphic processing circuit 307. The display panel 309 includes, for example, an LCD (Liquid Crystal Display), a PDP (Plasma Display Panel), and the like.

音声信号処理回路３１０は、ＨＤＭＩ受信部３０３で受信された、あるいはビットストリーム処理部３０６で得られた音声データに対してＤ／Ａ変換等の必要な処理を行う。音声増幅回路３１１は、音声信号処理回路３１０から出力される音声信号を増幅してスピーカ３１２に供給する。 The audio signal processing circuit 310 performs necessary processing such as D / A conversion on the audio data received by the HDMI receiving unit 303 or obtained by the bit stream processing unit 306. The audio amplification circuit 311 amplifies the audio signal output from the audio signal processing circuit 310 and supplies the amplified audio signal to the speaker 312.

ＣＰＵ３２１は、テレビ受信機３００の各部の動作を制御する。フラッシュＲＯＭ３２２は、制御ソフトウェアの格納およびデータの保管を行う。ＤＲＡＭ３２３は、ＣＰＵ３２１のワークエリアを構成する。ＣＰＵ３２１は、フラッシュＲＯＭ３２２から読み出したソフトウェアやデータをＤＲＡＭ３２３上に展開してソフトウェアを起動させ、テレビ受信機３００の各部を制御する。リモコン受信部３２５は、リモコン送信機３２６から送信されたリモートコントロール信号（リモコンコード）を受信し、ＣＰＵ３２１に供給する。ＣＰＵ３２１は、このリモコンコードに基づいて、テレビ受信機３００の各部を制御する。ＣＰＵ３２１、フラッシュＲＯＭ３２２およびＤＲＡＭ３２３は、内部バス３２４に接続されている。 The CPU 321 controls the operation of each unit of the television receiver 300. The flash ROM 322 stores control software and data. The DRAM 323 constitutes a work area for the CPU 321. The CPU 321 develops software and data read from the flash ROM 322 on the DRAM 323 to activate the software, and controls each unit of the television receiver 300. The remote control receiving unit 325 receives the remote control signal (remote control code) transmitted from the remote control transmitter 326 and supplies it to the CPU 321. The CPU 321 controls each part of the television receiver 300 based on the remote control code. The CPU 321, flash ROM 322, and DRAM 323 are connected to the internal bus 324.

図３２に示すテレビ受信機３００の動作を簡単に説明する。ＨＤＭＩ受信部３０３では、ＨＤＭＩ端子３０２にＨＤＭＩケーブル４００を介して接続されているセットトップボックス２００から送信されてくる、立体画像データおよび音声データが受信される。このＨＤＭＩ受信部３０３で受信された立体画像データは、３Ｄ信号処理部３０１に供給される。また、このＨＤＭＩ受信部３０３で受信された音声データは音声信号処理回路３１０に供給される。 The operation of the television receiver 300 shown in FIG. 32 will be briefly described. The HDMI receiving unit 303 receives stereoscopic image data and audio data transmitted from the set top box 200 connected to the HDMI terminal 302 via the HDMI cable 400. The stereoscopic image data received by the HDMI receiving unit 303 is supplied to the 3D signal processing unit 301. The audio data received by the HDMI receiving unit 303 is supplied to the audio signal processing circuit 310.

アンテナ端子３０４に入力されたテレビ放送信号はデジタルチューナ３０５に供給される。このデジタルチューナ３０５では、テレビ放送信号が処理されて、ユーザの選択チャネルに対応した所定のビットストリームデータ（トランスポートストリーム）ＢＳＤが出力される。 A television broadcast signal input to the antenna terminal 304 is supplied to the digital tuner 305. The digital tuner 305 processes the television broadcast signal and outputs predetermined bit stream data (transport stream) BSD corresponding to the user's selected channel.

デジタルチューナ３０５から出力されるビットストリームデータＢＳＤは、ビットストリーム処理部３０６に供給される。このビットストリーム処理部３０６では、ビットストリームデータＢＳＤから立体画像データ、音声データ、立体画像用のサブタイトルデータ（表示制御情報も含む）等を抽出する。また、このビットストリーム処理部３０６では、立体画像データに対して、左眼サブタイトルおよび右眼サブタイトルの表示データが合成されて、サブタイトルが重畳された出力立体画像データ（フル解像度の左眼画像データおよび右眼画像データ）が生成される。この出力立体画像データは、３Ｄ信号処理部３０１通って、映像・グラフィック処理回路３０７に供給される。 The bit stream data BSD output from the digital tuner 305 is supplied to the bit stream processing unit 306. The bit stream processing unit 306 extracts stereoscopic image data, audio data, stereoscopic image subtitle data (including display control information), and the like from the bit stream data BSD. Further, the bit stream processing unit 306 combines the display data of the left eye subtitle and the right eye subtitle with the stereoscopic image data, and outputs the stereoscopic image data (full resolution left eye image data and Right eye image data) is generated. The output stereoscopic image data is supplied to the video / graphic processing circuit 307 through the 3D signal processing unit 301.

３Ｄ信号処理部３０１では、ＨＤＭＩ受信部３０３で受信された立体画像データに対してデコード処理が行われて、フル解像度の左眼画像データおよび右眼画像データが生成される。この左眼画像データおよび右眼画像データは、映像・グラフィック処理回路３０７に供給される。この映像・グラフィック処理回路３０７では、左眼画像データおよび右眼画像データに基づいて、立体画像を表示するための画像データが生成され、必要に応じて、画質調整処理、ＯＳＤ（オンスクリーンディスプレイ）等の重畳情報データの合成処理も行われる。 In the 3D signal processing unit 301, the stereoscopic image data received by the HDMI receiving unit 303 is decoded, and full-resolution left eye image data and right eye image data are generated. The left eye image data and right eye image data are supplied to the video / graphic processing circuit 307. In the video / graphic processing circuit 307, image data for displaying a stereoscopic image is generated based on the left eye image data and the right eye image data, and image quality adjustment processing, OSD (on-screen display) is performed as necessary. The superimposing information data is synthesized.

この映像・グラフィック処理回路３０７で得られる画像データはパネル駆動回路３０８に供給される。そのため、表示パネル３０９により立体画像が表示される。例えば、表示パネル３０９に、左眼画像データによる左眼画像および右眼画像データによる右眼画像が交互に時分割的に表示される。視聴者は、例えば、表示パネル３０９の表示に同期して左眼シャッタおよび右眼シャッタが交互に開くシャッタメガネを装着することで、左眼では左眼画像のみを見ることができ、右眼では右眼画像のみを見ることができ、立体画像を知覚できる。 Image data obtained by the video / graphic processing circuit 307 is supplied to the panel drive circuit 308. Therefore, a stereoscopic image is displayed on the display panel 309. For example, the left eye image based on the left eye image data and the right eye image based on the right eye image data are alternately displayed on the display panel 309 in a time division manner. For example, the viewer can see only the left-eye image with the left eye and the right eye with the shutter glasses by alternately opening the left-eye shutter and the right-eye shutter in synchronization with the display on the display panel 309. Only the right eye image can be seen, and a stereoscopic image can be perceived.

また、ビットストリーム処理部３０６で得られた音声データは、音声信号処理回路３１０に供給される。この音声信号処理回路３１０では、ＨＤＭＩ受信部３０３で受信された、あるいはビットストリーム処理部３０６で得られた音声データに対してＤ／Ａ変換等の必要な処理が施される。この音声データは、音声増幅回路３１１で増幅された後に、スピーカ３１２に供給される。そのため、スピーカ３１２から表示パネル３０９の表示画像に対応した音声が出力される。 Also, the audio data obtained by the bit stream processing unit 306 is supplied to the audio signal processing circuit 310. In the audio signal processing circuit 310, necessary processing such as D / A conversion is performed on the audio data received by the HDMI receiving unit 303 or obtained by the bit stream processing unit 306. The audio data is amplified by the audio amplification circuit 311 and then supplied to the speaker 312. Therefore, sound corresponding to the display image on the display panel 309 is output from the speaker 312.

［送信データ生成部およびビットストリーム処理部の他の構成例（１）］
「送信データ生成部の構成例」
図３３は、放送局１００（図１参照）における送信データ生成部１１０Ａの構成例を示している。この送信データ生成部１１０Ａは、既存の放送規格の一つであるＡＲＩＢ（Association of Radio Industries and Businesses）方式に容易に連携できるデータ構造で視差情報（視差ベクトル）を送信する。この送信データ生成部１１０Ａは、データ取り出し部（アーカイブ部）１２１と、ビデオエンコーダ１２２と、オーディオエンコーダ１２３と、字幕発生部１２４と、視差情報作成部１２５と、字幕エンコーダ１２６と、マルチプレクサ１２７を有している。 [Another example of configuration of transmission data generation unit and bit stream processing unit (1)]
"Configuration example of transmission data generator"
FIG. 33 illustrates a configuration example of the transmission data generation unit 110A in the broadcast station 100 (see FIG. 1). The transmission data generation unit 110A transmits disparity information (disparity vector) with a data structure that can be easily linked to an ARIB (Association of Radio Industries and Businesses) system, which is one of the existing broadcasting standards. The transmission data generation unit 110A includes a data extraction unit (archive unit) 121, a video encoder 122, an audio encoder 123, a caption generation unit 124, a disparity information creation unit 125, a caption encoder 126, and a multiplexer 127. is doing.

データ取り出し部１２１には、データ記録媒体１２１ａが、例えば、着脱自在に装着される。このデータ記録媒体１２１ａには、図２に示す送信データ生成部１１０のデータ取り出し部１１１におけるデータ記録媒体１１１ａと同様に、左眼画像データおよび右眼画像データを含む立体画像データと共に、音声データ、視差情報が対応付けて記録されている。データ取り出し部１２１は、データ記録媒体１２１ａから、立体画像データ、音声データ、視差情報等を取り出して出力する。データ記録媒体１２１ａは、ディスク状記録媒体、半導体メモリ等である。 A data recording medium 121a is detachably attached to the data extraction unit 121, for example. The data recording medium 121a includes audio data, stereo data including left-eye image data and right-eye image data, as in the data recording medium 111a in the data extraction unit 111 of the transmission data generation unit 110 shown in FIG. Parallax information is recorded in association with each other. The data extracting unit 121 extracts and outputs stereoscopic image data, audio data, parallax information, and the like from the data recording medium 121a. The data recording medium 121a is a disk-shaped recording medium, a semiconductor memory, or the like.

図３３に戻って、字幕発生部１２４は、字幕データ（ＡＲＩＢ方式の字幕文データ）を発生する。字幕エンコーダ１２６は、字幕発生部１２４で発生された字幕データを含む字幕データストリーム（字幕エレメンタリストリーム）を生成する。図３４（ａ）は、字幕データストリームの構成例を示している。この例は、図３４（ｂ）に示すように、同一の画面に、「1st Caption Unit」、「2nd Caption Unit」、「3rd Caption Unit」の３つのキャプション・ユニット（字幕）が表示される例を示している。 Returning to FIG. 33, the caption generation unit 124 generates caption data (ARIB-style caption text data). The caption encoder 126 generates a caption data stream (caption elementary stream) including the caption data generated by the caption generation unit 124. FIG. 34A shows a configuration example of a caption data stream. In this example, as shown in FIG. 34B, three caption units (captions) of “1st Caption Unit”, “2nd Caption Unit”, and “3rd Caption Unit” are displayed on the same screen. Is shown.

字幕データストリームには、字幕文データグループの字幕文データ（字幕符号）として、各キャプション・ユニットの字幕データが挿入される。なお、各キャプション・ユニットの表示領域などの設定データは、図示していないが、字幕管理データグループのデータとして、字幕データストリームに挿入される。「1st Caption Unit」、「2nd Caption Unit」、「3rd Caption Unit」のキャプション・ユニットの表示領域は、それぞれ、（x1,y1）、（x2,y2）、（x3,y3）で示されている。 The caption data of each caption unit is inserted into the caption data stream as the caption text data (caption code) of the caption text data group. Although not shown, setting data such as the display area of each caption unit is inserted into the caption data stream as data of a caption management data group. The display areas of the caption units of “1st Caption Unit”, “2nd Caption Unit”, and “3rd Caption Unit” are indicated by (x1, y1), (x2, y2), and (x3, y3), respectively. .

視差情報作成部１２５は、ビューワ機能を持っている。この視差情報作成部１２５は、データ取り出し部１２１から出力される視差情報、すなわちピクセル（画素）毎の視差ベクトルにダウンサイジング処理を施し、所定の領域に属する視差ベクトルを生成する。視差情報作成部１２５は、詳細説明は省略するが、上述した図２に示す送信データ生成部１１０の視差情報作成部１１５と同様のダウンサイジング処理を行う。 The parallax information creation unit 125 has a viewer function. The disparity information creating unit 125 performs a downsizing process on the disparity information output from the data extracting unit 121, that is, the disparity vector for each pixel (pixel), and generates disparity vectors belonging to a predetermined region. Although the detailed description is omitted, the disparity information creating unit 125 performs the same downsizing process as the disparity information creating unit 115 of the transmission data generating unit 110 illustrated in FIG. 2 described above.

視差情報作成部１２５は、上述したダウンサイジング処理により、同一の画面に表示される所定数のキャプション・ユニット（字幕）に対応した視差ベクトルを作成する。この場合、視差情報作成部１２５は、キャプション・ユニット毎の視差ベクトル（個別視差ベクトル）を作成するか、あるいは各キャプション・ユニットに共通の視差ベクトル（共通視差ベクトル）を作成する。この選択は、例えば、ユーザの設定による。 The disparity information creating unit 125 creates disparity vectors corresponding to a predetermined number of caption units (captions) displayed on the same screen by the above-described downsizing process. In this case, the disparity information creating unit 125 creates a disparity vector (individual disparity vector) for each caption unit, or creates a disparity vector common to each caption unit (common disparity vector). This selection depends on, for example, user settings.

視差情報作成部１２５は、個別視差ベクトルを作成する場合、各キャプション・ユニットの表示領域に基づき、上述のダウンサイジング処理によって、その表示領域に属する視差ベクトルを求める。また、視差情報作成部１２５は、共通視差ベクトルを作成する場合、上述のダウンサイジング処理によって、ピクチャ全体（画像全体）の視差ベクトルを求める（図９（ｄ）参照）。なお、視差情報作成部１２５は、共通視差ベクトルを作成する場合、各キャプション・ユニットの表示領域に属する視差ベクトルを求め、最も値の大きな視差ベクトルを選択してもよい。 When creating an individual disparity vector, the disparity information creating unit 125 obtains a disparity vector belonging to the display area by the above-described downsizing process based on the display area of each caption unit. Further, when creating the common disparity vector, the disparity information creating unit 125 obtains the disparity vector of the entire picture (entire image) by the above-described downsizing process (see FIG. 9D). Note that the disparity information creating unit 125 may obtain a disparity vector belonging to the display area of each caption unit and select the disparity vector having the largest value when creating a common disparity vector.

字幕エンコーダ１２６は、上述したように視差情報作成部１２５で作成された視差ベクトル（視差情報）を、字幕データストリームに含める。この場合、字幕データストリームには、字幕文データグループのＰＥＳストリームに、字幕文データ（字幕符号）として、同一画面に表示される各キャプション・ユニットの字幕データが挿入される。また、この字幕データストリームには、字幕管理データのＰＥＳストリームに、あるいは、字幕文データグループのＰＥＳストリームに、字幕の表示制御情報として、視差ベクトル（視差情報）が挿入される。 The caption encoder 126 includes the disparity vector (disparity information) created by the disparity information creating unit 125 as described above in the caption data stream. In this case, the caption data of each caption unit displayed on the same screen is inserted as the caption text data (caption code) in the PES stream of the caption text data group. Also, disparity vectors (disparity information) are inserted into the caption data stream as caption display control information in the caption management data PES stream or in the caption text data group PES stream.

ここで、視差情報作成部１２５で個別視差ベクトルが作成される場合であって、字幕管理データのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合について説明する。ここでは、同一の画面に、「1st Caption Unit」、「2nd Caption Unit」、「3rd Caption Unit」の３つのキャプション・ユニット（字幕）が表示される例とする。 Here, a case where an individual disparity vector is created by the disparity information creating unit 125 and a disparity vector (disparity information) is inserted into the PES stream of caption management data will be described. Here, an example is shown in which three caption units (captions) of “1st Caption Unit”, “2nd Caption Unit”, and “3rd Caption Unit” are displayed on the same screen.

視差情報作成部１２５は、図３５（ｂ）に示すように、各キャプション・ユニットに対応した個別視差ベクトルを作成する。「Disparity 1」は、「1st Caption Unit」に対応した個別視差ベクトルである。「Disparity 2」は、「2nd Caption Unit」に対応した視差ベクトルである。「Disparity 3」は、「3rd Caption Unit」に対応した個別視差ベクトルである。 The disparity information creating unit 125 creates an individual disparity vector corresponding to each caption unit, as shown in FIG. “Disparity 1” is an individual disparity vector corresponding to “1st Caption Unit”. “Disparity 2” is a disparity vector corresponding to “2nd Caption Unit”. “Disparity 3” is an individual disparity vector corresponding to “3rd Caption Unit”.

図３５（ａ）は、字幕エンコーダ１２６で生成される字幕データストリーム（ＰＥＳストリーム）の構成例を示している。字幕文データグループのＰＥＳストリームには、各キャプション・ユニットの字幕文情報と、それぞれの字幕文情報に関連付けられた拡張表示制御情報（データユニットＩＤ）が挿入される。また、字幕管理データグループのＰＥＳストリームには、各キャプション・ユニットの字幕文情報にそれぞれ対応した拡張表示制御情報（視差情報）が挿入される。 FIG. 35A shows a configuration example of a caption data stream (PES stream) generated by the caption encoder 126. The caption text information of each caption unit and the extended display control information (data unit ID) associated with each caption text information are inserted into the PES stream of the caption text data group. Further, extended display control information (disparity information) corresponding to the caption text information of each caption unit is inserted into the PES stream of the caption management data group.

字幕文データグループの拡張表示制御情報（データユニットＩＤ）は、字幕管理データグループの各拡張表示制御情報（視差情報）を、字幕文データグループの各字幕文情報に対応付けするために必要とされる。この場合、字幕管理データグループの各拡張表示制御情報としての視差情報は、対応するキャプション・ユニットの個別視差ベクトルである。なお、各キャプション・ユニットの表示領域などの設定データは、図示していないが、字幕管理データグループのＰＥＳストリームに、字幕管理データ（制御符号）として、挿入される。「1st Caption Unit」、「2nd Caption Unit」、「3rd Caption Unit」のキャプション・ユニットの表示領域は、それぞれ、（x1,y1）、（x2,y2）、（x3,y3）で示されている。 The extended display control information (data unit ID) of the caption text data group is required to associate each extended display control information (disparity information) of the caption management data group with each caption text information of the caption text data group. The In this case, the disparity information as each extended display control information of the caption management data group is an individual disparity vector of the corresponding caption unit. Although not shown, setting data such as the display area of each caption unit is inserted as caption management data (control code) in the PES stream of the caption management data group. The display areas of the caption units of “1st Caption Unit”, “2nd Caption Unit”, and “3rd Caption Unit” are indicated by (x1, y1), (x2, y2), and (x3, y3), respectively. .

図３５（ｃ）は、各キャプション・ユニット（字幕）が重畳された第１のビュー（1st View）、例えば右眼画像を示している。また、図３５（ｄ）は、各キャプション・ユニットが重畳された第２のビュー（1st View）、例えば左眼画像を示している。各キャプション・ユニットに対応した個別視差ベクトルは、図示のように、例えば、右眼画像に重畳する各キャプション・ユニットと、左眼画像に重畳する各キャプション・ユニットとの間に視差を付与するために用いられる。 FIG. 35 (c) shows a first view (1st View) in which each caption unit (caption) is superimposed, for example, a right eye image. FIG. 35D shows a second view (1st View) in which each caption unit is superimposed, for example, a left eye image. As shown in the figure, the individual disparity vector corresponding to each caption unit is used to give disparity between each caption unit superimposed on the right eye image and each caption unit superimposed on the left eye image, for example. Used for.

次に、視差情報作成部１２５で共通視差ベクトルが作成される場合であって、字幕管理データのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合について説明する。ここでは、同一の画面に、「1st Caption Unit」、「2nd Caption Unit」、「3rd Caption Unit」の３つのキャプション・ユニット（字幕）が表示される例とする。視差情報作成部１２５は、図３６（ｂ）に示すように、各キャプション・ユニットに共通の共通視差ベクトル「Disparity」を作成する。 Next, a case where a common disparity vector is created by the disparity information creating unit 125 and a disparity vector (disparity information) is inserted into the PES stream of caption management data will be described. Here, an example is shown in which three caption units (captions) of “1st Caption Unit”, “2nd Caption Unit”, and “3rd Caption Unit” are displayed on the same screen. The disparity information creating unit 125 creates a common disparity vector “Disparity” common to the caption units, as shown in FIG.

図３６（ａ）は、字幕エンコーダ１２６で生成される字幕データストリーム（ＰＥＳストリーム）の構成例を示している。字幕文データグループのＰＥＳストリームには、各キャプション・ユニットの字幕文情報が挿入される。また、字幕管理データグループのＰＥＳストリームには、各キャプション・ユニットの字幕文情報に共通に対応した拡張表示制御情報（視差情報）が挿入される。この場合、字幕管理データグループの拡張表示制御情報としての視差情報は、各キャプション・ユニットの共通視差ベクトルである。 FIG. 36A shows a configuration example of a caption data stream (PES stream) generated by the caption encoder 126. The caption sentence information of each caption unit is inserted into the PES stream of the caption sentence data group. Further, extended display control information (disparity information) corresponding to the caption text information of each caption unit is inserted into the PES stream of the caption management data group. In this case, the disparity information as the extended display control information of the caption management data group is a common disparity vector of each caption unit.

なお、各キャプション・ユニットの表示領域などの設定データは、図示していないが、字幕管理データグループのＰＥＳストリームに、字幕管理データ（制御符号）として、挿入される。「1st Caption Unit」、「2nd Caption Unit」、「3rd Caption Unit」のキャプション・ユニットの表示領域は、それぞれ、（x1,y1）、（x2,y2）、（x3,y3）で示されている。 Although not shown, setting data such as the display area of each caption unit is inserted as caption management data (control code) in the PES stream of the caption management data group. The display areas of the caption units of “1st Caption Unit”, “2nd Caption Unit”, and “3rd Caption Unit” are indicated by (x1, y1), (x2, y2), and (x3, y3), respectively. .

図３６（ｃ）は、各キャプション・ユニット（字幕）が重畳された第１のビュー（1st View）、例えば右眼画像を示している。また、図３６（ｄ）は、各キャプション・ユニットが重畳された第２のビュー（1st View）、例えば左眼画像を示している。各キャプション・ユニットに共通の共通視差ベクトルは、図示のように、例えば、右眼画像に重畳する各キャプション・ユニットと、左眼画像に重畳する各キャプション・ユニットとの間に視差を付与するために用いられる。 FIG. 36C shows a first view (1st View) in which each caption unit (caption) is superimposed, for example, a right eye image. FIG. 36D shows a second view (1st View) in which the caption units are superimposed, for example, a left eye image. As shown in the figure, the common disparity vector common to each caption unit is used to give disparity between each caption unit superimposed on the right eye image and each caption unit superimposed on the left eye image, for example. Used for.

次に、視差情報作成部１２５で個別視差ベクトルが作成される場合であって、字幕文データグループのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合について説明する。ここでは、同一の画面に、「1st Caption Unit」、「2nd Caption Unit」、「3rd Caption Unit」の３つのキャプション・ユニット（字幕）が表示される例とする。 Next, a case where an individual disparity vector is created by the disparity information creating unit 125 and a case where a disparity vector (disparity information) is inserted into a PES stream of a caption text data group will be described. Here, an example is shown in which three caption units (captions) of “1st Caption Unit”, “2nd Caption Unit”, and “3rd Caption Unit” are displayed on the same screen.

視差情報作成部１２５は、図３７（ｂ）に示すように、各キャプション・ユニットに対応した個別視差ベクトルを作成する。「Disparity 1」は、「1st Caption Unit」に対応した個別視差ベクトルである。「Disparity 2」は、「2nd Caption Unit」に対応した視差ベクトルである。「Disparity 3」は、「3rd Caption Unit」に対応した個別視差ベクトルである。 The disparity information creating unit 125 creates an individual disparity vector corresponding to each caption unit, as shown in FIG. “Disparity 1” is an individual disparity vector corresponding to “1st Caption Unit”. “Disparity 2” is a disparity vector corresponding to “2nd Caption Unit”. “Disparity 3” is an individual disparity vector corresponding to “3rd Caption Unit”.

図３７（ａ）は、字幕エンコーダ１２６で生成される字幕データストリーム（ＰＥＳストリーム）のうち、字幕文データグループのＰＥＳストリームの構成例を示している。この字幕文データグループのＰＥＳストリームには、各キャプション・ユニットの字幕文情報（字幕文データ）が挿入される。また、この字幕文データグループのＰＥＳストリームには、各キャプション・ユニットの字幕文情報にそれぞれ対応した表示制御情報（視差情報）が挿入される。この場合、各表示制御情報としての視差情報は、上述したように視差情報作成部１２５で作成された個別視差ベクトルとなる。 FIG. 37A shows a configuration example of the PES stream of the caption text data group in the caption data stream (PES stream) generated by the caption encoder 126. The caption sentence information (caption sentence data) of each caption unit is inserted into the PES stream of this caption sentence data group. Also, display control information (disparity information) corresponding to the caption text information of each caption unit is inserted into the PES stream of this caption text data group. In this case, the disparity information as each display control information is the individual disparity vector created by the disparity information creating unit 125 as described above.

なお、各キャプション・ユニットの表示領域などの設定データは、図示していないが、字幕管理データグループのＰＥＳストリームに、字幕管理データ（制御符号）として、挿入される。また、「1st Caption Unit」、「2nd Caption Unit」、「3rd Caption Unit」のキャプション・ユニットの表示領域は、それぞれ、（x1,y1）、（x2,y2）、（x3,y3）で示されている。 Although not shown, setting data such as the display area of each caption unit is inserted as caption management data (control code) in the PES stream of the caption management data group. In addition, the display areas of the caption units of “1st Caption Unit”, “2nd Caption Unit”, and “3rd Caption Unit” are indicated by (x1, y1), (x2, y2), and (x3, y3), respectively. ing.

図３７（ｃ）は、各キャプション・ユニット（字幕）が重畳された第１のビュー（1st View）、例えば右眼画像を示している。また、図３７（ｄ）は、各キャプション・ユニットが重畳された第２のビュー（1st View）、例えば左眼画像を示している。各キャプション・ユニットに対応した個別視差ベクトルは、図示のように、例えば、右眼画像に重畳する各キャプション・ユニットと、左眼画像に重畳する各キャプション・ユニットとの間に視差を付与するために用いられる。 FIG. 37 (c) shows a first view (1st View) in which each caption unit (caption) is superimposed, for example, a right eye image. FIG. 37 (d) shows a second view (1st View) in which the caption units are superimposed, for example, a left eye image. As shown in the figure, the individual disparity vector corresponding to each caption unit is used to give disparity between each caption unit superimposed on the right eye image and each caption unit superimposed on the left eye image, for example. Used for.

次に、視差情報作成部１２５で共通視差ベクトルが作成される場合であって、字幕文データグループのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合について説明する。ここでは、同一の画面に、「1st Caption Unit」、「2nd Caption Unit」、「3rd Caption Unit」の３つのキャプション・ユニット（字幕）が表示される例とする。視差情報作成部１２５は、図３８（ｂ）に示すように、各キャプション・ユニットに共通の共通視差ベクトル「Disparity」を作成する。 Next, a case where a common disparity vector is created by the disparity information creating unit 125 and a disparity vector (disparity information) is inserted into the PES stream of the caption text data group will be described. Here, an example is shown in which three caption units (captions) of “1st Caption Unit”, “2nd Caption Unit”, and “3rd Caption Unit” are displayed on the same screen. As shown in FIG. 38B, the disparity information creating unit 125 creates a common disparity vector “Disparity” common to the caption units.

図３８（ａ）は、字幕エンコーダ１２６で生成される字幕データストリーム（ＰＥＳストリーム）のうち、字幕文データグループのＰＥＳストリームの構成例を示している。この字幕文データグループのＰＥＳストリームには、各キャプション・ユニットの字幕文情報（字幕文データ）が挿入される。また、この字幕文データグループのＰＥＳストリームには、各キャプション・ユニットの字幕文情報に共通に対応した表示制御情報（視差情報）が挿入される。この場合、表示制御情報としての視差情報は、上述したように視差情報作成部１２５で作成された共通視差ベクトルとなる。 FIG. 38A shows a configuration example of the PES stream of the caption text data group in the caption data stream (PES stream) generated by the caption encoder 126. The caption sentence information (caption sentence data) of each caption unit is inserted into the PES stream of this caption sentence data group. Also, display control information (disparity information) corresponding to the caption text information of each caption unit is inserted into the PES stream of this caption text data group. In this case, the disparity information as the display control information is the common disparity vector created by the disparity information creating unit 125 as described above.

なお、各キャプション・ユニットの表示領域などの設定データは、図示していないが、字幕管理データグループのＰＥＳストリームに、字幕管理情報（制御符号）として、挿入される。また、「1st Caption Unit」、「2nd Caption Unit」、「3rd Caption Unit」のキャプション・ユニットの表示領域は、それぞれ、（x1,y1）、（x2,y2）、（x3,y3）で示されている。 Although not shown, setting data such as the display area of each caption unit is inserted into the PES stream of the caption management data group as caption management information (control code). In addition, the display areas of the caption units of “1st Caption Unit”, “2nd Caption Unit”, and “3rd Caption Unit” are indicated by (x1, y1), (x2, y2), and (x3, y3), respectively. ing.

図３８（ｃ）は、各キャプション・ユニット（字幕）が重畳された第１のビュー（1st View）、例えば右眼画像を示している。また、図３８（ｄ）は、各キャプション・ユニットが重畳された第２のビュー（1st View）、例えば左眼画像を示している。各キャプション・ユニットに共通の共通視差ベクトルは、図示のように、例えば、右眼画像に重畳する各キャプション・ユニットと、左眼画像に重畳する各キャプション・ユニットとの間に視差を付与するために用いられる。 FIG. 38 (c) shows a first view (1st View) in which each caption unit (caption) is superimposed, for example, a right eye image. FIG. 38D shows a second view (1st View) in which each caption unit is superimposed, for example, a left eye image. As shown in the figure, the common disparity vector common to each caption unit is used to give disparity between each caption unit superimposed on the right eye image and each caption unit superimposed on the left eye image, for example. Used for.

なお、図３５（ｃ），（ｄ）、図３６（ｃ），（ｄ）、図３７（ｃ），（ｄ）、図３８（ｃ），（ｄ）の例は、第２のビュー（例えば、左眼画像）に重畳する各キャプション・ユニットの位置のみをシフトさせている。しかし、第１のビュー（例えば、右眼画像）に重畳する各キャプション・ユニットの位置のみをシフトさせる場合、あるいは、双方のビューに重畳する各キャプション・ユニットの位置をシフトさせる場合も考えられる。 Note that the examples of FIGS. 35 (c), (d), 36 (c), (d), 37 (c), (d), 38 (c), (d) are shown in the second view ( For example, only the position of each caption unit superimposed on the left eye image) is shifted. However, it is conceivable to shift only the position of each caption unit superimposed on the first view (for example, the right eye image), or to shift the position of each caption unit superimposed on both views.

図３９（ａ），（ｂ）は、第１のビューおよび第２のビューに重畳するキャプション・ユニットの双方の位置をシフトさせる場合を示している。この場合、各キャプション・ユニットに対応した視差ベクトル「Disparity」の値“disparity[i]”から、第１のビュー、第２のビューにおける各キャプション・ユニットのシフト値（オフセット値）Ｄ[i]が、以下のように求められる。 FIGS. 39A and 39B show a case where the positions of both the caption unit superimposed on the first view and the second view are shifted. In this case, the shift value (offset value) D [i] of each caption unit in the first view and the second view from the value “disparity [i]” of the disparity vector “Disparity” corresponding to each caption unit. However, it is required as follows.

すなわち、disparity[i]が偶数の場合には、第１のビューでは、「Ｄ[i]＝- disparity[i]/2」と求められ、第２のビューでは、「Ｄ[i]＝disparity[i]/2」と求められる。これにより、第１のビュー（例えば、右眼画像）に重畳する各キャプション・ユニットの位置は、左側に「disparity[i]/2」だけシフトされる。また、第２のビュー（例えば、左眼画像）に重畳する各キャプション・ユニットの位置は、右側に(disparity[i]/2)だけシフトされる。 That is, when disparity [i] is an even number, “D [i] = − disparity [i] / 2” is obtained in the first view, and “D [i] = disparity” is obtained in the second view. [i] / 2 ”. Thereby, the position of each caption unit to be superimposed on the first view (for example, the right eye image) is shifted to the left by “disparity [i] / 2”. Further, the position of each caption unit to be superimposed on the second view (for example, the left eye image) is shifted to the right by (disparity [i] / 2).

また、disparity(i)が奇数の場合には、第１のビューでは、「Ｄ[i]＝- (disparity[i]+1)/2」と求められ、第２のビューでは、「Ｄ[i]＝(disparity[i]-1)/2」と求められる。これにより、第１のビュー（例えば、右眼画像）に重畳する各キャプション・ユニットの位置は、左側に「(disparity[i]+1)/2」だけシフトされる。また、第２のビュー（例えば、左眼画像）に重畳する各キャプション・ユニットの位置は、右側に「(disparity[i]-1)/2」だけシフトされる。 When disparity (i) is an odd number, “D [i] = − (disparity [i] +1) / 2” is obtained in the first view, and “D [i] is obtained in the second view. i] = (disparity [i] -1) / 2 ". Thereby, the position of each caption unit to be superimposed on the first view (for example, the right eye image) is shifted to the left by “(disparity [i] +1) / 2”. Further, the position of each caption unit to be superimposed on the second view (for example, the left eye image) is shifted to the right by “(disparity [i] −1) / 2”.

ここで、字幕符号および制御符号のパケット構造を簡単に説明する。最初に、字幕文データグループのＰＥＳストリームに含まれる字幕符号の基本的なパケット構造について説明する。図４０は、字幕符号のパケット構造を示している。「Data_group_id」は、データグループ識別を示し、ここでは、字幕文データグループであることを示す。なお、字幕文データグループを示す「Data_group_id」は、さらに、言語を特定する。例えば、「Data_group_id==0x21」とされ、字幕文データグループであって、字幕文（第１言語）であることが示される。 Here, the packet structure of the caption code and the control code will be briefly described. First, a basic packet structure of a caption code included in a PES stream of a caption text data group will be described. FIG. 40 shows a packet structure of the caption code. “Data_group_id” indicates data group identification, and here indicates a caption sentence data group. Note that “Data_group_id” indicating a caption text data group further specifies a language. For example, “Data_group_id == 0x21” is set, indicating that the subtitle text data group is a subtitle text (first language).

「Data_group_size」は、後続のデータグループデータのバイト数を示す。字幕文データグループである場合、このデータグループデータは、字幕文データ（caption_data）である。この字幕文データには、１以上のデータユニットが配置されている。各データユニットは、データユニット分離符号（unit_parameter）で分離されている。各データユニット内のデータユニットデータ（data_unit_data）として、字幕符号が配置される。 “Data_group_size” indicates the number of bytes of subsequent data group data. In the case of a caption text data group, this data group data is caption text data (caption_data). One or more data units are arranged in the caption text data. Each data unit is separated by a data unit separation code (unit_parameter). A caption code is arranged as data unit data (data_unit_data) in each data unit.

次に、制御符号のパケット構造について説明する。図４１は、字幕管理データグループのＰＥＳストリームに含まれる制御符号のパケット構造を示している。「Data_group_id」は、データグループ識別を示す。ここでは、字幕管理データグループであることを示し、「Data_group_id==0x20」とされる。「Data_group_size」は、後続のデータグループデータのバイト数を示す。字幕管理データグループである場合、このデータグループデータは、字幕管理データ（caption_management_data）である。 Next, the packet structure of the control code will be described. FIG. 41 illustrates a packet structure of control codes included in the PES stream of the caption management data group. “Data_group_id” indicates data group identification. Here, it indicates a caption management data group, and “Data_group_id == 0x20” is set. “Data_group_size” indicates the number of bytes of subsequent data group data. In the case of a caption management data group, this data group data is caption management data (caption_management_data).

この字幕管理データには、１以上のデータユニットが配置されている。各データユニットは、データユニット分離符号（unit_parameter）で分離されている。各データユニット内のデータユニットデータ（data_unit_data）として、制御符号が配置される。この実施の形態において、視差ベクトルの値は、８単位符号として与えられる。「ＴＣＳ」は２ビットのデータであり、文字符号化方式を示す。ここでは、「ＴＣＳ==00」とされ、８単位符号であることが示される。 One or more data units are arranged in the caption management data. Each data unit is separated by a data unit separation code (unit_parameter). A control code is arranged as data unit data (data_unit_data) in each data unit. In this embodiment, the value of the disparity vector is given as an 8-unit code. “TCS” is 2-bit data and indicates a character encoding method. Here, “TCS == 00” is set, which indicates an 8-unit code.

図４２は、字幕データストリーム（ＰＥＳストリーム）内のデータグループの構造を示している。「data_group_id」の６ビットのフィールドは、データグループ識別を示し、字幕管理データ、字幕文データの種類を識別する。「data_group_size」の１６ビットのフィールドは、このデータグループフィールドにおいて、後続のデータグループデータのバイト数を示す。「data_group_data_byte」に、データグループデータが格納される。「CRC_16」は、１６ビットのサイクリック・リダンダンシー・チェック符号である。このＣＲＣ符号の符号化区間は、「data_group_id」の先頭から「data_group_data_byte」の終端までである。 FIG. 42 shows the structure of a data group in a caption data stream (PES stream). A 6-bit field of “data_group_id” indicates data group identification, and identifies the types of caption management data and caption text data. A 16-bit field of “data_group_size” indicates the number of bytes of subsequent data group data in this data group field. Data group data is stored in “data_group_data_byte”. “CRC — 16” is a 16-bit cyclic redundancy check code. The encoding section of the CRC code is from the beginning of “data_group_id” to the end of “data_group_data_byte”.

字幕管理データグループの場合、図４２のデータグループ構造における「data_group_data_byte」は、字幕管理データ（caption_management_data）となる。また、字幕文データグループの場合、図４２のデータグループ構造における「data_group_data_byte」は、字幕データ（caption_data）となる。 In the case of a caption management data group, “data_group_data_byte” in the data group structure of FIG. 42 is caption management data (caption_management_data). In the case of a caption text data group, “data_group_data_byte” in the data group structure of FIG. 42 is caption data (caption_data).

図４３は、字幕管理データのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合における字幕管理データの構造を概略的に示している。「advanced_rendering_version」は、この実施の形態で新たに定義された、字幕の拡張表示に対応しているか否かを示す１ビットのフラグ情報である。受信側においては、このように管理情報のレイヤに配置されるフラグ情報に基づいて、字幕の拡張表示に対応しているか否かを容易に把握可能となる。「data_unit_loop_length」の２４ビットフィールドは、この字幕管理データフィールドにおいて、後続のデータユニットのバイト数を示す。「data_unit」に、この字幕管理データフィールドで伝送するデータユニットが格納される。 FIG. 43 schematically shows the structure of caption management data when disparity vectors (disparity information) are inserted into the PES stream of caption management data. “Advanced_rendering_version” is 1-bit flag information newly defined in this embodiment and indicating whether or not the extended display of subtitles is supported. On the receiving side, based on the flag information arranged in the management information layer as described above, it is possible to easily grasp whether or not the extended display of subtitles is supported. A 24-bit field of “data_unit_loop_length” indicates the number of bytes of the subsequent data unit in the caption management data field. In “data_unit”, a data unit to be transmitted in this caption management data field is stored.

図４４は、字幕管理データのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合における字幕データの構造を概略的に示している。「data_unit_loop_length」の２４ビットフィールドは、この字幕データフィールドにおいて、後続のデータユニットのバイト数を示す。「data_unit」に、この字幕データフィールドで伝送するデータユニットが格納される。なお、この字幕データの構造には、「advanced_rendering_version」のフラグ情報はない。 FIG. 44 schematically shows the structure of caption data when disparity vectors (disparity information) are inserted into the PES stream of caption management data. A 24-bit field of “data_unit_loop_length” indicates the number of bytes of the subsequent data unit in the caption data field. In “data_unit”, a data unit to be transmitted in the caption data field is stored. Note that there is no flag information of “advanced_rendering_version” in the structure of the caption data.

図４５は、字幕文データグループのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合における字幕データの構造を概略的に示している。「advanced_rendering_version」は、この実施の形態で新たに定義された、字幕の拡張表示に対応しているかを示す１ビットのフラグ情報である。受信側においては、このようにデータユニットの上位レイヤに配置されるフラグ情報に基づいて、字幕の拡張表示に対応しているか否かを容易に把握可能となる。「data_unit_loop_length」の２４ビットフィールドは、この字幕文データフィールドにおいて、後続のデータユニットのバイト数を示す。「data_unit」に、この字幕文データフィールドで伝送するデータユニットが格納される。 FIG. 45 schematically illustrates the structure of caption data when disparity vectors (disparity information) are inserted into the PES stream of the caption text data group. “Advanced_rendering_version” is 1-bit flag information newly defined in this embodiment and indicating whether it corresponds to extended display of subtitles. On the receiving side, based on the flag information arranged in the upper layer of the data unit as described above, it is possible to easily grasp whether or not the extended display of subtitles is supported. The 24-bit field of “data_unit_loop_length” indicates the number of bytes of the subsequent data unit in the caption text data field. In “data_unit”, a data unit to be transmitted in the caption text data field is stored.

図４６は、字幕文データグループのＰＥＳストリームに視差ベクトル（視差情報）が挿入される場合における字幕管理データの構造を概略的に示している。「data_unit_loop_length」の２４ビットフィールドは、この字幕管理データフィールドにおいて、後続のデータユニットのバイト数を示す。「data_unit」に、この字幕管理データフィールドで伝送するデータユニットが格納される。なお、この字幕管理データの構造には、「advanced_rendering_version」のフラグ情報はない。 FIG. 46 schematically shows the structure of caption management data when disparity vectors (disparity information) are inserted into the PES stream of the caption text data group. A 24-bit field of “data_unit_loop_length” indicates the number of bytes of the subsequent data unit in the caption management data field. In “data_unit”, a data unit to be transmitted in this caption management data field is stored. Note that there is no flag information of “advanced_rendering_version” in the structure of the caption management data.

図４７は、字幕データストリームに含まれるデータユニット（data_unit）の構造（Syntax）を示している。「unit_separator」の８ビットフィールドは、データユニット分離符号を示し、“0x1F”とされている。「data_unit_parameter」の８ビットフィールドは、データユニットの種類を識別するデータユニットパラメータである。 FIG. 47 illustrates the structure (Syntax) of the data unit (data_unit) included in the caption data stream. An 8-bit field of “unit_separator” indicates a data unit separation code and is “0x1F”. The 8-bit field of “data_unit_parameter” is a data unit parameter that identifies the type of data unit.

図４８は、データユニットの種類と、データユニットパラメータおよび機能を示している。例えば、本文のデータユニットを示すデータユニットパラメータは“0x20”とされている。また、例えば、ジオメトリックのデータユニットを示すデータユニットパラメータは“0x28”とされている。また、例えば、ビットマップのデータユニットを示すデータユニットパラメータは“0x35”とされている。この実施の形態において、表示制御情報（拡張表示制御情報）を格納する拡張表示制御のデータユニットを新たに定義し、このデータユニットを示すデータユニットパラメータを、例えば“0x4F”とする。 FIG. 48 shows the types of data units, data unit parameters, and functions. For example, the data unit parameter indicating the data unit of the main text is “0x20”. For example, the data unit parameter indicating the geometric data unit is “0x28”. For example, a data unit parameter indicating a bitmap data unit is set to “0x35”. In this embodiment, a data unit for extended display control for storing display control information (extended display control information) is newly defined, and a data unit parameter indicating this data unit is set to, for example, “0x4F”.

「data_unit_size」の２４ビットのフィールドは、このデータユニットフィールドにおいて、後続のデータユニットデータのバイト数を示す。「data_unit_data_byte」に、データユニットデータが格納される。図４９は、拡張表示制御のデータユニット（data_unit）の構造（Syntax）を示している。この場合、データユニットパラメータは“0x4F”であり、「data_unit_data_byte」としての「Advanced_Rendering_Control」に、表示制御情報が格納される。 The 24-bit field of “data_unit_size” indicates the number of bytes of subsequent data unit data in this data unit field. Data unit data is stored in “data_unit_data_byte”. FIG. 49 shows the structure (Syntax) of the data unit (data_unit) for extended display control. In this case, the data unit parameter is “0x4F”, and display control information is stored in “Advanced_Rendering_Control” as “data_unit_data_byte”.

図５０は、上述の図３５、図３６の例において、字幕管理データグループのＰＥＳストリームが有する拡張表示制御のデータユニットにおける「Advanced_Rendering_Control」の構造（Syntax）を示している。また、この図５０は、上述の図３７、図３８の例において、字幕分データグループのＰＥＳストリームが有する拡張表示制御のデータユニットにおける「Advanced_Rendering_Control」の構造（Syntax）を示している。すなわち、この図５０は、表示制御情報として、ステレオビデオの視差情報を挿入する場合の構造を示している。 FIG. 50 illustrates the structure (Syntax) of “Advanced_Rendering_Control” in the data unit for extended display control included in the PES stream of the caption management data group in the examples of FIGS. 35 and 36 described above. Further, FIG. 50 illustrates the structure (Syntax) of “Advanced_Rendering_Control” in the data unit of the extended display control included in the PES stream of the caption data group in the example of FIGS. 37 and 38 described above. That is, FIG. 50 shows a structure in the case of inserting stereo video parallax information as display control information.

「start_code」の８ビットフィールドは、「Advanced_Rendering_Control」の始まりを示す。「data_unit_id」の１６ビットフィールドは、データユニットＩＤを示す。「data_length」の１６ビットフィールドは、このアドバンスレンダリングコントロールのフィールドにおいて、後続のデータバイト数を示す。「Advanced_rendering_type」の８ビットフィールドは、表示制御情報の種類を指定するアドバンスレンダリングタイプである。ここでは、データユニットパラメータは、例えば“0x01”であり、表示制御情報が「ステレオビデオの視差情報」であることが示される。「disparity_information」に、ディスパリティインフォメーションが格納される。 The 8-bit field of “start_code” indicates the start of “Advanced_Rendering_Control”. A 16-bit field of “data_unit_id” indicates a data unit ID. The 16-bit field of “data_length” indicates the number of subsequent data bytes in this advanced rendering control field. The 8-bit field of “Advanced_rendering_type” is an advanced rendering type that specifies the type of display control information. Here, the data unit parameter is, for example, “0x01”, indicating that the display control information is “stereo video parallax information”. Disparity information is stored in “disparity_information”.

図５１は、上述の図３５の例において、字幕分データグループのＰＥＳストリームが有する拡張表示制御のデータユニットにおける「Advanced_Rendering_Control」の構造（Syntax）を示している。すなわち、図５１は、表示制御情報として、データユニットＩＤを挿入する場合の構造を示している。 FIG. 51 illustrates the structure (Syntax) of “Advanced_Rendering_Control” in the extended display control data unit included in the PES stream of the caption data group in the example of FIG. 35 described above. That is, FIG. 51 shows a structure when a data unit ID is inserted as display control information.

「start_code」の８ビットフィールドは、「Advanced_Rendering_Control」の始まりを示す。「data_unit_id」の１６ビットフィールドは、データユニットＩＤを示す。「data_length」の１６ビットフィールドは、このアドバンスレンダリングコントロールのフィールドにおいて、後続のデータバイト数を示す。「Advanced_rendering_type」の８ビットフィールドは、表示制御情報の種類を指定するアドバンスレンダリングタイプである。ここでは、データユニットパラメータは、例えば“0x00”であり、表示制御情報が「データユニットＩＤ」であることが示される。 The 8-bit field of “start_code” indicates the start of “Advanced_Rendering_Control”. A 16-bit field of “data_unit_id” indicates a data unit ID. The 16-bit field of “data_length” indicates the number of subsequent data bytes in this advanced rendering control field. The 8-bit field of “Advanced_rendering_type” is an advanced rendering type that specifies the type of display control information. Here, the data unit parameter is “0x00”, for example, and the display control information is “data unit ID”.

なお、図５３は、上述の「Advanced_Rendering_Control」の構造における、さらには、後述の図５２に示す「disparity_information」の構造における主要なデータ規定内容を示している。 FIG. 53 shows the main data definition contents in the structure of “Advanced_Rendering_Control” described above and further in the structure of “disparity_information” shown in FIG. 52 described later.

図５２は、字幕文データグループに含まれる拡張表示制御のデータユニット（data_unit）内の「Advanced_Rendering_Control」における「disparity_information」の構造（Syntax）を示している。「sync_byte」の８ビットフィールドは、「disparity_information」の識別情報であり、この「disparity_information」の始まりを示す。「interval_PTS[32..0]」は、視差情報（disparity）の更新フレーム間隔におけるフレーム周期（１フレームの間隔）を９０ＫＨｚ単位で指定する。つまり、「interval_PTS[32..0]」は、フレーム周期を９０ＫＨｚのクロックで計測した値を３３ビット長で表す。 FIG. 52 shows the structure (Syntax) of “disparity_information” in “Advanced_Rendering_Control” in the data unit (data_unit) of extended display control included in the caption text data group. The 8-bit field of “sync_byte” is identification information of “disparity_information”, and indicates the beginning of this “disparity_information”. “Interval_PTS [32..0]” specifies the frame period (interval of one frame) in the update frame interval of disparity information (disparity) in units of 90 KHz. That is, “interval_PTS [32..0]” represents a value obtained by measuring the frame period with a clock of 90 KHz in 33-bit length.

ディスパリティインフォメーションにおいて、「interval_PTS[32..0]」によりフレーム周期を指定することで、送信側で意図する視差情報の更新フレーム間隔を、受信側に正しく伝えることが可能となる。この情報が付加されていない場合、受信側においては、例えば、ビデオのフレーム周期が参照される。 In disparity information, the frame period is designated by “interval_PTS [32..0]”, so that the update frame interval of disparity information intended on the transmission side can be correctly transmitted to the reception side. When this information is not added, for example, the video frame period is referred to on the receiving side.

「temporal_extension_flag」は、字幕表示期間内で順次更新される視差情報（disparity_update）の存在の有無を示す１ビットのフラグ情報である。この場合、“１”は存在することを示し、“０”は存在しないことを示す。「default_disparity」の８ビットフィールドは、デフォルトの視差情報を示す。この視差情報は、更新をしない場合の視差情報、つまり字幕表示期間内において共通に使用される視差情報である。 “Temporal_extension_flag” is 1-bit flag information indicating the presence / absence of disparity information (disparity_update) that is sequentially updated within the caption display period. In this case, “1” indicates that it exists, and “0” indicates that it does not exist. The 8-bit field “default_disparity” indicates default disparity information. This disparity information is disparity information when updating is not performed, that is, disparity information that is commonly used within a caption display period.

「shared_disparity」は、データユニット（Data_unit）に跨る共通の視差情報（disparity）制御を行うかどうかを示す。“１”は、以後の複数のデータユニット（Data_unit）に対して、一つの共通の視差情報（disparity）が適用されることを示す。“０”は、視差情報（Disparity）は、一つのデータユニット（data_unit）にのみ適用されることを示す。 “Shared_disparity” indicates whether to perform common disparity information (disparity) control across data units (Data_unit). “1” indicates that one common disparity information (disparity) is applied to a plurality of subsequent data units (Data_unit). “0” indicates that the disparity information (Disparity) is applied to only one data unit (data_unit).

「temporal_extension_flag」が“１”である場合、ディスパリティインフォメーションは、「disparity_temporal_extension（）」を有する。この「disparity_temporal_extension（）」の構造例（Syntax）については、上述したと同様であるので、ここでは、その説明を省略する（図２１、図２２参照）。 When “temporal_extension_flag” is “1”, the disparity information includes “disparity_temporal_extension ()”. Since the structural example (Syntax) of this “disparity_temporal_extension ()” is the same as described above, the description thereof is omitted here (see FIGS. 21 and 22).

なお、上述の図５２に示す「disparity_information」の構造（Syntax）においては「interval_PTS[32..0]」が付加されている。しかし「interval_PTS[32..0]」が付加されていない「disparity_information」の構造（Syntax）も考えられる。その場合、「disparity_information」の構造は、図５４に示すようになる。 Note that “interval_PTS [32..0]” is added to the structure (Syntax) of “disparity_information” shown in FIG. However, a structure (Syntax) of “disparity_information” without “interval_PTS [32..0]” may be considered. In this case, the structure of “disparity_information” is as shown in FIG.

図３３に戻って、ビデオエンコーダ１２２は、データ取り出し部１２１から供給される立体画像データに対して、ＭＰＥＧ４−ＡＶＣ、ＭＰＥＧ２、ＶＣ−１等の符号化を施し、ビデオエレメンタリストリームを生成する。オーディオエンコーダ１２３は、データ取り出し部１２１から供給される音声データに対して、ＭＰＥＧ−２ＡｕｄｉｏＡＡＣ等の符号化を施し、オーディオエレメンタリストリームを生成する。 Returning to FIG. 33, the video encoder 122 performs encoding such as MPEG4-AVC, MPEG2, VC-1, etc. on the stereoscopic image data supplied from the data extraction unit 121 to generate a video elementary stream. The audio encoder 123 performs encoding such as MPEG-2 Audio AAC on the audio data supplied from the data extraction unit 121 to generate an audio elementary stream.

マルチプレクサ１２７は、ビデオエンコーダ１２２、オーディオエンコーダ１２３および字幕エンコーダ１２６から出力される各エレメンタリストリームを多重化する。そして、このマルチプレクサ１２７は、伝送データ（多重化データストリーム）としてのビットストリームデータ（トランスポートストリーム）ＢＳＤを出力する。 The multiplexer 127 multiplexes the elementary streams output from the video encoder 122, the audio encoder 123, and the caption encoder 126. The multiplexer 127 outputs bit stream data (transport stream) BSD as transmission data (multiplexed data stream).

図３３に示す送信データ生成部１１０Ａの動作を簡単に説明する。データ取り出し部１２１から出力される立体画像データは、ビデオエンコーダ１２２に供給される。このビデオエンコーダ１２２では、その立体画像データに対してＭＰＥＧ４−ＡＶＣ、ＭＰＥＧ２、ＶＣ−１等の符号化が施され、符号化ビデオデータを含むビデオエレメンタリストリームが生成される。このビデオエレメンタリストリームはマルチプレクサ１２７に供給される。 The operation of the transmission data generation unit 110A shown in FIG. 33 will be briefly described. The stereoscopic image data output from the data extraction unit 121 is supplied to the video encoder 122. In the video encoder 122, the stereoscopic image data is encoded by MPEG4-AVC, MPEG2, VC-1, or the like, and a video elementary stream including the encoded video data is generated. This video elementary stream is supplied to the multiplexer 127.

また、字幕発生部１２４では、ＡＲＩＢ方式の字幕データが発生される。この字幕データは、字幕エンコーダ１２６に供給される。この字幕エンコーダ１２６では、字幕発生部１２４で発生された字幕データを含む字幕エレメンタリストリーム（字幕データストリーム）が生成される。この字幕エレメンタリストリームはマルチプレクサ１２７に供給される。 The caption generation unit 124 generates ARIB format caption data. This caption data is supplied to the caption encoder 126. The caption encoder 126 generates a caption elementary stream (caption data stream) including the caption data generated by the caption generator 124. This subtitle elementary stream is supplied to the multiplexer 127.

また、データ取り出し部１２１から出力されるピクセル（画素）毎の視差ベクトルは、視差情報作成部１２５に供給される。この視差情報作成部１２５では、ダウンサイジング処理により、同一の画面に表示される所定数のキャプション・ユニット（字幕）に対応した視差ベクトル（水平方向視差ベクトル）が作成される。この場合、視差情報作成部１２５では、キャプション・ユニット毎の視差ベクトル（個別視差ベクトル）、あるいは全てのキャプション・ユニットに共通の視差ベクトル（共通視差ベクトル）が作成される。 Also, the disparity vector for each pixel (pixel) output from the data extraction unit 121 is supplied to the disparity information creation unit 125. The disparity information creating unit 125 creates disparity vectors (horizontal disparity vectors) corresponding to a predetermined number of caption units (captions) displayed on the same screen by downsizing processing. In this case, the disparity information creating unit 125 creates a disparity vector (individual disparity vector) for each caption unit or a disparity vector (common disparity vector) common to all caption units.

視差情報作成部１２５で作成された視差ベクトルは、字幕エンコーダ１２６に供給される。字幕エンコーダ１２６では、視差ベクトルが、字幕データストリームに含められる（図３５〜図３８参照）。字幕データストリームには、字幕文データグループのＰＥＳストリームに、字幕文データ（字幕符号）として、同一画面に表示される各キャプション・ユニットの字幕データが挿入される。また、この字幕データストリームには、字幕管理データグループのＰＥＳストリームに、あるいは字幕分データグループのＰＥＳストリームに、字幕の表示制御情報として、視差ベクトル（視差情報）が挿入される。この場合、視差ベクトルは、新たに定義された表示制御情報を送出する拡張表示制御のデータユニットに挿入される（図４９参照）。 The disparity vector created by the disparity information creating unit 125 is supplied to the caption encoder 126. In the caption encoder 126, the disparity vector is included in the caption data stream (see FIGS. 35 to 38). In the caption data stream, caption data of each caption unit displayed on the same screen is inserted as caption text data (caption code) in the PES stream of the caption text data group. In addition, a disparity vector (disparity information) is inserted into the caption data stream as the caption display control information in the PES stream of the caption management data group or in the PES stream of the caption data group. In this case, the disparity vector is inserted into a data unit for extended display control that transmits newly defined display control information (see FIG. 49).

また、データ取り出し部１２１から出力される音声データはオーディオエンコーダ１２３に供給される。このオーディオエンコーダ１２３では、音声データに対して、ＭＰＥＧ−２ＡｕｄｉｏＡＡＣ等の符号化が施され、符号化オーディオデータを含むオーディオエレメンタリストリームが生成される。このオーディオエレメンタリストリームはマルチプレクサ１２７に供給される。 The audio data output from the data extraction unit 121 is supplied to the audio encoder 123. In the audio encoder 123, encoding such as MPEG-2 Audio AAC is performed on the audio data, and an audio elementary stream including the encoded audio data is generated. This audio elementary stream is supplied to the multiplexer 127.

マルチプレクサ１２７には、上述したように、ビデオエンコーダ１２２、オーディオエンコーダ１２３および字幕エンコーダ１２６からのエレメンタリストリームが供給される。そして、このマルチプレクサ１２７では、各エンコーダから供給されるエレメンタリストリームがパケット化されて多重され、伝送データとしてのビットストリームデータ（トランスポートストリーム）ＢＳＤが得られる。 As described above, the elementary stream from the video encoder 122, the audio encoder 123, and the caption encoder 126 is supplied to the multiplexer 127. In the multiplexer 127, the elementary streams supplied from the encoders are packetized and multiplexed to obtain bit stream data (transport stream) BSD as transmission data.

図５５は、ビデオエレメンタリストリーム、オーディオエレメンタリストリーム、字幕エレメンタリストリームを含む一般的なトランスポートストリーム（多重化データストリーム）の構成例を示している。このトランスポートストリームには、各エレメンタリストリームをパケット化して得られたＰＥＳパケットが含まれている。この構成例では、ビデオエレメンタリストリームのＰＥＳパケット「Video PES」が含まれている。また、この構成例では、オーディオエレメンタリストリームのＰＥＳパケット「Audio PES」および字幕エレメンタリストリームのＰＥＳパケット「SubtitlePES」が含まれている。 FIG. 55 illustrates a configuration example of a general transport stream (multiplexed data stream) including a video elementary stream, an audio elementary stream, and a caption elementary stream. This transport stream includes PES packets obtained by packetizing each elementary stream. In this configuration example, a PES packet “Video PES” of a video elementary stream is included. In addition, in this configuration example, the PES packet “Audio PES” of the audio elementary stream and the PES packet “SubtitlePES” of the caption elementary stream are included.

また、トランスポートストリームには、ＰＳＩ（Program Specific Information）として、ＰＭＴ（ProgramMap Table）が含まれている。このＰＳＩは、トランスポートストリームに含まれる各エレメンタリストリームがどのプログラムに属しているかを記した情報である。また、トランスポートストリームには、イベント単位の管理を行うＳＩ（Serviced Information）としてのＥＩＴ(EventInformation Table)が含まれている。 In addition, the transport stream includes a PMT (Program Map Table) as PSI (Program Specific Information). This PSI is information describing to which program each elementary stream included in the transport stream belongs. Further, the transport stream includes an EIT (Event Information Table) as SI (Serviced Information) for managing each event.

ＰＭＴには、プログラム全体に関連する情報を記述するプログラム・デスクリプタ（Program Descriptor）が存在する。また、このＰＭＴには、各エレメンタリストリームに関連した情報を持つエレメンタリ・ループが存在する。この構成例では、ビデオエレメンタリ・ループ、オーディオエレメンタリ・ループ、サブタイトルエレメンタリ・ループが存在する。各エレメンタリ・ループには、ストリーム毎に、パケット識別子（PID）、ストリームタイプ（Stream_Type）等の情報が配置されると共に、図示していないが、そのエレメンタリストリームに関連する情報を記述するデスクリプタも配置される。 The PMT includes a program descriptor (Program Descriptor) that describes information related to the entire program. The PMT includes an elementary loop having information related to each elementary stream. In this configuration example, there are a video elementary loop, an audio elementary loop, and a subtitle elementary loop. In each elementary loop, information such as a packet identifier (PID) and a stream type (Stream_Type) is arranged for each stream, and although not shown, there is also a descriptor that describes information related to the elementary stream. Be placed.

この実施の形態において、マルチプレクサ１２７（図３３参照）から出力されるトランスポートストリーム（多重化データストリーム）には、字幕データストリームが、字幕の拡張表示制御に対応しているか否かを示すフラグ情報が挿入されている。ここで、字幕の拡張表示制御は、例えば視差情報を用いた３次元字幕表示などである。この場合、受信側（セットトップボックス２００）においては、字幕データストリーム内のデータを開くことなく、この字幕データストリームが字幕の拡張表示制御に対応しているか否かを把握可能となる。 In this embodiment, in the transport stream (multiplexed data stream) output from the multiplexer 127 (see FIG. 33), flag information indicating whether or not the caption data stream corresponds to extended display control of captions. Has been inserted. Here, the extended display control of caption is, for example, three-dimensional caption display using parallax information. In this case, on the receiving side (set top box 200), it is possible to grasp whether or not this subtitle data stream supports extended display control of subtitles without opening the data in the subtitle data stream.

マルチプレクサ１２７は、このフラグ情報を、例えば、上述のＥＩＴの配下に挿入する。図５５の構成例では、ＥＩＴの配下に、データコンテンツ記述子が挿入されている。このデータコンテンツ記述子に、フラグ情報「Advanced_Rendering_support」が含まれている。図５６は、データコンテンツ記述子の構造例（Syntax）を示している。「descriptor_tag」は、デスクリプタ（記述子）のタイプを示す８ビットのデータであり、ここでは、データコンテンツ記述子であることを示す。「descriptor _length」は、デスクリプタの長さ（サイズ）を示す８ビットのデータである。このデータは、デスクリプタの長さとして、「descriptor _length」以降のバイト数を示す。 The multiplexer 127 inserts this flag information, for example, under the above-mentioned EIT. In the configuration example of FIG. 55, a data content descriptor is inserted under the EIT. This data content descriptor includes flag information “Advanced_Rendering_support”. FIG. 56 shows a structure example (Syntax) of the data content descriptor. “Descriptor_tag” is 8-bit data indicating the type of descriptor (descriptor), and here indicates a data content descriptor. “Descriptor_length” is 8-bit data indicating the length (size) of the descriptor. This data indicates the number of bytes after “descriptor_length” as the length of the descriptor.

「component_tag」は、字幕のエレメンタリストリームとの関連付けを行う８ビットのデータである。この「component_tag」の後に、「arib_caption_info」が定義されている。図５７（ａ）は、この「arib_caption_info」の構造例（Syntax）を示している。「Advanced_Rendering_support」は、図５７（ｂ）に示すように、字幕データストリームが字幕の拡張表示制御に対応しているか否かを示す１ビットのフラグ情報である。“１”は、字幕の拡張表示に対応していることを示す。“０”は、字幕の拡張表示制御に対応していないことを示す。 “Component_tag” is 8-bit data for associating with a subtitle elementary stream. After this “component_tag”, “arib_caption_info” is defined. FIG. 57A shows a structure example (Syntax) of this “arib_caption_info”. As shown in FIG. 57B, “Advanced_Rendering_support” is 1-bit flag information indicating whether or not the caption data stream is compatible with caption extended display control. “1” indicates that subtitle expansion display is supported. “0” indicates that subtitle extended display control is not supported.

なお、マルチプレクサ１２７は、ＰＭＴの配下に、上述のフラグ情報を挿入することもできる。図５８は、その場合におけるトランスポートストリーム（多重化データストリーム）の構成例を示している。この構成例では、ＰＭＴの字幕ＥＳループの配下にデータ符号化方式記述子が挿入されている。このデータ符号化方式記述子に、フラグ情報「Advanced_Rendering_support」が含まれている。 The multiplexer 127 can also insert the flag information described above under the PMT. FIG. 58 shows a configuration example of a transport stream (multiplexed data stream) in that case. In this configuration example, a data encoding scheme descriptor is inserted under the subtitle ES loop of the PMT. The data encoding scheme descriptor includes flag information “Advanced_Rendering_support”.

図５９は、データ符号化方式記述子の構造例（Syntax）を示している。「descriptor_tag」は、デスクリプタ（記述子）のタイプを示す８ビットのデータであり、ここでは、データコンテンツ記述子であることを示す。「descriptor _length」は、デスクリプタの長さ（サイズ）を示す８ビットのデータである。このデータは、デスクリプタの長さとして、「descriptor _length」以降のバイト数を示す。 FIG. 59 shows a structure example (Syntax) of the data encoding scheme descriptor. “Descriptor_tag” is 8-bit data indicating the type of descriptor (descriptor), and here indicates a data content descriptor. “Descriptor_length” is 8-bit data indicating the length (size) of the descriptor. This data indicates the number of bytes after “descriptor_length” as the length of the descriptor.

「component_tag」は、字幕のエレメンタリストリームとの関連付けを行う８ビットのデータである。「data_component_id」は、ここでは、字幕データを示す“0x0008”とされる。この「data_component_id」の後に、「additional_arib_caption_info」が定義されている。図６０は、この「additional_arib_caption_info」の構造例（Syntax）を示している。「Advanced_Rendering_support」は、上述の図５７（ｂ）に示すように、字幕データストリームが字幕の拡張表示制御に対応しているか否かを示す１ビットのフラグ情報である。“１”は、字幕の拡張表示に対応していることを示す。“０”は、字幕の拡張表示制御に対応していないことを示す。 “Component_tag” is 8-bit data for associating with a subtitle elementary stream. Here, “data_component_id” is “0x0008” indicating caption data. After “data_component_id”, “additional_arib_caption_info” is defined. FIG. 60 shows a structural example (Syntax) of this “additional_arib_caption_info”. “Advanced_Rendering_support” is 1-bit flag information indicating whether or not the caption data stream supports extended display control of captions, as shown in FIG. 57 (b) described above. “1” indicates that subtitle expansion display is supported. “0” indicates that subtitle extended display control is not supported.

上述したように、図３３に示す送信データ生成部１１０Ａにおいては、マルチプレクサ１２７から出力されるビットストリームデータＢＳＤは、ビデオデータストリームと字幕データストリームとを有する多重化データストリームである。ビデオデータストリームには、立体画像データが含まれている。また、字幕データストリームには、ＡＲＩＢ方式の字幕（キャプション・ユニット）のデータおよび視差ベクトル（視差情報）が含まれている。 As described above, in the transmission data generation unit 110A illustrated in FIG. 33, the bit stream data BSD output from the multiplexer 127 is a multiplexed data stream including a video data stream and a caption data stream. The video data stream includes stereoscopic image data. Also, the caption data stream includes ARIB caption data (caption unit) and disparity vectors (disparity information).

また、字幕管理データグループのＰＥＳストリーム内、あるいは字幕文データグループのＰＥＳストリーム内の字幕表示制御情報を送出するデータユニットに視差情報が挿入され、字幕文データ（字幕文情報）と視差情報との対応付けが行われている。そのため、受信側（セットトップボックス２００）においては、左眼画像および右眼画像に重畳されるキャプション・ユニット（字幕）に、対応する視差ベクトル（視差情報）を用いて適切な視差を付与できる。したがって、キャプション・ユニット（字幕）の表示において、画像内の各物体との間の遠近感の整合性を最適な状態に維持できる。 Also, disparity information is inserted into a data unit for transmitting caption display control information in the PES stream of the caption management data group or in the PES stream of the caption sentence data group, and the caption sentence data (subtitle sentence information) and the disparity information are Correspondence has been made. Therefore, on the reception side (set top box 200), appropriate parallax can be given to caption units (captions) superimposed on the left-eye image and right-eye image using the corresponding disparity vector (disparity information). Therefore, in the display of caption units (captions), perspective consistency with each object in the image can be maintained in an optimum state.

また、図３３に示す送信データ生成部１１０Ａにおいては、新たに定義された拡張表示制御のデータユニットに、字幕表示期間内で共通に使用される視差情報（図５２の「default_disparity」参照）が挿入される。また、このデータユニットに、字幕表示期間内で順次更新される視差情報（図２１の「disparity_update」参照）の挿入が可能とされている。そして、この拡張表示制御のデータユニットには、字幕表示期間内で順次更新される視差情報の存在を示すフラグ情報が挿入される（図５２の（「temporal_extension_flag」参照）。 33, disparity information (see “default_disparity” in FIG. 52) commonly used in the caption display period is inserted into the newly defined extended display control data unit. Is done. Also, disparity information (see “disparity_update” in FIG. 21) that is sequentially updated within the caption display period can be inserted into this data unit. Then, flag information indicating the presence of disparity information that is sequentially updated within the caption display period is inserted into the data unit for extended display control (see “temporal_extension_flag” in FIG. 52).

そのため、字幕表示期間内で共通に使用される視差情報のみを送信するか、さらに、字幕表示期間内で順次更新される視差情報を送信するかを選択することが可能となる。字幕表示期間内で順次更新される視差情報を送信することで、受信側（セットトップボックス２００）において、重畳情報に付与する視差を画像内容の変化に連動して動的に変化させることが可能となる。 For this reason, it is possible to select whether to transmit only disparity information that is commonly used within the caption display period, or to transmit disparity information that is sequentially updated within the caption display period. By transmitting disparity information that is sequentially updated within the caption display period, it is possible to dynamically change the disparity to be added to the superimposition information on the reception side (set top box 200) in conjunction with the change in the image content. It becomes.

また、図３３に示す送信データ生成部１１０Ａにおいて、拡張表示制御のデータユニットに含まれる視差情報は、サブタイトル表示期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなるものとされる。そのため、送信データ量を低減でき、また、受信側において、視差情報を保持するためのメモリ容量の大幅な節約が可能となる。 In addition, in the transmission data generation unit 110A illustrated in FIG. 33, the disparity information included in the extended display control data unit includes disparity information of the first frame in the subtitle display period, and disparity information of frames for each subsequent update frame interval. It is supposed to consist of Therefore, the amount of transmission data can be reduced, and the memory capacity for holding the parallax information can be greatly saved on the receiving side.

また、図３３に示す送信データ生成部１１０Ａにおいて、拡張表示制御のデータユニットに挿入される「disparity_temporal_extension()」は、上述のＳＣＳのセグメントに含まれる「disparity_temporal_extension()」と同じ構造のものである（図２１参照）。そのため、詳細説明は省略するが、図３３に示す送信データ生成部１１０Ａは、この「disparity_temporal_extension()」の構造により、図２に示す送信データ生成部１１０と同様の効果を得ることができる。 In addition, in the transmission data generation unit 110A illustrated in FIG. 33, “disparity_temporal_extension ()” inserted into the extended display control data unit has the same structure as “disparity_temporal_extension ()” included in the SCS segment described above. (See FIG. 21). Therefore, although detailed description is omitted, the transmission data generation unit 110A illustrated in FIG. 33 can obtain the same effect as the transmission data generation unit 110 illustrated in FIG. 2 by the structure of “disparity_temporal_extension ()”.

「ビットストリーム処理部の構成例」
図６１は、上述の図３３に示す送信データ生成部１１０Ａに対応した、セットトップボックス２００のビットストリーム処理部２０１Ａの構成例を示している。このビットストリーム処理部２０１Ａは、上述の図３３に示す送信データ生成部１１０Ａに対応した構成となっている。このビットストリーム処理部２０１Ａは、デマルチプレクサ２３１と、ビデオデコーダ２３２と、字幕デコーダ２３３を有している。さらに、このビットストリーム処理部２０１Ａは、立体画像用字幕発生部２３４と、視差情報取り出し部２３５と、視差情報処理部２３６と、ビデオ重畳部２３７と、オーディオデコーダ２３８を有している。 "Configuration example of the bitstream processing unit"
FIG. 61 illustrates a configuration example of the bit stream processing unit 201A of the set top box 200 corresponding to the transmission data generation unit 110A illustrated in FIG. This bit stream processing unit 201A has a configuration corresponding to the transmission data generation unit 110A shown in FIG. The bit stream processing unit 201A includes a demultiplexer 231, a video decoder 232, and a caption decoder 233. Further, the bit stream processing unit 201A includes a stereoscopic image subtitle generating unit 234, a parallax information extracting unit 235, a parallax information processing unit 236, a video superimposing unit 237, and an audio decoder 238.

デマルチプレクサ２３１は、ビットストリームデータＢＳＤから、ビデオ、オーディオ、字幕のパケットを抽出し、各デコーダに送る。ビデオデコーダ２３２は、上述の送信データ生成部１１０Ａのビデオエンコーダ１２２とは逆の処理を行う。すなわち、デマルチプレクサ２３１で抽出されたビデオのパケットからビデオのエレメンタリストリームを再構成し、復号化処理を行って、左眼画像データおよび右眼画像データを含む立体画像データを得る。この立体画像データの伝送方式は、例えば、上述の第１の伝送方式（「Top & Bottom」方式）、第２の伝送方式は（「Side By Side」方式）、第３の伝送方式（「Frame Sequential」方式）などである（図４参照）。 The demultiplexer 231 extracts video, audio, and subtitle packets from the bit stream data BSD, and sends them to each decoder. The video decoder 232 performs processing reverse to that of the video encoder 122 of the transmission data generation unit 110A described above. That is, a video elementary stream is reconstructed from the video packet extracted by the demultiplexer 231 and decoding processing is performed to obtain stereoscopic image data including left-eye image data and right-eye image data. The transmission method of the stereoscopic image data is, for example, the first transmission method (“Top & Bottom” method) described above, the second transmission method (“Side By Side” method), and the third transmission method (“Frame”). Sequential ”method) (see FIG. 4).

字幕デコーダ２２３は、上述の送信データ生成部１１０の字幕エンコーダ１３３とは逆の処理を行う。すなわち、字幕デコーダ２３３は、デマルチプレクサ２３１で抽出された字幕のパケットから字幕エレメンタリストリーム（字幕データストリーム）を再構成し、復号化処理を行って、各キャプション・ユニットの字幕データ（ＡＲＩＢ方式の字幕データ）を得る。 The caption decoder 223 performs the reverse process of the caption encoder 133 of the transmission data generation unit 110 described above. That is, the caption decoder 233 reconstructs a caption elementary stream (caption data stream) from the caption packet extracted by the demultiplexer 231, performs decoding processing, and caption data (ARIB format data) of each caption unit. Subtitle data).

視差情報取り出し部２３５は、字幕デコーダ２３３を通じて得られる字幕のストリームから、各キャプション・ユニットに対応した視差ベクトル（視差情報）を取り出す。この場合、キャプション・ユニット毎の視差ベクトル（個別視差ベクトル）、あるいは各キャプション・ユニットに共通の視差ベクトル（共通視差ベクトル）が得られる（図３５〜図３８参照）。 The disparity information extracting unit 235 extracts disparity vectors (disparity information) corresponding to each caption unit from the caption stream obtained through the caption decoder 233. In this case, a disparity vector (individual disparity vector) for each caption unit or a disparity vector common to each caption unit (common disparity vector) is obtained (see FIGS. 35 to 38).

上述したように、字幕データストリームには、ＡＲＩＢ方式の字幕（キャプション・ユニット）のデータおよび視差情報（視差ベクトル）が含まれている。そして、視差情報は、字幕の表示制御情報を送出するデータユニットに挿入されている。そのため、視差情報取り出し部２３５は、各キャプション・ユニットの字幕データと対応付けて、視差情報（視差ベクトル）を取り出すことができる。 As described above, the caption data stream includes ARIB-style caption (caption unit) data and disparity information (disparity vector). The disparity information is inserted into a data unit that transmits subtitle display control information. Therefore, the disparity information extracting unit 235 can extract disparity information (disparity vector) in association with the caption data of each caption unit.

視差情報取り出し部２３５は、字幕表示期間内で共通に使用される視差情報（図５２の「default_disparity」参照）を取得する。また、この視差情報取り出し部２３５は、さらに、字幕表示期間内で順次更新される視差情報（図２１の「disparity_update」参照）を取得することもある。視差情報取り出し部２３５は、この視差情報（視差ベクトル）を、視差情報処理部２３６を通じて、立体画像用字幕発生部２３４に送る。この字幕表示期間内で順次更新される視差情報は、上述したように、字幕表示期間の最初のフレームの視差情報と、その後のベースセグメント期間（更新フレーム間隔）毎のフレームの視差情報とからなっている。 The disparity information extracting unit 235 acquires disparity information (see “default_disparity” in FIG. 52) that is commonly used in the caption display period. Further, the disparity information extracting unit 235 may acquire disparity information (see “disparity_update” in FIG. 21) that is sequentially updated within the caption display period. The disparity information extracting unit 235 sends the disparity information (disparity vector) to the stereoscopic image caption generating unit 234 through the disparity information processing unit 236. As described above, the disparity information sequentially updated within the caption display period is composed of the disparity information of the first frame in the caption display period and the disparity information of the frame for each subsequent base segment period (update frame interval). ing.

視差情報処理部２３６は、字幕表示期間内で共通に使用される視差情報に関しては、そのまま立体画像用字幕発生部２３４に送る。一方、視差情報処理部２３６は、字幕表示期間内で順次更新される視差情報に関しては、補間処理を施して、字幕表示期間内における任意のフレーム間隔、例えば、１フレーム間隔の視差情報を生成して、立体画像用字幕発生部２３４に送る。視差情報処理部２３６は、この補間処理として、線形補間処理ではなく、時間方向（フレーム方向）にローパスフィルタ（ＬＰＦ）処理を伴った補間処理を行って、補間処理後の所定フレーム間隔の視差情報の時間方向（フレーム方向）の変化をなだらかにしている（図３１参照）。 The parallax information processing unit 236 sends the parallax information used in common during the caption display period to the stereoscopic image caption generation unit 234 as it is. On the other hand, the disparity information processing unit 236 performs interpolation processing on the disparity information sequentially updated within the caption display period, and generates disparity information at an arbitrary frame interval, for example, one frame interval within the caption display period. To the stereoscopic image caption generation unit 234. The disparity information processing unit 236 performs not the linear interpolation process but the interpolation process with the low-pass filter (LPF) process in the time direction (frame direction) as the interpolation process, and the disparity information at a predetermined frame interval after the interpolation process. In the time direction (frame direction) is smoothed (see FIG. 31).

立体画像用字幕発生部２３４は、左眼画像および右眼画像にそれぞれ重畳する左眼字幕および右眼字幕のデータを生成する。この生成処理は、字幕デコーダ２３３で得られた各キャプション・ユニットの字幕データと、視差情報処理部２３６を通じて供給される視差情報（視差ベクトル）に基づいて行われる。そして、この立体画像用字幕発生部２３４は、左眼字幕および左眼字幕のデータ（ビットマップデータ）を出力する。 The stereoscopic image caption generation unit 234 generates left-eye caption data and right-eye caption data to be superimposed on the left-eye image and the right-eye image, respectively. This generation process is performed based on the caption data of each caption unit obtained by the caption decoder 233 and the disparity information (disparity vector) supplied through the disparity information processing unit 236. The stereoscopic image caption generation unit 234 then outputs left-eye caption and left-eye caption data (bitmap data).

この場合、左眼および左眼の字幕（キャプション・ユニット）は同一の情報である。しかし、画像内の重畳位置が、例えば、左眼の字幕と右眼の字幕とは、視差ベクトル分だけ、水平方向にずれるようにされる。これにより、左眼画像および右眼画像に重畳される同一の字幕として、画像内の各物体の遠近感に応じて視差調整が施されたものを用いることができ、この字幕の表示において、画像内の各物体との間の遠近感の整合性を維持するようにされる。 In this case, the left eye caption and the left eye caption (caption unit) are the same information. However, the superimposed position in the image is shifted in the horizontal direction by, for example, the parallax vector between the left-eye caption and the right-eye caption. As a result, the same subtitle superimposed on the left eye image and the right eye image can be used with parallax adjusted according to the perspective of each object in the image. The perspective consistency between each object within is maintained.

ここで、立体画像用字幕発生部２３４は、視差情報処理部２３６から字幕表示期間内で共通に使用される視差情報（視差ベクトル）のみが送られてくる場合、その視差情報を使用する。また、立体画像用字幕発生部２３４は、視差情報処理部２３６から、さらに字幕表示期間内で順次更新される視差情報も送られてくる場合には、いずれかを使用する。 Here, when only the disparity information (disparity vector) that is commonly used in the caption display period is transmitted from the disparity information processing unit 236, the stereoscopic image caption generation unit 234 uses the disparity information. In addition, the stereoscopic image caption generation unit 234 uses either one of the parallax information sequentially updated within the caption display period from the parallax information processing unit 236.

いずれを使用するかは、例えば、上述したように、拡張表示制御のデータユニットに含まれている、字幕表示の際に受信側（デコーダ側）で必須の視差情報（disparity）対応レベルを示す情報（図５２の「rendering_level」参照）に拘束される。その場合、例えば、“００”であるときは、ユーザ設定による。字幕表示期間内で順次更新される視差情報を用いることで、左眼および右眼に付与する視差を画像内容の変化に連動して動的に変化させることが可能となる。 Which one is used is, for example, as described above, information indicating the disparity information (disparity) correspondence level that is included in the extended display control data unit and is essential on the reception side (decoder side) when displaying captions. (See “rendering_level” in FIG. 52). In this case, for example, when “00”, it depends on the user setting. By using the disparity information that is sequentially updated within the caption display period, it is possible to dynamically change the disparity to be given to the left eye and the right eye in conjunction with the change in the image content.

ビデオ重畳部２３７は、ビデオデコーダ２３２で得られた立体画像データ（左眼画像データ、右眼画像データ）に対し、立体画像用字幕発生部２３４で発生された左眼および左眼の字幕のデータ（ビットマップデータ）を重畳し、表示用立体画像データＶoutを得る。そして、このビデオ重畳部２３７は、表示用立体画像データＶoutを、ビットストリーム処理部２０１Ａの外部に出力する。 The video superimposing unit 237 performs the left-eye and left-eye caption data generated by the stereoscopic image caption generation unit 234 with respect to the stereoscopic image data (left-eye image data and right-eye image data) obtained by the video decoder 232. (Bitmap data) is superimposed to obtain display stereoscopic image data Vout. Then, the video superimposing unit 237 outputs the display stereoscopic image data Vout to the outside of the bit stream processing unit 201A.

また、オーディオデコーダ２３８は、上述の送信データ生成部１１０Ａのオーディオエンコーダ１２３とは逆の処理を行う。すなわち、このオーディオデコーダ２３８は、デマルチプレクサ２３１で抽出されたオーディオのパケットからオーディオのエレメンタリストリームを再構成し、復号化処理を行って、音声データＡoutを得る。そして、このオーディオデコーダ２３８は、音声データＡoutを、ビットストリーム処理部２０１Ａの外部に出力する。 Further, the audio decoder 238 performs a process reverse to that of the audio encoder 123 of the transmission data generation unit 110A described above. That is, the audio decoder 238 reconstructs an audio elementary stream from the audio packet extracted by the demultiplexer 231 and performs a decoding process to obtain audio data Aout. The audio decoder 238 then outputs the audio data Aout to the outside of the bit stream processing unit 201A.

図６１に示すビットストリーム処理部２０１Ａの動作を簡単に説明する。デジタルチューナ２０４（図２９参照）から出力されるビットストリームデータＢＳＤは、デマルチプレクサ２３１に供給される。このデマルチプレクサ２３１では、ビットストリームデータＢＳＤから、ビデオ、オーディオおよび字幕のパケットが抽出され、各デコーダに供給される。 The operation of the bit stream processing unit 201A shown in FIG. 61 will be briefly described. The bit stream data BSD output from the digital tuner 204 (see FIG. 29) is supplied to the demultiplexer 231. The demultiplexer 231 extracts video, audio, and subtitle packets from the bit stream data BSD and supplies them to each decoder.

ビデオデコーダ２３２では、デマルチプレクサ２３１で抽出されたビデオのパケットからビデオのエレメンタリストリームが再構成され、さらに復号化処理が行われて、左眼画像データおよび右眼画像データを含む立体画像データが得られる。この立体画像データは、ビデオ重畳部２３７に供給される。 In the video decoder 232, a video elementary stream is reconstructed from the video packet extracted by the demultiplexer 231, and further, decoding processing is performed so that stereoscopic image data including left eye image data and right eye image data is converted. can get. The stereoscopic image data is supplied to the video superimposing unit 237.

また、字幕デコーダ２３３では、デマルチプレクサ２３１で抽出された字幕のパケットから字幕エレメンタリストリームが再構成され、さらに復号化処理が行われて、各キャプション・ユニットの字幕データ（ＡＲＩＢ方式の字幕データ）が得られる。この各キャプション・ユニットの字幕データは、立体画像用字幕発生部２３４に供給される。 Also, the subtitle decoder 233 reconstructs a subtitle elementary stream from the subtitle packet extracted by the demultiplexer 231, further performs decoding processing, and subtitle data of each caption unit (ARIB method subtitle data) Is obtained. The caption data of each caption unit is supplied to the stereoscopic image caption generation unit 234.

また、視差情報取り出し部２３５では、字幕デコーダ２３３を通じて得られる字幕のストリームから、各キャプション・ユニットに対応した視差ベクトル（視差情報）が取り出される。この場合、キャプション・ユニット毎の視差ベクトル（個別視差ベクトル）、あるいは各キャプション・ユニットに共通の視差ベクトル（共通視差ベクトル）が得られる。 Also, the disparity information extracting unit 235 extracts disparity vectors (disparity information) corresponding to each caption unit from the caption stream obtained through the caption decoder 233. In this case, a disparity vector (individual disparity vector) for each caption unit, or a disparity vector common to each caption unit (common disparity vector) is obtained.

また、視差情報取り出し部２３５では、字幕表示期間内で共通に使用される視差情報、または、これと共に字幕表示期間内で順次更新される視差情報が取得される。視差情報取り出し部２３５で取り出された視差情報（視差ベクトル）は、視差情報処理部２３６を通じて、立体画像用字幕発生部２３４に送られる。視差情報処理部２３６では、字幕表示期間内で順次更新される視差情報に関して、以下の処理が行われる。すなわち、視差情報処理部２３６では、時間方向（フレーム方向）のＬＰＦ処理を伴った補間処理が施されて、字幕表示期間内における任意のフレーム間隔、例えば、１フレーム間隔の視差情報が生成されて、立体画像用字幕発生部２３４に送られる。 In addition, the disparity information extracting unit 235 acquires disparity information that is commonly used within the caption display period, or disparity information that is sequentially updated within the caption display period. The disparity information (disparity vector) extracted by the disparity information extracting unit 235 is sent to the stereoscopic image caption generating unit 234 through the disparity information processing unit 236. In the disparity information processing unit 236, the following processing is performed on the disparity information that is sequentially updated within the caption display period. That is, the disparity information processing unit 236 performs interpolation processing with LPF processing in the time direction (frame direction), and generates disparity information at an arbitrary frame interval, for example, one frame interval within the caption display period. And sent to the stereoscopic image caption generation unit 234.

立体画像用字幕発生部２３４では、各キャプション・ユニットの字幕データと、各キャプション・ユニットに対応した視差ベクトルに基づいて、左眼画像および右眼画像にそれぞれ重畳する左眼字幕および右眼字幕のデータ（ビットマップデータ）が生成される。この場合、画像内の重畳位置が、例えば、左眼の字幕に対して、右眼の字幕は、視差ベクトル分だけ、水平方向にずれるようにされる。この左眼字幕および左眼字幕のデータはビデオ重畳部２３７に供給される。 In the stereoscopic image caption generation unit 234, based on the caption data of each caption unit and the disparity vector corresponding to each caption unit, the left-eye caption and the right-eye caption to be superimposed on the left-eye image and the right-eye image, respectively. Data (bitmap data) is generated. In this case, the superimposed position in the image is shifted in the horizontal direction by the amount of the parallax vector, for example, with respect to the left-eye caption. The left-eye caption and left-eye caption data are supplied to the video superimposing unit 237.

ビデオ重畳部２３７では、ビデオデコーダ２３２で得られた立体画像データに対し、立体画像用字幕発生部２３４で発生された左眼字幕および右眼字幕のデータ（ビットマップデータ）が重畳され、表示用立体画像データＶoutが得られる。この表示用立体画像データＶoutは、ビットストリーム処理部２０１Ａの外部に出力される。 The video superimposing unit 237 superimposes the left-eye caption data and the right-eye caption data (bitmap data) generated by the stereoscopic image caption generation unit 234 on the stereoscopic image data obtained by the video decoder 232 for display. Stereoscopic image data Vout is obtained. The display stereoscopic image data Vout is output to the outside of the bit stream processing unit 201A.

また、オーディオデコーダ２３８では、デマルチプレクサ２３１で抽出されたオーディオのパケットからオーディオエレメンタリストリームが再構成され、さらに復号化処理が行われて、上述の表示用立体画像データＶoutに対応した音声データＡoutが得られる。この音声データＡoutは、ビットストリーム処理部２０１Ａの外部に出力される。 Also, the audio decoder 238 reconstructs an audio elementary stream from the audio packet extracted by the demultiplexer 231, further performs decoding processing, and audio data Aout corresponding to the display stereoscopic image data Vout described above. Is obtained. The audio data Aout is output to the outside of the bit stream processing unit 201A.

上述したように、図６１に示すビットストリーム処理部２０１Ａに供給されるビットストリームデータＢＳＤに含まれる字幕データストリームに、字幕（キャプション・ユニット）のデータおよび視差ベクトル（視差情報）が含まれている。そして、字幕文データグループのＰＥＳストリーム内の字幕表示制御情報を送出するデータユニットに視差ベクトル（視差情報）が挿入され、字幕データと視差ベクトルとが対応付けられている。 As described above, the caption data stream included in the bit stream data BSD supplied to the bit stream processing unit 201A illustrated in FIG. 61 includes caption (caption unit) data and disparity vectors (disparity information). . A disparity vector (disparity information) is inserted into a data unit for transmitting caption display control information in the PES stream of the caption text data group, and the caption data and the disparity vector are associated with each other.

そのため、ビットストリーム処理部２０１Ａでは、左眼画像および右眼画像に重畳されるキャプション・ユニット（字幕）に、対応する視差ベクトル（視差情報）を用いて適切な視差を付与できる。したがって、キャプション・ユニット（字幕）の表示において、画像内の各物体との間の遠近感の整合性を最適な状態に維持できる。 Therefore, the bit stream processing unit 201A can give appropriate parallax to the caption unit (caption) superimposed on the left eye image and the right eye image using the corresponding parallax vector (disparity information). Therefore, in the display of caption units (captions), perspective consistency with each object in the image can be maintained in an optimum state.

また、図６１に示すビットストリーム処理部２０１Ａの視差情報取り出し部２３５では、字幕表示期間内で共通に使用される視差情報、または、これと共に字幕表示期間内で順次更新される視差情報が取得される。立体画像用字幕発生部２３４では、字幕表示期間内で順次更新される視差情報が使用されることで、左眼および右眼の字幕に付与する視差を画像内容の変化に連動して動的に変化させることが可能となる。 In addition, the disparity information extracting unit 235 of the bitstream processing unit 201A illustrated in FIG. 61 acquires disparity information that is commonly used within the caption display period, or the disparity information that is sequentially updated within the caption display period. The The stereoscopic image caption generation unit 234 uses disparity information that is sequentially updated within the caption display period, so that the disparity to be given to the left-eye and right-eye captions is dynamically linked with changes in the image content. It can be changed.

また、図６１に示すビットストリーム処理部２０１Ａの視差情報処理部２３６では、字幕表示期間内で順次更新される視差情報に対して補間処理が施されて字幕表示期間内における任意のフレーム間隔の視差情報が生成される。この場合、送信側（放送局１００）から１６フレーム等のベースセグメント期間（更新フレーム間隔）毎に視差情報が送信される場合であっても、左眼および右眼の字幕に付与される視差を、細かな間隔で、例えばフレーム毎に制御することが可能となる。 In addition, in the disparity information processing unit 236 of the bitstream processing unit 201A illustrated in FIG. 61, the disparity information sequentially updated within the caption display period is subjected to interpolation processing, and the disparity at an arbitrary frame interval within the caption display period. Information is generated. In this case, even when the disparity information is transmitted from the transmission side (broadcast station 100) every base segment period (update frame interval) such as 16 frames, the disparity given to the left-eye and right-eye captions It is possible to control at fine intervals, for example, for each frame.

また、図６１に示すビットストリーム処理部２０１Ａの視差情報処理部２３６では、時間方向（フレーム方向）のローパスフィルタ処理を伴った補間処理が行われる。そのため、送信側（放送局１００）からベースセグメント期間（更新フレーム間隔）毎に視差情報が送信される場合であっても、補間処理後の視差情報の時間方向（フレーム方向）の変化をなだらかにできる（図３１参照）。したがって、左眼および右眼の字幕に付与される視差の推移が、更新フレーム間隔毎に不連続となることによる違和感を抑制できる。 In addition, the disparity information processing unit 236 of the bit stream processing unit 201A illustrated in FIG. 61 performs an interpolation process with a low-pass filter process in the time direction (frame direction). Therefore, even when disparity information is transmitted from the transmission side (broadcast station 100) every base segment period (updated frame interval), the change in the time direction (frame direction) of the disparity information after the interpolation processing is gently performed. Yes (see FIG. 31). Therefore, it is possible to suppress a sense of incongruity due to the disparity transitions given to the left-eye and right-eye captions being discontinuous at each update frame interval.

［送信データ生成部およびビットストリーム処理部の他の構成例（２）］
「送信データ生成部の構成例」
図６２は、放送局１００（図１参照）における送信データ生成部１１０Ｂの構成例を示している。この送信データ生成部１１０Ｂは、既存の放送規格の一つであるＣＥＡ方式に容易に連携できるデータ構造で視差情報（視差ベクトル）を送信する。この送信データ生成部１１０Ｂは、データ取り出し部（アーカイブ部）１３１と、ビデオエンコーダ１３２と、オーディオエンコーダ１３３を有している。また、この送信データ生成部１１０Ｂは、クローズド・キャプションエンコーダ（ＣＣエンコーダ）１３４と、視差情報作成部１３５と、マルチプレクサ１３６を有している。 [Another example of configuration of transmission data generation unit and bit stream processing unit (2)]
"Configuration example of transmission data generator"
FIG. 62 shows a configuration example of the transmission data generation unit 110B in the broadcast station 100 (see FIG. 1). The transmission data generation unit 110B transmits disparity information (disparity vector) with a data structure that can be easily linked to the CEA system, which is one of existing broadcast standards. The transmission data generation unit 110B includes a data extraction unit (archive unit) 131, a video encoder 132, and an audio encoder 133. In addition, the transmission data generation unit 110B includes a closed caption encoder (CC encoder) 134, a disparity information creation unit 135, and a multiplexer 136.

データ取り出し部１３１には、データ記録媒体１３１ａが、例えば、着脱自在に装着される。このデータ記録媒体１３１ａには、図２に示す送信データ生成部１１０のデータ取り出し部１１１におけるデータ記録媒体１１１ａと同様に、左眼画像データおよび右眼画像データを含む立体画像データと共に、音声データ、視差情報が対応付けて記録されている。データ取り出し部１３１は、データ記録媒体１３１ａから、立体画像データ、音声データ、視差情報等を取り出して出力する。データ記録媒体１３１ａは、ディスク状記録媒体、半導体メモリ等である。 A data recording medium 131a is detachably attached to the data extraction unit 131, for example. Similar to the data recording medium 111a in the data extraction unit 111 of the transmission data generation unit 110 shown in FIG. 2, the data recording medium 131a includes audio data, stereo image data including left eye image data and right eye image data, Parallax information is recorded in association with each other. The data extracting unit 131 extracts and outputs stereoscopic image data, audio data, parallax information, and the like from the data recording medium 131a. The data recording medium 131a is a disk-shaped recording medium, a semiconductor memory, or the like.

ＣＣエンコーダ１３４は、ＣＥＡ−７０８準拠のエンコーダであって、クローズド・キャプションの字幕表示をするためのＣＣデータ（クローズド・キャプション情報のデータ）を出力する。この場合、ＣＣエンコーダ１３４は、時系列的に表示される各クローズド・キャプション情報のＣＣデータを順次出力する。 The CC encoder 134 is an encoder compliant with CEA-708, and outputs CC data (closed caption information data) for displaying closed caption captions. In this case, the CC encoder 134 sequentially outputs CC data of each closed caption information displayed in time series.

視差情報作成部１３５は、データ取り出し部１３１から出力される視差ベクトル、すなわちピクセル（画素）毎の視差ベクトルにダウンサイジング処理を施し、上述のＣＣエンコーダ１３４から出力されるＣＣデータに含まれる各ウインドウＩＤ（WindowID）に対応付けされた視差情報（視差ベクトル）を出力する。視差情報作成部１３５は、詳細説明は省略するが、上述した図２に示す送信データ生成部１１０の視差情報作成部１１５と同様のダウンサイジング処理を行う。 The disparity information creation unit 135 performs downsizing processing on the disparity vector output from the data extraction unit 131, that is, the disparity vector for each pixel (pixel), and each window included in the CC data output from the CC encoder 134 described above. Disparity information (disparity vector) associated with the ID (WindowID) is output. Although the detailed description is omitted, the disparity information creating unit 135 performs the same downsizing process as the disparity information creating unit 115 of the transmission data generating unit 110 shown in FIG. 2 described above.

視差情報作成部１３５は、上述したダウンサイジング処理により、同一の画面に表示される所定数のキャプション・ユニット（字幕）に対応した視差ベクトルを作成する。この場合、視差情報作成部１３５は、キャプション・ユニット毎の視差ベクトル（個別視差ベクトル）を作成するか、あるいは各キャプション・ユニットに共通の視差ベクトル（共通視差ベクトル）を作成する。この選択は、例えば、ユーザの設定による。この視差情報には、左眼画像に重畳するクローズド・キャプション情報および右眼画像に重畳するクローズド・キャプション情報のうち、この視差情報に基づいてシフトさせるクローズド・キャプション情報を指定するシフト対象指定情報も付加されている。 The disparity information creating unit 135 creates disparity vectors corresponding to a predetermined number of caption units (captions) displayed on the same screen by the downsizing process described above. In this case, the disparity information creating unit 135 creates a disparity vector (individual disparity vector) for each caption unit, or creates a disparity vector common to each caption unit (common disparity vector). This selection depends on, for example, user settings. The disparity information includes shift target designation information for designating closed caption information to be shifted based on the disparity information among the closed caption information superimposed on the left eye image and the closed caption information superimposed on the right eye image. It has been added.

視差情報作成部１３５は、個別視差ベクトルを作成する場合、各キャプション・ユニットの表示領域に基づき、上述のダウンサイジング処理によって、その表示領域に属する視差ベクトルを求める。また、視差情報作成部１３５は、共通視差ベクトルを作成する場合、上述のダウンサイジング処理によって、ピクチャ全体（画像全体）の視差ベクトルを求める（図９（ｄ）参照）。なお、視差情報作成部１３５は、共通視差ベクトルを作成する場合、各キャプション・ユニットの表示領域に属する視差ベクトルを求め、最も値の大きな視差ベクトルを選択してもよい。 When creating an individual disparity vector, the disparity information creating unit 135 obtains a disparity vector belonging to the display area by the above-described downsizing process based on the display area of each caption unit. In addition, when creating the common disparity vector, the disparity information creating unit 135 obtains the disparity vector of the entire picture (entire image) by the above-described downsizing process (see FIG. 9D). Note that when creating the common parallax vector, the parallax information creating unit 135 may obtain a parallax vector belonging to the display area of each caption unit and select the parallax vector having the largest value.

この視差情報は、例えば、クローズド・キャプション情報が表示される所定数のフレーム期間（字幕表示期間）内で共通に使用される視差情報、あるいはこの字幕表示期間内で順次更新される視差情報である。そして、字幕表示期間内で順次更新される視差情報は、所定数のフレーム期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなるものである。 This disparity information is, for example, disparity information that is commonly used within a predetermined number of frame periods (caption display periods) in which closed caption information is displayed, or disparity information that is sequentially updated within this caption display period. . The disparity information that is sequentially updated within the caption display period is composed of disparity information of the first frame in a predetermined number of frame periods and disparity information of frames for each subsequent update frame interval.

ビデオエンコーダ１３２は、データ取り出し部１３１から供給される立体画像データに対して、ＭＰＥＧ４−ＡＶＣ、ＭＰＥＧ２、ＶＣ−１等の符号化を施して符号化ビデオデータを得る。また、このビデオエンコーダ１３２は、後段に備えるストリームフォーマッタ１３２ａにより、ペイロード部に符号化ビデオデータを含むビデオのエレメンタリストリームを生成する。 The video encoder 132 performs encoding such as MPEG4-AVC, MPEG2, or VC-1 on the stereoscopic image data supplied from the data extraction unit 131 to obtain encoded video data. Also, the video encoder 132 generates a video elementary stream including encoded video data in the payload portion by the stream formatter 132a provided in the subsequent stage.

上述のＣＣエンコーダ１３４から出力されるＣＣデータおよび上述の視差情報作成部１３５で作成された視差情報は、ビデオエンコーダ１３２内のストリームフォーマッタ１３２ａに供給される。ストリームフォーマッタ１３２ａは、ビデオのエレメンタリストリームに、ＣＣデータおよび視差情報を、ユーザデータとして埋め込む。つまり、ビデオのエレメンタリストリームのペイロード部に立体画像データが含まれると共に、そのヘッダ部のユーザデータ領域にＣＣデータおよび視差情報が含まれる。 The CC data output from the CC encoder 134 and the disparity information created by the disparity information creating unit 135 are supplied to the stream formatter 132a in the video encoder 132. The stream formatter 132a embeds CC data and disparity information as user data in a video elementary stream. That is, stereoscopic image data is included in the payload portion of the video elementary stream, and CC data and disparity information are included in the user data area of the header portion.

図６３に示すように、ビデオのエレメンタリストリームは、先頭に、シーケンス単位のパラメータを含むシーケンスヘッダ部が配置されている。このシーケンスヘッダ部に続いて、ピクチャ単位のパラメータおよびユーザデータを含むピクチャヘッダが配置されている。このピクチャヘッダ部に続いてピクチャーデータを含むペイロード部が配置される。以下、ピクチャヘッダ部およびペイロード部が繰り返し配置されている。上述したＣＣデータおよび視差情報は、例えば、ピクチャヘッダ部のユーザデータ領域に埋め込まれる。この視差情報のユーザデータ領域への埋め込み（挿入）方法の詳細については、後述する。 As shown in FIG. 63, a sequence header portion including parameters in units of sequences is arranged at the head of a video elementary stream. Subsequent to the sequence header portion, a picture header including parameters and user data in units of pictures is arranged. Following this picture header portion, a payload portion including picture data is arranged. Hereinafter, the picture header part and the payload part are repeatedly arranged. The CC data and disparity information described above are embedded in, for example, the user data area of the picture header part. Details of the method of embedding (inserting) the parallax information into the user data area will be described later.

オーディオエンコーダ１３３は、データ取り出し部１３１から取り出された音声データに対して、ＭＰＥＧ−２ＡｕｄｉｏＡＡＣ等の符号化を施し、オーディオのエレメンタリストリームを生成する。マルチプレクサ１３６は、ビデオエンコーダ１３２およびオーディオエンコーダ１３３から出力される各エレメンタリストリームを多重化する。そして、このマルチプレクサ１３６は、伝送データ（多重化データストリーム）としてのビットストリームデータ（トランスポートストリーム）ＢＳＤを出力する。 The audio encoder 133 performs encoding such as MPEG-2Audio AAC on the audio data extracted from the data extraction unit 131 to generate an audio elementary stream. The multiplexer 136 multiplexes the elementary streams output from the video encoder 132 and the audio encoder 133. The multiplexer 136 outputs bit stream data (transport stream) BSD as transmission data (multiplexed data stream).

図６２に示す送信データ生成部１１０Ｂの動作を簡単に説明する。データ取り出し部１３１から出力される立体画像データは、ビデオエンコーダ１３２に供給される。このビデオエンコーダ１３２では、その立体画像データに対してＭＰＥＧ４−ＡＶＣ、ＭＰＥＧ２、ＶＣ−１等の符号化が施され、符号化ビデオデータを含むビデオエレメンタリストリームが生成される。このビデオエレメンタリストリームはマルチプレクサ１３６に供給される。 The operation of the transmission data generation unit 110B shown in FIG. 62 will be briefly described. The stereoscopic image data output from the data extraction unit 131 is supplied to the video encoder 132. In the video encoder 132, the stereoscopic image data is encoded by MPEG4-AVC, MPEG2, VC-1, or the like, and a video elementary stream including the encoded video data is generated. This video elementary stream is supplied to the multiplexer 136.

また、ＣＣエンコーダ１３４では、クローズド・キャプションの字幕表示をするためのＣＣデータ（クローズド・キャプション情報のデータ）が出力される。この場合、ＣＣエンコーダ１３４では、時系列的に表示される各クローズド・キャプション情報のＣＣデータが順次出力される。 Also, the CC encoder 134 outputs CC data (closed caption information data) for displaying closed caption captions. In this case, the CC encoder 134 sequentially outputs CC data of each closed caption information displayed in time series.

また、データ取り出し部１３１から出力されるピクセル（画素）毎の視差ベクトルは、視差情報作成部１３５に供給される。この視差情報作成部１３５では、この視差ベクトルにダウンサイジング処理等が施されて、上述のＣＣエンコーダ１３４から出力されるＣＣデータに含まれる各ウインドウＩＤ（WindowID）に対応付けされた視差情報（視差ベクトル）が出力される。 Further, the disparity vector for each pixel (pixel) output from the data extracting unit 131 is supplied to the disparity information creating unit 135. In the disparity information creating unit 135, the disparity vector is subjected to downsizing processing or the like, and disparity information (disparity information) associated with each window ID (WindowID) included in the CC data output from the CC encoder 134 described above. Vector) is output.

ＣＣエンコーダ１３４から出力されるＣＣデータおよび視差情報作成部１３５で作成される視差情報は、ビデオエンコーダ１３２のストリームフォーマッタ１３２ａに供給される。このストリームフォーマッタ１３２ａでは、ビデオのエレメンタリストリームのヘッダ部のユーザデータ領域に、ＣＣデータおよび視差情報が挿入される。この場合、視差情報の埋め込み、あるいは挿入は、後述するように、例えば、（Ａ）既存のテーブル（CEA table）の範囲内で拡張を行う方法、（Ｂ）パッディングバイトとして読み飛ばされていたバイトを新たに拡張定義する方法などで行われる。 The CC data output from the CC encoder 134 and the disparity information created by the disparity information creating unit 135 are supplied to the stream formatter 132a of the video encoder 132. In this stream formatter 132a, CC data and disparity information are inserted into the user data area of the header portion of the video elementary stream. In this case, embedding or insertion of disparity information has been skipped as, for example, (A) a method of expanding within the range of an existing table (CEA table), and (B) padding bytes. This is done by, for example, a method for defining a new byte extension.

また、データ取り出し部１３１から出力される音声データはオーディオエンコーダ１３３に供給される。このオーディオエンコーダ１３３では、音声データに対して、ＭＰＥＧ−２ＡｕｄｉｏＡＡＣ等の符号化が施され、符号化オーディオデータを含むオーディオエレメンタリストリームが生成される。このオーディオエレメンタリストリームはマルチプレクサ１３６に供給される。このマルチプレクサ１３６では、各エンコーダから供給されるエレメンタリストリームが多重化され、伝送データとしてのビットストリームデータＢＳＤが得られる。 The audio data output from the data extraction unit 131 is supplied to the audio encoder 133. The audio encoder 133 performs encoding such as MPEG-2Audio AAC on the audio data, and generates an audio elementary stream including the encoded audio data. This audio elementary stream is supplied to the multiplexer 136. In the multiplexer 136, the elementary streams supplied from the encoders are multiplexed to obtain bit stream data BSD as transmission data.

［視差情報のユーザ領域への埋め込み（挿入）方法］
次に、視差情報のユーザデータ領域への埋め込み方法の詳細について説明する。（Ａ）既存のテーブル（CEA table）の範囲内で拡張を行う方法、（Ｂ）パッディングバイトとして読み飛ばされていたバイトを新たに拡張定義する方法などが考えられる。（Ａ）の方法は、拡張コマンドＥＸＴ１とその後の値で拡張バイト数を示し、パラメータを後続挿入する方法である。以下、各方法を説明する。 [Method of embedding (inserting) disparity information in user area]
Next, details of a method for embedding disparity information in a user data area will be described. (A) A method of performing extension within the range of an existing table (CEA table), (B) a method of newly defining a byte skipped as a padding byte, and the like can be considered. The method (A) is a method in which the extension command EXT1 and the subsequent value indicate the number of extension bytes and a parameter is subsequently inserted. Each method will be described below.

「（Ａ）既存のテーブル（table）の範囲内で拡張を行う方法（１）」
図６４は、ＣＥＡテーブルを概略的に示している。このＣＥＡテーブルの中で拡張を行う場合、Ｃ０テーブル中の０ｘ１０(EXT1)コマンドで拡張コマンドの開始を宣言した後、拡張コマンドのバイト長によって、Ｃ２テーブル（C2 Table）、Ｃ３テーブル（C3 Table）、Ｇ２テーブル（G2 Table）、Ｇ３テーブル（G3 Table）のアドレスを指定する。ここでは、３バイトのコマンドを構成するので、Ｃ２テーブルのうち、３バイト後続することを示す、以下のバイト列が定義される。なお、Ｃ２テーブル中の０ｘ１８〜０ｘ１Ｆのアドレス空間は３バイト後続を示すことが、ＣＥＡの規格で決められている。 "(A) Method of extending within the range of existing table (table) (1)"
FIG. 64 schematically shows a CEA table. When performing expansion in this CEA table, after declaring the start of the extended command with the 0x10 (EXT1) command in the C0 table, the C2 table (C2 Table) and C3 table (C3 Table) are determined according to the byte length of the extended command. , G2 table (G2 Table) and G3 table (G3 Table) addresses are designated. Here, since a 3-byte command is configured, the following byte string indicating that 3 bytes follow in the C2 table is defined. It is determined by the CEA standard that the address space of 0x18 to 0x1F in the C2 table indicates that 3 bytes follow.

この場合のトータルの拡張コマンドは以下のとおりになる。
拡張コマンド：EXT1(0x10）＋ 0x18(3バイト後続)
＋ [Byte1] + [Byte2] + [Byte3] The total extended command in this case is as follows.
Extended command: EXT1 (0x10) + 0x18 (following 3 bytes)
+ [Byte1] + [Byte2] + [Byte3]

図６５は、“Byte1”, “Byte2”, “Byte3”の３バイトフィールドの構造例を示している。“Byte1”の第７ビットから第５ビットまでの３ビットフィールドには、「window_id」が配置されている。この「window_id」により、この拡張コマンドの情報が適用されるウインドウ（window）との関連付けが行われる。また、“Byte1”の第４ビットから第０ビットまでの５ビットフィールドには、「temporal_division_count」が配置されている。この「temporal_division_count」は、字幕表示期間に含まれるベースセグメントの個数を示す（図２２参照）。 FIG. 65 shows a structure example of a 3-byte field of “Byte1”, “Byte2”, and “Byte3”. In the 3-bit field from the seventh bit to the fifth bit of “Byte1”, “window_id” is arranged. With this “window_id”, association with the window to which the information of the extended command is applied is performed. In addition, “temporal_division_count” is arranged in the 5-bit field from the 4th bit to the 0th bit of “Byte1”. This “temporal_division_count” indicates the number of base segments included in the caption display period (see FIG. 22).

“Byte2”の第７ビットおよび第６ビットの２ビットフィールドには、「temporal_division_size」が配置されている。この「temporal_division_size」は、ベースセグメント期間（更新フレーム間隔）に含まれるフレーム数を示す。“００”は、１６フレームであることを示す。“０１”は、２５フレームであることを示す。“１０”は、３０フレームであることを示す。さらに、“１１”は、３２フレームであることを示す（図２２参照）。 “Temporal_division_size” is arranged in the 7-bit and 6-bit 2-bit fields of “Byte2”. The “temporal_division_size” indicates the number of frames included in the base segment period (update frame interval). “00” indicates 16 frames. “01” indicates 25 frames. “10” indicates 30 frames. Further, “11” indicates 32 frames (see FIG. 22).

“Byte2”の第５ビットの１ビットフィールドには、「shared_disparity」が配置される。この「shared_disparity」は、全てのウインドウ（window）に跨る共通の視差情報（disparity）制御を行うかどうかを示す。“１”は、以後の全てのウインドウに対して、一つの共通の視差情報（disparity）が適用されることを示す。“０”は、視差情報（Disparity）は、一つのウインドウにのみ適用されることを示す（図１９参照）。 “Shared_disparity” is arranged in the 1-bit field of the fifth bit of “Byte2”. This “shared_disparity” indicates whether or not to perform common disparity information (disparity) control over all windows. “1” indicates that one common disparity information (disparity) is applied to all subsequent windows. “0” indicates that the disparity information (Disparity) is applied to only one window (see FIG. 19).

“Byte2”の第４ビットから第０ビットまでの５ビットフィールドには、「shifting_interval_counts」が配置される。この「shifting_interval_counts」は、ベースセグメント期間(更新フレーム間隔)を調整するドローファクタ（Draw factor）、つまり差し引きフレーム数を示す（図２２参照）。 “Shifting_interval_counts” is arranged in the 5-bit field from the 4th bit to the 0th bit of “Byte2”. This “shifting_interval_counts” indicates a draw factor for adjusting the base segment period (update frame interval), that is, the number of subtracted frames (see FIG. 22).

図６６のベースセグメント期間（ＢＳＰ）毎の視差情報の更新例において、時点Ｃ〜Ｆの視差情報の更新タイミングに関しては、ドローファクタ（Draw factor）により、ベースセグメント期間が調整されている。この調整情報が存在することで、ベースセグメント期間(更新フレーム間隔)を調整することが可能となり、受信側に、視差情報の時間方向（フレーム方向）の変化をより的確に伝えることが可能となる。 In the example of updating the disparity information for each base segment period (BSP) in FIG. 66, the base segment period is adjusted by the draw factor with respect to the update timing of the disparity information at time points C to F. The presence of this adjustment information makes it possible to adjust the base segment period (update frame interval), and more accurately convey changes in the time direction (frame direction) of disparity information to the receiving side. .

“Byte3”の第７ビットから第０ビットまでの８ビットフィールドには、「disparity_update」が配置される。この「disparity_update」は、対応するベースセグメントの視差情報を示す。なお、ｋ＝０における「disparity_update」は、字幕表示期間内において更新フレーム間隔で順次更新される視差情報の初期値、つまり、字幕表示期間における最初のフレームの視差情報である。 In the 8-bit field from the 7th bit to the 0th bit of “Byte3”, “disparity_update” is arranged. This “disparity_update” indicates the disparity information of the corresponding base segment. Note that “disparity_update” at k = 0 is an initial value of disparity information that is sequentially updated at update frame intervals within the caption display period, that is, disparity information of the first frame in the caption display period.

上述した５バイトの拡張コマンドをユーザデータ領域に含めて繰り返し送信することで、字幕表示期間で順次更新される視差情報およびそれに付加された更新フレーム間隔の調整情報などの伝送（送信）が可能となる。 By repeatedly transmitting the 5-byte extended command described above in the user data area, transmission (transmission) of disparity information that is sequentially updated in the caption display period and adjustment information of the update frame interval added thereto is possible. Become.

「（Ａ）既存のテーブル（table）の範囲内で拡張を行う方法（２）」
図６７は、ＣＥＡテーブルを概略的に示している。このＣＥＡテーブルの中で拡張を行う場合、Ｃ０テーブル中の０ｘ１０(EXT1)コマンドで拡張コマンドの開始を宣言した後、拡張コマンドのバイト長によって、Ｃ２テーブル（C2 Table）、Ｃ３テーブル（C3 Table）、Ｇ２テーブル（G2 Table）、Ｇ３テーブル（G3 Table）のアドレスを指定する。ここでは、可変長コマンドを構成するので、Ｃ３テーブルのうち、以下のバイト列が定義される。なお、Ｃ３テーブル中の０ｘ９０〜０ｘ９Ｆのアドレス空間は３バイト後続を示すことが、ＣＥＡの規格で決められている。 “(A) Method to extend within the range of existing table (table) (2)”
FIG. 67 schematically shows the CEA table. When performing expansion in this CEA table, after declaring the start of the extended command with the 0x10 (EXT1) command in the C0 table, the C2 table (C2 Table) and C3 table (C3 Table) are determined according to the byte length of the extended command. , G2 table (G2 Table) and G3 table (G3 Table) addresses are designated. Here, since a variable length command is configured, the following byte sequence is defined in the C3 table. Note that the CEA standard determines that the address space of 0x90 to 0x9F in the C3 table indicates the subsequent 3 bytes.

この場合のトータルの拡張コマンドは以下のとおりになる。
拡張コマンド：EXT1 (0x10) ＋ EXTCode(0x90)
＋ [Header(Byte1)] ＋ [Byte2] ＋・・・＋ [ByteN] The total extended command in this case is as follows.
Extended command: EXT1 (0x10) + EXTCode (0x90)
+ [Header (Byte1)] + [Byte2] + ... + [ByteN]

図６８は、“Header(Byte1)”“Byte2”, “Byte3”, “Byte4”の４バイトフィールドの構造例を示している。“Header(Byte1)”の第７ビットおよび第６ビットの２ビットフィールドには、「type_field」が配置される。この「type_field」は、コマンドタイプを示す。“００”は、コマンドの開始（ＢＯＣ：Beginning of Comand）を示す。“０１”は、コマンドの継続（ＣＯＣ：Continueationof Command）を示す。“１０”は、コマンドの終了（ＥＯＣ： End Of Command）を示す。 FIG. 68 shows a structure example of a 4-byte field of “Header (Byte 1)”, “Byte 2”, “Byte 3”, and “Byte 4”. “Type_field” is arranged in the 7-bit and 6-bit 2-bit fields of “Header (Byte1)”. This “type_field” indicates a command type. “00” indicates the start of a command (BOC: Beginning of Comand). “01” indicates continuation of command (COC). “10” indicates the end of command (EOC: End Of Command).

“Header(Byte1)”の第４ビットから第０ビットまでの５ビットフィールドは、「Length_field」が配置される。この「Length_field」は、この拡張コマンドの以降のバイト数を示す。１つのサービスブロック（service block）内では最大２８バイト分に決められている。この範囲内で、Byte2〜 Byte４をループで繰り返すことで、視差情報（disparity）の更新が可能となる。この場合、１つのサービスブロックでは、最大９セットの視差情報の更新を行うことができる。 In a 5-bit field from the 4th bit to the 0th bit of “Header (Byte1)”, “Length_field” is arranged. This “Length_field” indicates the number of bytes after this extended command. A maximum of 28 bytes is determined in one service block. Within this range, disparity information (disparity) can be updated by repeating Byte2 to Byte4 in a loop. In this case, a maximum of nine sets of disparity information can be updated in one service block.

“Byte2”の第７ビットから第５ビットまでの３ビットフィールドには、「window_id」が配置されている。この「window_id」により、この拡張コマンドの情報が適用されるウインドウ（window）との関連付けが行われる。また、“Byte2”の第４ビットから第０ビットまでの５ビットフィールドには、「temporal_division_count」が配置されている。この「temporal_division_count」は、字幕表示期間に含まれるベースセグメントの個数を示す（図２２参照）。 In the 3-bit field from the seventh bit to the fifth bit of “Byte2”, “window_id” is arranged. With this “window_id”, association with the window to which the information of the extended command is applied is performed. Also, “temporal_division_count” is arranged in the 5-bit field from the 4th bit to the 0th bit of “Byte2”. This “temporal_division_count” indicates the number of base segments included in the caption display period (see FIG. 22).

“Byte3”の第７ビットおよび第６ビットの２ビットフィールドには、「temporal_division_size」が配置されている。この「temporal_division_size」は、ベースセグメント期間（更新フレーム間隔）に含まれるフレーム数を示す。“００”は、１６フレームであることを示す。“０１”は、２５フレームであることを示す。“１０”は、３０フレームであることを示す。さらに、“１１”は、３２フレームであることを示す（図２２参照）。 “Temporal_division_size” is arranged in the 7-bit and 6-bit 2-bit fields of “Byte3”. The “temporal_division_size” indicates the number of frames included in the base segment period (update frame interval). “00” indicates 16 frames. “01” indicates 25 frames. “10” indicates 30 frames. Further, “11” indicates 32 frames (see FIG. 22).

“Byte3”の第５ビットの１ビットフィールドには、「shared_disparity」が配置される。この「shared_disparity」は、全てのウインドウ（window）に跨る共通の視差情報（disparity）制御を行うかどうかを示す。“１”は、以後の全てのウインドウに対して、一つの共通の視差情報（disparity）が適用されることを示す。“０”は、視差情報（Disparity）は、一つのウインドウにのみ適用されることを示す（図１９参照）。 In the 1-bit field of the fifth bit of “Byte3”, “shared_disparity” is arranged. This “shared_disparity” indicates whether or not to perform common disparity information (disparity) control over all windows. “1” indicates that one common disparity information (disparity) is applied to all subsequent windows. “0” indicates that the disparity information (Disparity) is applied to only one window (see FIG. 19).

“Byte3”の第４ビットから第０ビットまでの５ビットフィールドには、「shifting_interval_counts」が配置される。この「shifting_interval_counts」は、ベースセグメント期間(更新フレーム間隔)を調整するドローファクタ（Draw factor）、つまり差し引きフレーム数を示す（図２２参照）。 “Shifting_interval_counts” is arranged in a 5-bit field from the 4th bit to the 0th bit of “Byte3”. This “shifting_interval_counts” indicates a draw factor for adjusting the base segment period (update frame interval), that is, the number of subtracted frames (see FIG. 22).

“Byte4”の第７ビットから第０ビットまでの８ビットフィールドには、「disparity_update」が配置される。この「disparity_update」は、対応するベースセグメントの視差情報を示す。なお、ｋ＝０における「disparity_update」は、字幕表示期間内において更新フレーム間隔で順次更新される視差情報の初期値、つまり、字幕表示期間における最初のフレームの視差情報である。 In the 8-bit field from the 7th bit to the 0th bit of “Byte4”, “disparity_update” is arranged. This “disparity_update” indicates the disparity information of the corresponding base segment. Note that “disparity_update” at k = 0 is an initial value of disparity information that is sequentially updated at update frame intervals within the caption display period, that is, disparity information of the first frame in the caption display period.

上述した可変長の拡張コマンドをユーザデータ領域に含めて送信することで、字幕表示期間で順次更新される視差情報およびそれに付加された更新フレーム間隔の調整情報などの伝送（送信）が可能となる。 By transmitting the above-described variable-length extended command in the user data area, transmission (transmission) of disparity information sequentially updated in the caption display period and update frame interval adjustment information added thereto is possible. .

「（Ｂ）パッディングバイトを新たに拡張定義する方法」
図６９は、従来のクローズド・キャプションデータ（ＣＣデータ）の構造例（Syntax）を示している。「cc_valid = 0」、「cc_type = 00」の場合、受信側（デコーダ）では、「cc_data_1」、「cc_data_2」のフィールドを読み飛ばすことになっている。ここでは、この空間を利用し、視差情報（disparity）伝送のための拡張を定義する。 “(B) How to define a new extended padding byte”
FIG. 69 shows a structural example (Syntax) of conventional closed caption data (CC data). In the case of “cc_valid = 0” and “cc_type = 00”, the receiving side (decoder) skips the fields of “cc_data_1” and “cc_data_2”. Here, this space is used to define an extension for disparity information (disparity) transmission.

図７０は、視差情報（disparity）対応のために修正されたクローズド・キャプションデータ（ＣＣデータ）の構造例（Syntax）を示している。「extended_control」の２ビットフィールドは、「cc_data_1」、「cc_data_2」の２フィールドを、制御する情報である。図７１（ａ）に示すように、「cc_valid = 0」、「cc_type = 00」の場合、「extended_control」の２ビットフィールドが“０１”、“１０”のときは、「cc_data_1」、「cc_data_2」の２フィールドを視差情報（disparity）伝送用に使用するものとする。 FIG. 70 illustrates a structure example (Syntax) of closed caption data (CC data) modified to support disparity information (disparity). The 2-bit field “extended_control” is information for controlling the two fields “cc_data_1” and “cc_data_2”. As shown in FIG. 71 (a), when “cc_valid = 0” and “cc_type = 00”, when the 2-bit field of “extended_control” is “01” and “10”, “cc_data_1” and “cc_data_2” These two fields are used for transmission of disparity information (disparity).

この場合、図７１（ｂ）に示すように、「extended_control = 01」のとき、「cc_data_1」のフィールドは、“Start of Extended Packet”を意味し、最初の拡張パケットデータ（１バイト）が挿入されたものとなる。また、このとき、「cc_data_2」のフィールドは、“Extended Packet Data”を意味し、続く拡張パケットデータ（１バイト）が挿入されたものとなる。 In this case, as shown in FIG. 71 (b), when “extended_control = 01”, the field of “cc_data_1” means “Start of Extended Packet”, and the first extended packet data (1 byte) is inserted. It will be. At this time, the field of “cc_data_2” means “Extended Packet Data”, and the extended packet data (1 byte) that follows is inserted.

また、図７１（ｂ）に示すように、「extended_control = 10」のとき、「cc_data_1」、「cc_data_2」の各フィールドは、Extended Packet Data”を意味し、続く拡張パケットデータ（１バイト）が挿入されたものとなる。なお、図７１（ｂ）に示すように、「extended_control = 00」のとき、「cc_data_1」、「cc_data_2」の各フィールドは、“Padding”を意味するものとされる。 As shown in FIG. 71 (b), when “extended_control = 10”, each field of “cc_data_1” and “cc_data_2” means “Extended Packet Data”, and the extended packet data (1 byte) is inserted. 71 (b), when “extended_control = 00”, the fields “cc_data_1” and “cc_data_2” mean “Padding”.

そして、“Extended Packet Data ”が「caption_disparity_data()」のトランスポートとして定義される。図７２、図７３は、「caption_disparity_data()」の構造例（syntax）を示している。図７４は、「caption_disparity_data()」の構造例における主要なデータ規定内容（semantics）を示している。 “Extended Packet Data” is defined as a transport of “caption_disparity_data ()”. 72 and 73 show a structure example (syntax) of “caption_disparity_data ()”. FIG. 74 shows main data definition contents (semantics) in the structural example of “caption_disparity_data ()”.

「service_number」は、サービスタイプを示す１ビットの情報である。「shared_windows」は、全てのウインドウ（window）に跨る共通の視差情報（disparity）制御を行うかどうかを示す。“１”は、以後の全てのウインドウに対して、一つの共通の視差情報（disparity）が適用されることを示す。“０”は、視差情報（Disparity）は、一つのウインドウにのみ適用されることを示す。 “Service_number” is 1-bit information indicating a service type. “Shared_windows” indicates whether to perform common disparity information (disparity) control over all windows. “1” indicates that one common disparity information (disparity) is applied to all subsequent windows. “0” indicates that the disparity information (Disparity) is applied to only one window.

「caption_window_count」は、キャプション・ウインドウの数を示す３ビットの情報である。「caption_window_id」は、キャプション・ウインドウを識別する３ビットの識別情報である。「temporal_extension_flag」は、対応するウインドウにおいて、字幕表示期間内で順次更新される視差情報（disparity_update）の存在の有無を示す１ビットのフラグ情報である。この場合、“１”は存在することを示し、“０”は存在しないことを示す。 “Caption_window_count” is 3-bit information indicating the number of caption windows. “Caption_window_id” is 3-bit identification information for identifying a caption window. “Temporal_extension_flag” is 1-bit flag information indicating the presence / absence of disparity information (disparity_update) sequentially updated within the caption display period in the corresponding window. In this case, “1” indicates that it exists, and “0” indicates that it does not exist.

「select_view_shift」は、シフト対象指定情報を構成する２ビットの情報である。この「select_view_shift」は、左眼画像に重畳するクローズド・キャプション情報および右眼画像に重畳するクローズド・キャプション情報のうち、視差情報に基づいてシフトさせるクローズド・キャプション情報を指定する。「select_view_shift=00」はリザーブとされる。「select_view_shift=01」であるとき、右眼画像に重畳するクローズド・キャプション情報のみを、視差情報（disparity）分だけ、水平方向にシフトさせることを示す。 “Select_view_shift” is 2-bit information constituting shift target designation information. This “select_view_shift” designates closed caption information to be shifted based on disparity information among the closed caption information superimposed on the left eye image and the closed caption information superimposed on the right eye image. “Select_view_shift = 00” is reserved. When “select_view_shift = 01”, this indicates that only the closed caption information to be superimposed on the right eye image is shifted in the horizontal direction by the amount of disparity information (disparity).

また、「select_view_shift=10」であるとき、左眼画像に重畳するクローズド・キャプション情報のみを、視差情報（disparity）分だけ、水平方向にシフトさせることを示す。さらに、「select_view_shift=11」であるとき、左眼画像に重畳するクローズド・キャプション情報および右眼画像に重畳するクローズド・キャプション情報の双方を、水平方向の互いに逆の方向にシフトさせることを示す。 Further, when “select_view_shift = 10”, it indicates that only the closed caption information to be superimposed on the left eye image is shifted in the horizontal direction by the amount of disparity information (disparity). Further, when “select_view_shift = 11”, both the closed caption information superimposed on the left eye image and the closed caption information superimposed on the right eye image are shifted in opposite directions in the horizontal direction.

「default_disparity」の８ビットフィールドは、デフォルトの視差情報を示す。この視差情報は、更新をしない場合の視差情報、つまり字幕表示期間内において共通に使用される視差情報である。「temporal_extention_flag=1」が“１”である場合、「caption_disparity_data()」は、「disparity_temporal_extension（）」を有する。ここには、基本的に、ベースセグメント期間（ＢＳＰ：Base Segment Period）毎に更新すべき視差情報が格納される。 The 8-bit field “default_disparity” indicates default disparity information. This disparity information is disparity information when updating is not performed, that is, disparity information that is commonly used within a caption display period. When “temporal_extention_flag = 1” is “1”, “caption_disparity_data ()” has “disparity_temporal_extension ()”. Here, basically, disparity information to be updated for each base segment period (BSP) is stored.

上述したように、図２０は、ベースセグメント期間（ＢＳＰ）毎の視差情報の更新例を示している。そして、ベースセグメント期間は、更新フレーム間隔を意味する。この図からも明らかなように、字幕表示期間内で順次更新される視差情報は、字幕表示期間の最初のフレームの視差情報と、その後のベースセグメント期間（更新フレーム間隔）毎のフレームの視差情報とからなっている。 As described above, FIG. 20 illustrates an example of updating disparity information for each base segment period (BSP). The base segment period means an update frame interval. As is clear from this figure, the disparity information sequentially updated within the caption display period includes the disparity information of the first frame in the caption display period, and the disparity information of the frame for each subsequent base segment period (updated frame interval). It is made up of.

図７３は、「disparity_temporal_extension（）」の構造例（syntax）を示している。「temporal_division_size」の２ビットフィールドは、ベースセグメント期間（更新フレーム間隔）に含まれるフレーム数を示す。“００”は、１６フレームであることを示す。“０１”は、２５フレームであることを示す。“１０”は、３０フレームであることを示す。さらに、“１１”は、３２フレームであることを示す。 FIG. 73 illustrates a structure example (syntax) of “disparity_temporal_extension ()”. A 2-bit field of “temporal_division_size” indicates the number of frames included in the base segment period (update frame interval). “00” indicates 16 frames. “01” indicates 25 frames. “10” indicates 30 frames. Further, “11” indicates 32 frames.

「temporal_division_count」は、字幕表示期間に含まれるベースセグメントの個数を示す。「disparity_curve_no_update_flag」は、視差情報の更新の有無を示す１ビットのフラグ情報である。“１”は対応するベースセグメントのエッジで視差情報の更新を行わない、つまりスキップすることを示し、“０”は対応するベースセグメントのエッジで視差情報の更新を行うことを示す。 “Temporal_division_count” indicates the number of base segments included in the caption display period. “Disparity_curve_no_update_flag” is 1-bit flag information indicating whether or not disparity information is updated. “1” indicates that the disparity information is not updated at the edge of the corresponding base segment, that is, skipping, and “0” indicates that the disparity information is updated at the edge of the corresponding base segment.

上述の図２３のベースセグメント期間（ＢＳＰ）毎の視差情報の更新例において、「skip」が付されたベースセグメントのエッジでは視差情報の更新は行われない。このフラグ情報が存在することで、視差情報のフレーム方向の変化が同様となる期間が長く続く場合、視差情報の更新を行わないようにして、その期間内の視差情報の伝送を省略でき、視差情報のデータ量を抑制することが可能となる。 In the example of updating disparity information for each base segment period (BSP) in FIG. 23 described above, disparity information is not updated at the edge of the base segment to which “skip” is attached. When the period in which the change in the frame direction of the disparity information is similar continues for a long time due to the presence of this flag information, disparity information is not updated, and transmission of disparity information within that period can be omitted. It becomes possible to suppress the data amount of information.

「disparity_curve_no_update_flag」が“０”で視差情報の更新を行う場合、ディスパリティインフォメーションは、対応するベースセグメントの「shifting_interval_counts」を含む。また、「disparity_curve_no_update_flag」が“０”で視差情報の更新を行う場合、ディスパリティインフォメーションは、「disparity_update」を含む。「shifting_interval_counts」の６ビットフィールドは、ベースセグメント期間(更新フレーム間隔)を調整するドローファクタ（Draw factor）、つまり差し引きフレーム数を示す。 When “disparity_curve_no_update_flag” is “0” and the disparity information is updated, the disparity information includes “shifting_interval_counts” of the corresponding base segment. Further, when the disparity information is updated when “disparity_curve_no_update_flag” is “0”, the disparity information includes “disparity_update”. A 6-bit field of “shifting_interval_counts” indicates a draw factor for adjusting the base segment period (update frame interval), that is, the number of subtracted frames.

上述の図２３のベースセグメント期間（ＢＳＰ）毎の視差情報の更新例において、時点Ｃ〜Ｆの視差情報の更新タイミングに関しては、ドローファクタ（Draw factor）により、ベースセグメント期間が調整されている。この調整情報が存在することで、ベースセグメント期間(更新フレーム間隔)を調整することが可能となり、受信側に、視差情報の時間方向（フレーム方向）の変化をより的確に伝えることが可能となる。 In the example of updating disparity information for each base segment period (BSP) in FIG. 23 described above, the base segment period is adjusted by the draw factor with respect to the update timing of disparity information at time points C to F. The presence of this adjustment information makes it possible to adjust the base segment period (update frame interval), and more accurately convey changes in the time direction (frame direction) of disparity information to the receiving side. .

上述したように、パッディングバイトとして読み飛ばされていたバイトを新たに拡張定義することで、字幕表示期間で順次更新される視差情報およびそれに付加された更新フレーム間隔の調整情報などの伝送（送信）が可能となる。 As described above, by newly defining a byte that has been skipped as a padding byte, transmission (transmission) of disparity information that is sequentially updated in a caption display period and adjustment information of an update frame interval that is added thereto is transmitted. ) Is possible.

図７５は、ビデオエレメンタリストリーム、オーディオエレメンタリストリーム、字幕エレメンタリストリームを含む一般的なトランスポートストリーム（多重化データストリーム）の構成例を示している。このトランスポートストリームには、各エレメンタリストリームをパケット化して得られたＰＥＳパケットが含まれている。この構成例では、ビデオエレメンタリストリームのＰＥＳパケット「Video PES」が含まれている。また、この構成例では、オーディオエレメンタリストリームのＰＥＳパケット「Audio PES」および字幕エレメンタリストリームのＰＥＳパケット「SubtitlePES」が含まれている。 FIG. 75 illustrates a configuration example of a general transport stream (multiplexed data stream) including a video elementary stream, an audio elementary stream, and a caption elementary stream. This transport stream includes PES packets obtained by packetizing each elementary stream. In this configuration example, a PES packet “Video PES” of a video elementary stream is included. In addition, in this configuration example, the PES packet “Audio PES” of the audio elementary stream and the PES packet “SubtitlePES” of the caption elementary stream are included.

図６２に示す送信データ生成部１１０Ｂでは、視差情報（disparity）は、図７５に示すように、ビデオエレメンタリストリームの視差情報のユーザデータ領域に埋め込まれて伝送（送信）される。 In the transmission data generation unit 110B illustrated in FIG. 62, the disparity information (disparity) is embedded (transmitted) in the user data area of the disparity information of the video elementary stream, as illustrated in FIG.

図６２に示す送信データ生成部１１０Ｂにおいては、立体画像を表示するための左眼画像データおよび右眼画像データを含む立体画像データがビデオエレメンタリストリームのペイロード部に含まれて送信される。また、ＣＣデータおよびそのＣＣデータによるクローズド・キャプション情報に視差を付与するための視差情報が、ビデオエレメンタリストリームのヘッダ部のユーザデータ領域に挿入されて送信される。 In the transmission data generation unit 110B illustrated in FIG. 62, stereoscopic image data including left-eye image data and right-eye image data for displaying a stereoscopic image is included in the payload portion of the video elementary stream and transmitted. Also, disparity information for adding disparity to CC data and closed caption information based on the CC data is inserted into the user data area of the header portion of the video elementary stream and transmitted.

そのため、受信側（セットトップボックス２００）においては、このビデオエレメンタリストリームから、立体画像データを取得できる他に、ＣＣデータおよび視差情報を容易に取得できる。また、受信側においては、左眼画像および右眼画像に重畳される同一のクローズド・キャプション情報に、視差情報を用いて、適切な視差を付与できる。そのため、クローズド・キャプション情報の表示において、画像内の各物体との間の遠近感の整合性を最適な状態に維持できる。 Therefore, on the receiving side (set top box 200), in addition to acquiring stereoscopic image data from this video elementary stream, CC data and disparity information can be easily acquired. On the receiving side, appropriate parallax can be given to the same closed caption information superimposed on the left eye image and the right eye image using the parallax information. Therefore, in the display of closed caption information, the perspective consistency with each object in the image can be maintained in an optimum state.

また、図６２に示す送信データ生成部１１０Ｂにおいては、字幕表示期間内で順次更新される視差情報（図６５、図６８、図７３の「disparity_update」参照）の挿入が可能とされている。そのため、受信側（セットトップボックス２００）において、クローズド・キャプション情報に付与する視差を画像内容の変化に連動して動的に変化させることが可能となる。 In addition, in the transmission data generation unit 110B illustrated in FIG. 62, disparity information (see “disparity_update” in FIGS. 65, 68, and 73) that is sequentially updated within the caption display period can be inserted. Therefore, on the receiving side (set top box 200), the parallax to be added to the closed caption information can be dynamically changed in conjunction with the change of the image content.

また、図６２に示す送信データ生成部１１０Ｂにおいて、字幕表示期間内で順次更新される視差情報は、字幕（クローズド・キャプション情報）の表示期間の最初のフレームの視差情報と、その後の更新フレーム間隔毎のフレームの視差情報とからなるものとされる。そのため、送信データ量を低減でき、また、受信側において、視差情報を保持するためのメモリ容量の大幅な節約が可能となる。 In addition, in the transmission data generation unit 110B illustrated in FIG. 62, the disparity information sequentially updated within the caption display period includes the disparity information of the first frame in the caption (closed caption information) display period, and the subsequent update frame interval. It is assumed to be composed of disparity information of each frame. Therefore, the amount of transmission data can be reduced, and the memory capacity for holding the parallax information can be greatly saved on the receiving side.

また、図６２に示す送信データ生成部１１０Ｂにおいて、「caption_disparity_data()」に含まれる「disparity_temporal_extension()」は、上述のＳＣＳのセグメントに含まれる「disparity_temporal_extension()」と同じ構造のものである（図２１参照）。そのため、詳細説明は省略するが、図６２に示す送信データ生成部１１０Ｂは、この「disparity_temporal_extension()」の構造により、図２に示す送信データ生成部１１０と同様の効果を得ることができる。 62, “disparity_temporal_extension ()” included in “caption_disparity_data ()” has the same structure as “disparity_temporal_extension ()” included in the SCS segment described above (FIG. 62). 21). Therefore, although detailed description is omitted, the transmission data generation unit 110B illustrated in FIG. 62 can obtain the same effect as the transmission data generation unit 110 illustrated in FIG. 2 by the structure of “disparity_temporal_extension ()”.

「送信データ生成部の構成例」
図７６は、上述の図６２に示す送信データ生成部１１０Ｂに対応した、セットトップボックス２００のビットストリーム処理部２０１Ｂの構成例を示している。このビットストリーム処理部２０１Ｂは、上述の図６２に示す送信データ生成部１１０Ｂに対応した構成となっている。このビットストリーム処理部２０１Ｂは、デマルチプレクサ２４１と、ビデオデコーダ２４２と、ＣＣデコーダ２４３を有している。また、このビットストリーム処理部２０１Ｂは、立体画像用ＣＣ発生部２４４と、視差情報取り出し部２４５と、視差情報処理部２４６と、ビデオ重畳部２４７と、オーディオデコーダ２４８を有している。 "Configuration example of transmission data generator"
FIG. 76 illustrates a configuration example of the bit stream processing unit 201B of the set-top box 200 corresponding to the transmission data generation unit 110B illustrated in FIG. 62 described above. The bit stream processing unit 201B has a configuration corresponding to the transmission data generation unit 110B shown in FIG. The bit stream processing unit 201B includes a demultiplexer 241, a video decoder 242, and a CC decoder 243. The bit stream processing unit 201B includes a stereoscopic image CC generating unit 244, a parallax information extracting unit 245, a parallax information processing unit 246, a video superimposing unit 247, and an audio decoder 248.

デマルチプレクサ２４１は、ビットストリームデータＢＳＤから、ビデオ、オーディオのパケットを抽出し、各デコーダに送る。ビデオデコーダ２４２は、上述の送信データ生成部１１０Ｂのビデオエンコーダ１３２とは逆の処理を行う。すなわち、このビデオデコーダ２４２は、デマルチプレクサ２４１で抽出されたビデオのパケットからビデオのエレメンタリストリームを再構成し、復号化処理を行って、左眼画像データおよび右眼画像データを含む立体画像データを得る。 The demultiplexer 241 extracts video and audio packets from the bit stream data BSD and sends them to each decoder. The video decoder 242 performs processing opposite to that of the video encoder 132 of the transmission data generation unit 110B described above. That is, the video decoder 242 reconstructs a video elementary stream from the video packet extracted by the demultiplexer 241, performs decoding processing, and generates stereoscopic image data including left-eye image data and right-eye image data. Get.

この立体画像データの伝送方式は、例えば、上述の第１の伝送方式（「Top & Bottom」方式）、第２の伝送方式は（「Side By Side」方式）、第３の伝送方式（「Frame Sequential」方式）などである（図４（ａ）〜（ｃ）参照）。ビデオデコーダ２４２は、この立体画像データを、ビデオ重畳部２４７に送る。 The transmission method of the stereoscopic image data is, for example, the first transmission method (“Top & Bottom” method) described above, the second transmission method (“Side By Side” method), and the third transmission method (“Frame”). Sequential ”method) (see FIGS. 4A to 4C). The video decoder 242 sends the stereoscopic image data to the video superimposing unit 247.

ＣＣデコーダ２４３は、ビデオデコーダ２４２で再構成されたビデオビデオエレメンタリストリームからＣＣデータが取り出す。そして、ＣＣデコーダ２４３は、このＣＣデータから、キャプション・ウインドウ（Caption Window）毎の、クローズド・キャプション情報（字幕のキャラクタコード）、さらには重畳位置および表示時間の制御データを取得する。 The CC decoder 243 extracts CC data from the video video elementary stream reconstructed by the video decoder 242. Then, the CC decoder 243 acquires closed caption information (caption character code) for each caption window (Caption Window), and control data for the superimposition position and the display time from the CC data.

視差情報取り出し部２４５は、ビデオデコーダ２４２を通じて得られるビデオエレメンタリストリームから視差情報を取り出す。この視差情報は、上述のＣＣデコーダ２４３で取得されるキャプション・ウインドウ（Caption Window）毎のクローズド・キャプションデータ（字幕のキャラクタコード）に対応付けられている。この視差情報は、キャプション・ウインドウ毎の視差ベクトル（個別視差ベクトル）、あるいは各キャプション・ウインドウに共通の視差ベクトル（共通視差ベクトル）である。 The disparity information extracting unit 245 extracts disparity information from the video elementary stream obtained through the video decoder 242. This disparity information is associated with closed caption data (caption character code) for each caption window acquired by the CC decoder 243 described above. This disparity information is a disparity vector (individual disparity vector) for each caption window, or a disparity vector common to each caption window (common disparity vector).

視差情報取り出し部２４５は、字幕表示期間内で共通に使用される視差情報、あるいは字幕表示期間内で順次更新される視差情報を取得する。視差情報取り出し部２４５は、この視差情報を、視差情報処理部２４６を通じて、立体画像用ＣＣ発生部２４４に送る。この字幕表示期間内で順次更新される視差情報は、上述したように、字幕表示期間の最初のフレームの視差情報と、その後のベースセグメント期間（更新フレーム間隔）毎のフレームの視差情報とからなっている。 The disparity information extracting unit 245 acquires disparity information that is commonly used within the caption display period, or disparity information that is sequentially updated within the caption display period. The parallax information extraction unit 245 sends this parallax information to the stereoscopic image CC generation unit 244 via the parallax information processing unit 246. As described above, the disparity information sequentially updated within the caption display period is composed of the disparity information of the first frame in the caption display period and the disparity information of the frame for each subsequent base segment period (update frame interval). ing.

視差情報処理部２４６は、字幕表示期間内で共通に使用される視差情報に関しては、そのまま立体画像用ＣＣ発生部２４４に送る。一方、視差情報処理部２４６は、字幕表示期間内で順次更新される視差情報に関しては、補間処理を施して、字幕表示期間内における任意のフレーム間隔、例えば、１フレーム間隔の視差情報を生成して、立体画像用ＣＣ発生部２４４に送る。視差情報処理部２４６は、この補間処理として、線形補間処理ではなく、時間方向（フレーム方向）にローパスフィルタ（ＬＰＦ）処理を伴った補間処理を行って、補間処理後の所定フレーム間隔の視差情報の時間方向（フレーム方向）の変化をなだらかにしている（図３１参照）。 The parallax information processing unit 246 sends the parallax information used in common during the caption display period to the stereoscopic image CC generation unit 244 as it is. On the other hand, the disparity information processing unit 246 performs interpolation processing on the disparity information sequentially updated within the caption display period, and generates disparity information at an arbitrary frame interval, for example, one frame interval within the caption display period. To the CC generating unit 244 for stereoscopic images. As the interpolation processing, the disparity information processing unit 246 performs not the linear interpolation processing but the interpolation processing accompanied by the low-pass filter (LPF) processing in the time direction (frame direction), and the disparity information at a predetermined frame interval after the interpolation processing. In the time direction (frame direction) is smoothed (see FIG. 31).

立体画像用ＣＣ発生部２４４は、キャプション・ウインドウ（Caption Window）毎に、左眼画像、右眼画像にそれぞれ重畳する左眼クローズド・キャプション情報（字幕）、右眼クローズド・キャプション情報（字幕）のデータを生成する。この生成処理は、ＣＣデコーダ２４３で得られたクローズド・キャプションデータおよび重畳位置制御データと、視差情報取り出し部２４５から視差情報２４６を通じて送られる視差情報（視差ベクトル）に基づいて行われる。そして、この立体画像用ＣＣ発生部２４４は、左眼字幕および左眼字幕のデータ（ビットマップデータ）を出力する。 The stereoscopic image CC generating unit 244 generates, for each caption window (Caption Window), left-eye closed caption information (caption) and right-eye closed caption information (caption) to be superimposed on the left-eye image and the right-eye image, respectively. Generate data. This generation process is performed based on the closed caption data and the superimposed position control data obtained by the CC decoder 243 and the disparity information (disparity vector) sent from the disparity information extracting unit 245 through the disparity information 246. Then, the stereoscopic image CC generating unit 244 outputs left-eye caption and left-eye caption data (bitmap data).

この場合、左眼および左眼の字幕は同一の情報である。しかし、画像内の重畳位置が、例えば、左眼の字幕と右眼の字幕とは、視差ベクトル分だけ、水平方向にずれるようにされる。これにより、左眼画像および右眼画像に重畳される同一の字幕として、画像内の各物体の遠近感に応じて視差調整が施されたものを用いることができ、この字幕の表示において、画像内の各物体との間の遠近感の整合性を維持するようにされる。 In this case, the left eye caption and the left eye caption are the same information. However, the superimposed position in the image is shifted in the horizontal direction by, for example, the parallax vector between the left-eye caption and the right-eye caption. As a result, the same subtitle superimposed on the left eye image and the right eye image can be used with parallax adjusted according to the perspective of each object in the image. The perspective consistency between each object within is maintained.

ここで、立体画像用ＣＣ発生部２４４は、例えば、視差情報処理部２４６から字幕表示期間内で共通に使用される視差情報（視差ベクトル）のみが送られてくる場合、その視差情報を使用する。また、立体画像用ＣＣ発生部２４４は、例えば、視差情報処理部２４６から、字幕表示期間内で順次更新される視差情報（視差ベクトル）のみが送られてくる場合、その視差情報を使用する。また、立体画像用ＣＣ発生部２４４は、例えば、視差情報処理部２４６から、さらに字幕表示期間内で順次更新される視差情報も送られてくる場合には、いずれかを使用する。 Here, for example, when only the disparity information (disparity vector) that is commonly used within the caption display period is transmitted from the disparity information processing unit 246, the stereoscopic image CC generating unit 244 uses the disparity information. . In addition, for example, when only the disparity information (disparity vector) that is sequentially updated within the caption display period is sent from the disparity information processing unit 246, the stereoscopic image CC generating unit 244 uses the disparity information. Further, the stereoscopic image CC generating unit 244 uses, for example, when the parallax information sequentially updated within the caption display period is also sent from the parallax information processing unit 246.

いずれを使用するかは、例えば、上述したように、字幕表示の際に受信側（デコーダ側）で必須の視差情報（disparity）対応レベルを示す情報（図７２の「rendering_level」参照）に拘束される。その場合、例えば、“００”であるときは、ユーザ設定による。字幕表示期間内で順次更新される視差情報を用いることで、左眼および右眼に付与する視差を画像内容の変化に連動して動的に変化させることが可能となる。 Which one is used is constrained by information (see “rendering_level” in FIG. 72) indicating the disparity information (disparity) correspondence level that is essential on the reception side (decoder side), for example, as described above. The In this case, for example, when “00”, it depends on the user setting. By using the disparity information that is sequentially updated within the caption display period, it is possible to dynamically change the disparity to be given to the left eye and the right eye in conjunction with the change in the image content.

ビデオ重畳部２４７は、ビデオデコーダ２４２で得られた立体画像データ（左眼画像データ、右眼画像データ）に対し、立体画像用ＣＣ発生部２４４で発生された左眼および左眼の字幕のデータ（ビットマップデータ）を重畳し、表示用立体画像データＶoutを得る。そして、このビデオ重畳部２４７は、表示用立体画像データＶoutを、ビットストリーム処理部２０１Ｂの外部に出力する。 The video superimposing unit 247 performs the left-eye and left-eye caption data generated by the stereoscopic image CC generating unit 244 for the stereoscopic image data (left-eye image data and right-eye image data) obtained by the video decoder 242. (Bitmap data) is superimposed to obtain display stereoscopic image data Vout. Then, the video superimposing unit 247 outputs the display stereoscopic image data Vout to the outside of the bit stream processing unit 201B.

また、オーディオデコーダ２４８は、上述の送信データ生成部１１０Ｂのオーディオエンコーダ１３３とは逆の処理を行う。すなわち、このオーディオデコーダ２４８は、デマルチプレクサ２４１で抽出されたオーディオのパケットからオーディオのエレメンタリストリームを再構成し、復号化処理を行って、音声データＡoutを得る。そして、このオーディオデコーダ２４８は、音声データＡoutを、ビットストリーム処理部２０１Ｂの外部に出力する。 Also, the audio decoder 248 performs a process reverse to that of the audio encoder 133 of the transmission data generation unit 110B described above. That is, the audio decoder 248 reconstructs an audio elementary stream from the audio packet extracted by the demultiplexer 241 and performs a decoding process to obtain audio data Aout. The audio decoder 248 then outputs the audio data Aout to the outside of the bit stream processing unit 201B.

図７６に示すビットストリーム処理部２０１Ｂの動作を簡単に説明する。デジタルチューナ２０４（図２９参照）から出力されるビットストリームデータＢＳＤは、デマルチプレクサ２４１に供給される。このデマルチプレクサ２４１では、ビットストリームデータＢＳＤから、ビデオおよびオーディオのパケットが抽出され、各デコーダに供給される。ビデオデコーダ２４２では、デマルチプレクサ２４１で抽出されたビデオのパケットからビデオのエレメンタリストリームが再構成され、さらに復号化処理が行われて、左眼画像データおよび右眼画像データを含む立体画像データが得られる。この立体画像データは、ビデオ重畳部２４７に供給される。 The operation of the bit stream processing unit 201B shown in FIG. 76 will be briefly described. The bit stream data BSD output from the digital tuner 204 (see FIG. 29) is supplied to the demultiplexer 241. In the demultiplexer 241, video and audio packets are extracted from the bit stream data BSD and supplied to each decoder. The video decoder 242 reconstructs a video elementary stream from the video packet extracted by the demultiplexer 241, further performs decoding processing, and generates stereoscopic image data including left-eye image data and right-eye image data. can get. The stereoscopic image data is supplied to the video superimposing unit 247.

また、ビデオデコーダ２４２で再構成されたビデオビデオエレメンタリストリームはＣＣデコーダ２４３に供給される。このＣＣデコーダ２４３では、ビデオエレメンタリストリームからＣＣデータが取り出される。このＣＣデコーダ２４３では、ＣＣデータから、キャプション・ウインドウ（Caption Window）毎の、クローズド・キャプション情報（字幕のキャラクタコード）、さらには重畳位置および表示時間の制御データが取得される。このクローズド・キャプション情報と、重畳位置および表示時間の制御データは、立体画像用ＣＣ発生部２４４に供給される。 The video video elementary stream reconstructed by the video decoder 242 is supplied to the CC decoder 243. The CC decoder 243 extracts CC data from the video elementary stream. In the CC decoder 243, closed caption information (caption character code) for each caption window (caption window), and control data for the superimposed position and display time are obtained from the CC data. The closed caption information and the control data of the superimposed position and the display time are supplied to the stereoscopic image CC generator 244.

また、ビデオデコーダ２４２で再構成されたビデオビデオエレメンタリストリームは視差情報取り出し部２４５に供給される。視差情報取り出し部２４５では、ビデオエレメンタリストリームから視差情報が取り出される。この視差情報は、上述のＣＣデコーダ２４３で取得されるキャプション・ウインドウ（Caption Window）毎のクローズド・キャプションデータ（字幕のキャラクタコード）に対応付けられている。この視差情報は、視差情報処理部２４６を通じて、立体画像用ＣＣ発生部２４４に供給される。 The video video elementary stream reconstructed by the video decoder 242 is supplied to the disparity information extracting unit 245. The disparity information extracting unit 245 extracts disparity information from the video elementary stream. This disparity information is associated with closed caption data (caption character code) for each caption window acquired by the CC decoder 243 described above. The parallax information is supplied to the stereoscopic image CC generating unit 244 through the parallax information processing unit 246.

視差情報処理部２４６では、字幕表示期間内で順次更新される視差情報に関して、以下の処理が行われる。すなわち、視差情報処理部２４６では、時間方向（フレーム方向）のＬＰＦ処理を伴った補間処理が施されて、字幕表示期間内における任意のフレーム間隔、例えば、１フレーム間隔の視差情報が生成されて、立体画像用ＣＣ発生部２４４に送られる。 The disparity information processing unit 246 performs the following processing on the disparity information that is sequentially updated within the caption display period. That is, the disparity information processing unit 246 performs interpolation processing with LPF processing in the time direction (frame direction), and generates disparity information at an arbitrary frame interval within the caption display period, for example, one frame interval. And sent to the CC generating unit 244 for stereoscopic images.

立体画像用ＣＣ発生部２４４では、キャプション・ウインドウ（Caption Window）毎に、左眼画像、右眼画像にそれぞれ重畳する左眼クローズド・キャプション情報（字幕）、右眼クローズド・キャプション情報（字幕）のデータが生成される。この生成処理は、ＣＣデコーダ２４３で得られたクローズド・キャプションデータおよび重畳位置制御データと、視差情報取り出し部２４５から視差情報処理部２４６を通じて供給された視差情報（視差ベクトル）に基づいて行われる。 In the stereoscopic image CC generating unit 244, for each caption window (Caption Window), left-eye closed caption information (caption) and right-eye closed caption information (caption) to be superimposed on the left-eye image and right-eye image, respectively. Data is generated. This generation process is performed based on the closed caption data and superposition position control data obtained by the CC decoder 243 and the disparity information (disparity vector) supplied from the disparity information extracting unit 245 through the disparity information processing unit 246.

立体画像用ＣＣ発生部２４４では、左眼クローズド・キャプション情報および右眼クローズド・キャプション情報のいずれか、あるいは双方に対して、視差を付与するためのシフト処理が行われる。この場合、視差情報処理部２４６を通じて供給された視差情報が、各フレームで共通に使用される視差情報であるとき、左眼画像、右眼画像に重畳されるクローズド・キャプション情報に、この共通の視差情報に基づいて視差が付与される。また、その視差情報が、各フレームで順次更新される視差情報であるとき、左眼画像、右眼画像に重畳されるクローズド・キャプション情報に、フレーム毎に更新された視差情報に基づいて視差が付与される。 In the stereoscopic image CC generating unit 244, a shift process for adding parallax is performed on either or both of the left-eye closed caption information and the right-eye closed caption information. In this case, when the disparity information supplied through the disparity information processing unit 246 is disparity information commonly used in each frame, the common caption information superimposed on the left eye image and the right eye image is included in the common caption information. Parallax is given based on the parallax information. Further, when the disparity information is disparity information that is sequentially updated in each frame, the disparity is based on the disparity information updated for each frame in the closed caption information superimposed on the left eye image and the right eye image. Is granted.

このように、立体画像用ＣＣ発生部２４４でキャプション・ウインドウ（Caption Window）毎に生成された左眼および右眼のクローズド・キャプション情報のデータ（ビットマップデータ）は、表示時間の制御データと共に、ビデオ重畳部２４７に供給される。ビデオ重畳部２４７では、ビデオデコーダ２４２で得られた立体画像データ（左眼画像データ、右眼画像データ）に対して、立体画像用ＣＣ発生部２４４から供給されるクローズド・キャプション情報のデータが重畳され、表示用立体画像データＶoutが得られる。 As described above, the data (bitmap data) of the closed caption information of the left eye and the right eye generated for each caption window (Caption Window) by the stereoscopic image CC generating unit 244, together with display time control data, This is supplied to the video superimposing unit 247. The video superimposing unit 247 superimposes the closed caption information data supplied from the stereoscopic image CC generating unit 244 on the stereoscopic image data (left-eye image data and right-eye image data) obtained by the video decoder 242. As a result, display stereoscopic image data Vout is obtained.

また、オーディオデコーダ２４８では、デマルチプレクサ２４１で抽出されたオーディオのパケットからオーディオのエレメンタリストリームが再構成され、さらに復号化処理が行われて、上述の表示用立体画像データＶoutに対応した音声データＡoutが得られる。この音声データＡoutは、ビットストリーム処理部２０１Ａの外部に出力される。 Also, the audio decoder 248 reconstructs an audio elementary stream from the audio packet extracted by the demultiplexer 241, further performs decoding processing, and audio data corresponding to the display stereoscopic image data Vout described above. Aout is obtained. The audio data Aout is output to the outside of the bit stream processing unit 201A.

図７６に示すビットストリーム処理部２０１Ｂにおいては、ビデオエレメンタリストリームのペイロード部から立体画像データを取得でき、また、そのヘッダ部のユーザデータ領域からＣＣデータおよび視差情報を取得できる。そのため、左眼画像および右眼画像に重畳されるクローズド・キャプション情報に、このクローズド・キャプション情報に合った視差情報を用いて、適切な視差を付与できる。したがって、クローズド・キャプション情報の表示において、画像内の各物体との間の遠近感の整合性を最適な状態に維持できる。 In the bit stream processing unit 201B illustrated in FIG. 76, stereoscopic image data can be acquired from the payload portion of the video elementary stream, and CC data and disparity information can be acquired from the user data area of the header portion. Therefore, appropriate disparity can be given to the closed caption information superimposed on the left eye image and the right eye image by using the disparity information suitable for the closed caption information. Therefore, in the display of closed caption information, perspective consistency with each object in the image can be maintained in an optimum state.

また、図７６に示すビットストリーム処理部２０１Ｂの視差情報取り出し部２４５では、字幕表示期間内で共通に使用される視差情報、または、これと共に字幕表示期間内で順次更新される視差情報が取得される。立体画像用ＣＣ発生部２４４では、字幕表示期間内で順次更新される視差情報が使用されることで、左眼画像および右眼画像に重畳されるクローズド・キャプション情報に付与する視差を画像内容の変化に連動して動的に変化させることが可能となる。 In addition, the disparity information extracting unit 245 of the bitstream processing unit 201B illustrated in FIG. 76 acquires disparity information that is commonly used within the caption display period, or is updated sequentially with the disparity information that is sequentially updated within the caption display period. The The stereoscopic image CC generating unit 244 uses the disparity information that is sequentially updated within the caption display period, so that the disparity to be added to the closed caption information superimposed on the left eye image and the right eye image is included in the image content. It becomes possible to change dynamically in conjunction with the change.

また、図７６に示すビットストリーム処理部２０１Ｂの視差情報処理部２４６では、字幕表示期間内で順次更新される視差情報に対して補間処理が施されて字幕表示期間内における任意のフレーム間隔の視差情報が生成される。この場合、送信側（放送局１００）から１６フレーム等のベースセグメント期間（更新フレーム間隔）毎に視差情報が送信される場合であっても、左眼画像および右眼画像に重畳されるクローズド・キャプション情報に付与される視差を、細かな間隔で、例えばフレーム毎に制御することが可能となる。 In addition, in the disparity information processing unit 246 of the bit stream processing unit 201B illustrated in FIG. 76, an interpolation process is performed on disparity information sequentially updated within the caption display period, and the disparity at an arbitrary frame interval within the caption display period. Information is generated. In this case, even when disparity information is transmitted from the transmission side (broadcast station 100) every base segment period (update frame interval) such as 16 frames, the closed The parallax given to the caption information can be controlled at fine intervals, for example, for each frame.

また、図７６に示すビットストリーム処理部２０１Ｂの視差情報処理部２４６では、時間方向（フレーム方向）のローパスフィルタ処理を伴った補間処理が行われる。そのため、送信側（放送局１００）からベースセグメント期間（更新フレーム間隔）毎に視差情報が送信される場合であっても、補間処理後の視差情報の時間方向（フレーム方向）の変化をなだらかにできる（図３１参照）。したがって、左眼画像および右眼画像に重畳されるクローズド・キャプション情報に付与される視差の推移が、更新フレーム間隔毎に不連続となることによる違和感を抑制できる。 In addition, the disparity information processing unit 246 of the bit stream processing unit 201B illustrated in FIG. 76 performs an interpolation process with a low-pass filter process in the time direction (frame direction). Therefore, even when disparity information is transmitted from the transmission side (broadcast station 100) every base segment period (updated frame interval), the change in the time direction (frame direction) of the disparity information after the interpolation processing is gently performed. Yes (see FIG. 31). Therefore, it is possible to suppress a sense of incongruity due to discontinuity in the disparity transition of the disparity given to the closed caption information superimposed on the left eye image and the right eye image.

＜２．変形例＞
なお、図７７は、「disparity_temporal_extension（）」の他の構造例（Syntax）を示している。また、図７８は、その構造例に関連する主要なデータ規定内容（semantics）を示している。「disparity_update_count」の８ビットフィールドは、視差情報（disparity）の更新回数を示す。そして、この視差情報の更新回数で規制されたforループが存在する。 <2. Modification>
Note that FIG. 77 illustrates another example structure (Syntax) of “disparity_temporal_extension ()”. FIG. 78 shows main data defining contents (semantics) related to the structural example. An 8-bit field of “disparity_update_count” indicates the number of updates of disparity information (disparity). There is a for loop regulated by the number of updates of the parallax information.

「interval_count」の８ビットフィールドは、更新期間を、後述する「interval_PTS」で示されるインターバル期間（Interval period）の倍数で示す。「disparity_update」の８ビットフィールドは、対応する更新期間の視差情報を示す。なお、ｋ＝０における「disparity_update」は、字幕表示期間内において更新フレーム間隔で順次更新される視差情報の初期値、つまり、字幕表示期間における最初のフレームの視差情報である。 The 8-bit field of “interval_count” indicates the update period as a multiple of an interval period (Interval period) indicated by “interval_PTS” described later. The 8-bit field of “disparity_update” indicates disparity information for the corresponding update period. Note that “disparity_update” at k = 0 is an initial value of disparity information that is sequentially updated at update frame intervals within the caption display period, that is, disparity information of the first frame in the caption display period.

なお、図２１に示す構造の「disparity_temporal_extension（）」の代わりに、図７７に示す構造の「disparity_temporal_extension（）」を用いる場合、例えば、図１８に示すＳＣＳ（Subregion Composition segment）の実質的な情報を含む部分に、「interval_PTS」の３３ビットフィールドが設けられる。この「interval_PTS」は、インターバル期間（Interval period）を９０ＫＨｚ単位で指定する。つまり、「interval_PTS」は、このインターバル期間（Interval period）を９０ＫＨｚのクロックで計測した値を３３ビット長で表す。 When “disparity_temporal_extension ()” of the structure shown in FIG. 77 is used instead of “disparity_temporal_extension ()” of the structure shown in FIG. 21, for example, substantial information of the SCS (Subregion Composition segment) shown in FIG. A 33-bit field of “interval_PTS” is provided in the included portion. This “interval_PTS” designates an interval period (Interval period) in units of 90 KHz. That is, “interval_PTS” represents a value obtained by measuring this interval period (Interval period) with a 90 KHz clock in a 33-bit length.

図７９、図８０は、図７７に示す構造の「disparity_temporal_extension（）」を用いた場合における、視差情報の更新例を示している。図７９は、「interval_PTS」で示されるインターバル期間（Interval period）が固定で、しかも、その期間が更新期間と等しい場合を示している。この場合、「interval_count」は、「１」となる。 79 and 80 illustrate an example of disparity information update in the case where “disparity_temporal_extension ()” having the structure illustrated in FIG. 77 is used. FIG. 79 shows a case where the interval period (Interval period) indicated by “interval_PTS” is fixed and the period is equal to the update period. In this case, “interval_count” is “1”.

一方、図８０は、一般的なもので、「interval_PTS」で示されるインターバル期間（Interval period）を短期間（例えば、フレーム周期でもよい）とした場合の、視差情報の更新例を示している。この場合、「interval_count」は、各更新期間において、Ｍ，Ｎ，Ｐ，Ｑ，Ｒとなる。なお、図７９、図８０において、“Ａ”は字幕表示期間の開始フレーム（開始時点）を示し、“Ｂ”〜“Ｆ”は、その後の更新フレーム（更新時点）を示している。 On the other hand, FIG. 80 is a general example and shows an example of updating disparity information when the interval period (Interval period) indicated by “interval_PTS” is a short period (for example, a frame period may be used). In this case, “interval_count” is M, N, P, Q, and R in each update period. 79 and 80, “A” indicates the start frame (start time) of the caption display period, and “B” to “F” indicate subsequent update frames (update time).

図７７に示す構造の「disparity_temporal_extension（）」を用いて、字幕表示期間内で順次更新される視差情報を受信側（セットトップボックス２００など）に送る場合も、受信側においては、上述したと同様の処理が可能である。すなわち、この場合も、受信側においては、更新期間毎の視差情報に補間処理を施すことで、任意のフレーム間隔、例えば、１フレーム間隔の視差情報を生成して使用することが可能である。 Even when disparity information sequentially updated within the caption display period is sent to the receiving side (such as the set top box 200) using “disparity_temporal_extension ()” having the structure shown in FIG. 77, the receiving side is the same as described above. Can be processed. That is, in this case as well, on the receiving side, it is possible to generate and use disparity information at an arbitrary frame interval, for example, one frame interval, by performing an interpolation process on the disparity information for each update period.

図８１（ａ）は、図７７に示す構造の「disparity_temporal_extension（）」を用いる場合のサブタイトルデータストリームの構成例を示している。ＰＥＳヘッダには、時間情報（ＰＴＳ）が含まれている。また、ＰＥＳペイロードデータとして、ＤＤＳ、ＰＣＳ、ＲＣＳ、ＣＤＳ、ＯＤＳ、ＳＣＳ、ＥＯＳの各セグメントが含まれている。これらは、サブタイトル表示期間の開始前に一括送信される。なお、上述していないが、図２１に示す構造の「disparity_temporal_extension（）」を用いる場合のサブタイトルデータストリームの構成例も同様となる。 FIG. 81A shows a configuration example of a subtitle data stream when “disparity_temporal_extension ()” having the structure shown in FIG. 77 is used. The PES header includes time information (PTS). Moreover, each segment of DDS, PCS, RCS, CDS, ODS, SCS, and EOS is included as PES payload data. These are transmitted all at once before the start of the subtitle display period. Although not described above, the configuration example of the subtitle data stream in the case of using “disparity_temporal_extension ()” having the structure shown in FIG. 21 is the same.

なお、ＳＣＳセグメントに「disparity_temporal_extension（）」を含めずに、字幕表示期間内で順次更新される視差情報を受信側（セットトップボックス２００など）に送ることもできる。この場合、「temporal_extension_flag = 0」とされ、ＳＣＳセグメントでは、「subregion_disparity」のみが符号化される（図１８参照）。この場合、サブタイトルデータストリームに、更新を行うタイミング毎にＳＣＳセグメントが挿入される。その場合には、各更新タイミングのＳＣＳセグメントには、図示は省略するが、時間情報として、時間差分値（delta_PTS）が付加される。 Note that disparity information that is sequentially updated within the caption display period can be sent to the receiving side (such as the set top box 200) without including “disparity_temporal_extension ()” in the SCS segment. In this case, “temporal_extension_flag = 0” is set, and only “subregion_disparity” is encoded in the SCS segment (see FIG. 18). In this case, an SCS segment is inserted into the subtitle data stream at every update timing. In this case, although not shown, a time difference value (delta_PTS) is added to the SCS segment at each update timing as time information.

図８１（ｂ）は、その場合のサブタイトルデータストリームの構成例を示している。ＰＥＳペイロードデータとして、最初に、ＤＤＳ、ＰＣＳ、ＲＣＳ、ＣＤＳ、ＯＤＳ、ＳＣＳの各セグメントが送信される。その後に、更新を行うタイミングで、時間差分値（delta_PTS）および視差情報が更新された所定個数のＳＣＳセグメントが送信される。最後には、ＳＣＳセグメントと共にＥＯＳセグメントも送信される。 FIG. 81 (b) shows a configuration example of the subtitle data stream in that case. First, each segment of DDS, PCS, RCS, CDS, ODS, and SCS is transmitted as PES payload data. Thereafter, a predetermined number of SCS segments in which the time difference value (delta_PTS) and the disparity information are updated are transmitted at the timing of updating. Finally, an EOS segment is transmitted along with the SCS segment.

図８２は、上述したようにＳＣＳセグメントを順次送信する場合における、視差情報の更新例を示している。なお、図８２において、“Ａ”は字幕表示期間の開始フレーム（開始時点）を示し、“Ｂ”〜“Ｆ”は、その後の更新フレーム（更新時点）を示している。 FIG. 82 illustrates an example of updating disparity information when sequentially transmitting SCS segments as described above. In FIG. 82, “A” indicates the start frame (start point) of the caption display period, and “B” to “F” indicate subsequent update frames (update point).

ＳＣＳセグメントを順次送信して、字幕表示期間内で順次更新される視差情報を受信側（セットトップボックス２００など）に送る場合も、受信側においては、上述したと同様の処理が可能である。すなわち、この場合も、受信側においては、更新期間毎の視差情報に補間処理を施すことで、任意のフレーム間隔、例えば、１フレーム間隔の視差情報を生成して使用することが可能である。 Even when the SCS segment is sequentially transmitted and the disparity information sequentially updated within the caption display period is transmitted to the receiving side (such as the set top box 200), the receiving side can perform the same processing as described above. That is, in this case as well, on the receiving side, it is possible to generate and use disparity information at an arbitrary frame interval, for example, one frame interval, by performing an interpolation process on the disparity information for each update period.

なお、上述した図７７に示す構造の「disparity_temporal_extension（）」を用いることを、図２に示す送信データ生成部１１０の説明（図２１など）を用いて行っている。しかし、詳細説明は省略するが、このことは、ＤＶＢ方式だけでなく、ＡＲＩＢ方式およびＣＥＡ方式においても、同様に可能であることは勿論である。 Note that the use of “disparity_temporal_extension ()” having the structure shown in FIG. 77 is performed using the description of the transmission data generation unit 110 shown in FIG. 2 (FIG. 21 and the like). However, although detailed explanation is omitted, it is needless to say that this is possible not only in the DVB system but also in the ARIB system and the CEA system.

図８３は、上述の図８０と同様の、視差情報（disparity）の更新例を示している。更新フレーム間隔は、単位期間としてのインターバル期間（ＩＤ：Interval Duration）の倍数で表される。例えば、更新フレーム間隔「DivisionPeriod 1」は“ＩＤ＊Ｍ”で表され、更新フレーム間隔「Division Period 2」は“ＩＤ＊Ｎ”で表され、以下の各更新フレーム間隔も同様に表される。図８３に示す視差情報の更新例においては、更新フレーム間隔は固定ではなく、視差情報カーブに応じた更新フレーム間隔の設定が行われている。 FIG. 83 shows an example of updating disparity information (disparity) similar to FIG. 80 described above. The update frame interval is represented by a multiple of an interval period (ID: Interval Duration) as a unit period. For example, the update frame interval “DivisionPeriod 1” is represented by “ID * M”, the update frame interval “Division Period 2” is represented by “ID * N”, and the following update frame intervals are similarly represented. In the example of updating disparity information shown in FIG. 83, the update frame interval is not fixed, and the update frame interval is set according to the disparity information curve.

また、この視差情報（disparity）の更新例において、受信側では、字幕表示期間の開始フレーム（開始時刻）Ｔ1_0は、この視差情報が含まれるＰＥＳストリームのヘッダに挿入されているＰＴＳ（PresentationTime Stamp）で与えられる。そして、受信側では、視差情報の各更新時刻が、各更新フレーム間隔の情報であるインターバル期間の情報（単位期間の情報）およびそのインターバル期間の個数の情報に基づいて求められる。 Also, in this update example of disparity information (disparity), on the receiving side, the start frame (start time) T1_0 of the caption display period is a PTS (Presentation Time Stamp) inserted in the header of the PES stream including this disparity information. Given in. Then, on the receiving side, each update time of the disparity information is obtained based on information on interval periods (information on unit periods) that is information on each update frame interval and information on the number of the interval periods.

この場合、字幕表示期間の開始フレーム（開始時刻）Ｔ1_0から、以下の（１）式に基づいて、順次各更新時刻が求められる。この（１）式において、「interval_count」はインターバル期間の個数を示し、図８３におけるＭ，Ｎ，Ｐ，Ｑ，Ｒ，Ｓに相当する値である。また、この（１）式において、「interval_time」は、図８３におけるインターバル期間（ＩＤ）に相当する値である。 In this case, each update time is sequentially obtained from the start frame (start time) T1_0 of the caption display period based on the following equation (1). In this equation (1), “interval_count” indicates the number of interval periods, and is a value corresponding to M, N, P, Q, R, and S in FIG. In this equation (1), “interval_time” is a value corresponding to the interval period (ID) in FIG.

Tm_n= Tm_(n-1) + (interval_time * interval_count) ・・・（１） Tm_n = Tm_ (n-1) + (interval_time * interval_count) (1)

例えば、図８３に示す更新例においては、この（１）式に基づいて、各更新時刻が以下のように求められる。すなわち、更新時刻Ｔ1_1は、開始時刻（Ｔ1_0）と、インターバル期間（ＩＤ）と、個数（Ｍ）が用いられて、「T1_1 = T1_0 + (ID * M) 」のように求められる。また、更新時刻Ｔ1_2は、更新時刻（Ｔ1_1）と、インターバル期間（ＩＤ）と、個数（Ｎ）が用いられて、「T1_2 = T1_1+ (ID * N) 」のように求められる。以降の各更新時刻も同様に求められる。 For example, in the update example shown in FIG. 83, each update time is obtained as follows based on the equation (1). That is, the update time T1_1 is obtained as “T1_1 = T1_0 + (ID * M)” using the start time (T1_0), the interval period (ID), and the number (M). Further, the update time T1_2 is obtained as “T1_2 = T1_1 + (ID * N)” using the update time (T1_1), the interval period (ID), and the number (N). Each subsequent update time is obtained in the same manner.

図８３に示す更新例において、受信側では、字幕表示期間内で順次更新される視差情報に関して、補間処理が施され、字幕表示期間内における任意のフレーム間隔、例えば、１フレーム間隔の視差情報が生成されて使用される。例えば、この補間処理として、線形補間処理ではなく、時間方向（フレーム方向）にローパスフィルタ（ＬＰＦ）処理を伴った補間処理が行われることで、補間処理後の所定フレーム間隔の視差情報の時間方向（フレーム方向）を変化がなだらかとされる。図８３の破線ａはＬＰＦ出力例を示している。 In the update example shown in FIG. 83, on the reception side, interpolation processing is performed on the disparity information sequentially updated within the caption display period, and disparity information at an arbitrary frame interval, for example, one frame interval within the caption display period. Generated and used. For example, the interpolation process is not a linear interpolation process but an interpolation process with a low-pass filter (LPF) process in the time direction (frame direction), so that the time direction of the disparity information at a predetermined frame interval after the interpolation process is performed. The (frame direction) changes gently. A broken line a in FIG. 83 shows an example of LPF output.

図８４は、サブタイトルデータストリームの構成例を示している。ＰＥＳヘッダには、時間情報（ＰＴＳ）が含まれている。また、ＰＥＳペイロードデータとして、ＤＤＳ、ＰＣＳ、ＲＣＳ、ＣＤＳ、ＯＤＳ、ＤＳＳ（Display Signaling Segment）、ＥＯＳの各セグメントが含まれている。これらは、サブタイトル表示期間の開始前に一括送信される。 FIG. 84 illustrates a configuration example of the subtitle data stream. The PES header includes time information (PTS). The PES payload data includes DDS, PCS, RCS, CDS, ODS, DSS (Display Signaling Segment), and EOS segments. These are transmitted all at once before the start of the subtitle display period.

ＤＳＳのセグメントには、上述の図８３に示すような視差情報更新を実現するための、視差情報が含まれている。すなわち、このＤＳＳには、字幕表示期間の開始フレーム（開始時刻）の視差情報と、その後の更新フレーム間隔毎のフレームの視差情報が含まれる。また、この視差情報には、更新フレーム間隔の情報として、インターバル期間の情報（単位期間の情報）およびそのインターバル期間の個数の情報が付加されている。これにより、受信側においては、各更新フレーム間隔を「単位期間＊個数」の計算により簡単に求めることができる。 The DSS segment includes disparity information for realizing disparity information update as shown in FIG. 83 described above. That is, this DSS includes disparity information of the start frame (start time) of the caption display period, and disparity information of frames for each subsequent update frame interval. In addition, information on the interval period (unit period information) and information on the number of the interval periods are added to the disparity information as the update frame interval information. Thereby, on the receiving side, each update frame interval can be easily obtained by calculating “unit period * number”.

また、ＤＳＳのセグメントには、字幕表示期間に順次更新される視差情報として、リージョン単位、あるいはこのリージョンに含まれるサブリージョン単位の視差情報と、全てのリージョンを含むページ単位の視差情報のいずれか、あるいは双方が選択的に含まれる。また、このＤＳＳには、字幕表示期間で固定の視差情報として、リージョン単位、あるいはこのリージョンに含まれるサブリージョン単位の視差情報と、全てのリージョンを含むページ単位の視差情報とが含まれる。 In addition, in the DSS segment, as disparity information sequentially updated in the caption display period, either disparity information in units of regions or subregions included in this region and disparity information in units of pages including all regions are included. Or both are selectively included. Also, the DSS includes disparity information in units of regions or subregions included in this region and disparity information in units of pages including all regions as disparity information fixed in the caption display period.

図８５は、字幕としてのサブタイトルの表示例を示している。この表示例においては、ページ領域（Area for Page_default）に、字幕表示領域としてのリージョン（Region）が２つ（リージョン１、リージョン２）含まれている。リージョンには１つまたは複数のサブリージョンが含まれている。ここでは、リージョンに１つのサブリージョンが含まれており、リージョン領域とサブリージョン領域とが等しいものとする。 FIG. 85 shows a display example of subtitles as subtitles. In this display example, the page area (Area for Page_default) includes two regions (Region 1 and Region 2) as subtitle display areas. A region includes one or more subregions. Here, it is assumed that the region includes one subregion, and the region region and the subregion region are equal.

図８６は、ＤＳＳのセグメントに、字幕表示期間に順次更新される視差情報（Disparity）として、リージョン単位の視差情報とページ単位の視差情報の双方が含まれている場合において、各リージョンとページの視差情報カーブの一例を示している。ここで、ページの視差情報カーブは、２つのリージョンの視差情報カーブの最小値を採るような形とされている。 FIG. 86 shows the case where each region and page disparity is included when the DSS segment includes both disparity information in units of regions and disparity information in units of pages as disparity information (Disparity) sequentially updated during the caption display period. An example of the parallax information curve is shown. Here, the parallax information curve of the page is configured to take the minimum value of the parallax information curves of the two regions.

リージョン１（Region1）に関しては、開始時刻であるＴ1_0と、その後の更新時刻であるＴ1_1，Ｔ1_2，Ｔ1_3，・・・，Ｔ1_6の７個の視差情報が存在する。また、リージョン２（Region2）に関しては、開始時刻であるＴ2_0と、その後の更新時刻であるＴ2_1，Ｔ2_2，Ｔ2_3，・・・，Ｔ2_7の８個の視差情報が存在する。さらに、ページ（Page_default）に関しては、開始時刻であるＴ0_0と、その後の更新時刻であるＴ0_1，Ｔ0_2，Ｔ0_3，・・・，Ｔ0_6の７個の視差情報が存在する。 Regarding the region 1 (Region1), there are seven pieces of disparity information of T1_0 which is a start time and T1_1, T1_2, T1_3,. For region 2 (Region2), there are eight pieces of disparity information of T2_0 which is a start time and T2_1, T2_2, T2_3,..., T2_7 which are update times thereafter. Further, regarding the page (Page_default), there are seven pieces of disparity information, that is, T0_0 that is a start time and T0_1, T0_2, T0_3,.

図８７は、図８６に示すページおよび各リージョンの視差情報がどのような構造で送られるかを示している。最初にページレイヤについて説明する。このページレイヤには、視差情報の固定値である「page_default_disparity」が配置される。そして、字幕表示期間に順次更新される視差情報に関しては、開始時刻とその後の各更新時刻に対応した、インターバル期間の個数を示す「interval_count」と、視差情報を示す「disparity_page_updete」が、順次配置される。なお、開始時刻の「interval_count」は“０”とされる。 FIG. 87 shows a structure in which the disparity information of the page and each region shown in FIG. 86 is sent. First, the page layer will be described. In this page layer, “page_default_disparity” that is a fixed value of disparity information is arranged. For the disparity information sequentially updated in the caption display period, “interval_count” indicating the number of interval periods and “disparity_page_updete” indicating disparity information corresponding to the start time and each subsequent update time are sequentially arranged. The The start time “interval_count” is set to “0”.

次に、リージョンレイヤについて説明する。リージョン１（サブリージョン１）については、視差情報の固定値である「subregion_disparity_integer_part」および「subregion_disparity_fractional_part」が配置される。ここで、「subregion_disparity_integer_part」は視差情報の整数部分を示し、「subregion_disparity_fractional_part」は視差情報の小数部分を示している。 Next, the region layer will be described. For region 1 (subregion 1), “subregion_disparity_integer_part” and “subregion_disparity_fractional_part” which are fixed values of disparity information are arranged. Here, “subregion_disparity_integer_part” indicates an integer part of disparity information, and “subregion_disparity_fractional_part” indicates a decimal part of disparity information.

そして、字幕表示期間に順次更新される視差情報に関しては、開始時刻とその後の各更新時刻に対応した、インターバル期間の個数を示す「interval_count」と、視差情報を示す「disparity_region_updete_integer_part」および「disparity_region_updete_fractional_part」が、順次配置される。ここで、「disparity_region_updete_integer_part」は視差情報の整数部分を示し、「disparity_region_updete_fractional_part」は視差情報の小数部分を示している。なお、開始時刻の「interval_count」は“０”とされる。 For the disparity information sequentially updated in the caption display period, “interval_count” indicating the number of interval periods, “disparity_region_updete_integer_part” and “disparity_region_updete_fractional_part” indicating the disparity information are associated with the start time and each subsequent update time. Are arranged sequentially. Here, “disparity_region_updete_integer_part” indicates an integer part of disparity information, and “disparity_region_updete_fractional_part” indicates a decimal part of disparity information. The start time “interval_count” is set to “0”.

リージョン２（サブリージョン２）については、上述のリージョン１と同様であり、視差情報の固定値である「subregion_disparity_integer_part」および「subregion_disparity_fractional_part」が配置される。そして、字幕表示期間に順次更新される視差情報に関しては、開始時刻とその後の各更新時刻に対応した、インターバル期間の個数を示す「interval_count」と、視差情報を示す「disparity_region_updete_integer_part」および「disparity_region_updete_fractional_part」が、順次配置される。 Region 2 (subregion 2) is the same as region 1 described above, and “subregion_disparity_integer_part” and “subregion_disparity_fractional_part”, which are fixed values of disparity information, are arranged. For the disparity information sequentially updated in the caption display period, “interval_count” indicating the number of interval periods, “disparity_region_updete_integer_part” and “disparity_region_updete_fractional_part” indicating the disparity information are associated with the start time and each subsequent update time. Are arranged sequentially.

図８８〜図９１は、ＤＳＳ（Disparity_Signaling_ Segment）の構造例（syntax）を示している。図９２、図９３は、ＤＳＳの主要なデータ規定内容（semantics）を示している。この構造には、「Sync_byte」、「segment_type」、「page_id」、「segment_length」、「dss_version_number」の各情報が含まれている。「segment_type」は、セグメントタイプを示す８ビットのデータであり、ここでは、ＤＳＳを示す値とされる。「segment_length」は、以降のバイト数を示す８ビットのデータである。 88 to 91 show a structure example (syntax) of DSS (Disparity_Signaling_Segment). 92 and 93 show the main data definition contents (semantics) of the DSS. This structure includes information of “Sync_byte”, “segment_type”, “page_id”, “segment_length”, and “dss_version_number”. “Segment_type” is 8-bit data indicating a segment type, and is a value indicating DSS here. “Segment_length” is 8-bit data indicating the number of subsequent bytes.

「disparity_page_update_sequence_flag」の１ビットフラグは、ページ単位の視差情報として字幕表示期間に順次更新される視差情報があるか否かを示す。“１”は存在することを示し、“０”は存在しないことを示す。「disparity_region_update_sequence_present_flag」の１ビットフラグは、リージョン単位（サブリージョン単位）の視差情報として字幕表示期間に順次更新される視差情報があるか否かを示す。“１”は存在することを示し、“０”は存在しないことを示す。なお、「disparity_region_update_sequence_present_flag」は、while ループの外側にあって、少なくとも一つのリージョン（region）に関する“disparity update”が存在するかどうかを簡単に分からせる目的で送られる。この「disparity_region_update_sequence_present_flag」を送るかどうかは送信側の自由である。 The 1-bit flag of “disparity_page_update_sequence_flag” indicates whether or not there is disparity information that is sequentially updated during the caption display period as disparity information in units of pages. “1” indicates that it exists, and “0” indicates that it does not exist. The 1-bit flag of “disparity_region_update_sequence_present_flag” indicates whether or not there is disparity information that is sequentially updated in the caption display period as disparity information in units of regions (subregions). “1” indicates that it exists, and “0” indicates that it does not exist. Note that “disparity_region_update_sequence_present_flag” is sent for the purpose of easily determining whether “disparity update” relating to at least one region (region) exists outside the while loop. Whether or not to send “disparity_region_update_sequence_present_flag” is up to the sender.

「page_default_disparity」の８ビットフィールドは、ページ単位の固定の視差情報、つまり、字幕表示期間内において共通に使用される視差情報を示す。上述した「disparity_page_update_sequence_flag」のフラグが“１”であるとき、「disparity_page_update_sequence()」の読み出しが行われる。 The 8-bit field of “page_default_disparity” indicates fixed disparity information in units of pages, that is, disparity information that is commonly used in the caption display period. When the “disparity_page_update_sequence_flag” flag is “1”, “disparity_page_update_sequence ()” is read.

図９０は、「disparity_page_update_sequence() 」の構造例（Syntax）を示している。「disparity_page_update_sequence_length」は、以降のバイト数を示す８ビットのデータである。「segment_NOT_continued_flag」は、現在のパケット内で完結しているか否かを示す。“１”は現在のパケットで完結していることを示す。“０”は現在のパケットで完結しておらず、次のパケットに続きの部分があることを示す。 FIG. 90 illustrates a structure example (Syntax) of “disparity_page_update_sequence ()”. “Disparity_page_update_sequence_length” is 8-bit data indicating the number of subsequent bytes. “Segment_NOT_continued_flag” indicates whether the packet is completed in the current packet. “1” indicates that the current packet is completed. “0” indicates that the current packet is not completed, and there is a subsequent part in the next packet.

「interval_time[23..0]」の２４ビットフィールドは、単位期間としてのインターバル期間（Interval Duration）（図８３参照）を９０ＫＨｚ単位で指定する。つまり、「interval_time[23..0]」は、このインターバル期間（Interval Duration）を９０ＫＨｚのクロックで計測した値を２４ビット長で表す。 A 24-bit field of “interval_time [23..0]” specifies an interval period (Interval Duration) (see FIG. 83) as a unit period in units of 90 KHz. That is, “interval_time [23..0]” represents a value obtained by measuring the interval duration with a 90 KHz clock in a 24-bit length.

ＰＥＳのヘッダ部に挿入されているＰＴＳが３３ビット長であるのに対して、２４ビット長とされているのは、以下の理由からである。すなわち、３３ビット長では２４時間分を超える時間を表現できるが、字幕表示期間内のこのインターバル期間（Interval Duration）としては不必要な長さである。また、２４ビットとすることで、データサイズを縮小でき、コンパクトな伝送を行うことができる。また、２４ビットは８×３ビットであり、バイトアラインが容易となる。 The reason why the PTS inserted in the header portion of the PES is 33 bits long is 24 bits long for the following reason. That is, a time exceeding 24 hours can be expressed with a 33-bit length, but this interval period (Interval Duration) within the caption display period is an unnecessary length. In addition, by using 24 bits, the data size can be reduced and compact transmission can be performed. Further, 24 bits are 8 × 3 bits, and byte alignment is easy.

「division_period_count」の８ビットフィールドは、視差情報を送信する期間（Division Period）の数を示す。例えば、図８３に示す更新例の場合には、開始時刻であるＴ1_0とその後の更新時刻であるＴ1_1〜Ｔ1_6に対応して、この数は“７”となる。この「division_period_count」の８ビットフィールドが示す数だけ、以下のforループが繰り返される。 The 8-bit field of “division_period_count” indicates the number of periods (Division Period) for transmitting disparity information. For example, in the case of the update example shown in FIG. 83, this number is “7” corresponding to the start time T1_0 and the subsequent update times T1_1 to T1_6. The following for loop is repeated by the number indicated by the 8-bit field of “division_period_count”.

「interval_count」の８ビットフィールドは、インターバル期間の個数を示す。例えば、図８３に示す更新例の場合には、Ｍ，Ｎ，Ｐ，Ｑ，Ｒ，Ｓが相当する。「disparity_page_update」の８ビットフィールドは、視差情報を示す。開始時刻の視差情報（視差情報の初期値）に対応して「interval_count」は“０”とされる。つまり、「interval_count」が“０”であるとき、「disparity_page_update」は開始時刻の視差情報（視差情報の初期値）を示す。 An 8-bit field of “interval_count” indicates the number of interval periods. For example, in the case of the update example shown in FIG. 83, M, N, P, Q, R, and S correspond. An 8-bit field of “disparity_page_update” indicates disparity information. “Interval_count” is set to “0” corresponding to the disparity information at the start time (the initial value of the disparity information). That is, when “interval_count” is “0”, “disparity_page_update” indicates disparity information (initial value of disparity information) at the start time.

図８９のwhileループは、それまでに処理したデータ長（processed_length）が、セグメントデータ長（segment_length）に達していないとき、繰り返される。このwhileループ中に、リージョン単位、あるいはリージョン内のサブリージョン単位の視差情報が配置される。ここで、リージョンには１つまたは複数のサブリージョンが含まれ、サブリージョン領域とリージョン領域とが同じ場合もある。 The while loop of FIG. 89 is repeated when the data length processed so far (processed_length) has not reached the segment data length (segment_length). In this while loop, disparity information in units of regions or sub-regions in the region is arranged. Here, the region includes one or more subregions, and the subregion region and the region region may be the same.

このwhileループ中に、「region_id 」および「subregion_id 」の情報が含まれている。サブリージョン領域がリージョン領域と同じ場合、「subregion_id 」は“０”とされる。そのため、「subregion_id 」が“０”でないとき、このwhileループ中に、サブリージョン領域を示す、「subregion_horizontal_position」の位置情報、「subregion_width 」の幅情報が含まれる。 In this while loop, information of “region_id” and “subregion_id” is included. When the subregion region is the same as the region region, “subregion_id” is set to “0”. Therefore, when “subregion_id” is not “0”, position information of “subregion_horizontal_position” and width information of “subregion_width” indicating the subregion region are included in this while loop.

「disparity_region_update_sequence_flag」の１ビットフラグは、リージョン単位（サブリージョン単位）の視差情報として字幕表示期間に順次更新される視差情報があるか否かを示す。“１”は存在することを示し、“０”は存在しないことを示す。「subregion_disparity_integer_part 」の８ビットフィールドは、リージョン単位（サブリージョン単位）の固定の視差情報、つまり、字幕表示期間内において共通に使用される視差情報の整数部分を示す。「subregion_disparity_fractional_part 」の４ビットフィールドは、リージョン単位（サブリージョン単位）の固定の視差情報、つまり、字幕表示期間内において共通に使用される視差情報の小数部分を示す。 The 1-bit flag of “disparity_region_update_sequence_flag” indicates whether or not there is disparity information that is sequentially updated in the caption display period as disparity information in units of regions (subregions). “1” indicates that it exists, and “0” indicates that it does not exist. The 8-bit field of “subregion_disparity_integer_part” indicates fixed disparity information in units of regions (subregions), that is, an integer part of disparity information that is commonly used in the caption display period. A 4-bit field of “subregion_disparity_fractional_part” indicates fixed disparity information in units of regions (subregions), that is, a decimal part of disparity information that is commonly used in a caption display period.

上述した「disparity_region_update_sequence_flag」のフラグが“１”であるとき、「disparity_region_update_sequence()」の読み出しが行われる。図９１は、「disparity_page_update_sequence() 」の構造例（Syntax）を示している。「disparity_region_update_sequence_length」は、以降のバイト数を示す８ビットのデータである。「segment_NOT_continued_flag」は、現在のパケット内で完結しているか否かを示す。“１”は現在のパケットで完結していることを示す。“０”は現在のパケットで完結しておらず、次のパケットに続きの部分があることを示す。 When the “disparity_region_update_sequence_flag” flag is “1”, “disparity_region_update_sequence ()” is read. FIG. 91 illustrates a structure example (Syntax) of “disparity_page_update_sequence ()”. “Disparity_region_update_sequence_length” is 8-bit data indicating the number of subsequent bytes. “Segment_NOT_continued_flag” indicates whether the packet is completed in the current packet. “1” indicates that the current packet is completed. “0” indicates that the current packet is not completed, and there is a subsequent part in the next packet.

「interval_time[23..0]」の２４ビットフィールドは、単位期間としてのインターバル期間（Interval Duration）（図８３参照）を９０ＫＨｚ単位で指定する。つまり、「interval_time[23..0]」は、このインターバル期間（Interval Duration）を９０ＫＨｚのクロックで計測した値を２４ビット長で表す。２４ビット長とされているのは、上述の「disparity_page_update_sequence() 」の構造例（Syntax）で説明したと同様である。 A 24-bit field of “interval_time [23..0]” specifies an interval period (Interval Duration) (see FIG. 83) as a unit period in units of 90 KHz. That is, “interval_time [23..0]” represents a value obtained by measuring the interval duration with a 90 KHz clock in a 24-bit length. The 24-bit length is the same as that described in the structure example (Syntax) of “disparity_page_update_sequence ()” described above.

「interval_count」の８ビットフィールドは、インターバル期間の個数を示す。例えば、図８３に示す更新例の場合には、Ｍ，Ｎ，Ｐ，Ｑ，Ｒ，Ｓが相当する。「disparity_region_update_integer_part」の８ビットフィールドは、視差情報の整数部分を示す。「disparity_region_update_fractional_part」の４ビットフィールドは、視差情報の小数部分を示す。開始時刻の視差情報（視差情報の初期値）に対応して「interval_count」は“０”とされる。つまり、「interval_count」が“０”であるとき、「disparity_region_update_integer_part」、「disparity_region_update_fractional_part」は、開始時刻の視差情報（視差情報の初期値）を示す。 An 8-bit field of “interval_count” indicates the number of interval periods. For example, in the case of the update example shown in FIG. 83, M, N, P, Q, R, and S correspond. An 8-bit field of “disparity_region_update_integer_part” indicates an integer part of disparity information. A 4-bit field of “disparity_region_update_fractional_part” indicates a decimal part of disparity information. “Interval_count” is set to “0” corresponding to the disparity information at the start time (the initial value of the disparity information). That is, when “interval_count” is “0”, “disparity_region_update_integer_part” and “disparity_region_update_fractional_part” indicate disparity information (initial value of disparity information) at the start time.

また、上述実施の形態においては、画像送受信システム１０が、放送局１００、セットトップボックス２００およびテレビ受信機３００で構成されているものを示した。しかし、テレビ受信機３００は、図３２に示すように、セットトップボックス２００内のビットストリーム処理部２０１（２０１Ａ、２０１Ｂ）と同様に機能するビットストリーム処理部３０６を備えている。したがって、図９４に示すように、放送局１００およびテレビ受信機３００で構成される画像送受信システム１０Ａも考えられる。 In the above-described embodiment, the image transmission / reception system 10 includes the broadcasting station 100, the set-top box 200, and the television receiver 300. However, the television receiver 300 includes a bit stream processing unit 306 that functions in the same manner as the bit stream processing unit 201 (201A, 201B) in the set top box 200, as shown in FIG. Therefore, as shown in FIG. 94, an image transmission / reception system 10A including a broadcasting station 100 and a television receiver 300 is also conceivable.

また、上述実施の形態においては、立体画像データを含むデータストリーム（ビットストリームデータ）が放送局１００から放送される例を示した。しかし、この発明は、このデータストリームがインターネット等のネットワークを利用して受信端末に配信される構成のシステムにも同様に適用できる。 In the above-described embodiment, an example in which a data stream (bit stream data) including stereoscopic image data is broadcast from the broadcast station 100 has been described. However, the present invention can be similarly applied to a system in which the data stream is distributed to the receiving terminal using a network such as the Internet.

また、上述実施の形態においては、セットトップボックス２００と、テレビ受信機３００とが、ＨＤＭＩのデジタルインタフェースで接続されるものを示している。しかし、これらが、ＨＤＭＩのデジタルインタフェースと同様のデジタルインタフェース（有線の他に無線も含む）で接続される場合においても、この発明を同様に適用できる。 In the above-described embodiment, the set top box 200 and the television receiver 300 are connected via an HDMI digital interface. However, the present invention can be similarly applied even when these are connected by a digital interface (including wireless as well as wired) similar to the HDMI digital interface.

また、上述実施の形態においては、重畳情報としてサブタイトル（字幕）を取り扱うものを示した。しかし、その他のグラフィクス情報、テキスト情報などの重畳情報を扱うものにも、この発明を同様に適用できる。 Further, in the above-described embodiment, a case where a subtitle (caption) is handled as superimposition information is shown. However, the present invention can be similarly applied to other information that handles superimposition information such as graphics information and text information.

この発明は、立体画像に重ねてサブタイトル（字幕）などの重畳情報の表示を行い得る画像送受信システムに適用できる。 The present invention can be applied to an image transmission / reception system capable of displaying superimposition information such as a subtitle (caption) superimposed on a stereoscopic image.

１０，１０Ａ・・・画像送受信システム
１００・・・放送局
１１０，１１０Ａ，１１０Ｂ・・・送信データ生成部
１１１，１２１，１３１・・・データ取り出し部
１１２，１２２，１３２・・・ビデオエンコーダ
１３２ａ・・・ストリームフォーマッタ
１１３，１２３，１３３・・・オーディオエンコーダ
１１４・・・サブタイトル発生部
１１５，１２５，１３５・・・視差情報作成部
１１６・・・サブタイトル処理部
１１７・・・表示制御情報生成部
１１８・・・サブタイトルエンコーダ
１１９，１２７，１３６・・・マルチプレクサ
１２４・・・字幕発生部
１２６・・・字幕エンコーダ
１３４・・・ＣＣエンコーダ
１２６・・・マルチプレクサ
２００・・・セットトップボックス（ＳＴＢ）
２０１，２０１Ａ，２０１Ｂ・・・ビットストリーム処理部
２０２・・・ＨＤＭＩ端子
２０３・・・アンテナ端子
２０４・・・デジタルチューナ
２０５・・・映像信号処理回路
２０６・・・ＨＤＭＩ送信部
２０７・・・音声信号処理回路
２１１・・・ＣＰＵ
２１５・・・リモコン受信部
２１６・・・リモコン送信機
２２１，２３１，２４１・・・デマルチプレクサ
２２２，２３２，２４２・・・ビデオデコーダ
２２３・・サブタイトルデコーダ
２２４・・・立体画像用サブタイトル発生部
２２５・・・表示制御部
２２６・・・表示制御情報取得部
２２７，２３６，２４６・・視差情報処理部
２２８，２３７，２４７・・・ビデオ重畳部
２２９，２３８，２４８・・・オーディオデコーダ
２３３・・・字幕デコーダ
２３４・・・立体画像用字幕発生部
２３５，２４５・・・視差情報取り出し部
２４３・・・ＣＣデコーダ
２４４・・・立体画像用ＣＣ発生部
３００・・・テレビ受信機（ＴＶ）
３０１・・・３Ｄ信号処理部
３０２・・・ＨＤＭＩ端子
３０３・・・ＨＤＭＩ受信部
３０４・・・アンテナ端子
３０５・・・デジタルチューナ
３０６・・・ビットストリーム処理部
３０７・・・映像・グラフィック処理回路
３０８・・・パネル駆動回路
３０９・・・表示パネル
３１０・・・音声信号処理回路
３１１・・・音声増幅回路
３１２・・・スピーカ
３２１・・・ＣＰＵ
３２５・・・リモコン受信部
３２６・・・リモコン送信機
４００・・・ＨＤＭＩケーブル DESCRIPTION OF SYMBOLS 10,10A ... Image transmission / reception system 100 ... Broadcasting station 110, 110A, 110B ... Transmission data generation part 111,121,131 ... Data extraction part 112,122,132 ... Video encoder 132a. .. Stream formatter 113, 123, 133 ... audio encoder 114 ... subtitle generation unit 115, 125, 135 ... disparity information creation unit 116 ... subtitle processing unit 117 ... display control information generation unit 118 ... Subtitle encoders 119, 127, 136 ... Multiplexer 124 ... Subtitle generator 126 ... Subtitle encoder 134 ... CC encoder 126 ... Multiplexer 200 ... Set top box (STB)
201, 201A, 201B ... bit stream processing unit 202 ... HDMI terminal 203 ... antenna terminal 204 ... digital tuner 205 ... video signal processing circuit 206 ... HDMI transmission unit 207 ... audio Signal processing circuit 211 ... CPU
215: Remote control reception unit 216: Remote control transmitter 221, 231, 241 ... Demultiplexer 222, 232, 242 ... Video decoder 223 ... Subtitle decoder 224 ... Stereo image subtitle generation unit 225 ... Display control unit 226 ... Display control information acquisition unit 227, 236, 246 ... Parallax information processing unit 228, 237, 247 ... Video superposition unit 229, 238, 248 ... Audio decoder 233 ... Subtitle decoder 234: Stereo image caption generation unit 235, 245 ... Parallax information extraction unit 243 ... CC decoder 244 ... Stereo image CC generation unit 300 ... Television receiver (TV)
DESCRIPTION OF SYMBOLS 301 ... 3D signal processing part 302 ... HDMI terminal 303 ... HDMI receiving part 304 ... Antenna terminal 305 ... Digital tuner 306 ... Bit stream processing part 307 ... Video / graphic processing circuit 308 ... Panel drive circuit 309 ... Display panel 310 ... Audio signal processing circuit 311 ... Audio amplification circuit 312 ... Speaker 321 ... CPU
325 ... Remote control receiver 326 ... Remote control transmitter 400 ... HDMI cable

Claims

An image data output unit for outputting stereoscopic image data having left-eye image data and right-eye image data;
A superimposition information data output unit for outputting superimposition information data to be superimposed on an image based on the left eye image data and the right eye image data;
A parallax information output unit for outputting parallax information for shifting the superimposition information to be superimposed on the image based on the left-eye image data and the right-eye image data and providing parallax;
A stereoscopic image data output from the image data output unit, a superimposition information data output from the superimposition information data output unit, and a data transmission unit that transmits parallax information output from the parallax information output unit,
The disparity information is disparity information that is sequentially updated within a predetermined number of frame periods in which the superimposition information is displayed, and the disparity information of the first frame of the predetermined number of frame periods and every subsequent update frame interval. A three-dimensional image data transmission apparatus comprising frame parallax information.

The stereoscopic image data transmission device according to claim 1, wherein information of a unit period and information on the number of unit periods are added to the disparity information as information of each update frame interval.

The stereoscopic image data transmission device according to claim 2, wherein the information of the unit period is information in which a value obtained by measuring the unit period with a 90 KHz clock is represented by a 24-bit length.

The stereoscopic image data transmission device according to claim 1, wherein flag information indicating whether or not the disparity information is updated is added to the disparity information for each frame at each update frame interval.

The stereoscopic image data transmission device according to claim 1, wherein information for adjusting the update frame interval is added to the disparity information for each frame at each update frame interval.

The stereoscopic image data transmission device according to claim 1, wherein information specifying a frame period is added to the parallax information.

The stereoscopic image according to claim 1, wherein the disparity information is disparity information corresponding to specific superimposition information displayed on the same screen and / or disparity information corresponding to a plurality of superimposition information displayed on the same screen. Data transmission device.

The stereoscopic image data transmission device according to claim 1, wherein the parallax information is added with information indicating a level corresponding to the parallax information, which is essential when the superimposition information is displayed.

The superimposition information data is DVB subtitle data,
The stereoscopic image data transmission device according to claim 1, wherein the data transmission unit transmits the disparity information by including the parallax information in a subtitle data stream including the subtitle data.

The stereoscopic image data transmission device according to claim 9, wherein the disparity information is disparity information in units of regions or subregions included in the regions.

The stereoscopic image data transmission device according to claim 9, wherein the disparity information is disparity information in units of pages including all regions.

The superimposition information data is ARIB subtitle data,
The stereoscopic image data transmission device according to claim 1, wherein the data transmission unit transmits the disparity information by including the parallax information in a caption data stream including the caption data.

The superimposition information data is CEA closed caption data,
The stereoscopic image data transmission device according to claim 1, wherein the data transmission unit transmits the disparity information by including the disparity information in a user data area of a video data stream including the closed caption data.

The stereoscopic image data transmission device according to claim 13, wherein the superimposition information data is inserted into an extended command based on a CEA table arranged in the user data area.

The stereoscopic image data transmission device according to claim 13, wherein the superimposition information data is inserted into the closed caption data arranged in the user data area.

An image data output step for outputting stereoscopic image data having left eye image data and right eye image data;
A superimposition information data output step for outputting superimposition information data to be superimposed on an image based on the left eye image data and the right eye image data;
A disparity information output step for outputting disparity information for shifting the superimposition information to be superimposed on the image based on the left eye image data and the right eye image data to give disparity;
A data transmission step for transmitting stereoscopic image data output in the image data output step, superimposition information data output in the superimposition information data output step, and disparity information output in the disparity information output step;
The disparity information is disparity information that is sequentially updated within a predetermined number of frame periods in which the superimposition information is displayed, and the disparity information of the first frame of the predetermined number of frame periods and every subsequent update frame interval. A method for transmitting stereoscopic image data comprising disparity information of a frame.

Three-dimensional image data including left-eye image data and right-eye image data, superimposition information data to be superimposed on an image based on the left-eye image data and the right-eye image data, and the left-eye image data and the right-eye image data A data receiving unit that receives parallax information for shifting the superimposition information to be superimposed on the image to give parallax;
The disparity information received by the receiving unit is disparity information that is sequentially updated within a predetermined number of frame periods in which the superimposition information is displayed, and the disparity information of the first frame of the predetermined number of frame periods; It consists of the disparity information of the frame for each subsequent update frame interval,
Using the left eye image data and the right eye image data received by the data receiving unit, the superimposition information data, and the parallax information, the same superimposition information to be superimposed on the left eye image and the right eye image An image data processing unit is further provided that provides parallax and obtains left eye image data on which the superimposition information is superimposed and right eye image data on which the superposition information is superimposed.

Stereo image data receiving device.
The image data processing unit performs an interpolation process on disparity information of a plurality of frames that are sequentially updated within the predetermined number of frame periods, and generates disparity information at arbitrary frame intervals within the predetermined number of frame periods. The stereoscopic image data receiving device according to claim 17.

The stereoscopic image data receiving device according to claim 18, wherein the interpolation processing includes a low-pass filter processing in a time direction.

In the disparity information, information on the unit period and information on the number of unit periods are added as information on the update frame interval.
The image data processing unit
18. The solid according to claim 17, wherein each update time of the parallax information is obtained based on the information of the unit period and the number of pieces of information that are information of the update frame intervals, with the display start time of the superimposition information as a reference. Image data transmission device.

The stereoscopic image data transmission device according to claim 20, wherein the display start time of the superimposition information is given by a PTS inserted in a header part of a PES stream including the parallax information.

Stereoscopic image data including left-eye image data and right-eye image data, superimposition information data to be superimposed on an image based on the left-eye image data, and superimposition information to be superimposed on an image based on the left-eye image data and the right-eye image data A data receiving step of receiving parallax information for shifting the image to give parallax,
The disparity information received in the reception step is disparity information that is sequentially updated within a predetermined number of frame periods in which the superimposition information is displayed, and the disparity information of the first frame of the predetermined number of frame periods; It consists of the disparity information of the frame for each subsequent update frame interval,
Using the left eye image data and the right eye image data received in the data reception step, the superimposition information data, and the parallax information, the same superimposition information to be superimposed on the left eye image and the right eye image A stereoscopic image data receiving method further comprising an image data processing step of obtaining parallax and obtaining data of a left eye image on which the superimposition information is superimposed and right eye image data on which the superimposition information is superimposed.