JP2011511593A

JP2011511593A - Apparatus and method for generating and displaying media files

Info

Publication number: JP2011511593A
Application number: JP2010545808A
Authority: JP
Inventors: ソ−ヨンファン，; ゼ−ヨンソン，; ゴン−イルリ，; クッ−ヘリ，; ヨン−テキム，; ゼ−ソンキム，
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2008-02-05
Filing date: 2009-02-05
Publication date: 2011-04-07
Anticipated expiration: 2029-02-05
Also published as: AU2009210926A1; CN101971639B; KR20090086017A; JP5483205B2; CN101971639A; CA2713857C; KR101530713B1; RU2462771C2; RU2010132853A; CA2713857A1; AU2009210926B2

Abstract

格納されたデータを有するコンピュータ読み取り可能な記録媒体を提供する。そのデータの構造は、２個以上のメディアデータを含むメディアデータボックスと、メディアデータのビューシーケンスデータに関する情報を含む映画データ（‘ｍｏｏｖ’）ボックスとを有し、‘ｍｏｏｖ’ボックスは、一つのビューシーケンスに対するトラックボックスが他のビューシーケンスのトラックボックスを参照することを示すトラック基準情報を含む。
【選択図】図２A computer-readable recording medium having stored data is provided. The data structure includes a media data box including two or more media data and a movie data ('moov') box including information on view sequence data of the media data, and the 'moov' box includes one Track reference information indicating that a track box for a view sequence refers to a track box of another view sequence is included.
[Selection] Figure 2

Description

本発明は、ステレオスコピックメディアファイルを生成してディスプレイするための装置及び方法に関するものである。 The present invention relates to an apparatus and method for generating and displaying stereoscopic media files.

マルチメディア関連国際標準化機構であるＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）は、その第１の標準化であるＭＰＥＧ−１を始めとして現在はＭＰＥＧ−２、ＭＰＥＧ−４、ＭＰＥＧ−７、ＭＰＥＧ−２１の標準化作業を進めている。このような様々な標準の開発は、異なる標準技術を組み合わせて一つのプロファイルを生成する必要性をもたらし、このような動きの一つとして、多様なマルチメディアアプリケーションフォーマット（ＭＡＦ）がＭＰＥＧ−Ａ（ＭＰＥＧＭｕｌｔｉｍｅｄｉａＡｐｐｌｉｃａｔｉｏｎＦｏｒｍａｔ：ＩＳＯ／ＩＥＣ（ＩｎｔｅｒｎａｔｉｏｎａｌＯｒｇａｎｉｚａｔｉｏｎｆｏｒＳｔａｎｄａｒｄｉｚａｔｉｏｎ／ＩｎｔｅｒｎａｔｉｏｎａｌＥｌｅｃｔｒｏｔｅｃｈｎｉｃａｌＣｏｍｍｉｓｓｉｏｎ）２３０００）マルチメディアアプリケーション標準化活動で作られている。ＭＡＦは、既存のＭＰＥＧ標準だけでなく非ＭＰＥＧ標準も組み合わせて標準の利用価値を高めることを目指している。このように別途の標準を新たに作る努力なしに既に検証された標準技術を組み合わせてＭＡＦを作ることによって、その効用価値を最大限に活用することができる。 MPEG (Moving Picture Experts Group), an international standardization organization related to multimedia, has started standardization work for MPEG-2, MPEG-4, MPEG-7, and MPEG-21 as well as its first standard, MPEG-1. We are promoting. The development of such various standards has led to the need to combine different standard technologies to generate a single profile. As one such movement, various multimedia application formats (MAF) are MPEG-A ( MPEG Multimedia Application Format: ISO / IEC (International Organization for Standardization / International Electrotechnical Commission) 23000 (multi-media application standardization). MAF aims to increase the utility value of a standard by combining not only the existing MPEG standard but also a non-MPEG standard. In this way, the utility value can be maximized by creating a MAF by combining standard techniques that have already been verified without an effort to create a new standard.

最近、より現実的なビデオ情報を表現するために、三次元（３Ｄ）ビデオを実現する方式に関連して集中的な研究が行われている。その方法の中で、人間の視覚特性を用いて、様々な面で効果的であると考えられる方法は、左側ビュー（ｖｉｅｗ）と右側ビューが各々ユーザーの左眼と右眼に結像されるようにその該当位置で既存のディスプレイ装置上に左側ビューイメージと右側ビューイメージをスキャンし、それによってユーザーが３Ｄ効果を感じるようにする。一例として、バリアＬＣＤ（ｂａｒｒｉｅｒＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）を装着した携帯端末は、ステレオスコピックコンテンツを再生してユーザーに写実的なビデオを提供できる。 Recently, in order to represent more realistic video information, intensive research has been conducted in connection with methods for realizing three-dimensional (3D) video. Among the methods, using human visual characteristics, a method considered to be effective in various aspects is that the left view and the right view are imaged on the left eye and right eye of the user, respectively. Thus, the left view image and the right view image are scanned on the existing display device at the corresponding position so that the user can feel the 3D effect. As an example, a mobile terminal equipped with a barrier LCD (Barrier Liquid Crystal Display) can reproduce stereoscopic content and provide a realistic video to the user.

しかし、２個以上のビューシーケンスで構成されるステレオスコピックコンテンツの場合、構文（ｓｙｎｔａｘ）は、ファイルフォーマット上で定義されていない。この構文に基づき、ステレオスコピックコンテンツのビューシーケンスのトラックが相互に関連しているか否かを判定できる。ビューシーケンスは、一つ以上のビデオフレームで構成されたビデオビットストリームであり、基本（ｅｌｅｍｅｎｔａｒｙ）ストリームとして呼ぶこともできる。また、ステレオスコピックビデオとモノスコピックビデオを含み、ステレオスコピックビデオが空間的に二次元（２Ｄ）ビデオと結合するか、或いはステレオスコピックビデオとモノスコピックビデオが一つのシーンに共に現れるコンテンツの場合、例えばモノスコピックミュージックビデオの下部に映像字幕があり、その映像字幕が２Ｄイメージでディスプレイされるサービスの場合、ミュージックビデオと映像字幕が相互に関連しているか否かを判定できる構文は、ファイルフォーマット上に定義されていない。従って、ミュージックビデオと映像字幕が相互に関連しているか否かを示す情報を追加的に提供することが必要である。 However, in the case of stereoscopic content composed of two or more view sequences, the syntax is not defined on the file format. Based on this syntax, it can be determined whether or not the tracks of the stereoscopic content view sequence are related to each other. A view sequence is a video bit stream made up of one or more video frames, and can also be called an elementary stream. It also includes stereoscopic video and monoscopic video, where the stereoscopic video is spatially combined with two-dimensional (2D) video, or content that appears together in one scene. For example, in the case of a service in which there is a video subtitle at the bottom of the monoscopic music video and the video subtitle is displayed as a 2D image, the syntax for determining whether or not the music video and the video subtitle are mutually related is a file It is not defined on the format. Therefore, it is necessary to additionally provide information indicating whether or not the music video and the video subtitle are related to each other.

本発明は、上記従来の問題点に鑑みてなされたものであって、その目的は、２個以上のビューシーケンスで構成されるステレオスコピックコンテンツ、又は一つのシーンに同時にディスプレイされるステレオスコピックビデオ及びモノスコピックビデオを有するコンテンツに対して、ビューシーケンスのトラックが相互に関連しているか否かを明確に判定するメディアファイルの生成とディスプレイ装置及び方法を提供することにある。 The present invention has been made in view of the above-described conventional problems, and an object of the present invention is to provide a stereoscopic content composed of two or more view sequences or a stereoscopic display that is simultaneously displayed in one scene. It is an object of the present invention to provide a media file generation and display apparatus and method for clearly determining whether or not a track of a view sequence is related to content having video and monoscopic video.

上記目的を達成するために、本発明の一態様によれば、格納されたデータを有するコンピュータ読み取り可能な記録媒体が提供される。そのデータの構造は、２個以上のメディアデータアイテムを含むメディアデータボックスと、前記メディアデータのビューシーケンスデータに関する情報を含む映画データ（‘ｍｏｏｖ’）ボックスと、を有し、前記‘ｍｏｏｖ’ボックスは、一つのビューシーケンスに対するトラックボックスが他のビューシーケンスのトラックボックスを参照することを示すトラック基準情報を含む。 In order to achieve the above object, according to one aspect of the present invention, a computer-readable recording medium having stored data is provided. The data structure includes a media data box including two or more media data items, and a movie data ('moov') box including information on view sequence data of the media data, and the 'moov' box Includes track reference information indicating that a track box for one view sequence refers to a track box of another view sequence.

本発明の他の態様によれば、コンピュータ実行方法が提供される。その方法は、メディアファイルを受信するステップと、２個以上のビューシーケンスデータを含む前記受信されたメディアファイルのメディアデータボックスと、前記ビューシーケンスデータに関する情報を含む映画データ（‘ｍｏｏｖ’）ボックスとを分析するステップと、前記‘ｍｏｏｖ’ボックスに含まれ一つのビューシーケンスに対するトラックボックスが他のビューシーケンスに対するトラックボックスを参照することを示すトラック基準情報によって、参照するビューシーケンスと参照されるビューシーケンスに基づいてビデオを生成するステップと、を有する。 According to another aspect of the invention, a computer-implemented method is provided. The method includes receiving a media file, a media data box of the received media file including two or more view sequence data, and a movie data ('moov') box including information about the view sequence data. A view sequence to be referred to and a view sequence to be referred to by track reference information indicating that a track box for one view sequence refers to a track box for another view sequence included in the 'moov' box. Generating a video based on.

本発明の更に他の態様によれば、２個以上のビューシーケンスデータを含むメディアファイルのメディアデータボックスと、前記ビューシーケンスデータに関する情報を含む映画データ（‘ｍｏｏｖ’）ボックスを分析し、前記‘ｍｏｏｖ’ボックスに含まれ一つのビューシーケンスに対するトラックボックスが他のビューシーケンスのトラックボックスを参照することを示すトラック基準情報によって、参照するビューシーケンスと参照されるビューシーケンスに基づいてビデオを抽出するファイル分析部と、前記抽出されたビデオを表示するディスプレイ部と、を備える端末装置が提供される。 According to still another aspect of the present invention, a media data box of a media file including two or more view sequence data and a movie data ('moov') box including information on the view sequence data are analyzed, and the ' A file for extracting video based on a reference view sequence and a reference view sequence by track reference information included in a moov 'box and indicating that a track box for one view sequence refers to a track box of another view sequence A terminal device is provided that includes an analysis unit and a display unit that displays the extracted video.

本発明によれば、２個以上のビューシーケンスで構成されたステレオスコピックコンテンツ、又は一つのシーンに同時にディスプレイされるステレオスコピックビデオとモノスコピックビデオを有するコンテンツに含まれるトラックの中から、相互に関連しているトラックを明確に判定でき、また追加的なメタデータに対する冗長性を避けることができる効果を有する。 According to the present invention, a stereoscopic content composed of two or more view sequences, or a track included in a content having a stereoscopic video and a monoscopic video that are simultaneously displayed in one scene are mutually connected. It is possible to clearly determine the track related to the, and to avoid redundancy for additional metadata.

本発明の上記及び他の様相、特徴、及び利点は、以下のような添付図面と共に続く詳細な説明から、より明白になるだろう。 The above and other aspects, features and advantages of the present invention will become more apparent from the following detailed description taken in conjunction with the accompanying drawings in which:

ＩＳＯベースのメディアファイルフォーマットを示す図である。FIG. 3 is a diagram illustrating an ISO-based media file format. 本発明の第１の実施形態によるファイル構造を示す図である。It is a figure which shows the file structure by the 1st Embodiment of this invention. 本発明の第１の実施形態により、関連したトラックを相互に接続するように設計されたファイル構造を示す図である。FIG. 3 shows a file structure designed to interconnect related tracks according to a first embodiment of the present invention. 本発明の第１の実施形態により、関連したトラックを相互に接続するように設計されたファイル構造を示す図である。FIG. 3 shows a file structure designed to interconnect related tracks according to a first embodiment of the present invention. 本発明の第１の実施形態による端末の動作を示す図である。It is a figure which shows operation | movement of the terminal by the 1st Embodiment of this invention. 本発明の第２の実施形態によるファイル構造を示す図である。It is a figure which shows the file structure by the 2nd Embodiment of this invention. 本発明の第２の実施形態による第１のビューシーケンスを表現する方法を示す図である。FIG. 6 is a diagram illustrating a method of expressing a first view sequence according to a second embodiment of the present invention. 本発明の第２の実施形態による端末の動作を示す図である。It is a figure which shows operation | movement of the terminal by the 2nd Embodiment of this invention. 本発明の第３の実施形態によるファイル構造を示す図である。It is a figure which shows the file structure by the 3rd Embodiment of this invention. 本発明の第３の実施形態による第１のビューシーケンスを表現する方法を示す図である。FIG. 7 is a diagram illustrating a method for expressing a first view sequence according to a third embodiment of the present invention. 本発明の第４の実施形態によるファイル構造を示す図である。It is a figure which shows the file structure by the 4th Embodiment of this invention. 本発明の第５の実施形態によるステレオスコピックビデオのファイル構造を示す図である。It is a figure which shows the file structure of the stereoscopic video by the 5th Embodiment of this invention. 本発明の第５の実施形態によるステレオスコピックビデオのファイル構造を示す図である。It is a figure which shows the file structure of the stereoscopic video by the 5th Embodiment of this invention. 本発明の第５の実施形態によるマルチビューコンテンツのファイル構造を示す図である。It is a figure which shows the file structure of the multi view content by the 5th Embodiment of this invention. 本発明の第５の実施形態によるマルチビューコンテンツのファイル構造を示す図である。It is a figure which shows the file structure of the multi view content by the 5th Embodiment of this invention. 本発明の一実施形態によるメディアファイル生成装置を示す図である。1 is a diagram illustrating a media file generation apparatus according to an embodiment of the present invention. 本発明の一実施形態によるメディアファイル再生装置を示す図である。1 is a diagram illustrating a media file playback apparatus according to an embodiment of the present invention.

以下、本発明の望ましい実施形態を添付の図面を参照して詳細に説明する。 Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

下記の説明では、本発明に関連した公知の機能又は構成に関する具体的な説明は、明瞭性及び簡潔さのために省略する。ここで使用される用語は、本発明の機能を考慮して定義されたものであって、ユーザー又は運用者の意図又は慣例によって変更することができる。従って、上記用語は、本明細書の全体内容に基づいて定義されなければならない。 In the following description, specific descriptions of well-known functions or configurations related to the present invention are omitted for clarity and brevity. The terminology used herein is defined in consideration of the function of the present invention, and can be changed according to the intention or practice of the user or operator. Therefore, the above terms must be defined based on the entire contents of this specification.

本発明では、先ずＩＳＯベースのメディアファイルフォーマットについて説明する。また、本発明は、２個以上のビューシーケンスで構成されるステレオスコピックコンテンツで対をなすトラック間の関係を示す方法、及び一つのシーンで同時にディスプレイされるステレオスコピックビデオとモノスコピックビデオを有するコンテンツで、ステレオスコピックビデオトラックとモノスコピックビデオトラックとの間の関係を示す方法を提供する。 In the present invention, the ISO-based media file format will be described first. In addition, the present invention provides a method for indicating a relationship between pairs of tracks in a stereoscopic content composed of two or more view sequences, and a stereoscopic video and a monoscopic video displayed simultaneously in one scene. A method is provided for showing the relationship between a stereoscopic video track and a monoscopic video track with content having.

図１は、ＩＳＯベースのメディアファイルフォーマットを示す。 FIG. 1 shows an ISO-based media file format.

図１を参照すると、ＩＳＯベースのメディアファイル１００は、ファイルタイプボックス（‘ｆｔｙｐ’ボックス；図示せず）、映画データボックス（‘ｍｏｏｖ’ボックス）１１０、及びメディアデータボックス（‘ｍｄａｔ’ボックス）１２０を含む。このファイルタイプボックスは、ファイルタイプと互換タイプの詳細を含む。正常再生は、互換タイプに従って該当デコーダで可能である。’ｍｏｏｖ’ボックス１１０は、ファイルフォーマットのヘッダーボックスに該当し、各データは‘ａｔｏｍ’と呼ばれるオブジェクトに基づいた構造で形成される。’ｍｏｏｖ’ボックス１１０は、フレームレート（ｒａｔｅ）、ビットレート、イメージサイズのようなコンテンツ情報及びＦＦ（Ｆａｓｔ−Ｆｏｒｗａｒｄ）／ＲＥＷ（Ｒｅｗｉｎｄ）のような再生機能をサポートするのに使用される同期化情報を含む再生ファイルに必要な全ての情報を含む。データボックスであるメディアデータボックス１２０は、実際のメディアデータを含み、それぞれのトラックにビデオデータ及びオーディオデータがそのフレーム単位で格納される。 Referring to FIG. 1, an ISO-based media file 100 includes a file type box (“ftyp” box; not shown), a movie data box (“moov” box) 110, and a media data box (“mdat” box) 120. including. This file type box contains details of the file type and compatible type. Normal reproduction is possible with the corresponding decoder according to the compatible type. The 'moov' box 110 corresponds to a file format header box, and each data is formed in a structure based on an object called 'atom'. The 'moov' box 110 is a synchronization used to support content information such as frame rate, bit rate, image size and playback functions such as FF (Fast-Forward) / REW (Rewind). Contains all the information necessary for a playback file containing information. The media data box 120, which is a data box, includes actual media data, and video data and audio data are stored in units of frames in each track.

ステレオスコピックビデオは、ステレオスコピックビデオ関連情報を含む。ステレオスコピックビデオ関連情報は、ステレオスコピックビデオの構成タイプのような必要な情報であり、カメラパラメータとディスプレイ情報のような追加データであり得る。ステレオスコピックビデオが２個以上のビューシーケンスで構成されている場合、それぞれのビューシーケンスは、同一のステレオスコピックビデオ関連情報を有することができる。例えば、２個のビューシーケンスで構成されるステレオスコピックビデオに対して、ビデオの左側ビューと右側ビューの各々は、同一のカメラ及びディスプレイに関する追加情報を含むことができる。このように各ビューシーケンスが同一のステレオスコピックビデオの関連情報を有する場合、同一の情報がそれぞれのビューシーケンスに重複して格納されることを防止するために、該当情報は一つのビューシーケンスのみに包含され、残りのビューシーケンスは該当ビューシーケンスを参照して該当ビューシーケンスに包含された該当ステレオスコピックビデオ関連情報を使用することができる。しかし、そのために、どの基本ストリームがステレオスコピックビデオ関連情報を含んでいるかを他の基本ストリームに通知し、ステレオスコピックビデオ関連情報を含んでいるビューシーケンスを判別しなければならない。２個以上のビューシーケンスで構成されたステレオスコピックビデオの場合、２個のビューシーケンスは、第１の（ｐｒｉｍａｒｙ）ビューシーケンスと第２の（ｓｅｃｏｎｄａｒｙ）ビューシーケンスに分けられる。上述したように、ステレオスコピックビデオ関連情報は一つの基本ストリームにのみ包含される場合、第１のビューシーケンスを第２のビューシーケンスと区別して該当情報を確認することができる。本発明で説明した第１のビューシーケンス及び第２のビューシーケンスは、２個以上のビューシーケンスのうちの一つのみが選択されて画面に表示されなければならない場合に、優先順位の高いディスプレイを有するビューシーケンスを区別する。 The stereoscopic video includes stereoscopic video related information. The stereoscopic video related information is necessary information such as the configuration type of the stereoscopic video, and may be additional data such as camera parameters and display information. If the stereoscopic video is composed of two or more view sequences, each view sequence may have the same stereoscopic video related information. For example, for a stereoscopic video composed of two view sequences, each of the left and right views of the video can contain additional information about the same camera and display. Thus, when each view sequence has related information of the same stereoscopic video, in order to prevent the same information from being redundantly stored in each view sequence, the corresponding information is only one view sequence. The remaining view sequence can use the corresponding stereoscopic video related information included in the corresponding view sequence with reference to the corresponding view sequence. However, in order to do so, it is necessary to notify other elementary streams which elementary stream contains stereoscopic video related information and to determine a view sequence containing stereoscopic video related information. In the case of a stereoscopic video composed of two or more view sequences, the two view sequences are divided into a first (primary) view sequence and a second (secondary) view sequence. As described above, when the stereoscopic video-related information is included in only one basic stream, the corresponding information can be confirmed by distinguishing the first view sequence from the second view sequence. The first view sequence and the second view sequence described in the present invention display a high priority display when only one of two or more view sequences must be selected and displayed on the screen. Distinguish between view sequences.

第１のビューシーケンスを第２のビューシーケンスと区別する方法は、各ビューシーケンスのトラックＩＤ（ｔｒａｃｋ＿ＩＤ）を確認する一番目の方法を含む。各ビューシーケンスのトラックヘッダー‘ｔｋｈｄ’ボックスは、各トラックを識別することができる識別子であるトラックＩＤ（ｔｒａｃｋ＿ＩＤ）を有する。トラックＩＤが各ビューシーケンスのトラックに順次に割り当てられる整数（ｉｎｔｅｇｅｒ）値であるので、トラックＩＤの最小値を有するトラックのビューシーケンスは第１のビューシーケンスとして決定される。 The method for distinguishing the first view sequence from the second view sequence includes a first method for confirming the track ID (track_ID) of each view sequence. The track header 'tkhd' box of each view sequence has a track ID (track_ID) which is an identifier that can identify each track. Since the track ID is an integer value sequentially assigned to the tracks of each view sequence, the view sequence of the track having the minimum value of the track ID is determined as the first view sequence.

２番目の方法は、ステレオスコピックビデオの構成タイプ情報の左側ビューシーケンスと右側ビューシーケンス（又は、２個以上のビューシーケンス）のうちのいずれか一つを、最初に符号化したかを示す‘ｉｓ＿ｌｅｆｔ＿ｆｉｒｓｔ’パラメータを確認し、該当パラメータの値によって左側ビューシーケンス及び右側ビューシーケンス（或いは、２個以上のビューシーケンス）のうちのどちらが第１のビューシーケンス又は第２のビューシーケンスであるかを判定する。３番目の方法は、他のトラックを参照する（ｒｅｆｅｒｅｎｃｅ）トラックを第１のビューシーケンス或いは第２のビューシーケンスとして判定する。 The second method indicates whether one of the left view sequence and the right view sequence (or two or more view sequences) of the configuration type information of the stereoscopic video is first encoded. Check the is_left_first 'parameter, and determine whether the left view sequence or the right view sequence (or two or more view sequences) is the first view sequence or the second view sequence according to the value of the corresponding parameter. . In the third method, a track that refers to another track is determined as the first view sequence or the second view sequence.

トラック基準に関する情報に基づいて第１のビューシーケンスを決定することは、（参照が他のトラックによって遂行される）参照される（ｒｅｆｅｒｅｎｃｅｄ）トラックが第１のビューシーケンスとして決定される場合、（他のトラックを参照する）参照トラックは、第２のビューシーケンスとして決定される。他のトラックを参照するトラックはトラック基準（ｔｒａｃｋｒｅｆｅｒｅｎｃｅ）ボックス（‘ｔｒｅｆ’ボックス）を有するので、他の側面又は残りのビューのステレオスコピックビデオは、上記の例で第１のビューシーケンスとして決定される。この例において、トラック参照に関する情報を有する‘ｔｒｅｆ’ボックスの位置は、第１のビューシーケンスを第２のビューシーケンスと区別する方法となることができる。トラック参照を使用することに伴い、２個以上のビデオトラックで構成されたメディアファイル内に相互に関連したビューシーケンスを接続させることができ、それによってどのトラックが相互に関連しているかを決定することが可能になる。また、これは、マルチビュービデオから一つのビデオを作り出すために相互に接続させる方法として使用することができる。トラック参照方法を使用することに伴い、特定トラック、即ち第１のビューシーケンス及び第２のビューシーケンスのうちの一つのみに重複するステレオスコピックビデオ関連情報を挿入することによって、数個のトラックにステレオスコピックビデオ関連情報が重複して挿入されることを防止することができる。 Determining the first view sequence based on the information about the track criteria is when the referenced track is determined as the first view sequence (reference is performed by another track) (other The reference track is referred to as the second view sequence. Since tracks that reference other tracks have a track reference box ('tref' box), the stereo video of the other side or the remaining view is determined as the first view sequence in the above example. Is done. In this example, the location of the 'tref' box with information about track references can be a way to distinguish the first view sequence from the second view sequence. With the use of track references, it is possible to connect mutually related view sequences within a media file composed of two or more video tracks, thereby determining which tracks are related to each other. It becomes possible. It can also be used as a way of interconnecting to create a single video from multiview video. By using the track reference method, several tracks are inserted by inserting overlapping stereoscopic video related information into a specific track, ie only one of the first view sequence and the second view sequence. It is possible to prevent the stereoscopic video related information from being inserted in duplicate.

第１のビューシーケンスを第２のビューシーケンスと区別するための他の方法によると、第１のビューシーケンスと第２のビューシーケンスは、上記方法のように一つの情報又は一つのパラメータだけでなく、トラックＩＤと‘ｉｓ＿ｌｅｆｔ＿ｆｉｒｓｔ’パラメータを含むステレオスコピックビデオの構成を表現するのに要求されるステレオスコピックビデオ情報、‘ｔｒｅｆ’ボックス情報が識別されるパラメータ、及びハンドラタイプのように、２個以上のステレオスコピックビデオ関連情報、フィールド、パラメータ、及びボックスを組み合わせることによって決定される。次の方法は、２個以上のステレオスコピックビデオ関連情報、フィールド、パラメータ、ボックスを組み合わせて第１のビューシーケンスと第２のビューシーケンスを決定することが可能である。先ず、左側ビューと右側ビューの２つのビューシーケンスで構成されるステレオスコピックビデオの場合、‘ｉｓ＿ｌｅｆｔ＿ｆｉｒｓｔ’フィールドの値と他のトラックのステレオスコピックビデオを参照する‘ｔｒｅｆ’ボックスの情報を用いて、第１のビューシーケンスを第２のビューシーケンスと区別する基準によって、第１のビューシーケンス又は第２のビューシーケンスとして該当トラックを決定できる。或いは、他のトラックのステレオスコピックビデオを参照する‘ｔｒｅｆ’ボックスの情報及びトラックＩＤを用いて、第１のビューシーケンスを第２のビューシーケンスと区別するための基準によって、第１のビューシーケンス又は第２のビューシーケンスとして該当トラックを決定することができる。 According to another method for distinguishing the first view sequence from the second view sequence, the first view sequence and the second view sequence are not only one information or one parameter as in the above method. 2 such as the stereoscopic video information required to represent the structure of the stereoscopic video including the track ID and the 'is_left_first' parameter, the parameter for identifying the 'tref' box information, and the handler type. It is determined by combining the above stereoscopic video related information, fields, parameters, and boxes. The following method can determine the first view sequence and the second view sequence by combining two or more stereoscopic video related information, fields, parameters, and boxes. First, in the case of a stereoscopic video composed of two view sequences of a left view and a right view, the value of the 'is_left_first' field and the information of the 'tref' box that refers to the stereoscopic video of another track are used. The corresponding track can be determined as the first view sequence or the second view sequence according to a criterion for distinguishing the first view sequence from the second view sequence. Alternatively, the first view sequence may be based on a criterion for distinguishing the first view sequence from the second view sequence using information in a 'tref' box that refers to the stereoscopic video of another track and the track ID. Alternatively, the corresponding track can be determined as the second view sequence.

更に、２個以上のステレオスコピックビデオ関連情報、フィールド、パラメータ、及びボックスの組み合わせによって、２個以上のビューシーケンス（即ち、多重（ｍｕｌｔｉｐｌｅ）又はマルチビューシーケンス）で構成されるステレオスコピックビデオに対する第１のビューシーケンス及び第２のビューシーケンスを決定する他の方法がある。この他の方法によれば、‘ｉｓ＿ｌｅｆｔ＿ｆｉｒｓｔ’フィールドの値、トラックのＩＤ、及びステレオスコピックビデオトラックを参照する‘ｔｒｅｆ’ボックスを用いて第１のビューシーケンス及び第２のビューシーケンスを決定することができる。 Furthermore, for a stereoscopic video composed of two or more view sequences (ie, multiple or multi-view sequences) by a combination of two or more stereoscopic video related information, fields, parameters, and boxes. There are other ways of determining the first view sequence and the second view sequence. According to this other method, the first view sequence and the second view sequence are determined using the value of the 'is_left_first' field, the ID of the track, and the 'tref' box that refers to the stereoscopic video track. Can do.

上記に言及した以外のパラメータ又は情報が、上記のように第１のビューシーケンス及び第２のビューシーケンスを決定するためのパラメータ又は情報として使用され、２個以上のステレオスコピックビデオ関連情報、フィールド、パラメータ、及びボックスは多様な方法で拡張又は追加することができる。 Parameters or information other than those mentioned above are used as parameters or information for determining the first view sequence and the second view sequence as described above, and two or more stereoscopic video related information fields , Parameters and boxes can be expanded or added in various ways.

以下に、本発明による２個以上のビューシーケンスで構成されたステレオスコピックコンテンツで対をなすビューシーケンスのトラック間の関係を示す方法について説明する。また、次の説明は、本発明によるステレオスコピックビデオとモノスコピックビデオを有するコンテンツでステレオスコピックビューシーケンスとモノスコピックビューシーケンス間の関係を示す方法を更に含む。 Hereinafter, a method for indicating a relationship between tracks of a view sequence paired with a stereoscopic content composed of two or more view sequences according to the present invention will be described. The following description further includes a method of showing a relationship between a stereoscopic view sequence and a monoscopic view sequence in content having stereoscopic video and monoscopic video according to the present invention.

＜第１の実施形態＞
２個以上のビューシーケンスで構成されたステレオスコピックコンテンツを復号化して画面にディスプレイするために、左側ビューシーケンスのトラックと右側ビューシーケンスのトラックが相互に関連することを示す必要がある。しかしながら、トラック間の関係を示すボックス及び情報が現在ステレオスコピックファイルフォーマットに存在しないので、本発明の第１の実施形態ではこの問題を解決するために、次のような方法を提供する。 <First Embodiment>
In order to decode a stereoscopic content composed of two or more view sequences and display it on the screen, it is necessary to indicate that the tracks of the left view sequence and the tracks of the right view sequence are related to each other. However, since the box and information indicating the relationship between tracks do not currently exist in the stereoscopic file format, the first embodiment of the present invention provides the following method to solve this problem.

‘ＩＳＯ／ＩＥＣ１４４９６−１２ＩＳＯベースのメディアファイルフォーマット’文書に定義されたボックスの中に、ハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）とトラック基準ボックス（‘ｔｒｅｆ’ボックス）がある。ハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）は、ハンドラタイプ（‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’）を用いて現在トラックにメディアデータのタイプを表し、＜表１＞のように定義される。 Among the boxes defined in the 'ISO / IEC 14496-12 ISO-based media file format' document, there are a handler reference box ('hdlr' box) and a track reference box ('tref' box). The handler reference box ('hdlr' box) indicates the type of media data in the current track using the handler type ('handler_type'), and is defined as shown in Table 1.

２個以上のビデオトラックで構成されるステレオスコピックコンテンツに対して２個の関連トラックを相互に接続させるために、本発明の第１の実施形態では、＜表２＞に示すように、ハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）のハンドラタイプ（‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’）に、該当トラックのメディアデータのタイプがステレオスコピックビデオであることを示す‘ｓｖｉｄ’値を追加する。 In order to connect two related tracks to a stereoscopic content composed of two or more video tracks, in the first embodiment of the present invention, as shown in Table 2, a handler is used. A 'svid' value indicating that the media data type of the corresponding track is a stereoscopic video is added to the handler type ('handler_type') of the reference box ('hdlr' box).

トラック基準ボックス（‘ｔｒｅｆ’ボックス）は、基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）とトラックＩＤ（ｔｒａｃｋ＿ＩＤ）を用いて現在のトラックが参照する他のトラックを更に接続するのに使用される。現在‘ＩＳＯ／ＩＥＣ１４４９６−１２ＩＳＯベースのメディアファイルフォーマット’文書に定義されている‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’を、＜表３＞に示す。 The track reference box ('tref' box) is used to further connect other tracks referenced by the current track using the reference type ('reference_type') and the track ID (track_ID). Table 3 shows the 'reference_type' currently defined in the 'ISO / IEC 14496-12 ISO-based media file format' document.

本発明の第１の実施形態では、２個の関連トラックを接続させるために、＜表４＞に示すようにトラック基準ボックス（‘ｔｒｅｆ’ボックス）の‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’に‘ａｖｍｉ’を追加する。 In the first embodiment of the present invention, in order to connect two related tracks, ‘avmi’ is added to ‘reference_type’ in the track reference box (‘tref’ box) as shown in Table 4.

図２は、本発明の第１の実施形態により、新たに定義された‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’と‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’を用いて２個のビューシーケンスで構成されたステレオスコピックコンテンツに対して関連したビューシーケンスのトラックを相互に接続させるファイル構造を示す。 FIG. 2 illustrates a view sequence associated with a stereoscopic content composed of two view sequences using newly defined 'handler_type' and 'reference_type' according to the first embodiment of the present invention. A file structure for connecting tracks to each other is shown.

図２を参照すると、ステレオスコピック左側ビューシーケンスのトラックは‘ｔｒｅｆ’ボックスを含み、該当トラックが参照しようとするステレオスコピック右側ビューシーケンスのトラックをトラック基準ボックス（‘ｔｒｅｆ’ボックス）２１０を用いて接続する。ここで、トラック基準ボックス（‘ｔｒｅｆ’ボックス）のｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ＝‘ａｖｍｉ’に設定することで、該当参照トラックは、ステレオスコピックビデオ関連情報を含むトラックであり、参照しようとするトラック、即ち参照される（ｒｅｆｅｒｅｎｃｅｄ）トラックと関連していることに注意する。該当する参照トラックに含まれているステレオスコピックビデオ関連情報は、ステレオスコピックコンテンツを構成するビューシーケンスの各トラックが基本的に含まなければならないステレオスコピックビデオ情報であり、関連した２個のトラックのうちのいずれか一つのみに格納することができる。このトラックが参照されるトラックと関係を有する場合、２個のトラックが対をなすことを意味し、これは２個のトラックの間に従属関係があることを意味する。言い換えれば、参照されるトラックのビューシーケンスが第１のビューシーケンスである場合に、参照するトラックのビューシーケンスは、第２のビューシーケンスとなるので、参照するトラックは、参照されるトラックに従属関係を有する。また、参照されるトラックのハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）２２０のｈａｎｄｌｅｒ＿ｔｙｐｅ＝‘ｓｖｉｄ’に設定することで、参照されるトラックがステレオスコピックビデオトラックであることがわかる。 Referring to FIG. 2, the track of the stereoscopic left view sequence includes a “tref” box, and the track of the stereoscopic right view sequence to be referred to by the corresponding track is used using a track reference box (“tref” box) 210. Connect. Here, by setting reference_type = 'avmi' in the track reference box ('tref' box), the corresponding reference track is a track including stereoscopic video related information, and is referred to as a track to be referred to, that is, referred to. Note that it is associated with a referenced track. The stereoscopic video related information included in the corresponding reference track is stereoscopic video information that must basically be included in each track of the view sequence constituting the stereoscopic content. It can be stored in only one of the tracks. If this track has a relationship with a referenced track, it means that the two tracks make a pair, which means that there is a dependency between the two tracks. In other words, when the view sequence of the referenced track is the first view sequence, the view sequence of the referenced track is the second view sequence, so that the referenced track is dependent on the referenced track. Have Further, by setting handler_type = 'svid' in the handler reference box ('hdlr' box) 220 of the referenced track, it can be seen that the referenced track is a stereoscopic video track.

トラック基準ボックス（‘ｔｒｅｆ’ボックス）の有無によって第１のビューシーケンス及び第２のビューシーケンスを決定できるため、トラック基準ボックス（‘ｔｒｅｆ’ボックス）を有するトラックが第２のビューシーケンスとして決定される場合、図２ではステレオスコピック左側シーケンスのトラックが第２のビューシーケンストラックとなる。第１のビューシーケンス決定方法によって左側ビューシーケンスは第１のビューシーケンスとなることができる。第１のビューシーケンス及び第２のビューシーケンスがトラック基準ボックス（‘ｔｒｅｆ’ボックス）２１０を用いて決定される場合、ステレオスコピック右側ビューシーケンスのトラックが第１のビューシーケンスとして決定される際に、ステレオスコピック右側ビューシーケンスのトラックがステレオスコピック左側ビューシーケンスのトラックから参照されるように設定される。この場合、トラック基準ボックス（‘ｔｒｅｆ’ボックス）を有する参照トラックは、第２のビューシーケンスとして設定されると考えられる。 Since the first view sequence and the second view sequence can be determined based on the presence or absence of the track reference box ('tref' box), the track having the track reference box ('tref' box) is determined as the second view sequence. In FIG. 2, the stereoscopic left sequence track is the second view sequence track. The left view sequence may be the first view sequence by the first view sequence determination method. When the first view sequence and the second view sequence are determined using the track reference box ('tref' box) 210, the track of the stereoscopic right view sequence is determined as the first view sequence. The stereoscopic right-view sequence track is set to be referred to by the stereoscopic left-view sequence track. In this case, the reference track having the track reference box ('tref' box) is considered to be set as the second view sequence.

図３は、本発明の第１の実施形態による多重ビューシーケンスを有するマルチビューコンテンツに対して関連したトラックを相互に接続させるファイル構造を示す。 FIG. 3 shows a file structure for interconnecting related tracks for multi-view content having multiple view sequences according to the first embodiment of the present invention.

図３を参照すると、第１の（又はメイン）ビューシーケンスのトラックがトラック基準ボックス（‘ｔｒｅｆ’ボックス）を有すると仮定し、トラック基準ボックス（‘ｔｒｅｆ’ボックス）３１０を用いて第１のビューシーケンスのトラックと、このトラックと関係のある複数のトラックを接続することができる。この場合、第１のビューシーケンスを含むトラックにおいて、トラック基準ボックス（‘ｔｒｅｆ’ボックス）３１０の基準タイプをｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ＝‘ａｖｍｉ’に設定し、このトラックによって参照されるトラックのハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）３２０、３３０のハンドラタイプ（‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’は、ｈａｎｄｌｅｒ＿ｔｙｐｅ＝‘ｓｖｉｄ’に設定される。 Referring to FIG. 3, assuming that a track of the first (or main) view sequence has a track reference box ('tref' box), the first view is shown using a track reference box ('tref' box) 310. A sequence track and a plurality of tracks related to the track can be connected. In this case, in the track including the first view sequence, the reference type of the track reference box ('tref' box) 310 is set to reference_type = 'avmi', and the handler reference box ('hdlr') of the track referred to by this track is set. The handler type ('handler_type') of 'box' 320, 330 is set to handler_type = 'svid'.

上述したように、トラック基準ボックス（‘ｔｒｅｆ’ボックス）の情報を用いて第１のビューシーケンスを第２のビューシーケンスと区別することができる。図４は、本発明の第１の実施形態によってトラック基準ボックス（‘ｔｒｅｆ’ボックス）を有していないトラック、即ち参照されるトラックのシーケンスが第１のビューシーケンスと仮定する場合に、関連したトラックを相互に接続するファイル構造を示す。 As described above, the first view sequence can be distinguished from the second view sequence using information in the track reference box ('tref' box). FIG. 4 is related to the case where the sequence of tracks that do not have a track reference box ('tref' box) according to the first embodiment of the present invention, ie, the referenced track, is the first view sequence. This shows the file structure that connects tracks together.

図５は、本発明の第１の実施形態によってステレオスコピックビデオが２個以上のビューシーケンスで構成されている場合に、関連したビデオトラックを識別して画面にディスプレイするための端末の動作を示す。 FIG. 5 illustrates an operation of a terminal for identifying a related video track and displaying it on a screen when a stereoscopic video is composed of two or more view sequences according to the first embodiment of the present invention. Show.

図５を参照すると、端末は、ステップ４０１で、メディアファイルのファイルタイプボックス（‘ｆｔｙｐ’ボックス）を分析する。ステップ４０２及び４０３で、端末は、メディアファイルの‘ｍｏｏｖ’ボックスとトラックボックス（‘ｔｒａｋ’ボックス）を分析する。ステップ４０４では、端末は、トラック基準ボックス（‘ｔｒｅｆ’ボックス）がトラックボックスに存在するか否かを判定する。トラックがトラック基準ボックス（‘ｔｒｅｆ’ボックス）を有する場合に、端末は、ステップ４０５で、トラック基準ボックス（‘ｔｒｅｆ’ボックス）の基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）を確認する。基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）が‘ａｖｍｉ’であると判定される場合に、端末は、トラック基準ボックス（‘ｔｒｅｆ’ボックス）の参照するトラックＩＤ（‘ｔｒａｃｋ＿ＩＤ’）を確認し、どのステレオスコピックビューシーケンスのトラックが該当トラックと対をなすかを判定する。端末は、ステップ４０６でメディア情報ボックス（‘ｍｄｉａ’ボックス）を確認し、ステップ４０７で、端末が該当トラックのメディアデータタイプを決定できるかに基づき、ハンドラボックス（‘ｈｄｌｒ’ボックス）のハンドラタイプ（‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’）を確認する。ステップ４０８で、端末は、ステレオスコピック情報を包含する残りのボックスの情報を確認し、ステレオスコピックビューシーケンスのトラックのステレオスコピックビデオ関連情報を分析し、関連ビューシーケンスを画面にディスプレイする。トラックボックス（‘ｔｒａｋ’ボックス）を分析する一連のプロセスは、該当トラックがステレオスコピックビューシーケンスのトラックである場合に、メディアファイルの最初のトラックから最後のトラックまで同一に遂行される。 Referring to FIG. 5, the terminal analyzes a file type box (“ftyp” box) of the media file in step 401. In steps 402 and 403, the terminal analyzes the 'moov' box and the track box ('trak' box) of the media file. In step 404, the terminal determines whether a track reference box ('tref' box) exists in the track box. If the track has a track reference box ('tref' box), the terminal checks a reference type ('reference_type') of the track reference box ('tref' box) in step 405. When it is determined that the reference type ('reference_type') is 'avmi', the terminal checks the track ID ('track_ID') referred to by the track reference box ('tref' box) and determines which stereoscopic It is determined whether a track in the view sequence is paired with the corresponding track. The terminal checks the media information box ('mdia' box) in step 406, and in step 407, based on whether the terminal can determine the media data type of the corresponding track, the handler type (handler box ('hdlr' box)) Check 'handler_type'). In step 408, the terminal checks the information in the remaining boxes including the stereoscopic information, analyzes the stereoscopic video related information of the tracks of the stereoscopic view sequence, and displays the related view sequence on the screen. A series of processes for analyzing a track box ('trak' box) is performed in the same way from the first track to the last track of a media file when the corresponding track is a track of a stereoscopic view sequence.

一方、ステップ４０４で、トラックがトラック基準ボックス（‘ｔｒｅｆ’ボックス）を有していないと判定される場合に、端末は、ステップ４０６に進行し、該当トラックのメディア情報ボックス（‘ｍｄｉａ’ボックス）を確認する。その後、端末は、ステップ４０７で、ハンドラタイプ（‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’）を確認し、ステップ４０８で、ステレオスコピック情報を含んでいる残りのボックスを確認してステレオスコピックコンテンツを画面にディスプレイする。 On the other hand, if it is determined in step 404 that the track does not have a track reference box ('tref' box), the terminal proceeds to step 406, and the media information box ('mdia' box) of the corresponding track. Confirm. Thereafter, in step 407, the terminal checks the handler type ('handler_type'). In step 408, the terminal checks the remaining boxes including the stereoscopic information and displays the stereoscopic content on the screen.

端末は、図５のステップ４０８で、第１のビューシーケンス及び第２のビューシーケンスを識別するが、この第１のビューシーケンス及び第２のビューシーケンスを識別するステップの順序は、第１のビューシーケンスを第２のビューシーケンスと区別する上述した方法によって変更することができる。 The terminal identifies the first view sequence and the second view sequence in step 408 of FIG. 5, and the order of the steps of identifying the first view sequence and the second view sequence is the first view sequence. The sequence can be modified by the method described above to distinguish it from the second view sequence.

例えば、トラック基準ボックス（‘ｔｒｅｆ’ボックス）を用いて第１のビューシーケンス及び第２のビューシーケンスを識別する場合に、端末は、図５のステップ４０５で、トラック基準ボックス（‘ｔｒｅｆ’ボックス）の基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）とトラックＩＤ（‘ｔｒａｃｋ＿ＩＤ’）を確認することによって第１のビューシーケンス及び第２のビューシーケンスを識別する。トラック基準ボックス（‘ｔｒｅｆ’ボックス）を有するビューシーケンスのトラックが第２のビューシーケンスであると判定される場合、トラック基準ボックス（‘ｔｒｅｆ’ボックス）の基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）が‘ａｖｍｉ’である際に、参照するトラックＩＤ（‘ｔｒａｃｋ＿ＩＤ’）は、第１のビューシーケンスのトラックＩＤ（‘ｔｒａｃｋ＿ＩＤ’）である。例えば、トラックＩＤ＝１（ｔｒａｃｋ＿ＩＤ＝１）であるトラックがトラック基準ボックス（‘ｔｒｅｆ’ボックス）を有し、該当トラック基準ボックスの基準タイプが‘ａｖｍｉ’（ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ＝‘ａｖｍｉ’）であり、参照するトラックＩＤが２（ｔｒａｃｋ＿ＩＤ＝２）である場合に、トラックＩＤ＝１（ｔｒａｃｋ＿ＩＤ＝１）であるトラックは、トラックＩＤ＝２（ｔｒａｃｋ＿ＩＤ＝２）であるトラックと対をなすステレオスコピックビューシーケンスのトラックであり、トラックＩＤ＝２（ｔｒａｃｋ＿ＩＤ＝２）であるトラックのビューシーケンスが第１のビューシーケンスである。 For example, if the track reference box ('tref' box) is used to identify the first view sequence and the second view sequence, the terminal performs the track reference box ('tref' box) in step 405 of FIG. The first view sequence and the second view sequence are identified by confirming the reference type ('reference_type') and the track ID ('track_ID'). If it is determined that the track of the view sequence having the track reference box ('tref' box) is the second view sequence, the reference type ('reference_type') of the track reference box ('tref' box) is 'avmi'. The track ID ('track_ID') to be referred to is the track ID ('track_ID') of the first view sequence. For example, a track with track ID = 1 (track_ID = 1) has a track reference box ('tref' box), and the reference type of the corresponding track reference box is 'avmi' (reference_type = 'avmi'), see When the track ID to be tracked is 2 (track_ID = 2), the track with track ID = 1 (track_ID = 1) is paired with the track with track ID = 2 (track_ID = 2). The view sequence of the track with track ID = 2 (track_ID = 2) is the first view sequence.

本発明の一実施形態による第１のビューシーケンスを第２のビューシーケンスと区別する他の方法としては、端末が、ステレオスコピックビデオの構成タイプ情報の左側ビューシーケンスと右側ビューシーケンス（又は、２個以上のビューシーケンス）のうちのいずれか一つを先ず符号化するかを示す‘ｉｓ＿ｌｅｆｔ＿ｆｉｒｓｔ’フィールドを確認し、左側ビューシーケンスと右側ビューシーケンス（又は、２個以上のビューシーケンス）のうちのいずれか一つが該当フィールドの値によって第１のビューシーケンス又は第２のビューシーケンスであるかを判定する場合、端末は、図５の動作によって、ステップ４０８で、‘ｉｓ＿ｌｅｆｔ＿ｆｉｒｓｔ’パラメータを含むステレオスコピック関連情報ボックスを確認することによって第１のビューシーケンス及び第２のビューシーケンスを識別し、その関連ビューシーケンスをディスプレイする。 As another method for distinguishing the first view sequence from the second view sequence according to an embodiment of the present invention, the terminal may include a left view sequence and a right view sequence (or 2) of the configuration type information of the stereoscopic video. The 'is_left_first' field indicating whether one of the at least one view sequence is to be encoded first is checked, and one of the left view sequence and the right view sequence (or two or more view sequences) is checked. When determining whether one of the first view sequence or the second view sequence is based on the value of the corresponding field, the UE performs a stereoscopic related operation including an 'is_left_first' parameter in step 408 according to the operation of FIG. By checking the information box Identify the primary view sequence and the secondary view sequence, and displays the related view sequences.

このように、第１のビューシーケンス及び第２のビューシーケンスを識別するプロセスの動作順序は、本発明による第１のビューシーケンスを第２のビューシーケンスと区別する各々の方法に従って変更することができる。 In this way, the operational order of the process of identifying the first view sequence and the second view sequence can be changed according to each method for distinguishing the first view sequence from the second view sequence according to the present invention. .

本発明の一実施形態において、参照されるトラック、即ち‘ｔｒｅｆ’ボックスを有していない残りのトラックのハンドラタイプがステレオスコピックビデオタイプ（‘ｓｖｉｄ’）として表示されるが、参照されるトラックはビデオタイプ（‘ｖｉｄｅ’）となり、参照するトラックはステレオスコピックビデオタイプ（‘ｓｖｉｄ’）となり得る。また、参照するトラック及び参照されるトラックのハンドラタイプ（‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’）は、別途の区別なしにビデオタイプ（‘ｖｉｄｅ’）で表示することができる。 In one embodiment of the present invention, the handler type of the referenced track, ie, the remaining track that does not have a 'tref' box, is displayed as a stereoscopic video type ('svid'), but the referenced track Becomes a video type ('video'), and the track to be referenced can be a stereoscopic video type ('svid'). In addition, the track type to be referred to and the handler type ('handler_type') of the referenced track can be displayed as a video type ('vide') without distinction.

一方、図５で説明したメディアファイルのトラックを確認して画面にディスプレイするプロセスは、端末又はシステムによって順次に提供されないこともある。ここでは、具体的に説明されていないファイルフォーマットの分析プロセス及び該当端末の動作は、ＩＳＯ／ＩＥＣ１４４９６−１２及びＩＳＯ／ＩＥＣ２３０００−１１に従う。 On the other hand, the process of checking the track of the media file described in FIG. 5 and displaying it on the screen may not be sequentially provided by the terminal or the system. Here, the file format analysis process and the operation of the corresponding terminal not specifically described are in accordance with ISO / IEC 14496-12 and ISO / IEC 23000-11.

＜第２の実施形態＞
本発明の第２の実施形態では、ステレオスコピックコンテンツでトラック基準ボックス（‘ｔｒｅｆ’ボックス）を用いて、追加情報であるカメラパラメータ及びディスプレイ安全情報（ｄｉｓｐｌａｙｓａｆｅｔｙｉｎｆｏｒｍａｔｉｏｎ）を含むトラックを参照するためのトラック参照方法について説明する。ステレオスコピックコンテンツに付加的な情報として含まれるカメラパラメータは、ベースライン、ｆｏｃａｌ＿ｌｅｎｇｔｈ、ｃｏｎｖｅｒｇｅｎｃｅ＿ｄｉｓｔａｎｃｅ、変換（ｔｒａｎｓｌａｔｉｏｎ）、回転などを含み、ディスプレイ安全情報は、ディスプレイサイズ関連情報、視聴距離、及び不一致（ｄｉｓｐａｒｉｔｙ）情報などを含む。ここで、カメラパラメータ及びディスプレイ安全情報は追加情報として示したが、これらパラメータ及び安全情報は選択的である。従って、該当情報を含むボックスは、選択的なボックスとして表現することができる。 <Second Embodiment>
In the second embodiment of the present invention, a track reference box ('tref' box) is used in stereoscopic content to refer to a track including camera parameters and display safety information as additional information. The track reference method will be described. Camera parameters included as additional information in the stereoscopic content include baseline, focal_length, convergence_distance, translation, rotation, etc. Display safety information includes display size related information, viewing distance, and disparity ) Contains information. Here, the camera parameters and display safety information are shown as additional information, but these parameters and safety information are optional. Therefore, the box containing the relevant information can be expressed as a selective box.

本発明の第２の実施形態では、ステレオスコピックコンテンツを獲得するために使用されるカメラパラメータ及びディスプレイ安全情報を含むトラックのトラック参照のために、＜表５＞に示すように‘ｔｒｅｆ’ボックスのｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅに‘ｃｄｓｉ’を追加する。 In the second embodiment of the present invention, a 'tref' box as shown in Table 5 for track reference of a track including camera parameters and display safety information used to acquire stereoscopic content. Add 'cdsi' to the reference_type.

図６は、本発明の第２の実施形態により、ステレオスコピックコンテンツに対する追加情報であるカメラパラメータとディスプレイ安全情報を含むトラックを参照する方法を示す。 FIG. 6 illustrates a method for referring to a track including camera parameters and display safety information, which are additional information for stereoscopic content, according to the second embodiment of the present invention.

図６を参照すると、ステレオスコピック左側ビューシーケンスのトラックとステレオスコピック右側ビューシーケンスのトラックは、トラック基準ボックス（‘ｔｒｅｆ’ボックス）５１０、５２０を用いて、追加情報が含まれたトラックを参照できる。このような場合に、トラック全てに追加情報を格納する必要がなく、他のトラックが追加情報を含むトラックを参照することによって、同一の情報が複数のトラックに重複して格納されることを防止することができる。 Referring to FIG. 6, the tracks of the stereoscopic left view sequence and the tracks of the stereoscopic right view sequence use the track reference boxes ('tref' boxes) 510 and 520 to refer to the tracks including additional information. it can. In such a case, it is not necessary to store additional information in all the tracks, and other tracks refer to a track including additional information, thereby preventing the same information from being stored repeatedly in multiple tracks. can do.

図７は、多重ビューシーケンスを有するマルチビューコンテンツに本発明の第２の実施形態を適用する方法を示す。 FIG. 7 illustrates a method for applying the second embodiment of the present invention to multi-view content having multiple view sequences.

この場合でも、図６に示すように、各々のビューシーケンスを含むトラックは、‘ｔｒｅｆ’ボックス６１０、６２０、６３０を用いて追加情報を含むトラックを参照する。 Even in this case, as shown in FIG. 6, the track including each view sequence refers to the track including the additional information using the ‘tref’ boxes 610, 620, and 630.

図８は、本発明の第２の実施形態による端末の動作を示すフローチャートである。 FIG. 8 is a flowchart illustrating an operation of the terminal according to the second embodiment of the present invention.

図８を参照すると、端末は、ステップ７０１で、メディアファイルのファイルタイプボックス（‘ｆｔｙｐ’ボックス）を分析する。端末は、ステップ７０２及び７０３で、メディアファイルの‘ｍｏｏｖ’ボックスとトラックボックス（‘ｔｒａｋ’ボックス）を分析する。端末は、ステップ７０４で、トラック基準ボックス（‘ｔｒｅｆ’ボックス）がトラックボックスに存在するか否かを判定する。トラックがトラック基準ボックス（‘ｔｒｅｆ’ボックス）を有する場合に、端末は、ステップ７０５で、トラック基準ボックス（‘ｔｒｅｆ’ボックス）の基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）を確認する。基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）が‘ｃｄｓｉ’であると判定される場合に、端末は、トラック基準ボックス（‘ｔｒｅｆ’ボックス）の参照するトラックＩＤ（‘ｔｒａｃｋ＿ＩＤ’）を確認し、該当トラックが参照しようとするステレオスコピックビデオに関する追加情報であるカメラパラメータ及びディスプレイ安全情報を含む追加情報をどのトラックが包含するかを判定する。端末は、ステップ７０６でメディア情報ボックス（‘ｍｄｉａ’ボックス）を確認し、ステップ７０７で、端末が該当トラックのどのメディアデータタイプを決定できるかに基づき、ハンドラボックス（‘ｈｄｌｒ’ボックス）のハンドラタイプ（‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’）を確認する。最後に、ステップ７０８で、端末は、ステレオスコピック情報を包含する残りのボックスの情報を確認し、ステレオスコピックビューシーケンスのトラックのステレオスコピックビデオ関連情報を分析し、その関連したトラックを画面にディスプレイする。トラックボックス（‘ｔｒａｋ’ボックス）を分析する一連のプロセスは、該当トラックがステレオスコピックビューシーケンスのトラックである場合に、メディアファイルの最初のトラックから最後のトラックまで同一に遂行される。 Referring to FIG. 8, the terminal analyzes a file type box (“ftyp” box) of the media file in step 701. In step 702 and 703, the terminal analyzes the 'moov' box and the track box ('trak' box) of the media file. In step 704, the terminal determines whether a track reference box ('tref' box) exists in the track box. If the track has a track reference box ('tref' box), the terminal confirms the reference type ('reference_type') of the track reference box ('tref' box) in step 705. When it is determined that the reference type ('reference_type') is 'cdsi', the terminal checks the track ID ('track_ID') referred to by the track reference box ('tref' box), and the corresponding track is referred to. It is determined which track contains additional information including camera parameters and display safety information, which are additional information regarding the stereoscopic video to be attempted. The terminal checks the media information box ('mdia' box) in step 706, and in step 707, based on which media data type of the corresponding track the terminal can determine, the handler type of the handler box ('hdlr' box) Confirm ('handler_type'). Finally, in step 708, the terminal checks the information in the remaining boxes containing the stereoscopic information, analyzes the stereoscopic video related information of the tracks in the stereoscopic view sequence, and displays the related tracks on the screen. Display. A series of processes for analyzing a track box ('trak' box) is performed in the same way from the first track to the last track of a media file when the corresponding track is a track of a stereoscopic view sequence.

一方、ステップ７０４で、トラックがトラック基準ボックス（‘ｔｒｅｆ’ボックス）を有していないと判定される場合には、端末は、ステップ７０６に進行し、該当トラックのメディア情報ボックス（‘ｍｄｉａ’ボックス）を確認する。その後、端末は、ステップ７０７で、ハンドラタイプ（‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’）を確認し、ステップ７０８で、ステレオスコピック情報を含んでいる残りのボックスを確認してステレオスコピックコンテンツを画面にディスプレイする。 On the other hand, if it is determined in step 704 that the track does not have a track reference box ('tref' box), the terminal proceeds to step 706, and the media information box ('mdia' box) of the corresponding track. ) Thereafter, the terminal confirms the handler type ('handler_type') in step 707, and confirms the remaining boxes including the stereoscopic information in step 708, and displays the stereoscopic content on the screen.

端末は、図８のステップ７０８で、第１のビューシーケンス及び第２のビューシーケンスを識別するが、この第１のビューシーケンス及び第２のビューシーケンスを識別するプロセスのステップは、本発明の第１の実施形態による図５に説明したように、第１のビューシーケンスを第２のビューシーケンスと区別する上述した方法によって順序を変更することができる。 The terminal identifies the first view sequence and the second view sequence in step 708 of FIG. 8, and the process step of identifying the first view sequence and the second view sequence is the first step of the present invention. As described in FIG. 5 according to one embodiment, the order can be changed by the above-described method of distinguishing the first view sequence from the second view sequence.

ステップ７０７で分析したトラックのハンドラタイプがステレオスコピックビデオタイプ（‘ｓｖｉｄ’）である場合、該当トラックは、ステレオスコピックビデオに対する追加情報であるカメラパラメータとディスプレイ安全情報を有する選択的情報を含むトラックである。 If the handler type of the track analyzed in step 707 is a stereoscopic video type ('svid'), the corresponding track includes selective information including camera parameters and display safety information as additional information for the stereoscopic video. It is a track.

一方、図８で説明したメディアファイルのトラックを確認して画面にディスプレイするプロセスは、端末又はシステムによって順次に実行されないこともある。ここで、具体的に説明されていないファイルフォーマットの分析プロセス及び該当端末の動作は、ＩＳＯ／ＩＥＣ１４４９６−１２及びＩＳＯ／ＩＥＣ２３０００−１１に従う。 On the other hand, the process of confirming the track of the media file described in FIG. 8 and displaying it on the screen may not be sequentially executed by the terminal or the system. Here, the file format analysis process and the operation of the corresponding terminal not specifically described are in accordance with ISO / IEC14496-12 and ISO / IEC23000-11.

＜第３の実施形態＞
ステレオスコピックコンテンツとモノスコピックコンテンツが一つのシーンを構成する要素として一つのシーンで同時にディスプレイされるサービスの場合に、一つのシーン内で表現されなければならないステレオスコピックビューシーケンスとモノスコピックビューシーケンスを復号化してディスプレイするために、２個のビューシーケンスのトラックを接続して、ユーザーが相互に関連したトラックであることをわかるようにする必要がある。しかし、現在のステレオスコピックファイルフォーマットにはこのような関係を示す方法がないので、本発明の第３の実施形態ではこの問題を解決するための方法を提案する。 <Third Embodiment>
In the case of a service in which stereoscopic content and monoscopic content are simultaneously displayed in one scene as elements constituting one scene, the stereoscopic view sequence and monoscopic view sequence that must be expressed in one scene In order to decode and display the images, it is necessary to connect the tracks of the two view sequences so that the user knows that they are related to each other. However, since there is no method for indicating such a relationship in the current stereoscopic file format, the third embodiment of the present invention proposes a method for solving this problem.

‘ＩＳＯ／ＩＥＣ１４４９６−１２ＩＳＯベースのメディアファイルフォーマット’文書に定義されたボックスの中に、ハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）とトラック基準ボックス（‘ｔｒｅｆ’ボックス）がある。上述したように、ハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）は、ハンドラタイプ（‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’）を用いて現在トラックのメディアデータのタイプを表す。本発明の第３の実施形態では、一つのシーンで表現されなければならないステレオスコピックビューシーケンスのトラックとモノスコピックビューシーケンスのトラックを相互に接続するために、＜表６＞に示すように、ハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）のハンドラタイプ（‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’）‘ｍｖｉｄ’を追加する。 Among the boxes defined in the 'ISO / IEC 14496-12 ISO-based media file format' document, there are a handler reference box ('hdlr' box) and a track reference box ('tref' box). As described above, the handler reference box ('hdlr' box) represents the media data type of the current track using the handler type ('handler_type'). In the third embodiment of the present invention, in order to connect a stereoscopic view sequence track and a monoscopic view sequence track that must be expressed in one scene, as shown in Table 6, Add the handler type ('handler_type') 'mvid' of the handler criteria box ('hdlr' box).

トラック基準ボックス（‘ｔｒｅｆ’ボックス）は基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）とトラックＩＤ（‘ｔｒａｃｋ＿ＩＤ’）を用いて現在のトラックによって参照されるもう一つのトラックを接続するために使用されるボックスである。本発明の第３の実施形態では、関連した２個のトラックを接続させるために、＜表７＞に示すようにトラック基準ボックス（‘ｔｒｅｆ’ボックス）の基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）に‘ｓｃｍｉ’を追加する。 A track reference box ('tref' box) is a box used to connect another track referenced by the current track using a reference type ('reference_type') and a track ID ('track_ID'). . In the third embodiment of the present invention, in order to connect two related tracks, the reference type ('reference_type') of the track reference box ('tref' box) is set to 'scmi' as shown in Table 7. Add '.

図９は、本発明の第３の実施形態により、新たに定義された‘ｈａｎｄｌｅｒ＿ｔｙｐｅ’と‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’を用いて一つのシーンに同時にディスプレイされるステレオスコピックビューシーケンスとモノスコピックビューシーケンスを有するコンテンツで、一つのシーンに形成されるステレオスコピックビューシーケンスのトラックとモノスコピックビューシーケンスのトラックを相互に接続するファイル構造を示す。 FIG. 9 illustrates content having a stereoscopic view sequence and a monoscopic view sequence that are simultaneously displayed in one scene using newly defined 'handler_type' and 'reference_type' according to the third embodiment of the present invention. A file structure for connecting a stereoscopic view sequence track and a monoscopic view sequence track formed in one scene is shown.

図９を参照すると、現在トラックは、ステレオスコピックビューシーケンスのトラックであり、ステレオスコピックビューシーケンスと共に一つのシーンでディスプレイされなければならないモノスコピックビューシーケンスのトラックは、トラック基準ボックス（‘ｔｒｅｆ’ボックス）８１０を用いて現在トラックに接続される。基準タイプをｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ＝‘ｓｃｍｉ’として設定する場合に、参照されるトラックは、参照するトラックであるステレオスコピックビデオトラックと共に一つのシーンでディスプレイされなければならないモノスコピックコンテンツを含むトラック（空間的に組み合わせたメディアトラック）である。更に、参照されるトラックのハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）８２０のハンドラタイプをｈａｎｄｌｅｒ＿ｔｙｐｅ＝‘ｍｖｉｄ’として設定する場合に、参照されるトラックは、ステレオスコピックビューシーケンスと共に一つのシーンでディスプレイされなければならないモノスコピックビューシーケンスのトラック（空間的に組み合わせたメディアトラック）である。 Referring to FIG. 9, the current track is a stereoscopic view sequence track, and the monoscopic view sequence track that must be displayed in one scene together with the stereoscopic view sequence is a track reference box ('tref'). Box) 810 to connect to the current track. When the reference type is set as reference_type = 'scmi', the referenced track is a track that contains monoscopic content (spatially) that must be displayed in one scene with the stereoscopic video track being the referenced track. Media track). In addition, when the handler type of the referenced track's handler criteria box ('hdlr' box) 820 is set as handler_type = 'mvid', the referenced track is displayed in one scene with a stereoscopic view sequence. This is a monoscopic view sequence track (spatial media track).

図１０は、本発明の第３の実施形態によって一つのシーンで同時にディスプレイされるステレオスコピックビューシーケンスとモノスコピックビューシーケンスを有するコンテンツで、２個以上のビューシーケンスで構成されたステレオスコピックビューシーケンスとモノスコピックビューシーケンスを相互に接続するファイル構造を示す。 FIG. 10 illustrates a stereoscopic view including two or more view sequences, which includes a stereoscopic view sequence and a monoscopic view sequence that are simultaneously displayed in one scene according to the third embodiment of the present invention. The file structure which connects a sequence and a monoscopic view sequence mutually is shown.

図１０を参照すると、ステレオスコピックビデオを構成するステレオスコピック左側ビューシーケンスのトラックとステレオスコピック右側ビューシーケンスのトラックは、各々トラック基準ボックス（‘ｔｒｅｆ’ボックス）９１０、９２０を用いて一つのシーンで共にディスプレイされなければならないモノスコピックビューシーケンスのトラックを参照するように接続することができる。この場合にも、ステレオスコピック左側ビューシーケンスのトラックとステレオスコピック右側ビューシーケンスのトラックの各々に対して、トラック基準ボックス（‘ｔｒｅｆ’ボックス）９１０、９２０の基準タイプがｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ＝‘ｓｃｍｉ’として設定され、この参照されるトラックのハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）９３０のハンドラタイプがｈａｎｄｌｅｒ＿ｔｙｐｅ＝‘ｍｖｉｄ’として設定される場合に、参照されるビューシーケンスのトラックは、ステレオスコピックビューシーケンスと共に一つのシーンでディスプレイされなければならないモノスコピックビューシーケンスのトラック（空間的に組み合わせたメディアトラック）である。 Referring to FIG. 10, the stereoscopic left-view sequence track and the stereoscopic right-view sequence track that constitute the stereoscopic video have one track reference box ('tref' box) 910 and 920, respectively. It can be connected to reference a track of a monoscopic view sequence that must be displayed together in the scene. Also in this case, the reference type of the track reference box ('tref' box) 910, 920 is set to reference_type = 'scmi' for each of the track of the stereoscopic left view sequence and the track of the stereoscopic right view sequence. If set and the handler type of this referenced track's handler criteria box ('hdlr' box) 930 is set as handler_type = 'mvid', the track of the referenced view sequence will be with a stereoscopic view sequence A monoscopic view sequence track (a spatially combined media track) that must be displayed in one scene.

＜第４の実施形態＞
図１１は、本発明の第４の実施形態により、新たに定義されたハンドラタイプ（ｈａｎｄｌｅｒ＿ｔｙｐｅ）‘ｓｖｉｄ’と基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）‘ａｖｍｉ’を用いて一つのシーンで同時にディスプレイされるステレオスコピックビューシーケンスとモノスコピックビューシーケンスを有するコンテンツで、ステレオスコピックビューシーケンスのトラックとモノスコピックビューシーケンスのトラックを相互に接続する他のファイル構造を更に示す。 <Fourth Embodiment>
FIG. 11 illustrates a stereo that is simultaneously displayed in one scene using a newly defined handler type (handler_type) 'svid' and a reference type ('reference_type') 'avmi' according to the fourth embodiment of the present invention. FIG. 5 further illustrates another file structure that interconnects a stereoscopic view sequence track and a monoscopic view sequence track with content having a stereoscopic view sequence and a monoscopic view sequence.

本発明の第４の実施形態では、本発明の第１の実施形態で使用されたトラック基準方法と同一の方法でトラックを参照する。しかし、本発明の第４の実施形態では、トラック基準ボックス（‘ｔｒｅｆ’ボックス）１０１０は、そのビューシーケンスと対をなすステレオスコピックビューシーケンスとの接続だけでなく、一つのシーンで同時にディスプレイされるモノスコピックビューシーケンスとの接続にも使用される。トラック基準ボックス（‘ｔｒｅｆ’ボックス）１０１０のｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ＝‘ａｖｍｉ’、参照されるステレオスコピックビデオトラックのハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）１０２０のｈａｎｄｌｅｒ＿ｔｙｐｅ＝‘ｓｖｉｄ’、参照されるモノスコピックビデオトラックのハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）１０３０のｈａｎｄｌｅｒ＿ｔｙｐｅ＝‘ｖｉｄｅ’に設定することによって、第１のビューシーケンスと対をなすステレオスコピックの残りの一つのビューシーケンスを、ステレオスコピックコンテンツと共に一つのシーンで同時にディスプレイされなければならないモノスコピックビューシーケンスと区別することができる。 In the fourth embodiment of the present invention, the track is referred to by the same method as the track reference method used in the first embodiment of the present invention. However, in the fourth embodiment of the present invention, the track reference box ('tref' box) 1010 is displayed simultaneously in one scene, as well as a connection with the stereoscopic view sequence paired with that view sequence. Also used to connect to monoscopic view sequences. Reference_type = 'avmi' in the track reference box ('tref' box) 1010, handler reference box ('hdlr' box) 1020 in the referenced stereoscopic video track, handler_type = 'svid', referenced monoscopic video track By setting handler_type = 'vide' in the handler reference box ('hdlr' box) 1030, the remaining one of the stereoscopic views paired with the first view sequence is combined with the stereoscopic content. It can be distinguished from a monoscopic view sequence that must be displayed simultaneously in one scene.

図１１の例において、上述した第１のビューシーケンスを決定する方法によってステレオスコピック右側ビューシーケンスのトラックが第１のビューシーケンストラックとして決定されるので、トラック基準ボックス（‘ｔｒｅｆ’ボックス）を有するトラックは、第２のビューシーケンスとして設定される。 In the example of FIG. 11, since the track of the stereoscopic right-side view sequence is determined as the first view sequence track by the above-described method for determining the first view sequence, the track reference box ('tref' box) is included. The track is set as the second view sequence.

更に、本実施形態において、２個以上のトラックで構成されるステレオスコピックコンテンツ間の接続関係がｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ＝‘ａｖｍｉ’を用いて示される場合に、第１のビューシーケンストラックのｈａｎｄｌｅｒ＿ｔｙｐｅは‘ｖｉｄｅ’であり、第２のビューシーケンストラックのｈａｎｄｌｅｒ＿ｔｙｐｅは‘ｓｖｉｄ’である。もちろん、このような区別において、参照されるビューシーケンスはビデオタイプ（‘ｖｉｄｅ’）となり得る。また、全てのビューシーケンスは、別途の区分なしにビデオタイプ（‘ｖｉｄｅ’）のみを用いて示すことができる。 Furthermore, in the present embodiment, when the connection relationship between stereoscopic contents composed of two or more tracks is indicated using reference_type = 'avmi', the handler_type of the first view sequence track is set to 'vide'. And the handler_type of the second view sequence track is 'svid'. Of course, in such a distinction, the referenced view sequence can be a video type ('video'). Also, all view sequences can be shown using only the video type ('video') without separate division.

＜第５の実施形態＞
本発明の第５の実施形態は、２個以上のビューシーケンスで構成されたステレオスコピック間の関係を本発明の第１の実施形態のようなトラック参照方法を用いて示し、参照するビューシーケンス以外の残りのビューシーケンスに対して参照するビューシーケンスから相対的なディスプレイ及びカメラ情報を格納して生成されたステレオスコピックメディアファイルの構造を提供する。 <Fifth Embodiment>
In the fifth embodiment of the present invention, a view sequence that shows and refers to a relationship between stereoscopics composed of two or more view sequences by using the track reference method as in the first embodiment of the present invention. A structure of a stereoscopic media file generated by storing display and camera information relative to a reference view sequence with respect to the remaining view sequences other than.

ステレオスコピックビデオ関連情報において、追加情報は、本発明の第１及び第２の実施形態で説明したようにステレオスコピックコンテンツに含まれる。ステレオスコピックコンテンツに含まれ得る追加情報として、ステレオスコピックビデオを獲得する過程で得られるステレオスコピックビデオ関連情報を含む、ステレオスコピックビデオに対するディスプレイ及びカメラ情報がある。このようなステレオスコピックビデオに対するディスプレイ及びカメラ情報は、ベースライン、ｆｏｃａｌ＿ｌｅｎｇｔｈ、ｃｏｎｖｅｒｇｅｎｃｅ＿ｄｉｓｔａｎｃｅ、変換、回転などを含み、ディスプレイ安全情報は、ディスプレイサイズ関連情報、視聴距離、及び不一致情報などを含むことができる。本発明では追加情報として称しているが、この情報は、選択的な情報である。従って、該当情報を含んでいるボックスは、選択的なボックスとして表現できる。 In the stereoscopic video related information, the additional information is included in the stereoscopic content as described in the first and second embodiments of the present invention. Additional information that can be included in the stereoscopic content includes display and camera information for the stereoscopic video, including information related to the stereoscopic video obtained in the process of acquiring the stereoscopic video. Display and camera information for such stereoscopic video may include baseline, focal_length, convergence_distance, transformation, rotation, etc., and display safety information may include display size related information, viewing distance, mismatch information, etc. . Although referred to as additional information in the present invention, this information is selective information. Therefore, the box containing the relevant information can be expressed as a selective box.

ステレオスコピックビデオのディスプレイ及びカメラ情報を格納する方法のうちの一つは、参照するビューシーケンスを基にして残りのビューシーケンスに、参照するビューシーケンスのディスプレイ及びカメラ情報に対する相対的な値を各フィールドのパラメータ値として格納する。例えば、参照するビューシーケンスが第１のビューシーケンスであると仮定する場合、第１のビューシーケンスのディスプレイ及びカメラ情報は、全て０として格納され、各フィールドに対して参照するビューシーケンスから相対的なディスプレイ及びカメラ情報のパラメータ値は、第１のビューシーケンスを除いた残りのビューシーケンス、即ち第２のビューシーケンスに格納される。参照するビューシーケンスのステレオスコピックビデオに対するディスプレイ及びカメラ情報は、全て０に設定されるため、該当情報は省略できる。従って、参照するビューシーケンスのディスプレイ及びカメラ情報に対する相対的なディスプレイ及びカメラ情報は、残りのビューシーケンスのみに格納することができる。例えば、ステレオスコピックビデオのディスプレイ及びカメラ情報のうちの一つである２個のビューシーケンスに対するカメラ間の距離が５であると仮定すると、参照するビューシーケンスの該当情報に対するフィールドの値が０であるので、‘０’値は省略され、参照するビューシーケンスのカメラからの距離である５は、残りのビューシーケンスの該当情報に対するフィールドの値として格納される。 One of the methods for storing stereoscopic video display and camera information is to assign the relative values for the reference view sequence display and camera information to the remaining view sequences based on the reference view sequence. Store as the parameter value of the field. For example, assuming that the referenced view sequence is the first view sequence, the display and camera information of the first view sequence are all stored as 0, relative to the referenced view sequence for each field. The parameter values of the display and camera information are stored in the remaining view sequence excluding the first view sequence, that is, the second view sequence. Since the display and camera information for the stereoscopic video of the view sequence to be referred to are all set to 0, the corresponding information can be omitted. Accordingly, display and camera information relative to the display and camera information of the referenced view sequence can be stored only in the remaining view sequences. For example, assuming that the distance between cameras with respect to two view sequences, which is one of stereoscopic video display and camera information, is 5, the value of the field for the corresponding information of the referenced view sequence is 0. Therefore, the value “0” is omitted, and the distance 5 from the camera of the view sequence to be referred to is stored as a field value for the corresponding information of the remaining view sequences.

図１２は、本発明の第５の実施形態によるファイル構造を示す。 FIG. 12 shows a file structure according to the fifth embodiment of the present invention.

図１２において、参照するビューシーケンスは、第１のビューシーケンスであると仮定し、２個のビューシーケンスで構成されるステレオスコピックメディアファイル構造において、第１のビューシーケンスから相対的なディスプレイ及びカメラ情報を格納する‘ｓｃｄｉ’ボックス１１４０は、第２のビューシーケンスのトラックに包含され、第１のビューシーケンスのトラックは、本発明の第１の実施形態で使用されたトラック参照方法と同一の方法でハンドラタイプ‘ｓｖｉｄ’１１１０及び基準タイプ‘ａｖｍｉ’１１２０を用いて‘ｓｃｄｉ’情報を有する第２のビューシーケンスのトラックに接続される。この場合でも、ビデオタイプ‘ｖｉｄｅ’は、ステレオスコピックビューシーケンスのハンドラタイプ１１１０として使用することができる。図１２において、第１のビューシーケンスは、トラック基準ボックス（‘ｔｒｅｆ’ボックス）のない左側ビューシーケンスのように示し、トラック基準ボックス（‘ｔｒｅｆ’ボックス）を有するステレオスコピックビューシーケンス、即ち基準タイプ（‘ｒｅｆｅｒｅｎｃｅ＿ｔｙｐｅ’）＝‘ａｖｍｉ’であるビューシーケンスのトラックは、参照するビューシーケンスからの相対的なディスプレイ及びカメラ情報を持っている‘ｓｃｄｉ’ボックスを含む。 In FIG. 12, it is assumed that the view sequence to be referred to is a first view sequence, and in a stereoscopic media file structure composed of two view sequences, a display and a camera relative to the first view sequence. A 'scdi' box 1140 for storing information is included in the track of the second view sequence, and the track of the first view sequence is the same method as the track reference method used in the first embodiment of the present invention. , The handler type 'svid' 1110 and the reference type 'avmi' 1120 are used to connect to the track of the second view sequence having the 'scdi' information. Even in this case, the video type “video” can be used as the handler type 1110 of the stereoscopic view sequence. In FIG. 12, the first view sequence is shown as a left-side view sequence without a track reference box ('tref' box) and is a stereoscopic view sequence with a track reference box ('tref' box), i.e., a reference type. The view sequence track where ('reference_type') = 'avmi' includes a 'scdi' box with relative display and camera information from the referenced view sequence.

図１３は、トラックを参照するトラック基準ボックス（‘ｔｒｅｆ’ボックス）１１５０を有するトラックが‘ｓｃｄｉ’ボックス１１６０を有するトラックとは関係なく提供される場合を示す。ここで、‘ｓｃｄｉ’情報の参照するビューシーケンスは左側ビューシーケンスであり、左側ビューシーケンスの相対的‘ｓｃｄｉ’情報は、右側ビューシーケンスのトラックに含まれる。 FIG. 13 illustrates a case where a track having a track reference box ('tref' box) 1150 that refers to a track is provided independently of a track having a 'scdi' box 1160. Here, the view sequence referred to by the ‘scdi’ information is the left view sequence, and the relative ‘scdi’ information of the left view sequence is included in the track of the right view sequence.

図１４は、２個以上のビューシーケンスを有するマルチビューコンテンツに対する本発明の第５の実施形態の方法を拡張して生成されるファイル構造を示す。 FIG. 14 shows a file structure generated by extending the method of the fifth embodiment of the present invention for multi-view content having two or more view sequences.

図１４を参照すると、本発明の第５の実施形態により、第１のビューシーケンスから相対的ディスプレイ及びカメラ情報が格納された‘ｓｃｄｉ’ボックス１２２４及び１２３４は、第１のビューシーケンス以外の残りの複数のビューシーケンスに包含され、第１のビューシーケンスのトラックは、本発明の第１の実施形態で使用されたトラック参照方法と同一の方法でハンドラタイプ‘ｓｖｉｄ’１２１０と基準タイプ‘ａｖｍｉ’１２２０、１２３０を用いて‘ｓｃｄｉ’情報を有する残りのビューシーケンスのトラックに接続される。この場合でも、ビデオタイプ‘ｖｉｄｅ’は、ステレオスコピックビデオのハンドラタイプ（１２２２及び１２３２）として使用することができる。 Referring to FIG. 14, according to the fifth embodiment of the present invention, 'scdi' boxes 1224 and 1234 storing relative display and camera information from the first view sequence are displayed in the remaining views other than the first view sequence. The tracks of the first view sequence are included in a plurality of view sequences, and the handler type 'svid' 1210 and the reference type 'avmi' 1220 are the same as the track reference method used in the first embodiment of the present invention. 1230 is used to connect to the remaining view sequence tracks having 'scdi' information. Even in this case, the video type “video” can be used as a handler type (1222 and 1232) of the stereoscopic video.

図１５は、トラックを参照する‘ｔｒｅｆ’ボックス１２５０を有するトラックが‘ｓｃｄｉ’ボックス１２６０及び１２７０を有するトラックと別個に提供される場合を示す。 FIG. 15 illustrates a case where a track having a 'tref' box 1250 that references a track is provided separately from a track having 'scdi' boxes 1260 and 1270.

第１〜第５の実施形態において、基準タイプ及びハンドラタイプのタイトル、名称、及び意味（ｓｅｍａｎｔｉｃ）は、同一のオブジェクト及び方法に対応する限り、異なるタイトル、名称、及び意味で表現することができる。 In the first to fifth embodiments, the title, name, and semantic of the reference type and the handler type can be expressed by different titles, names, and meanings as long as they correspond to the same object and method. .

次に、本発明の一実施形態によるメディアファイルフォーマットを用いるメディアファイルを生成及び再生するシステムについて説明する。本発明の一実施形態によるシステムは、大きくメディアファイル生成装置とメディアファイル再生装置で構成することができる。 Next, a system for generating and playing back a media file using the media file format according to an embodiment of the present invention will be described. A system according to an embodiment of the present invention can be mainly composed of a media file generation device and a media file playback device.

図１６は、本発明の一実施形態によるメディアファイル生成装置を示す。 FIG. 16 shows a media file generation device according to an embodiment of the present invention.

図１６を参照すると、本実施形態によるメディアファイル生成装置は、２個以上のカメラ１３０１〜１３０４、入力部１３１０、ビデオ信号プロセッサ１３２０、格納部１３３０、エンコーダ１３４０、及びファイル生成部１３５０を含む。 Referring to FIG. 16, the media file generation apparatus according to the present embodiment includes two or more cameras 1301 to 1304, an input unit 1310, a video signal processor 1320, a storage unit 1330, an encoder 1340, and a file generation unit 1350.

カメラ１３０１〜１３０４は、各々特定の被写体を左側ビュー又は右側ビューで撮影して相互に異なるビューシーケンスを出力する。モノグラフィックビデオがサービスされる場合には、モノスコピックビデオデータは、ステレオスコピックビデオデータと一緒に入力部１３１０に入力される。このとき、カメラパラメータのような情報も、入力部１３１０に伝達することができる。 The cameras 1301 to 1304 each capture a specific subject in the left view or the right view, and output different view sequences. When monographic video is served, the monoscopic video data is input to the input unit 1310 together with the stereoscopic video data. At this time, information such as camera parameters can also be transmitted to the input unit 1310.

ビデオ信号プロセッサ１３２０は、入力部１３１０を通じて受信された全てのビデオデータを前処理する。ここで、前処理動作は、外部のビデオ値、即ち光とカラー成分をＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）又はＣＭＯＳ（ＣｏｍｐｌｅｍｅｎｔａｒｙＭｅｔａｌ−ＯｘｉｄｅＳｅｍｉｃｏｎｄｕｃｔｏｒ）タイプのセンサーによって認識されて生成されたアナログ値をデジタル値に変換する動作を意味する。 The video signal processor 1320 preprocesses all video data received through the input unit 1310. Here, the pre-processing operation converts an external video value, that is, an analog value generated by recognizing a light and color component by a CCD (Charge Coupled Device) or CMOS (Complementary Metal-Oxide Semiconductor) type sensor into a digital value. Means an action to convert.

格納部１３３０は、ビデオ信号プロセッサ１３２０によって前処理されたビデオデータを格納してエンコーダ１３４０に提供する。図１６は、格納部１３３０を示しているが、この格納部１３３０は、図１６に示す構成要素間のバッファリングに対する格納構成を別途に示していない。エンコーダ１３４０は、格納部１３３０から提供された各ビデオデータを符号化する。エンコーダ１３４０によって遂行される符号化動作は、必要に応じて省略可能なデータの符号化である。 The storage unit 1330 stores the video data preprocessed by the video signal processor 1320 and provides the video data to the encoder 1340. FIG. 16 shows the storage unit 1330, but this storage unit 1330 does not separately show a storage configuration for buffering between the components shown in FIG. The encoder 1340 encodes each video data provided from the storage unit 1330. The encoding operation performed by the encoder 1340 is encoding of data that can be omitted as necessary.

ファイル生成部１３５０は、エンコーダ１３４０によって符号化された各ビデオデータを用いてメディアファイル１３００を生成する。このビデオデータは、データ領域、特にメディアデータ領域に格納され、ビデオデータ間の関係を示すトラック基準情報、各ビデオデータのメディアタイプを表すハンドラ情報、ステレオスコピックビデオの構成タイプ、及びカメラとディスプレイ情報は、各ビデオデータのトラックの該当情報に対するボックスに格納される。この生成されたメディアファイル１３００は、ステレオスコピックメディアファイル再生装置に入力又は転送され、メディアファイル再生装置は、メディアファイル１３００からステレオスコピックサービスビデオを再生してディスプレイする。 The file generation unit 1350 generates a media file 1300 using each video data encoded by the encoder 1340. This video data is stored in the data area, particularly the media data area, and track reference information indicating the relationship between the video data, handler information indicating the media type of each video data, the configuration type of the stereoscopic video, and the camera and display The information is stored in a box for the corresponding information of each video data track. The generated media file 1300 is input or transferred to a stereoscopic media file playback device, and the media file playback device plays back and displays the stereoscopic service video from the media file 1300.

次に、本発明の一実施形態によるステレオスコピックメディアファイル再生装置について説明する。 Next, a stereoscopic media file playback device according to an embodiment of the present invention will be described.

図１７は、本発明の一実施形態によるメディアファイル再生装置を示すブロック構成図である。図１７に示すように、メディアファイル再生装置は、ファイル分析器１４１０、デコーダ１４２０、格納部１４３０、再生部１４４０、及びディスプレイ部１４５０を含む。 FIG. 17 is a block diagram showing a media file playback apparatus according to an embodiment of the present invention. As shown in FIG. 17, the media file playback apparatus includes a file analyzer 1410, a decoder 1420, a storage unit 1430, a playback unit 1440, and a display unit 1450.

ファイル分析器１４１０は、例えば、メディアファイル生成装置のファイル生成部１３５０によって生成されたメディアファイル１４００を受信して分析する。このとき、ファイル分析器１４１０は、ファイル、ｍｏｏｖ、トラック、及びメタデータ領域に各々格納された情報を分析した後に、メディアデータ領域に格納されているビデオデータ１４０１〜１４０４を抽出する。図５及び図８に示したファイル分析動作を通じて、ファイル分析器１４１０は、トラック間の基準情報を含む関連性を示す情報も抽出し、関連したトラックを識別することができる。 The file analyzer 1410 receives and analyzes the media file 1400 generated by the file generation unit 1350 of the media file generation device, for example. At this time, the file analyzer 1410 analyzes the information stored in each of the file, moov, track, and metadata areas, and then extracts the video data 1401 to 1404 stored in the media data area. Through the file analysis operation shown in FIGS. 5 and 8, the file analyzer 1410 can also extract information indicating relevance including reference information between tracks and identify related tracks.

デコーダ１４２０は、抽出されたビデオデータを復号化する。本実施形態において、このデコーダ１４２０は、メディアファイル生成装置がエンコーダ１３４０を用いてデータを符号化する場合に使用される。復号化されたデータは、格納部１４３０に格納される。再生部１４４０は、識別情報に基づいて格納部１４３０に格納されたビデオデータを用いて関連したステレオスコピックビューシーケンスを合成して再生し、及び／又は関連したステレオスコピックビューシーケンス及びモノスコピックビューシーケンスを共に再生する。ディスプレイ部１４５０は、再生されたビューシーケンスをディスプレイする。このディスプレイ部１４５０には、バリアＬＣＤを採用できる。この場合、バリアＬＣＤは、メディアファイルのモノスコピックビデオに対してオフ状態とし、ステレオスコピックビデオに対してオン状態とすることで、各ビデオを画面にディスプレイすることができる。 The decoder 1420 decodes the extracted video data. In the present embodiment, the decoder 1420 is used when the media file generation apparatus encodes data using the encoder 1340. The decrypted data is stored in the storage unit 1430. The playback unit 1440 synthesizes and plays back related stereoscopic view sequences using the video data stored in the storage unit 1430 based on the identification information, and / or the related stereoscopic view sequences and monoscopic views. Play the sequence together. The display unit 1450 displays the reproduced view sequence. The display unit 1450 can employ a barrier LCD. In this case, the barrier LCD can display each video on the screen by turning off the monoscopic video of the media file and turning on the stereoscopic video.

以上、本発明を具体的な実施形態に関して図示及び説明したが、添付した特許請求の範囲により規定されるような本発明の精神及び範囲を外れることなく、形式や細部の様々な変更が可能であることは、当該技術分野における通常の知識を持つ者には明らかである。 While the invention has been illustrated and described with reference to specific embodiments, various changes in form and detail can be made without departing from the spirit and scope of the invention as defined by the appended claims. Certainly it will be apparent to those with ordinary knowledge in the art.

１００、１３００、１４００メディアファイル
１１０映画データボックス（‘ｍｏｏｖ’ボックス）
１２０メディアデータボックス（‘ｍｄａｔ’ボックス）
２１０、３１０、５１０、５２０、６１０、６２０、６３０、８１０、９１０、９２０、１０１０、１１５０、１２５０トラック基準ボックス（‘ｔｒｅｆ’ボックス）
２２０、３２０、３３０、８２０、９３０、１０２０、１０３０ハンドラ基準ボックス（‘ｈｄｌｒ’ボックス）
１１１０、１２１０ハンドラタイプ‘ｓｖｉｄ’
１１２０、１２２０、１２３０基準タイプ‘ａｖｍｉ’
１１４０、１１６０、１２２４、１２３４ ‘ｓｃｄｉ’ボックス
１３０１各ビデオの関連情報
１３０２カメラ（左）
１３０３カメラ（右）
１３０４マルチビュー又はモノビューカメラ
１３１０入力部
１３２０ビデオ信号プロセッサ
１３３０、１４３０格納部
１３４０エンコーダ
１３５０ファイル生成部
１４１０ファイル分析器
１４２０デコーダ
１４４０再生部
１４５０ディスプレイ部 100, 1300, 1400 Media file 110 Movie data box ('moov' box)
120 Media data box ('mdat' box)
210, 310, 510, 520, 610, 620, 630, 810, 910, 920, 1010, 1150, 1250 Track reference box ('tref' box)
220, 320, 330, 820, 930, 1020, 1030 Handler reference box ('hdlr' box)
1110, 1210 Handler type 'svid'
1120, 1220, 1230 Reference type 'avmi'
1140, 1160, 1224, 1234 'scdi' box 1301 Information related to each video 1302 Camera (left)
1303 Camera (right)
1304 Multi-view or mono-view camera 1310 Input unit 1320 Video signal processor 1330, 1430 Storage unit 1340 Encoder 1350 File generation unit 1410 File analyzer 1420 Decoder 1440 Playback unit 1450 Display unit

Claims

A computer-readable recording medium having stored data,
A media data box containing two or more media data items;
A movie data ('moov') box containing information about the view sequence data of the media data;
The computer-readable recording medium, wherein the 'moov' box includes track reference information indicating that a track box for one view sequence refers to a track box of another view sequence.

The computer-readable recording medium according to claim 1, wherein the track reference information is included in a track reference box of the track box.

The view sequence data is divided into first view sequence data and second view sequence data,
The computer-readable recording medium according to claim 2, wherein the second view sequence data includes the track reference box.

The computer-readable medium of claim 3, wherein the 'moov' box of the second view sequence data includes a box in which display and camera information relative to the first view sequence data is stored. Recording medium.

The 'moov' box includes a track header in which header information for each view sequence data is stored,
The computer-readable recording medium according to claim 1, wherein the view sequence data to be referenced is distinguished from the view sequence data to be referenced by a track identifier (ID) stored in the track header.

Receiving a media file;
Analyzing a media data box of the received media file containing two or more view sequence data and a movie data ('moov') box containing information about the view sequence data;
The video is generated based on the reference view sequence and the reference view sequence according to the track reference information included in the 'moov' box and indicating that the track box for one view sequence refers to the track box for another view sequence. A computer-implemented method comprising:

The computer-implemented method of claim 6, wherein the track reference information is included in a track reference box of the track box.

The view sequence data is divided into first view sequence data and second view sequence data,
The computer-implemented method of claim 7, wherein the second view sequence data includes the track reference box.

The computer-implemented method of claim 8, wherein the 'moov' box of the second view sequence data includes a box for storing display and camera information relative to the first view sequence data. .

The 'moov' box includes a track header in which header information for each view sequence data is stored,
7. The computer-implemented method according to claim 6, wherein the view sequence data to be referenced is distinguished from the view sequence data to be referenced by a track identifier (ID) stored in the track header.

A media data box of a media file including two or more view sequence data and a movie data ('moov') box including information on the view sequence data are analyzed, and one view sequence included in the 'moov' box is analyzed. A file analysis unit that extracts a video based on a reference view sequence and a reference view sequence according to track reference information indicating that the reference track box refers to a track box of another view sequence;
And a display unit for displaying the extracted video.

The terminal apparatus according to claim 11, wherein the track reference information is included in a track reference box of the track box.

The view sequence data is divided into first view sequence data and second view sequence data,
The terminal apparatus according to claim 12, wherein the second view sequence data includes the track reference box.

The terminal apparatus according to claim 13, wherein the 'moov' box of the second view sequence data includes a box in which display and camera information relative to the first view sequence data are stored.

The 'moov' box includes a track header in which header information for each view sequence data is stored,
The terminal device according to claim 11, wherein the file analysis unit distinguishes the view sequence data to be referenced from the view sequence data to be referred to based on a track identifier (ID) stored in the track header.