JP2010178369A

JP2010178369A - Image synthesizing apparatus, program, and recording medium

Info

Publication number: JP2010178369A
Application number: JP2010083123A
Authority: JP
Inventors: Shuichi Watabe; 秀一渡部; Jiro Kiyama; 次郎木山; Takayoshi Yamaguchi; 孝好山口
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2010-03-31
Filing date: 2010-03-31
Publication date: 2010-08-12
Anticipated expiration: 2025-07-27
Also published as: JP4964319B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a displaying data for giving the displaying region of a daughter screen responding to the image content of a mother screen, in relation to the displaying region of the daughter screen when synthesizing and displaying the daughter screen on the mother screen in picture-in-picture reproduction, and provide an image synthesizing apparatus, etc. which displays the daughter screen in an appropriate position by using the displaying data. <P>SOLUTION: An image displaying apparatus 1 has decoding portions 101, 103 for decoding two image data, a synthesizing portion 105 for synthesizing and outputting a decoding image, an input portion 108 for indicating the display/non-display of the daughter screen, a processing control portion 109 for controlling the processing of the decode portions 101, 103 by receiving the forgoing indication, and a position assigning portion 110 for indicating the displaying position of the daughter screen to the synthesizing portion 105 by receiving the forgoing indication, the displaying data, and time information. The position assigning potion 110 determines the displaying position of a daughter screen image, according to display enable time information of the daughter screen image and the information for indicating the displaying region of each time or the display enable region which are included in the displaying data. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、第１の映像に第２の映像を合成する映像合成装置等に関する。 The present invention relates to a video composition device and the like for synthesizing a second video with a first video.

ネットワークインフラの高速化や記録メディアの大容量化によって、ユーザが、あるいはユーザの扱う映像機器が、一度に扱える映像データの絶対量は飛躍的に増大している。それに伴い、複数の映像データを使って実現される様々な機能や、それらに基づく高度アプリケーションも登場しつつある。そのような機能の１つに「ピクチャ・イン・ピクチャ」と呼ばれる機能がある。 With the increase in network infrastructure speed and recording media capacity, the absolute amount of video data that can be handled by the user or the video equipment handled by the user at a time has increased dramatically. Along with this, various functions realized using a plurality of video data and advanced applications based on them are also appearing. One such function is a function called “picture-in-picture”.

「ピクチャ・イン・ピクチャ」は、画面（親画面）の中に小さな子画面を重ねて２つの映像を同時に表示する機能である。例えば、親画面の映像と別のアングルから撮影された映像を子画面中に表示する「マルチアングル表示」や、親画面の映像に関する付加情報をコメンタリ形式で表示する「コメンタリ表示」（例えば、映画に関する撮影秘話などを収録した監督のコメンタリ映像等を子画面表示する）などに用いられる。 “Picture-in-picture” is a function for displaying two images simultaneously by overlapping a small child screen on a screen (parent screen). For example, “multi-angle display” that displays images taken from a different angle from the main screen image in the sub-screen, and “commentary display” that displays additional information related to the main screen image in commentary format (for example, movie This is used to display the director's commentary video, etc., which contains secret stories about the filming).

ピクチャ・イン・ピクチャは、例えば、図１７に示されるように、２つの映像データを２つの異なるレイヤ上にそれぞれ復号した後、それらの復号映像を重ね合せることで実現されている。この時、子画面側の映像は、親画面に重ね合わせるために映像の表示サイズや表示位置が調整される。また、ピクチャ・イン・ピクチャ時の子画面には、図１７に図示した矩形の映像の他、任意形状の映像が用いられることもある。このようなピクチャ・イン・ピクチャの機能及び実現方法については、例えば特許文献１に記載されている。 For example, as shown in FIG. 17, picture-in-picture is realized by decoding two video data on two different layers and then superimposing the decoded video. At this time, the display size and the display position of the video on the child screen side are adjusted so as to be superimposed on the parent screen. In addition, in the sub-screen at the time of picture-in-picture, an arbitrarily shaped image may be used in addition to the rectangular image shown in FIG. Such a picture-in-picture function and implementation method are described in Patent Document 1, for example.

特開２００５−１２３７７５号公報JP 2005-123775 A

従来のピクチャ・イン・ピクチャでは、子画面の表示位置が予め決まっており、その位置に子画面を表示していた。 In the conventional picture-in-picture, the display position of the small screen is determined in advance, and the small screen is displayed at that position.

ピクチャ・イン・ピクチャでは、親画面の映像の上に子画面の映像が重ねて表示されるため、子画面表示時には、子画面の映像によって親画面の映像の一部が隠れることになる。このことから、子画面の表示位置は、親画面の映像の内容の変化に応じて、親画面上のどの位置に子画面を表示するかを切替えて与えることができることが望ましい。 In picture-in-picture, since the video on the child screen is displayed on top of the video on the parent screen, a portion of the video on the parent screen is hidden by the video on the child screen when the child screen is displayed. Therefore, it is desirable that the display position of the child screen can be switched and given at which position on the parent screen the child screen is displayed in accordance with the change in the content of the image on the parent screen.

さらに、ピクチャ・イン・ピクチャを利用したアプリケーションとして、子画面の映像を、ある期間内の任意の時刻で自由に再生開始、一時停止、再生再開可能とするものが考えられる。子画面は再生状態時にのみ表示される。例えば、子画面映像が親画面映像に対する特典映像の時、親画面と完全に同期させる必要は無いが、親画面映像内の特定の期間でのみ、その子画面映像（特典映像）を視聴可能とする、といった場合に利用される。この場合でも、子画面が表示される都度、親画面映像の内容に応じて、子画面を表示すべき適切な親画面上の表示位置が与えられることが望ましい。 Further, as an application using picture-in-picture, it is conceivable that an image on a child screen can be freely started, paused, and resumed at any time within a certain period. The child screen is displayed only during playback. For example, when the sub-screen video is a special video for the main screen video, it is not necessary to completely synchronize with the main screen, but the sub-screen video (special video) can be viewed only during a specific period in the main screen video. It is used for such cases. Even in this case, it is desirable that an appropriate display position on the parent screen for displaying the child screen is given every time the child screen is displayed in accordance with the content of the parent screen video.

しかし、これらの要求に対して、上記のように親画面映像の変化と共に変化する子画面の表示位置を与える方法が従来に無かった。そのため、上記のようなアプリケーションも実現不可能であった。 However, there has been no conventional method for providing the display position of the child screen that changes with the change of the parent screen image as described above in response to these requests. Therefore, the application as described above cannot be realized.

本発明は上記の問題を鑑みてなされたものであり、ピクチャ・イン・ピクチャ再生時の子画面の表示位置について、表示可能時間、並びに各時刻に対する表示領域もしくは表示可能領域を示した表示用データを提供する。また、それによって、上記したように、子画面映像の再生時刻、停止時刻を自由に変更しても、適切な子画面の表示位置を与えることを可能にする、映像合成装置等を提供するものである。 The present invention has been made in view of the above-described problem, and the display data indicating the displayable time and the display area or displayable area for each time with respect to the display position of the child screen at the time of picture-in-picture reproduction. I will provide a. In addition, as described above, it is possible to provide a video composition device or the like that makes it possible to provide an appropriate sub-screen display position even if the playback time and stop time of the sub-screen video are freely changed. It is.

上述した課題を解決するために、本発明の映像合成装置は、第１の映像に第２の映像を合成する映像合成装置であって、
前記第１の映像の再生時間内に、前記第２の映像の表示可能時間が指定され、
前記第２の映像は、前記表示可能時間内の任意の時刻に再生を開始し、前記第２の映像の終了もしくは前記表示可能時間を超えたとき再生を終了し、かつ、再生中は任意に表示／非表示切替可能であって、
前記表示可能時間内の総ての時刻に亘って指定される、前記第１の映像の時刻を示す時刻情報と、前記時刻に前記第２の映像を表示するときの表示位置を示す表示領域情報とを含む表示用データを受け取り、該表示用データに基づいて任意の時刻に前記第２の映像を表示するときの表示位置を指定する指定手段と、
前記第１の映像の前記指定手段により指定された表示位置に前記第２の映像を重ねて合成する合成手段と、
を備えることを特徴とする。 In order to solve the above-described problem, a video composition device of the present invention is a video composition device that synthesizes a second video with a first video,
The displayable time of the second video is designated within the playback time of the first video,
The second video starts to be played at an arbitrary time within the displayable time, ends when the second video ends or exceeds the displayable time, and is arbitrarily selected during playback. Display / non-display switchable,
The time information indicating the time of the first video and the display area information indicating the display position when the second video is displayed at the time specified over all the times within the displayable time. Designating means for designating a display position when displaying the second video at an arbitrary time based on the display data,
Synthesizing means for superposing and synthesizing the second video image at a display position designated by the designating means of the first video image;
It is characterized by providing.

また、本発明の映像合成装置において、前記表示用データには、前記第２の映像の時刻に対応して与えられる前記第２の映像を表示に切り替えたときの前記第１の映像内における前記第２の映像の表示領域を示す第２の表示領域情報が含まれており、
前記指定手段は、前記第２の映像を表示する際に、前記表示用データに含まれる前記表示領域情報及び／又は前記第２の表示領域情報に従って表示位置を指定する処理を行うことを特徴とする。 In the video composition device of the present invention, the display data may include the second video provided corresponding to the time of the second video in the first video when the second video is switched to display. Second display area information indicating a display area of the second video is included,
The designation means performs a process of designating a display position according to the display area information and / or the second display area information included in the display data when the second video is displayed. To do.

また、本発明の映像合成装置において、前記表示用データには、前記第２の映像の時刻に対応して与えられる前記第２の映像を表示に切り替えたときの前記第１の映像内における前記第２の映像を表示するときの表示可能な領域を示す表示可能領域情報が含まれており、前記表示可能な領域とは、前記第２の映像を表示に切り替えたときその領域内の任意の位置に表示してよい領域を指し、
前記指定手段は、前記第２の映像を表示する際に、前記表示用データに含まれる前記表示領域情報及び／又は前記表示可能領域情報に従って表示位置を指定する処理を行うことを特徴とする。 In the video composition device of the present invention, the display data may include the second video provided corresponding to the time of the second video in the first video when the second video is switched to display. Displayable area information indicating a displayable area when the second video is displayed is included, and the displayable area is an arbitrary area in the area when the second video is switched to display. Refers to the area that can be displayed
The specifying means performs a process of specifying a display position according to the display area information and / or the displayable area information included in the display data when displaying the second video.

また、本発明の映像合成装置において、前記表示領域情報は、前記第２の映像を表示する領域の座標及び／またはサイズを示す情報を含むことを特徴とする。 In the video composition device according to the present invention, the display area information includes information indicating coordinates and / or size of an area in which the second video is displayed.

また、本発明の映像合成装置において、前記表示領域情報は、前記第２の映像を表示する矩形領域の左上の頂点座標を含む、ことを特徴とする。 In the video composition device according to the present invention, the display area information includes an upper left vertex coordinate of a rectangular area for displaying the second video.

本発明の映像合成装置は、第１の映像に第２の映像を合成する映像合成装置であって、
前記第１の映像の再生時間内に、前記第２の映像の表示可能時間が指定され、
前記第２の映像は、前記表示可能時間内の任意の時刻に再生を開始し、前記第２の映像の終了もしくは前記表示可能時間を超えたとき再生を終了し、かつ、再生中は任意に表示／非表示切替可能であって、
前記表示可能時間内の総ての時刻に亘って指定される、前記第１の映像の時刻を示す時刻情報と、前記時刻に前記第２の映像を表示するときの表示可能な領域を示す表示可能領域情報とを含む表示用データを受け取り、前記表示可能な領域とは、前記第２の映像を表示に切り替えたときその領域内の任意の位置に表示してよい領域を指し、該表示用データに基づいて任意の時刻に前記第２の映像を表示するときの表示位置を指定する指定手段と、
前記第１の映像の前記指定手段により指定された表示位置に前記第２の映像を重ねて合成する合成手段と、
を備えることを特徴とする。 The video composition device of the present invention is a video composition device for synthesizing the second video with the first video,
The displayable time of the second video is designated within the playback time of the first video,
The second video starts to be played at an arbitrary time within the displayable time, ends when the second video ends or exceeds the displayable time, and is arbitrarily selected during playback. Display / non-display switchable,
Time information indicating the time of the first video specified over all the times within the displayable time, and a display indicating a displayable area when the second video is displayed at the time Display data including displayable area information. The displayable area refers to an area that may be displayed at an arbitrary position in the area when the second video is switched to display. Designation means for designating a display position when displaying the second video at an arbitrary time based on data;
Synthesizing means for superposing and synthesizing the second video image at a display position designated by the designating means of the first video image;
It is characterized by providing.

また、本発明の映像合成装置において、前記表示用データには、前記第２の映像の時刻に対応して与えられる前記第２の映像を表示に切り替えたときの前記第１の映像内における前記第２の映像の表示領域を示す表示領域情報が含まれており、
前記指定手段は、前記第２の映像を表示する際に、前記表示用データに含まれる前記表示可能領域情報及び／又は前記表示領域情報に従って表示位置を指定する処理を行うことを特徴とする。 In the video composition device of the present invention, the display data may include the second video provided corresponding to the time of the second video in the first video when the second video is switched to display. Display area information indicating the display area of the second video is included,
The designation means performs a process of designating a display position in accordance with the displayable area information and / or the display area information included in the display data when displaying the second video.

また、本発明の映像合成装置において、前記表示用データには、前記第２の映像の時刻に対応して与えられる前記第２の映像を表示に切り替えたときの前記第１の映像内における前記第２の映像を表示するときの表示可能な領域を示す第２の表示可能領域情報が含まれており、
前記指定手段は、前記第２の映像を表示する際に、前記表示用データに含まれる前記表示可能領域情報及び／又は前記第２の表示可能領域情報に従って表示位置を指定する処理を行うことを特徴とする。 In the video composition device of the present invention, the display data may include the second video provided corresponding to the time of the second video in the first video when the second video is switched to display. Second displayable area information indicating a displayable area when displaying the second video is included,
The designation means performs a process of designating a display position in accordance with the displayable area information and / or the second displayable area information included in the display data when displaying the second video. Features.

また、本発明の映像合成装置において、前記合成されて出力される映像はピクチャ・イン・ピクチャ形式の映像であり、前記第１の映像は親画面に表示される映像にあたり、前記第２の映像は子画面に表示される映像にあたることを特徴とする。 In the video composition device of the present invention, the synthesized and outputted video is a picture-in-picture video, and the first video is a video displayed on a main screen, and the second video Corresponds to the image displayed on the sub-screen.

本発明のプログラムは、第１の映像に第２の映像を合成する映像合成装置において実行されるプログラムであって、
前記第１の映像の再生時間内に、前記第２の映像の表示可能時間が指定され、
前記第２の映像は、前記表示可能時間内の任意の時刻に再生を開始し、前記第２の映像の終了もしくは前記表示可能時間を超えたとき再生を終了し、かつ、再生中は任意に表示／非表示切替可能であって、
前記表示可能時間内の総ての時刻に亘って指定される、前記第１の映像の時刻を示す時刻情報と、前記時刻に前記第２の映像を表示するときの表示位置を示す表示領域情報とを含む表示用データを受け取り、該表示用データに基づいて任意の時刻に前記第２の映像を表示するときの表示位置を指定する指定機能と、
前記第１の映像の前記指定機能により指定された表示位置に前記第２の映像を重ねて合成する合成機能と、
を、前記映像合成装置に実現することを特徴としている。 The program of the present invention is a program executed in a video composition device that synthesizes a second video with a first video,
The displayable time of the second video is designated within the playback time of the first video,
The second video starts to be played at an arbitrary time within the displayable time, ends when the second video ends or exceeds the displayable time, and is arbitrarily selected during playback. Display / non-display switchable,
The time information indicating the time of the first video and the display area information indicating the display position when the second video is displayed at the time specified over all the times within the displayable time. A designation function for designating a display position when displaying the second video at an arbitrary time based on the display data;
A composition function for superimposing the second image on the display position designated by the designation function of the first image;
Is realized in the video composition apparatus.

また、本発明のプログラムにおいて、前記表示用データには、前記第２の映像の時刻に対応して与えられる前記第２の映像を表示に切り替えたときの前記第１の映像内における前記第２の映像の表示領域を示す第２の表示領域情報が含まれており、
前記指定機能は、前記第２の映像を表示する際に、前記表示用データに含まれる前記表示領域情報及び／又は前記第２の表示領域情報に従って表示位置を指定する処理を行うことを特徴とする。 In the program of the present invention, the display data includes the second video in the first video when the second video given corresponding to the time of the second video is switched to display. Second display area information indicating the display area of the video of
The designation function performs a process of designating a display position in accordance with the display area information and / or the second display area information included in the display data when the second video is displayed. To do.

また、本発明のプログラムにおいて、前記表示用データには、前記第２の映像の時刻に対応して与えられる前記第２の映像を表示に切り替えたときの前記第１の映像内における前記第２の映像を表示するときの表示可能な領域を示す表示可能領域情報が含まれており、前記表示可能な領域とは、前記第２の映像を表示に切り替えたときその領域内の任意の位置に表示してよい領域を指し、
前記指定機能は、前記第２の映像を表示する際に、前記表示用データに含まれる前記表示領域情報及び／又は前記表示可能領域情報に従って表示位置を指定する処理を行うことを特徴とする。 In the program of the present invention, the display data includes the second video in the first video when the second video given corresponding to the time of the second video is switched to display. The displayable area information indicating the displayable area when displaying the video is included, and the displayable area is an arbitrary position in the area when the second video is switched to display. Refers to the area that can be displayed,
The designation function performs a process of designating a display position in accordance with the display area information and / or the displayable area information included in the display data when the second video is displayed.

また、本発明のプログラムにおいて、前記表示領域情報は、前記第２の映像を表示する領域の座標及び／またはサイズを示す情報を含むことを特徴とする。 In the program of the present invention, the display area information includes information indicating coordinates and / or size of an area in which the second video is displayed.

また、本発明のプログラムにおいて、前記表示領域情報は、前記第２の映像を表示する矩形領域の左上の頂点座標を含む、ことを特徴とする。 In the program of the present invention, the display area information includes an upper left vertex coordinate of a rectangular area for displaying the second video.

本発明のプログラムは、第１の映像に第２の映像を合成する映像合成装置において実行されるプログラムであって、
前記第１の映像の再生時間内に、前記第２の映像の表示可能時間が指定され、
前記第２の映像は、前記表示可能時間内の任意の時刻に再生を開始し、前記第２の映像の終了もしくは前記表示可能時間を超えたとき再生を終了し、かつ、再生中は任意に表示／非表示切替可能であって、
前記表示可能時間内の総ての時刻に亘って指定される、前記第１の映像の時刻を示す時刻情報と、前記時刻に前記第２の映像を表示するときの表示可能な領域を示す表示可能領域情報とを含む表示用データを受け取り、前記表示可能な領域とは、前記第２の映像を表示に切り替えたときその領域内の任意の位置に表示してよい領域を指し、該表示用データに基づいて任意の時刻に前記第２の映像を表示するときの表示位置を指定する指定機能と、
前記第１の映像の前記指定手段により指定された表示位置に前記第２の映像を重ねて合成する合成機能と、
を、前記映像合成装置に実現することを特徴としている。 The program of the present invention is a program executed in a video composition device that synthesizes a second video with a first video,
The displayable time of the second video is designated within the playback time of the first video,
The second video starts to be played at an arbitrary time within the displayable time, ends when the second video ends or exceeds the displayable time, and is arbitrarily selected during playback. Display / non-display switchable,
Time information indicating the time of the first video specified over all the times within the displayable time, and a display indicating a displayable area when the second video is displayed at the time Display data including displayable area information, and the displayable area refers to an area that may be displayed at an arbitrary position in the area when the second video is switched to display. A designation function for designating a display position when displaying the second video at an arbitrary time based on data;
A composition function for superimposing and synthesizing the second image on the display position designated by the designation means of the first image;
Is realized in the video composition apparatus.

また、本発明のプログラムにおいて、前記表示用データには、前記第２の映像の時刻に対応して与えられる前記第２の映像を表示に切り替えたときの前記第１の映像内における前記第２の映像の表示領域を示す表示領域情報が含まれており、
前記指定機能は、前記第２の映像を表示する際に、前記表示用データに含まれる前記表示可能領域情報及び／又は前記表示領域情報に従って表示位置を指定する処理を行うことを特徴とする。 In the program of the present invention, the display data includes the second video in the first video when the second video given corresponding to the time of the second video is switched to display. Display area information indicating the display area of the video of
The designation function performs a process of designating a display position in accordance with the displayable area information and / or the display area information included in the display data when the second video is displayed.

また、本発明のプログラムにおいて、前記表示用データには、前記第２の映像の時刻に対応して与えられる前記第２の映像を表示に切り替えたときの前記第１の映像内における前記第２の映像を表示するときの表示可能な領域を示す第２の表示可能領域情報が含まれており、
前記指定機能は、前記第２の映像を表示する際に、前記表示用データに含まれる前記表示可能領域情報及び／又は前記第２の表示可能領域情報に従って表示位置を指定する処理を行うことを特徴とする。 In the program of the present invention, the display data includes the second video in the first video when the second video given corresponding to the time of the second video is switched to display. Second displayable area information indicating a displayable area when displaying the video of
The designation function performs a process of designating a display position in accordance with the displayable area information and / or the second displayable area information included in the display data when the second video is displayed. Features.

また、本発明のプログラムにおいて、前記合成されて出力される映像はピクチャ・イン・ピクチャ形式の映像であり、前記第１の映像は親画面に表示される映像にあたり、前記第２の映像は子画面に表示される映像にあたることを特徴とする。 In the program of the present invention, the synthesized and output video is a picture-in-picture video, the first video is a video displayed on a main screen, and the second video is a child video. It is characterized by being a video displayed on the screen.

本発明の記録媒体は、第１の映像と、第２の映像及び該第１の映像に該第２の映像を重ねて合成表示するための表示用データが記録され、該第１の映像に該第２の映像を合成して出力する映像合成装置において再生される記録媒体であって、
上述した発明のプログラムが記録されていることを特徴とする。 The recording medium of the present invention records the first video, the second video, and display data for combining and displaying the second video on the first video, and the first video is recorded on the first video. A recording medium played back in a video composition device for synthesizing and outputting the second video,
The program of the invention described above is recorded.

本発明では、ピクチャ・イン・ピクチャ再生時の子画面の表示位置について、表示可能時間、並びに表示領域もしくは表示可能領域を示した表示用データを提供する。この表示用データは、子画面映像もしくは親画面映像の映像データ中に含めるか、あるいは映像データと独立した管理データ内に格納し、映像の伝送や配信時などに映像データと共に扱われる。映像の表示装置及び方法では、上記表示用データを読み出して、都度、親（子）画面映像の再生時刻に対応する子画面の表示位置を決定するのに利用する。これらにより、ピクチャ・イン・ピクチャで子画面映像を親画面映像に合成表示する際、適切な表示位置に表示して再生することが可能となる。結果、子画面映像は、表示可能時間の範囲内で自由に表示、非表示を切り替えることができ、また、自由に表示、非表示を切り替えても、その都度適切な位置に子画面映像を合成表示することができる。従って、提供者の意図した通りにピクチャ・イン・ピクチャ時の再生を行うことが可能となる。 The present invention provides display data indicating the displayable time and the display area or the displayable area for the display position of the small screen during picture-in-picture reproduction. This display data is included in the video data of the sub-screen video or the parent screen video, or is stored in management data independent of the video data, and is handled together with the video data at the time of video transmission or distribution. In the video display apparatus and method, the display data is read and used each time to determine the display position of the sub-screen corresponding to the reproduction time of the parent (child) screen video. As a result, when the small-screen video is synthesized and displayed on the main-screen video by picture-in-picture, it can be displayed and reproduced at an appropriate display position. As a result, the sub screen video can be switched between display and non-display freely within the displayable time range, and the sub screen video can be synthesized at the appropriate position each time it is switched between display and non-display. Can be displayed. Therefore, it is possible to perform playback during picture-in-picture as intended by the provider.

第１、第２、第３の実施形態にかかる映像表示装置の概略構成を示した機能ブロック図である。It is the functional block diagram which showed schematic structure of the video display apparatus concerning 1st, 2nd, 3rd Embodiment. 第１の実施形態にかかる映像表示装置で扱われる表示用データの例を示した図である。It is the figure which showed the example of the data for a display handled with the video display apparatus concerning 1st Embodiment. 第１の実施形態にかかる映像表示装置で扱われる表示用データの別の例を示した図である。It is the figure which showed another example of the data for a display handled with the video display apparatus concerning 1st Embodiment. 第１の実施形態にかかる映像表示装置で扱われる表示用データのバリエーションを示した説明図である。It is explanatory drawing which showed the variation of the data for display handled with the video display apparatus concerning 1st Embodiment. 第１の実施形態にかかる映像表示装置で扱われる表示用データのさらに別の例を示した図である。It is the figure which showed another example of the data for a display handled with the video display apparatus concerning 1st Embodiment. 第１、第２、第３の実施形態にかかる映像表示装置で映像が表示される時の処理を示したフローチャートである。It is the flowchart which showed the process when a video is displayed with the video display apparatus concerning 1st, 2nd, 3rd embodiment. 第１の実施形態にかかる映像表示装置で映像が表示される時の第１の表示状態を示した説明図である。It is explanatory drawing which showed the 1st display state when a video is displayed with the video display apparatus concerning 1st Embodiment. 第１の実施形態にかかる映像表示装置で映像が表示される時の第２の表示状態を示した説明図である。It is explanatory drawing which showed the 2nd display state when a video is displayed with the video display apparatus concerning 1st Embodiment. 第１の実施形態にかかる映像表示装置で映像が表示される時の第３の表示状態を示した説明図である。It is explanatory drawing which showed the 3rd display state when a video is displayed with the video display apparatus concerning 1st Embodiment. 第１の実施形態にかかる映像表示装置で映像が表示される時の第４の表示状態を示した説明図である。It is explanatory drawing which showed the 4th display state when a video is displayed with the video display apparatus concerning 1st Embodiment. 第２の実施形態にかかる映像表示装置で扱われる表示用データの例を示した図である。It is the figure which showed the example of the data for a display handled with the video display apparatus concerning 2nd Embodiment. 第２の実施形態にかかる映像表示装置で映像が表示される時の第１の表示状態を示した説明図である。It is explanatory drawing which showed the 1st display state when a video is displayed with the video display apparatus concerning 2nd Embodiment. 第２の実施形態にかかる映像表示装置で映像が表示される時の第２の表示状態を示した説明図である。It is explanatory drawing which showed the 2nd display state when a video is displayed with the video display apparatus concerning 2nd Embodiment. 第２の実施形態にかかる映像表示装置で映像が表示される時の第３の表示状態を示した説明図である。It is explanatory drawing which showed the 3rd display state when a video is displayed with the video display apparatus concerning 2nd Embodiment. 第２の実施形態にかかる映像表示装置で映像が表示される時の第４の表示状態を示した説明図である。It is explanatory drawing which showed the 4th display state when an image | video is displayed with the video display apparatus concerning 2nd Embodiment. 第３の実施形態にかかる映像表示装置で映像が表示される時の処理を示した説明図である。It is explanatory drawing which showed the process when a video is displayed with the video display apparatus concerning 3rd Embodiment. 従来のピクチャ・イン・ピクチャ機能の実現方式を示した説明図である。It is explanatory drawing which showed the implementation | achievement system of the conventional picture in picture function.

以下、図面を参照して本発明を実施するための形態について説明する。なお、以下の実施形態では、一例として、本発明における映像合成装置（プログラム）を、記録媒体を再生表示する映像表示装置に適用した場合について説明する。 Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings. In the following embodiments, as an example, a case where the video composition device (program) of the present invention is applied to a video display device that reproduces and displays a recording medium will be described.

（第１の実施形態）
第１の実施形態にかかる映像表示装置、方法、並びに表示用データを、図１乃至図１０を用いて説明する。 (First embodiment)
The video display apparatus, method, and display data according to the first embodiment will be described with reference to FIGS.

図１は、第１の実施形態にかかる映像表示装置１の概略構成を示した機能ブロック図である。映像表示装置１では、２つの映像データ（符号化映像ストリーム）が入力され、それらが復号、合成されいわゆる「ピクチャ・イン・ピクチャ表示」状態で表示される。本明細書では以下、ピクチャ・イン・ピクチャ表示において親画面に表示される映像を「メイン映像」、子画面として表示される映像を「サブ映像」と呼んで区別する。 FIG. 1 is a functional block diagram showing a schematic configuration of a video display device 1 according to the first embodiment. In the video display device 1, two video data (encoded video streams) are input, decoded and combined, and displayed in a so-called “picture-in-picture display” state. Hereinafter, in the picture-in-picture display, the video displayed on the parent screen is referred to as “main video”, and the video displayed as the sub-screen is referred to as “sub-video”.

映像表示装置１は、メイン映像の映像データを復号及び出力制御する復号部１０１及びバッファリング部１０２、サブ映像の映像データを復号及び出力制御する復号部１０３及びバッファリング部１０４、サブ映像をメイン映像に合成する合成部１０５及びその内部構成である調整部１０６、出力映像を表示する表示部１０７に加えて、ユーザからのサブ映像（子画面）の表示／非表示の切り替え指示を受け付ける入力部１０８と、該切り替えに対応して、復号部１０３及び／又はバッファリング部１０４の処理を調整する処理制御部１０９、並びに、別途入力されるサブ映像の表示用データと再生時の時刻情報とから、サブ映像（子画面）の表示位置を指定する位置指定部１１０とを備えて構成される。本明細書では以下、サブ映像（子画面）の表示位置を指定するのに利用されるこの表示用データを、映像データに対する「メタデータ」と呼ぶ。 The video display device 1 includes a decoding unit 101 and a buffering unit 102 that decode and output video data of a main video, a decoding unit 103 and a buffering unit 104 that decode and output control video data of a sub video, In addition to the synthesizing unit 105 that synthesizes the video, the adjusting unit 106 that is an internal configuration thereof, and the display unit 107 that displays the output video, an input unit that receives a display / non-display switching instruction from the user for the sub video (sub-screen) 108, the processing control unit 109 that adjusts the processing of the decoding unit 103 and / or the buffering unit 104 in response to the switching, and sub-video display data and playback time information that are input separately. And a position designating unit 110 for designating the display position of the sub video (sub-screen). In the present specification, hereinafter, the display data used for designating the display position of the sub video (child screen) is referred to as “metadata” for the video data.

なお、ここでは映像表示装置１を、復号部１０１、１０３を含んだ構成として示したが、それらは必須ではない。例えば、入力される映像データが符号化されていない映像データであれば、映像表示装置１は復号部１０１、１０３を含まなくともよい。また、図１の映像表示装置１は、映像のデータ（映像信号に係るデータ）の処理に関する機能ブロックのみを含んで構成されているが、実際の映像データは、映像信号に係るデータ以外に、オーディオデータや管理データ（符号化方式など符号化データの復号に必要な情報や、映像の切出しや連結を指定したプレイリストなど映像の再生に必要な情報）を含み、実際の映像表示装置は、それらを処理するための機能ブロックも含んで構成されるものである。この時、図１は、実際の映像表示装置の内部構成として実施される。 In addition, although the video display apparatus 1 was shown here as the structure containing the decoding parts 101 and 103, they are not essential. For example, if the input video data is video data that is not encoded, the video display device 1 may not include the decoding units 101 and 103. 1 is configured to include only functional blocks related to processing of video data (data related to a video signal), actual video data includes data related to a video signal, Audio data and management data (information necessary for decoding encoded data such as the encoding method, information necessary for video reproduction such as a playlist specifying video clipping and connection), and the actual video display device, It also includes a functional block for processing them. At this time, FIG. 1 is implemented as an internal configuration of an actual video display device.

まず、映像表示装置１において、サブ映像（子画面）が表示されない時の処理について説明する。この時、サブ映像の映像データは、入力されていないか、あるいは入力されているが非表示処理されている。 First, the processing when the sub video (child screen) is not displayed in the video display device 1 will be described. At this time, the video data of the sub video is not input or is input but is not displayed.

入力されたメイン映像の映像データは、復号部１０１で復号され、復号された映像はバッファリング部１０２でタイミングを調整された上で出力される。サブ映像が表示されないため、バッファリング部１０２から出力された復号映像は、合成部１０５をそのまま素通りして、表示部１０７に入力される。そして、メイン映像がそのまま表示される。 The input video data of the main video is decoded by the decoding unit 101, and the decoded video is output after the timing is adjusted by the buffering unit 102. Since the sub video is not displayed, the decoded video output from the buffering unit 102 passes through the synthesis unit 105 as it is and is input to the display unit 107. Then, the main video is displayed as it is.

次に、映像表示装置１において、サブ映像（子画面）が表示される場合の処理を説明する。
入力されたサブ映像の映像データは、復号部１０３で復号され、復号された映像はバッファリング部１０４でタイミングを調整された上で出力される。このサブ映像の復号映像は、合成部１０５内の調整部１０６に入力される。 Next, a process when a sub video (child screen) is displayed on the video display device 1 will be described.
The input video data of the sub video is decoded by the decoding unit 103, and the decoded video is output after the timing is adjusted by the buffering unit 104. The decoded video of the sub video is input to the adjustment unit 106 in the synthesis unit 105.

調整部１０６は、サブ映像をメイン映像に合成する前処理として、サブ映像の復号映像の画像サイズや画面上の表示位置を変換、調整する。この時、後述する位置指定部１１０で指示されたメイン映像（親画面）の表示位置にサブ映像（子画面）を合成するように調整する。その後、入力されたメイン映像の復号映像に調整後のサブ映像が合成されて出力され、表示部１０７を通して表示される。また、合成時に透過度が設定されて、合成されたサブ映像の下にメイン映像が透けて見えるような合成を行うことも可能である。 The adjustment unit 106 converts and adjusts the image size of the decoded video of the sub video and the display position on the screen as preprocessing for combining the sub video with the main video. At this time, adjustment is performed so that the sub video (child screen) is synthesized with the display position of the main video (parent screen) designated by the position specifying unit 110 described later. Thereafter, the adjusted sub video is combined with the decoded video of the main video that has been input, and is output and displayed through the display unit 107. It is also possible to perform composition such that the transparency is set at the time of composition and the main image can be seen through the synthesized sub image.

映像表示装置１は、入力部１０８を備えており、ユーザからサブ映像（子画面）の表示／非表示の切り替え指示を受け付ける。そして、入力部１０８は、入力された切り替え指示に基づいて、サブ映像（子画面）が現時点で表示状態か非表示状態かを示す表示状態情報を生成し、処理制御部１０９及び位置指定部１１０に伝える。 The video display device 1 includes an input unit 108, and accepts an instruction to switch display / non-display of the sub video (child screen) from the user. Then, the input unit 108 generates display state information indicating whether the sub-video (child screen) is currently displayed or not displayed based on the input switching instruction, and the processing control unit 109 and the position specifying unit 110. To tell.

処理制御部１０９は、入力部１０８から表示状態情報を受け取り、それに基づいて復号部１０３及び／又はバッファリング部１０４の処理を制御する。例えば、表示状態情報が「非表示状態」になった時に復号部１０３での復号処理及び／又はバッファリング部１０４からの出力を停止し、表示状態情報が「表示状態」になった時点でそれらの処理を再開することで、非表示の間、サブ映像を一時停止状態にしておく、といった処理を行う。 The processing control unit 109 receives display state information from the input unit 108 and controls processing of the decoding unit 103 and / or the buffering unit 104 based on the display state information. For example, when the display state information becomes “non-display state”, the decoding process in the decoding unit 103 and / or the output from the buffering unit 104 is stopped, and when the display state information becomes “display state” By resuming the above process, the sub video is paused during non-display.

位置指定部１１０は、入力部１０８から表示状態情報を受け取り、サブ映像（子画面）が表示状態である時に、後述するメタデータを使って、サブ映像（子画面）を表示するメイン映像（親画面）内の表示位置を決定し、調整部１０６に通知する。 The position designation unit 110 receives the display state information from the input unit 108, and when the sub video (child screen) is in the display state, the main video (parent screen) that displays the sub video (child screen) using the metadata described later. The display position in the screen) is determined, and the adjustment unit 106 is notified.

メイン映像は時間的に変化するものであり、サブ映像を表示させたい、あるいはサブ映像を表示させてもよいメイン映像内の表示位置は、メイン映像の変化に伴って時間的に変化するものである。そのため、上述したように、処理制御部１０９並びに処理制御部１０９で制御される復号部１０３及び／又はバッファリング部１０４によってサブ映像が非表示となり一時停止状態となってから、いくらかの時間が経過後に再生を再開しサブ映像を表示する時、非表示になる前と同じ位置に表示されることが望ましいとは限らない。本発明で与えるサブ映像の表示用データ、即ちメタデータは、メイン映像の各時刻位置に対して、サブ映像をメイン映像の何処に表示するか、あるいは何処に表示が可能かを与えたデータであり、位置指定部１１０は、サブ映像の映像データと共に入力されるメタデータを使って、再生時の時刻情報で示される時刻位置に対応するサブ映像（子画面）の表示位置を出力する。 The main video changes over time, and the display position in the main video where you want to display the sub video or may display the sub video changes over time as the main video changes. is there. For this reason, as described above, some time has elapsed since the sub video is not displayed and is temporarily stopped by the processing control unit 109 and the decoding unit 103 and / or the buffering unit 104 controlled by the processing control unit 109. When the playback is resumed later and the sub-video is displayed, it is not always desirable to display it at the same position as before the non-display. The sub video display data provided in the present invention, i.e., metadata, is data that indicates where in the main video the sub video is displayed or where it can be displayed for each time position of the main video. The position specifying unit 110 outputs the display position of the sub video (sub-screen) corresponding to the time position indicated by the time information at the time of reproduction, using the metadata input together with the video data of the sub video.

本実施形態で扱う映像表示用のメタデータについて、図２乃至図５を用いてさらに詳しく説明する。 The video display metadata handled in this embodiment will be described in more detail with reference to FIGS.

図２、図３は、本発明で与えるサブ映像表示に係るメタデータの具体例を示したものである。映像データに含まれる映像ストリーム（サブ映像ストリーム）は、ヘッダ部分と映像データ部分から構成されている。そして、ヘッダ部分にはストリームに関する各種情報が含まれており、当該ヘッダ部分にメタデータが含まれている。 2 and 3 show specific examples of metadata relating to the sub video display provided by the present invention. A video stream (sub video stream) included in the video data includes a header portion and a video data portion. The header portion includes various information regarding the stream, and the header portion includes metadata.

図２、図３では各々、具体的なメタデータの構造（図２（ａ）、図３（ａ））と、そのメタデータが指し示している表示領域もしくは表示可能領域を示した図（図２（ｂ）、図３（ｂ））とに加えて、表示領域もしくは表示可能領域の時間的な変化が理解し易いように、同じく表示領域もしくは表示可能領域を１次元として模式的に表した図（図２（ｃ）、図３（ｃ））とが記載されている。即ち、図２（ｃ）及び図３（ｃ）の縦軸は画面の空間的な２次元位置を表すものであり、図示された帯の縦幅が表示領域もしくは表示可能領域の大きさに対応する。 FIGS. 2 and 3 each show a specific metadata structure (FIGS. 2A and 3A) and a display area or displayable area indicated by the metadata (FIG. 2). In addition to (b) and FIG. 3 (b)), a diagram schematically showing the display area or the displayable area as one dimension so that the temporal change of the display area or the displayable area can be easily understood. (FIG. 2 (c), FIG. 3 (c)). That is, the vertical axis of FIG. 2C and FIG. 3C represents the spatial two-dimensional position of the screen, and the vertical width of the illustrated band corresponds to the size of the display area or displayable area. To do.

図２（ａ）にメタデータの構造の一例を示す。メタデータには、サブ映像の総再生時間２００と、メイン映像の再生時刻（再生開始位置を「００：００：００」とする再生時間）を基準として、メイン映像のどの時間範囲でサブ映像が表示可能かを示す表示可能時間情報２０１と、表示可能な時間範囲内の各々の時刻で、サブ映像を表示するメイン映像内の位置を示した表示領域情報２０２とを含んで構成されている。なお、図２で示した表示領域情報２０２は、サブ映像（子画面）の表示サイズが予め与えられた固定サイズであることを想定し、子画面の左上頂点位置を示している。例えば、時間「００：００：１０」からは（ｘ１，ｙ１）を左上の頂点位置として、サブ映像が表示される。なお、頂点座標は左上の座標とは限らず、例えばサブ映像の中心座標としても良いことは勿論である。 FIG. 2A shows an example of the metadata structure. In the metadata, the sub video is recorded in any time range of the main video based on the total playback time 200 of the sub video and the playback time of the main video (the playback time where the playback start position is “00:00:00”). It includes displayable time information 201 indicating whether display is possible and display area information 202 indicating the position in the main video for displaying the sub video at each time within the displayable time range. Note that the display area information 202 shown in FIG. 2 indicates the upper left vertex position of the sub-screen assuming that the display size of the sub-video (sub-screen) is a fixed size given in advance. For example, from time “00:00:10”, the sub video is displayed with (x1, y1) as the top left vertex position. It should be noted that the vertex coordinates are not limited to the upper left coordinates, and may be, for example, the center coordinates of the sub video.

図２（ｂ）は、メイン映像の各時刻でサブ映像を表示する表示領域を２次元で表したものである。例えば、時刻「００：００：１５」〜時刻「００：００：３０」の間であれば、サブ映像は（ｘ２，ｙ２）を左上の頂点座標とするメイン映像内の領域に合成されて表示される。 FIG. 2B is a two-dimensional representation of the display area for displaying the sub video at each time of the main video. For example, if it is between time “00:00:15” and time “00:00:30”, the sub video is synthesized and displayed in an area in the main video with (x2, y2) as the upper left vertex coordinate. Is done.

図２（ｃ）は、サブ映像を表示する表示領域を１次元で表したものである。縦方向はメイン映像における空間位置（領域）を示しており、横方向は時刻（メイン映像の時間位置）を示している。例えば、時刻「００：００：１５」のときに、サブ映像の左上の頂点座標が（ｘ１，ｙ１）から（ｘ２，ｙ２）に移動していることが図示されている。そして、メイン映像に対するサブ映像の表示領域は、図２（ｃ）において時刻「００：００：１５」及び「００：００：３０」で位置が変化する帯状の領域として表されている。 FIG. 2C shows a one-dimensional display area for displaying the sub video. The vertical direction indicates a spatial position (region) in the main video, and the horizontal direction indicates time (time position of the main video). For example, at the time “00:00:15”, it is illustrated that the top left vertex coordinate of the sub-video is moving from (x1, y1) to (x2, y2). The sub video display area with respect to the main video is represented as a band-shaped area whose position changes at times “00:00:15” and “00:00:30” in FIG.

図３（ａ）も同様にメタデータの構造の一例を示す。図３（ａ）に示したメタデータは、サブ映像の総再生時間３００と、メイン映像の再生時刻を基準として、メイン映像のどの時間範囲でサブ映像が表示可能かを示す表示可能時間情報３０１と、表示可能な時間範囲内の各時刻で、サブ映像を表示することが可能な（表示することが許可された）メイン映像内の領域を示した表示可能領域情報３０２とを含んで構成されている。図３で示した表示可能領域情報３０２は、子画面が表示可能な領域を左上、右下２つの頂点座標で表している。例えば、図３（ｂ）を参照すると、時刻「００：００：１０」からは、座標（ｘ１，ｙ１）、（ｘ１’，ｙ１’）を左上、右下の各頂点とする矩形領域にサブ映像（子画面）が表示可能であることが示されている。サブ映像（子画面）の表示サイズが予め与えられた固定サイズであり、図３の表示可能領域情報３０２で指定される表示可能領域が子画面の表示サイズより大きい場合に、サブ映像は、表示時に表示可能領域内で任意の位置に表示することができるものとする。また、表示可能領域の範囲内で、表示されているサブ映像（子画面）を移動、もしくは拡大表示可能としてもよい。例えば、図３（ｃ）において、メイン映像に対してサブ映像を表示できる領域は、時刻「００：００：１５」及び「００：００：３０」で位置、幅が変化する帯状の領域として表されている。 FIG. 3A similarly shows an example of the metadata structure. The metadata shown in FIG. 3A includes displayable time information 301 indicating the time range of the main video in which the sub video can be displayed with reference to the total playback time 300 of the sub video and the playback time of the main video. And displayable area information 302 indicating an area in the main video that can display (allow to display) the sub video at each time within the displayable time range. ing. The displayable area information 302 shown in FIG. 3 represents an area in which a child screen can be displayed by two vertex coordinates at the upper left and lower right. For example, referring to FIG. 3B, from time “00:00:10”, the coordinates (x1, y1) and (x1 ′, y1 ′) are subdivided into rectangular areas having the upper left and lower right vertices. It is shown that a video (sub-screen) can be displayed. When the display size of the sub video (child screen) is a fixed size given in advance and the displayable area specified by the displayable area information 302 in FIG. 3 is larger than the display size of the child screen, the sub video is displayed. It can be displayed at any position within the displayable area at times. Further, the displayed sub video (child screen) may be moved or enlarged within the displayable area. For example, in FIG. 3C, the area where the sub video can be displayed with respect to the main video is represented as a band-shaped area whose position and width change at times “00:00:15” and “00:00:30”. Has been.

図２及び図３に示した２つの例では、サブ映像（子画面）のサイズが固定であることを想定してメタデータにより表される表示（可能）領域について説明したが、これに限らず、表示領域情報がサブ映像の表示サイズ自体を表すとしてもよい。即ち、図３と同様に表示領域を左上、右下２つの頂点座標で表した上で、サブ映像が表示される際に、その表示領域のサイズにサブ映像が拡大又は縮小されて表示されるようにしてもよい。 In the two examples shown in FIGS. 2 and 3, the display (possible) area represented by the metadata has been described on the assumption that the size of the sub video (child screen) is fixed. However, the present invention is not limited to this. The display area information may represent the display size of the sub video itself. That is, when the sub video is displayed with the display area represented by the two coordinates of the upper left corner and the lower right corner as in FIG. 3, the sub video is enlarged or reduced to the size of the display area. You may do it.

図４の表は、本発明で与えるメタデータに関して、表示（可能）領域を指定する時間範囲の設定及び表示（可能）領域の記述形式についてのバリエーションを示したものである。但し、図４には、表示（可能）領域が矩形領域の場合のみを例示している。 The table of FIG. 4 shows variations regarding the setting of the time range for specifying the display (possible) area and the description format of the display (possible) area with respect to the metadata provided by the present invention. However, FIG. 4 illustrates only the case where the display (possible) area is a rectangular area.

時間範囲の設定の仕方としては、任意区間を指定するものと、固定の単位区間毎に表示（可能）領域を与えるものとがある。なお、任意区間を指定する際、連続する区間の間には時間の抜けや重複が無いとすれば、区間の開始もしくは終了時刻のいずれか一方を省略しても構わない。また、図４の表では、時刻の表現として、一般的に用いられている「時：分：秒」を例に挙げたが、これに限らず、例えば全体を「秒」もしくは「ミリ秒」で与えるといった形式であってもよい。また、固定の単位区間毎に与える場合では、図４に例示した５秒毎以外にも、任意の時間、例えば１秒毎や、２５０ミリ秒毎、１分毎などに１つずつ表示（可能）領域を与えるようにしてもよい。また、時間以外に、フレーム毎や、ＧＯＰ（ＧｒｏｕｐＯｆＰｉｃｔｕｒｅ）といった映像符号化時の単位毎に与えることもできる。単位区間の長さは、格納される映像の性質によって適宜設定される。 There are two methods for setting the time range: one that specifies an arbitrary section and one that provides a display (possible) area for each fixed unit section. In addition, when designating an arbitrary section, if there is no time gap or overlap between consecutive sections, either the start or end time of the section may be omitted. In the table of FIG. 4, “hour: minute: second”, which is generally used as an expression of time, is given as an example. However, the present invention is not limited to this. For example, the whole is “second” or “millisecond”. It may be in the form of giving. In addition, in the case of giving every fixed unit section, in addition to every 5 seconds illustrated in FIG. 4, it can be displayed at any time, for example, every 1 second, every 250 milliseconds, every minute, etc. (possible ) An area may be given. In addition to time, it can be given for each frame or for each unit of video encoding such as GOP (Group Of Picture). The length of the unit section is appropriately set according to the nature of the stored video.

表示（可能）領域の記述形式としては、座標１つのみ、座標２つ、座標とサイズ、といった指定の仕方がある。このうち、座標１つで指定可能なのは、サブ映像の表示サイズが予め決められるケースである。また、領域を座標２つ、もしくは座標とサイズで表すのは、サブ映像の表示サイズが指定された領域よりも小さい、いわゆる表示可能領域を表すケースと、サブ映像を指定された領域にリサイズ（拡大あるいは縮小）する、表示領域を表すケースとがある。表示可能領域として、メイン映像内の上下あるいは左右に亘った帯状の領域（例えば画面の上半分、下半分といった領域）を指定することも可能である。また、図４は表示（可能）領域を矩形領域とした例であるが、これ以外に、表示（可能）領域を多角形や楕円といった矩形以外の形状で与えるか、または任意形状で与えることも可能である。任意形状領域は、例えばそれを表したマスク画像などにより与えられる。ここでは任意形状の具体的な記述形式については説明を省略する。 As a description format of the display (possible) area, there are methods of specifying only one coordinate, two coordinates, coordinates and size. Among these, the one that can be specified by one coordinate is a case where the display size of the sub-video is determined in advance. In addition, the area is represented by two coordinates, or the coordinates and the size. The sub-video display size is smaller than the designated area, that is, a so-called displayable area, and the sub-video is resized to the designated area ( There are cases where the display area is enlarged or reduced. As the displayable area, it is possible to designate a band-like area (for example, an upper half or a lower half of the screen) extending vertically or horizontally in the main video. FIG. 4 shows an example in which the display (possible) area is a rectangular area. Alternatively, the display (possible) area may be given in a shape other than a rectangle such as a polygon or an ellipse, or given in an arbitrary shape. Is possible. The arbitrarily shaped region is given by, for example, a mask image representing it. Here, a description of a specific description format of an arbitrary shape is omitted.

また、図２、図３のように表示（可能）領域の位置がある時刻で離散的に変化するので無く、図５に示すように表示（可能）領域が時間的に連続して変化するように指定することもできる。この場合のメタデータ（図５（ａ））に含まれる表示（可能）領域情報５０２は、例えば図５に示すように、時間区間と、その時間区間の開始時刻位置での表示（可能）領域の位置及びその時間区間の終了時刻位置での表示（可能）領域の位置の組み合わせで与えられる。例えば、子画面の表示領域を図５（ｂ）に示す。ここで、時刻「００：００：１０」のとき、子画面は（ｘ１、ｙ１）を左上の座標とする表示領域に表示される。そして、時刻「００：００：２０」のときに、子画面が（ｘ２、ｙ２）を左上の座標とする表示領域に表示されるように、連続して表示領域を移動させる。また、時刻「００：００：４０」のときに、子画面が（ｘ３、ｙ３）を左上の座標とする表示領域に表示されるように、連続して表示領域を移動させる。そして、この場合における表示領域もしくは表示可能領域を１次元として模式的に表したのが図５（ｃ）である。 Further, the position of the display (possible) area does not change discretely at a certain time as shown in FIGS. 2 and 3, but the display (possible) area changes continuously in time as shown in FIG. Can also be specified. The display (possible) area information 502 included in the metadata (FIG. 5A) in this case is a display (possible) area at the time section and the start time position of the time section, as shown in FIG. 5, for example. And the position of the display (possible) area at the end time position of the time interval. For example, the display area of the small screen is shown in FIG. Here, at time “00:00:10”, the child screen is displayed in a display area having (x1, y1) as the upper left coordinates. Then, at the time “00:00:20”, the display area is continuously moved so that the child screen is displayed in the display area having (x2, y2) as the upper left coordinates. In addition, at the time “00:00:40”, the display area is continuously moved so that the child screen is displayed in the display area having (x3, y3) as the upper left coordinates. FIG. 5C schematically shows the display area or the displayable area in this case as one dimension.

なお、連続して変化する領域の指定方法はこれに限らず、開始時刻位置での表示（可能）領域の位置とそこからの単位変化量（移動ベクトル）によって与える、といった方法もある。 Note that the method for designating a continuously changing area is not limited to this, and there is a method in which the area is given by the position of the display (possible) area at the start time position and the unit variation (movement vector) therefrom.

また、本発明では、メタデータにより表される領域を表示領域（表示される領域）もしくは表示可能領域（表示できる領域）であるとしているが、逆に、それ以外の領域を表示禁止領域（表示できない領域）として指定している、と捉えることもできる。即ち、表示可能時間と表示禁止領域とを指定するメタデータにも、本発明は同様に適用される。 In the present invention, the region represented by the metadata is a display region (displayed region) or a displayable region (displayable region). Conversely, other regions are displayed as display prohibited regions (displayed). It can also be understood that it is designated as an area that cannot be performed. That is, the present invention is similarly applied to metadata designating the displayable time and the display prohibited area.

次に、以上で説明した表示用のメタデータを使って、サブ映像をメイン映像に合成して再生、表示する際の具体的な動作について、図６乃至図１０を用いて説明する。 Next, a specific operation when the sub video is combined with the main video and reproduced and displayed using the display metadata described above will be described with reference to FIGS.

図６は、サブ映像（子画面）の表示／非表示の切り替えを含むサブ映像表示時の処理を示したフローチャートである。本フローチャートは、図１で示した映像表示装置１の装置構成のうち、主に位置指定部１１０、処理制御部１０９及び合成部１０５の動作を示している。また、図７乃至図１０には、図６のフローチャートを通して図１の映像表示装置１でサブ映像が合成表示された時の動作結果の一例を示した。図７乃至図１０中、黒く塗った箇所が、サブ映像の表示された時間、並びにその時の表示位置を示している。 FIG. 6 is a flowchart showing processing at the time of sub-video display including display / non-display switching of the sub-video (child screen). This flowchart mainly shows operations of the position specifying unit 110, the processing control unit 109, and the combining unit 105 in the apparatus configuration of the video display apparatus 1 shown in FIG. 7 to 10 show an example of the operation result when the sub video is synthesized and displayed on the video display device 1 of FIG. 1 through the flowchart of FIG. In FIGS. 7 to 10, black portions indicate the time when the sub video is displayed and the display position at that time.

なお、以下、表示領域がサブ映像の表示サイズと同じとなる図２で示したメタデータを例にとって再生、表示過程を説明していくが、表示領域のサイズがサブ映像の表示サイズよりも大きい、いわゆる表示可能領域を示したメタデータを用いる場合であっても、位置指定部１１０が表示可能領域から適宜表示位置を決定して出力する以外に、基本的な動作は変わらない。 In the following, the reproduction and display process will be described with reference to the metadata shown in FIG. 2 in which the display area is the same as the display size of the sub video, but the display area size is larger than the display size of the sub video Even when metadata indicating a so-called displayable area is used, the basic operation does not change except that the position designation unit 110 appropriately determines and outputs a display position from the displayable area.

位置指定部１１０は、メタデータを読み込む(ステップＳ１)と、まず、メタデータに含まれる表示可能時間情報（図２の２０１）に従い、メイン映像の現在の再生時刻が表示可能時間内であるかを判定する（ステップＳ２及びＳ３）。表示可能時間の開始時刻よりも前であれば、表示可能時間が開始されるまで待機することにより、サブ映像は表示されない（ステップＳ２；Ｎｏ）。 When the position designation unit 110 reads the metadata (step S1), first, according to the displayable time information (201 in FIG. 2) included in the metadata, is the current playback time of the main video within the displayable time? Is determined (steps S2 and S3). If it is before the start time of the displayable time, the sub video is not displayed by waiting until the displayable time starts (step S2; No).

メイン映像における現在の再生時刻が表示可能時間内であれば（ステップＳ２；Ｙｅｓ→ステップＳ３；Ｎｏ）、位置指定部１１０はサブ映像の表示／非表示状態の変更の指示を入力部１０８から受け付ける。ここで、サブ映像を表示する指示を受け、サブ映像が表示状態である場合には（ステップＳ４；Ｙｅｓ）、サブ映像の復号処理を行い、復号画像を出力する（ステップＳ５）。また、位置指定部１１０は、メイン映像の現在の再生時刻位置に関する時刻情報を取得し（ステップＳ６）、メタデータを用いて現在の再生時刻位置に対応するサブ映像の表示位置を決定する（ステップＳ７）。そして、合成部１０５は、メイン映像内の指定された表示位置にサブ映像を合成して表示する（ステップＳ８）。そして、サブ映像のデータが終了となっていない場合（ステップＳ９；Ｎｏ）は、処理をステップＳ３に移行し処理を継続する。 If the current playback time in the main video is within the displayable time (step S2; Yes → Step S3; No), the position specifying unit 110 receives an instruction to change the display / non-display state of the sub video from the input unit 108. . Here, when an instruction to display a sub video is received and the sub video is in a display state (step S4; Yes), the sub video is decoded and a decoded image is output (step S5). In addition, the position specifying unit 110 acquires time information related to the current playback time position of the main video (step S6), and determines the display position of the sub video corresponding to the current playback time position using the metadata (step S6). S7). Then, the synthesis unit 105 synthesizes and displays the sub video at the designated display position in the main video (step S8). If the sub video data has not ended (step S9; No), the process proceeds to step S3 and the process is continued.

他方、サブ映像の表示／非表示状態の変更指示により、サブ映像が非表示状態とユーザにより指示された場合（Ｓ４；Ｎｏ）には、サブ映像の復号並びに出力処理を停止し（ステップＳ１０）、サブ映像の表示自体を一時停止状態とする。
サブ映像の再生が終了するか(Ｓ９；Ｙｅｓ)、あるいはメイン映像の再生時刻がサブ映像の表示可能時間の終了時刻を過ぎたら（Ｓ３；Ｙｅｓ）、サブ映像の表示処理は終了する。 On the other hand, when the sub video is instructed to be in a non-display state by the user in response to an instruction to change the display / non-display state of the sub video (S4; No), the decoding and output processing of the sub video is stopped (step S10). Then, the display of the sub video itself is temporarily stopped.
When the playback of the sub video ends (S9; Yes), or when the playback time of the main video passes the end time of the sub video displayable time (S3; Yes), the sub video display processing ends.

図７乃至図１０は、メイン映像とサブ映像との位置関係を模式的に示した図である。縦方向にメイン映像の空間位置を、横方向に時間を示している。今、時刻「００：００：００」からメイン映像の出力が開始される。さらに、図２（ａ）で示したメタデータ構造を利用した場合におけるサブ映像の表示状態について表している。 7 to 10 are diagrams schematically illustrating the positional relationship between the main video and the sub video. The spatial position of the main video is shown in the vertical direction, and the time is shown in the horizontal direction. Now, the output of the main video is started from time “00:00:00”. Further, the display state of the sub video when the metadata structure shown in FIG. 2A is used is shown.

まず、図７は、時刻「００：００：１３」までの状態を表した図である。図２（ａ）におけるメタデータ構造を参照すると、時刻「００：００：１０」からサブ映像の表示可能時間が開始する。そして、時刻「００：００：１３」のときにサブ映像を表示する操作がユーザによりなされると（図６のステップＳ２；Ｙｅｓ→ステップＳ３；Ｎｏ→ステップＳ４；Ｙｅｓ）、サブ映像が復号される（ステップＳ５）。そして、メイン映像と合成され、メタデータで示された時刻「００：００：１３」に対応する表示位置に、サブ映像の表示が開始される（図７の黒塗り部）。 First, FIG. 7 is a diagram showing a state up to time “00:00:13”. Referring to the metadata structure in FIG. 2A, the sub video displayable time starts from time “00:00:10”. When the user performs an operation for displaying the sub video at time “00:00:13” (step S2 in FIG. 6; Yes → Step S3; No → Step S4; Yes), the sub video is decoded. (Step S5). Then, the display of the sub video is started at the display position corresponding to the time “00:00:13”, which is combined with the main video and indicated by the metadata (blacked portion in FIG. 7).

つづいて、図８は、時刻「００：００：２０」までの状態を表した図である。図２（ａ）におけるメタデータ構造を参照すると、時刻「００：００：１５」のときにサブ映像の表示領域を変更する内容が記述されている。したがって、位置指定部１１０は、メタデータの表示領域２０１にしたがって、サブ映像の表示位置を変更する（ステップＳ７）。また、時刻「００：００：２０」において、入力部１０８からサブ映像の状態を非表示とする信号が入力されると（ステップＳ４；Ｎｏ）、位置指定部１１０から合成部１０５に対してサブ映像の出力を停止する信号を出力する。そして、合成部１０５は、サブ映像の出力を停止する（ステップＳ１０）。 FIG. 8 is a diagram showing a state up to time “00:00:20”. Referring to the metadata structure in FIG. 2A, contents for changing the display area of the sub video at the time “00:00:15” are described. Therefore, the position specifying unit 110 changes the display position of the sub video according to the metadata display area 201 (step S7). Further, at time “00:00:20”, when a signal for not displaying the state of the sub video is input from the input unit 108 (step S4; No), the position specifying unit 110 outputs the sub video to the combining unit 105. Outputs a signal to stop video output. Then, the synthesis unit 105 stops outputting the sub video (Step S10).

つづいて、図９は、時刻「００：００：２８」までの状態を表した図であり、再びサブ映像（子画面）が表示状態に切り替えられた時の状態を示している。この時、サブ映像は一時停止状態から再生状態に復帰し、「００：００：２０」時点で再生されていた続きのサブ映像が再生される。そしてその時のサブ映像（子画面）の表示位置は、メタデータによって時刻「００：００：２８」に対応した表示位置が与えられる。 Next, FIG. 9 is a diagram showing a state up to time “00:00:28”, and shows a state when the sub video (child screen) is switched to the display state again. At this time, the sub video returns from the paused state to the playback state, and the subsequent sub video that was played back at the time of “00:00:20” is played back. Then, the display position of the sub video (sub-screen) at that time is given a display position corresponding to the time “00:00:28” by the metadata.

つづいて、図１０は、時刻「００：００：３６」までの状態を表した図であり、総再生時間「１５」秒であるサブ映像の再生が終了した状態を示している。まず、図２（ａ）に記載されたメタデータを参照すると、時刻「００：００：３０」の時にサブ映像の表示領域が変更される（ステップＳ７）。そして、総再生時間「１５」秒が経過した「００：００：３６」において、サブ映像の出力は停止する（ステップＳ９；Ｙｅｓ）。 FIG. 10 is a diagram showing a state up to time “00:00:36”, and shows a state in which the reproduction of the sub-video having the total reproduction time “15” seconds is finished. First, referring to the metadata described in FIG. 2A, the display area of the sub video is changed at time “00:00:30” (step S7). Then, at “00:00:36” when the total playback time “15” seconds have elapsed, the output of the sub video is stopped (step S9; Yes).

以上説明したように、第１の実施形態にかかる映像表示装置１では、サブ映像の表示領域あるいは表示可能領域を示したメタデータを利用することによって、サブ映像がメイン映像に合成表示される時、その表示時刻に対応したメイン映像内でのサブ映像の表示位置を適切に指定することができる。これにより、サブ映像は、表示可能時間の範囲内で自由に表示、非表示を切り替えることができ、また、自由に表示、非表示を切り替えても、メイン映像にとって望ましくない位置にサブ映像が合成表示されることを回避することができる。 As described above, in the video display device 1 according to the first embodiment, when the sub video is synthesized and displayed on the main video by using the metadata indicating the display area or the displayable area of the sub video. The display position of the sub video in the main video corresponding to the display time can be appropriately designated. As a result, the sub video can be switched between display and non-display freely within the displayable time range, and the sub video can be synthesized at a position that is not desirable for the main video even if display and non-display are switched freely. It is possible to avoid being displayed.

本実施形態の図１では、上記したメタデータが、それぞれの映像データと独立して入力されるように描かれている。例えば、映像データと共に、映像データを管理するための管理データ（符号化方式など符号化データの復号に必要な情報や、映像の切出しや連結を指定したプレイリストなど映像の再生に必要な情報）が映像と別ストリームで与えられるような場合に、メタデータを管理データ内に格納して映像表示装置１に与えることができる。あるいは、図２（ａ）あるいは図３（ａ）に既に示しているように、上記したメタデータが、サブ映像の映像データを含む映像ストリーム中に格納されて提供されるようにしてもよい。この場合には、映像表示装置１に入力する前に、サブ映像の映像ストリームからメタデータを分離する処理が必要である。 In FIG. 1 of the present embodiment, the above-described metadata is drawn so as to be input independently of each video data. For example, together with video data, management data for managing video data (information necessary for decoding encoded data such as an encoding method, and information necessary for video playback such as a playlist specifying video segmentation and connection) Can be stored in the management data and provided to the video display device 1. Alternatively, as already shown in FIG. 2A or FIG. 3A, the above-described metadata may be stored and provided in a video stream including video data of a sub video. In this case, before inputting to the video display device 1, it is necessary to separate the metadata from the video stream of the sub video.

また、上記したメタデータは、サブ映像の再生に伴って消費されるため、通常はサブ映像と一対一で与えられると考えられるが、例えば、メイン映像がメタデータを有して、複数のサブ映像に対して、どのサブ映像にも共通にそのメタデータを適用する、といった使用方法も考えられる。その場合には、メイン映像の映像データ（の映像ストリーム）中にメタデータが格納されることもある。さらに、図２（ａ）、図３（ａ）では、映像ストリームのヘッダ位置に上記メタデータが格納されているが、格納位置はこれに限らす、ストリームの中間、例えば映像データが複数のパケットに分割されて送られるような場合に、映像パケットと映像パケットの間に新規のデータパケットとして埋め込むか、あるいは映像パケットそれぞれのパケットヘッダに格納するとしてもよい。 In addition, since the above-described metadata is consumed as the sub video is played back, it is considered that the metadata is usually given one-on-one with the sub video. For example, the main video has metadata and a plurality of sub videos are provided. It is also possible to use such a method that the metadata is commonly applied to any sub-video for the video. In that case, the metadata may be stored in the video data (video stream) of the main video. Further, in FIGS. 2A and 3A, the metadata is stored at the header position of the video stream. However, the storage position is not limited to this, and the middle of the stream, for example, video data includes a plurality of packets. May be embedded as a new data packet between video packets or stored in the packet header of each video packet.

映像の提供者は、上記のようにして映像データと共にメタデータを提供することにより、提供者の意図した表示位置でピクチャ・イン・ピクチャ時のサブ映像の表示を可能とする。 The video provider can display the sub video at the time of picture-in-picture at the display position intended by the provider by providing the metadata together with the video data as described above.

また、図１に示した映像表示装置１の合成部１０５は、サブ映像の調整のみを行い、メイン映像の調整は行わない（即ちメイン映像は全画面表示）が、例えば、メイン映像の復号映像の入力側にも（サブ映像用の調整部１０６と別個の）調整部１０６ａを設けて、メイン映像、サブ映像共に調整して出力する合成部１０５ａを構成することも可能である（合成部１０５ａ及び調整部１０６ａについては特に図示していない。）。但し、その場合、メタデータはサブ映像を合成表示するメイン映像上の表示（可能）領域を表すため、上記調整部１０６ａでメイン映像の調整を行った際には、その調整に応じてメタデータで示されるサブ映像の表示（可能）領域も調整する必要がある。即ち、例えば、メイン映像が縦横共に２分の１に縮小表示される場合には、そのメイン映像に合成表示されるサブ映像の表示（可能）領域も縦横２分の１に圧縮される。このことは、他の実施形態においては特に明言しないが、他の実施形態についても全く同様に当てはまるものとする。 Further, the composition unit 105 of the video display device 1 shown in FIG. 1 only adjusts the sub video and does not adjust the main video (that is, the main video is displayed in full screen). It is also possible to provide an adjustment unit 106a (separate from the adjustment unit 106 for the sub video) on the input side of the video signal to configure the synthesis unit 105a that adjusts and outputs both the main video and the sub video (the synthesis unit 105a). Further, the adjustment unit 106a is not particularly illustrated.) However, in this case, since the metadata represents a display (possible) area on the main video where the sub-video is synthesized and displayed, when the main video is adjusted by the adjustment unit 106a, the metadata is changed according to the adjustment. It is also necessary to adjust the display (possible) area of the sub video indicated by. That is, for example, when the main video is reduced and displayed in half both vertically and horizontally, the display (possible) area of the sub video combined with the main video is also compressed in half. This is not explicitly stated in other embodiments, but the same applies to other embodiments.

（第２の実施形態）
次に、第２の実施形態にかかる映像表示装置、方法、並びに表示用データを、図１、図６、並びに図１１乃至図１５を用いて説明する。 (Second Embodiment)
Next, a video display apparatus, method, and display data according to the second embodiment will be described with reference to FIGS. 1, 6, and 11 to 15.

第２の実施形態にかかる映像表示装置２の概略構成は、第１の実施形態と同様に図１の機能ブロック図で表される。但し、第２の実施形態では、第１の実施形態と扱うメタデータが異なり、表示装置の動作としては、位置指定部の動作のみが映像表示位置１（位置指定部１１０）と映像表示装置２（位置指定部２１０）とで異なる。このため、以下では、第１の実施形態との相違を中心に、第２の実施形態の映像表示装置２で用いるメタデータと、そのメタデータを用いた再生時の具体的動作とを説明する。 The schematic configuration of the video display device 2 according to the second embodiment is represented by the functional block diagram of FIG. 1 as in the first embodiment. However, in the second embodiment, the metadata handled is different from that in the first embodiment, and as the operation of the display device, only the operation of the position specifying unit is the video display position 1 (position specifying unit 110) and the video display device 2. It differs from (position specifying part 210). Therefore, in the following, the metadata used in the video display device 2 of the second embodiment and the specific operation at the time of reproduction using the metadata will be described with a focus on differences from the first embodiment. .

図１１に、本第２の実施形態で扱うメタデータの例を示す。第１の実施形態であげたメタデータ（図２、図３）は、サブ映像が表示可能時間内に表示される時、メイン映像にとって好ましいメイン映像内のサブ映像（子画面）の表示領域を与えるものであった。このため、図２、図３に示されたメタデータでは、メイン映像の再生時間軸を基準に、それぞれのメイン映像の再生時刻に対応するサブ映像の表示領域が与えられていた。これに対して、図１１に示した第２の実施形態にかかるメタデータは、サブ映像が表示される時、サブ映像の内容や演出意図などによって、サブ映像自体にとって望ましいサブ映像の表示されるべき表示領域を与えるものである。このため、第２の実施形態にかかるメタデータでは、サブ映像の再生時間軸を基準に、それぞれのサブ映像の再生時刻に対応したサブ映像の表示領域を与える。 FIG. 11 shows an example of metadata handled in the second embodiment. The metadata (FIGS. 2 and 3) given in the first embodiment shows the display area of the sub video (child screen) in the main video preferable for the main video when the sub video is displayed within the displayable time. It was to give. For this reason, in the metadata shown in FIGS. 2 and 3, the display area of the sub video corresponding to the playback time of each main video is provided on the basis of the playback time axis of the main video. On the other hand, in the metadata according to the second embodiment shown in FIG. 11, when a sub video is displayed, a sub video desirable for the sub video itself is displayed depending on the content of the sub video or the intention of the presentation. Display area to be displayed. For this reason, in the metadata according to the second embodiment, the display area of the sub video corresponding to the playback time of each sub video is given based on the playback time axis of the sub video.

ここで、サブ映像の内容によって望ましい表示位置とは、例えば、１０秒のサブ映像があり、最初の５秒が右を向いた人物Ａ、残りの５秒が左を向いた人物Ｂを移した映像であった場合に、最初の５秒は画面向かって左にサブ映像を表示し、残りの５秒は画面向かって右にサブ映像を表示して、両者Ａ、Ｂとも画面中央向きに表示する、といったユースケースにおいて利用される。無論、これは例であって、必ずしも両者中央を向くことが常に望ましい訳ではなく、何処に表示させるかは映像製作者の演出意図と絡むものである。つまり、図１１に示すような第２の実施形態にかかるメタデータは、サブ映像自体に対してサブ映像製作者の演出意図を再生に反映させるための付加情報ということになる。 Here, the desired display position depending on the contents of the sub-video is, for example, a sub-video of 10 seconds, and the person A facing the right for the first 5 seconds and the person B facing the left for the remaining 5 seconds are moved. If it is a video, the first 5 seconds displays the sub video on the left side of the screen, the remaining 5 seconds displays the sub video on the right side of the screen, and both A and B are displayed toward the center of the screen. Used in use cases such as Of course, this is only an example, and it is not always desirable to face the center of both, and where it is displayed is related to the production intention of the video producer. That is, the metadata according to the second embodiment as shown in FIG. 11 is additional information for reflecting the production intention of the sub video producer in the reproduction of the sub video itself.

第１の実施形態の図２と同様、図１１（ａ）には具体的なメタデータの構造を、図１１（ｂ）にはそのメタデータが指し示している表示領域を、図１１（ｃ）には表示領域の時間的な変化が理解し易いように、表示領域を１次元として模式的に表した。上述したように、図１１（ｂ）及び図１１（ｃ）の横軸には、サブ映像の再生時間位置が与えられている。図１１（ｃ）の縦軸は画面の空間的な２次元位置を表すものであり、図示された帯の縦幅が表示領域の大きさに対応する。 As in FIG. 2 of the first embodiment, FIG. 11A shows a specific metadata structure, FIG. 11B shows a display area indicated by the metadata, and FIG. Is schematically represented as a one-dimensional display area so that changes in the display area over time can be easily understood. As described above, the reproduction time position of the sub video is given on the horizontal axis of FIGS. 11B and 11C. The vertical axis in FIG. 11C represents the spatial two-dimensional position of the screen, and the vertical width of the illustrated band corresponds to the size of the display area.

図１１（ａ）に示したメタデータは、メイン映像のどの時間範囲でサブ映像が表示可能かを示す表示可能時間情報１１０１と、サブ映像の再生時間軸を基準として、サブ映像の各再生時刻時点でサブ映像を表示すべきメイン映像内の位置を示した表示領域情報１１０２とを含んで構成される。但し、表示可能時間情報１１０１は必須ではなく、省略されてもよい。省略された場合、メイン映像全体がサブ映像の表示可能時間と解釈される。 The metadata shown in FIG. 11A includes displayable time information 1101 indicating in which time range of the main video the sub video can be displayed, and each playback time of the sub video based on the playback time axis of the sub video. And display area information 1102 indicating the position in the main video at which the sub video is to be displayed at the time. However, the displayable time information 1101 is not essential and may be omitted. When omitted, the entire main video is interpreted as the sub video displayable time.

なお、図１１では、表示領域情報１１０２として、サブ映像（子画面）の表示サイズが予め与えられた固定サイズであることを想定し、子画面の左上頂点位置（あるいは子画面の中心位置）の座標のみで指定される例を示したが、これに限らず、第１の実施形態と同様に、２つの座標を与えて表示可能領域を示したり（図３を参照）、同じく２つの座標を与えてサブ映像を拡大・縮小して表示する表示領域を与えたりすることも可能である。図１１（ｃ）で、サブ映像を表示すべき表示領域は、サブ映像の再生時刻「００：００：０５」（即ちサブ映像の再生開始から延べ５秒再生後）及び「００：００：１０」（即ちサブ映像の再生開始から延べ１０秒再生後）で位置が変化する帯状の領域として表される。 In FIG. 11, it is assumed that the display size of the sub video (child screen) is a fixed size given in advance as the display area information 1102, and the upper left vertex position (or the center position of the child screen) of the child screen is assumed. Although an example in which only coordinates are specified is shown, the present invention is not limited to this, and similarly to the first embodiment, two coordinates are given to indicate a displayable area (see FIG. 3), It is also possible to give a display area for displaying the enlarged sub-image by enlarging or reducing it. In FIG. 11C, the display area in which the sub video is to be displayed is the sub video playback time “00:00:05” (that is, after 5 seconds from the start of playback of the sub video) and “00:00:10”. ”(That is, after 10 seconds of reproduction from the start of sub-video reproduction), the position changes.

次に、図１１で示したメタデータを使って、サブ映像をメイン映像に合成して再生、表示する際の具体的な動作について、図６、図１２乃至図１５を用いて説明する。 Next, a specific operation when the sub video is combined with the main video and reproduced and displayed using the metadata shown in FIG. 11 will be described with reference to FIGS. 6 and 12 to 15.

本実施形態にかかる映像表示装置２で、サブ映像（子画面）の表示／非表示の切り替えを含むサブ映像表示を行う際の処理は、第１の実施形態と同様に図６のフローチャートで表される。本フローチャートは、図１で示した映像表示装置２の装置構成のうち、位置指定部２１０、処理制御部１０９及び合成部１０５の動作を示している。 In the video display device 2 according to the present embodiment, the processing when performing sub video display including display / non-display switching of the sub video (sub-screen) is represented by the flowchart of FIG. 6 as in the first embodiment. Is done. This flowchart shows operations of the position specifying unit 210, the processing control unit 109, and the combining unit 105 in the device configuration of the video display device 2 shown in FIG.

なお、第１の実施形態の説明と同様、以下では、表示領域を表すメタデータを使った再生、表示過程を説明していくが、表示可能領域を示したメタデータを用いる場合であっても、位置指定部２１０が表示可能領域から適宜表示位置を決定して出力する以外に、基本的な動作は変わらない。 Similar to the description of the first embodiment, the playback and display process using the metadata indicating the display area will be described below. However, even when the metadata indicating the displayable area is used, The basic operation is the same except that the position designation unit 210 appropriately determines the display position from the displayable area and outputs it.

位置指定部２１０は、入力されたメタデータを読み込むと(ステップＳ１)、当該メタデータに含まれる表示可能時間情報１１０１に従い、メイン映像の現在の再生時刻が表示可能時間内であるかを判定する（ステップＳ２及びＳ３）。現在の再生時刻が表示可能時間の開始時刻よりも前であれば、サブ映像の表示は行わず待機状態となる（ステップＳ２；Ｎｏ）。 When reading the input metadata (step S1), the position specifying unit 210 determines whether the current playback time of the main video is within the displayable time according to the displayable time information 1101 included in the metadata. (Steps S2 and S3). If the current reproduction time is before the start time of the displayable time, the sub video is not displayed and the standby state is set (step S2; No).

メイン映像における現在の再生時刻が表示可能時間内であれば（ステップＳ２；Ｙｅｓ→ステップＳ３；Ｎｏ）、位置指定部２１０はサブ映像の表示／非表示状態の変更の指示を入力部１０８から受け付ける。ここで、サブ映像を表示する指示を受け、サブ映像が表示状態である場合には（ステップＳ４；Ｙｅｓ）、サブ映像の復号処理を行い復号画像を出力する（ステップＳ５）。また、位置指定部２１０は、サブ映像の現在の再生時刻位置に関する時刻情報を取得し（ステップＳ６）、メタデータを用いてサブ映像の現在の再生時刻位置に対応する表示位置を決定する（ステップＳ７）。そして、合成部１０５は、メイン映像内の指定された表示位置にサブ映像を合成して表示する（ステップＳ８）。このように、第１の実施形態との相違は、ステップＳ６において時刻情報としてサブ映像自体の延べ再生時刻位置を取得することと、ステップＳ７においてメタデータを使ってサブ映像の再生時刻位置に対応する表示位置を決定することの２点である。 If the current playback time in the main video is within the displayable time (step S2; Yes → Step S3; No), the position specifying unit 210 receives from the input unit 108 an instruction to change the display / non-display state of the sub video. . Here, when an instruction to display the sub video is received and the sub video is in the display state (step S4; Yes), the decoding of the sub video is performed and the decoded image is output (step S5). Further, the position specifying unit 210 acquires time information related to the current playback time position of the sub video (step S6), and determines a display position corresponding to the current playback time position of the sub video using the metadata (step S6). S7). Then, the synthesis unit 105 synthesizes and displays the sub video at the designated display position in the main video (step S8). As described above, the difference from the first embodiment is that the total playback time position of the sub video itself is acquired as time information in step S6, and the playback time position of the sub video is used in step S7 using the metadata. This is two points of determining the display position to be performed.

図１２乃至図１５は、映像表示装置２でサブ映像が合成表示された時の動作結果の一例を模式的に示した図である。但し、本実施形態における映像再生装置２では、メイン映像の再生時刻と別に、サブ映像が何処まで再生表示されたかを示すサブ映像の再生時刻を基にメタデータを管理するため、図１２乃至図１５では、（ａ）にサブ映像の再生時刻を基準としてメタデータから表示領域が指定される様子を示し、（ｂ）にメイン映像の時間を基準としてサブ映像がメイン映像に合成表示される様子を示した図を記載した。図１２乃至図１５の（ｂ）図中、黒く塗った箇所が、サブ映像の表示された時間、並びにその時の表示位置を示している。 12 to 15 are diagrams schematically illustrating an example of an operation result when the sub video is synthesized and displayed on the video display device 2. However, in the video playback apparatus 2 according to the present embodiment, the metadata is managed based on the playback time of the sub video indicating how far the sub video has been played back and displayed in addition to the playback time of the main video. 15 shows (a) how the display area is specified from the metadata based on the playback time of the sub video, and (b) how the sub video is synthesized and displayed on the main video based on the time of the main video. The figure which showed was described. In FIG. 12 to FIG. 15B, black portions indicate the time when the sub video is displayed and the display position at that time.

まず、図１２は、時刻「００：００：１３」までの状態を表した図である。図１１（ａ）におけるメタデータ構造を参照すると、時刻「００：００：１０」からサブ映像の表示可能時間が開始する。そして、時刻「００：００：１３」のときにサブ映像を表示する操作がユーザによりなされると（図６のステップＳ２；Ｙｅｓ→ステップＳ３；Ｎｏ→ステップＳ４；Ｙｅｓ）、サブ映像が復号される（ステップＳ５）。そして、メイン映像と合成され、メタデータで示された時刻「００：００：１３」に対応する表示位置に、サブ映像の表示が開始される。ここで、図１２（ａ）は、サブ映像における再生時刻「００：００：００」から映像が出力されはじめた様子が示されている。また、図１２（ｂ）はメイン映像における再生時刻が「００：００：１３」のときにサブ映像が出力されはじめた様子が示されている。 First, FIG. 12 is a diagram showing a state up to time “00:00:13”. Referring to the metadata structure in FIG. 11A, the sub video displayable time starts from time “00:00:10”. When the user performs an operation for displaying the sub video at time “00:00:13” (step S2 in FIG. 6; Yes → Step S3; No → Step S4; Yes), the sub video is decoded. (Step S5). Then, the display of the sub video is started at the display position corresponding to the time “00:00:13” that is combined with the main video and indicated by the metadata. Here, FIG. 12A shows a state in which the video starts to be output from the reproduction time “00:00:00” in the sub video. FIG. 12B shows a state in which the sub video starts to be output when the reproduction time of the main video is “00:00:13”.

つづいて、図１３は、時刻「００：００：２０」までの状態を表した図である。図１１（ａ）におけるメタデータ構造の表示領域情報１１０２を参照すると、サブ映像の再生時刻が「００：００：０５」のときにサブ映像の表示領域が変更される。したがって、図１３（ａ）に示すように、サブ映像の再生時刻「００：００：０５」のときに、表示領域が変化している。このため、図１３（ｂ）に示すように、合成映像上では、サブ映像が再生（表示）されてから５秒後となる時刻「００：００：１８」において表示位置が変更されている。そして、メイン映像における再生時刻「００：００：２０」において、サブ映像を非表示状態とする操作がなされると、メイン映像においてサブ映像の表示が停止する。このとき、サブ映像は「００：００：０７」まで再生されている。 FIG. 13 is a diagram showing a state up to time “00:00:20”. Referring to the display area information 1102 having the metadata structure in FIG. 11A, the display area of the sub video is changed when the reproduction time of the sub video is “00:00:05”. Therefore, as shown in FIG. 13A, the display area is changed at the reproduction time “00:00:05” of the sub video. For this reason, as shown in FIG. 13B, on the synthesized video, the display position is changed at time “00:00:18” which is 5 seconds after the sub video is reproduced (displayed). Then, when the operation for setting the sub video to the non-display state is performed at the reproduction time “00:00:20” in the main video, the display of the sub video in the main video is stopped. At this time, the sub video has been reproduced up to “00:00:07”.

つづいて、図１４は、時刻「００：００：２８」までの状態を表した図であり、再びサブ映像（子画面）が表示状態に切り替えられた時の状態を示している。この時、サブ映像は一時停止状態から再生状態に復帰し、メイン映像の「００：００：２０」時点で再生されていた続きのサブ映像、即ちサブ映像自体の時刻位置「００：００：０７」（延べ再生時間７秒の時間位置）から再生が開始される。サブ映像（子画面）の表示位置は、メタデータによって、同じくサブ映像自体の時刻位置「００：００：０７」（延べ再生時間７秒の時間位置）に対応した表示位置が与えられる。 Next, FIG. 14 is a diagram showing a state up to time “00:00:28”, and shows a state when the sub video (child screen) is switched to the display state again. At this time, the sub video returns from the paused state to the playback state, and the time position “00:00:07 of the subsequent sub video that was being played back at the time of“ 00:00:20 ”of the main video, that is, the sub video itself. "(The time position of the total reproduction time of 7 seconds) starts reproduction. The display position corresponding to the time position “00:00:07” (the time position of the total reproduction time of 7 seconds) of the sub video itself is given as the display position of the sub video (child screen) by the metadata.

つづいて、図１５は、メイン映像における時刻「００：００：３６」までの状態を表した図であり、総再生時間「１５」秒であるサブ映像の再生が終了した状態を示している。図１１（ａ）に記載されたメタデータに含まれる表示領域情報１１０２を参照すると、サブ映像の時刻「００：００：１０」（延べ再生時間１０秒の時間位置）にサブ映像の表示位置の変更がある。したがって、サブ映像の時刻が「００：００：１０」となる時刻、すなわちメイン映像の時刻「００：００：３１」において、サブ映像の表示位置が変更されている。 FIG. 15 is a diagram showing a state of the main video up to time “00:00:36”, and shows a state in which the playback of the sub video having the total playback time “15” seconds is finished. Referring to the display area information 1102 included in the metadata described in FIG. 11A, the display position of the sub video is displayed at the time “00:00:10” of the sub video (total playback time of 10 seconds). There is a change. Accordingly, the display position of the sub video is changed at the time when the time of the sub video is “00:00:10”, that is, the time “00:00:31” of the main video.

以上説明したように、本実施形態にかかる映像表示装置２では、サブ映像の表示領域（あるいは表示可能領域）を示したメタデータを利用することによって、サブ映像がメイン映像に合成表示される時、サブ映像の内容や演出意図によって予め決められるサブ映像を表示すべき位置を指定して、メイン映像に合成表示することができる。これにより、サブ映像は自由に表示、非表示を切り替えることができ、また、自由に表示、非表示を切り替えても、サブ映像の内容や演出意図に則った表示位置にサブ映像を合成表示することができる。 As described above, in the video display device 2 according to the present embodiment, when the sub video is synthesized and displayed on the main video by using the metadata indicating the display area (or displayable area) of the sub video. In addition, it is possible to designate the position where the sub video, which is determined in advance according to the content of the sub video and the production intention, is to be displayed, and to display the synthesized video on the main video. As a result, the sub video can be switched between display and non-display freely, and even if display and non-display are switched freely, the sub video is synthesized and displayed at the display position according to the content of the sub video and the intention of the production. be able to.

本実施形態にかかるメタデータについても、メタデータの提供形態は、第１の実施形態と同様、映像データと独立した、例えば管理データのデータストリームに格納されて提供されるか、もしくは、図１１（ａ）に示すように、サブ映像の映像データを含む映像ストリーム中に格納されて提供されるかの、いずれも可能である。映像ストリームに格納される場合には、映像表示装置２に入力する前に、サブ映像の映像ストリームからメタデータを分離する処理が必要となる。なお、第２の実施形態にかかるメタデータは、サブ映像と一対一で与えられるものであるため、基本的にサブ映像の映像データ、もしくはサブ映像に関連する管理データに付加される。さらに、図１１（ａ）では、映像ストリームのヘッダ位置にメタデータが格納されているが、格納位置はこれに限らす、ストリームの中間、例えば映像データが複数のパケットに分割されて送られるような場合に、映像パケットと映像パケットの間に新規のデータパケットとして埋め込むか、あるいは映像パケットそれぞれのパケットヘッダに格納するとしてもよい。 As for the metadata according to this embodiment, the form of providing metadata is provided by being stored in a data stream of management data, for example, independent of video data, as in the first embodiment, or FIG. As shown in (a), it can be stored and provided in a video stream including video data of a sub video. In the case of being stored in the video stream, a process for separating the metadata from the video stream of the sub video is required before inputting to the video display device 2. Since the metadata according to the second embodiment is given one-to-one with the sub video, it is basically added to the video data of the sub video or the management data related to the sub video. Further, in FIG. 11A, metadata is stored at the header position of the video stream. However, the storage position is not limited to this, and the middle of the stream, for example, video data is divided into a plurality of packets and sent. In such a case, it may be embedded as a new data packet between video packets or stored in the packet header of each video packet.

（第３の実施形態）
次に、第３の実施形態にかかる映像表示装置、方法、並びに表示用データを、図１、図６及び図１６を用いて説明する。 (Third embodiment)
Next, a video display apparatus, method, and display data according to the third embodiment will be described with reference to FIGS.

第３の実施形態にかかる映像表示装置３の概略構成は、第１及び第２の実施形態と同様に、図１の機能ブロック図で表される。但し、位置指定部１１０の動作のみが異なり、本実施形態においては位置指定部３１０として表す。また、第３の実施形態にかかる映像表示装置３でサブ映像表示を行う際の処理も、第１及び第２の実施形態と同様に、図６のフローチャートで表される。以下では、第１の実施形態にかかる映像表示装置１との相違点を中心に、第３の実施形態にかかる映像表示装置３の動作を説明する。 The schematic configuration of the video display device 3 according to the third embodiment is represented by the functional block diagram of FIG. 1 as in the first and second embodiments. However, only the operation of the position specifying unit 110 is different, and is represented as a position specifying unit 310 in the present embodiment. Further, the processing when the sub video display is performed by the video display device 3 according to the third embodiment is also represented by the flowchart of FIG. 6 as in the first and second embodiments. Below, operation | movement of the video display apparatus 3 concerning 3rd Embodiment is demonstrated centering on difference with the video display apparatus 1 concerning 1st Embodiment.

本実施形態における映像表示装置３では、第１の実施形態並びに第２の実施形態で説明した２種類のメタデータが、共にサブ映像の表示用メタデータとして入力され、それらのメタデータの組合せに基づいてサブ映像の表示領域を決定するものである。従って、映像表示装置３の位置指定部３１０は、２種類のメタデータと、２つの時刻情報（メイン映像の再生時刻位置情報とサブ映像の再生時刻位置情報）とを受け取り（フローチャートのステップＳ６）、適切なサブ映像の表示領域を決定する（フローチャートのステップＳ７）。 In the video display device 3 in the present embodiment, the two types of metadata described in the first embodiment and the second embodiment are both input as sub-video display metadata, and are combined into the metadata. Based on this, the display area of the sub video is determined. Accordingly, the position specifying unit 310 of the video display device 3 receives two types of metadata and two pieces of time information (main video reproduction time position information and sub video reproduction time position information) (step S6 in the flowchart). Then, an appropriate sub video display area is determined (step S7 in the flowchart).

図１６は、メイン映像とサブ映像の状態を模式的に表した図である。図１６（ａ）は、第１の実施形態で説明したメタデータで与えられる、メイン映像に対して指定されるサブ映像の表示可能領域であり、図１６（ｂ）は、第２の実施形態で説明したメタデータで与えられる、サブ映像自体に対して指定される表示領域を表したものである。また、図１６（ｃ）は図１６（ｂ）のメタデータによって再生時にサブ映像の表示領域が指定される様子を示した図であり、図１６（ｄ）は図１６（ａ）、図１６（ｂ）のメタデータを使ってメイン映像とサブ映像とが合成されて表示される状態を示した図である。 FIG. 16 is a diagram schematically showing the state of the main video and the sub video. FIG. 16A shows a sub video displayable area specified for the main video given by the metadata described in the first embodiment, and FIG. 16B shows the second embodiment. This is a display area designated for the sub-video itself, which is given by the metadata described in the above. FIG. 16C is a diagram showing a state in which the display area of the sub video is specified at the time of reproduction by the metadata of FIG. 16B, and FIG. It is the figure which showed the state by which a main image | video and a sub image | video are synthesize | combined and displayed using the metadata of (b).

図１６（ｃ）及び図１６（ｄ）は、上記２種類のメタデータを用いて、第１及び第２の実施形態と同様に、サブ映像を時刻「００：００：１３」で表示開始、時刻「００：００：２０」で表示停止、時刻「００：００：２８」で表示再開、時刻「００：００：３６」で表示を終了した時の、サブ映像の表示位置を示している。また図１６（ｃ）には（ｂ）で示したサブ映像に対する表示領域１６Ｂが、図１６（ｄ）には、（ａ）で示したメイン映像におけるサブ映像の表示可能領域１６Ａが示してある。そして、図１６（ｄ）の斜線もしくは黒く塗りつぶした箇所が、サブ映像の表示された時間、並びにその時の表示位置を示している。 FIG. 16C and FIG. 16D show the display of the sub video at time “00:00:13” using the above two types of metadata, as in the first and second embodiments. The display position of the sub video when the display is stopped at time “00:00:20”, the display is restarted at time “00:00:28”, and the display is ended at time “00:00:36” is shown. 16C shows a display area 16B for the sub video shown in FIG. 16B, and FIG. 16D shows a sub video displayable area 16A in the main video shown in FIG. . Then, the hatched area in FIG. 16D or the blacked-out area indicates the time when the sub video is displayed and the display position at that time.

サブ映像はメイン映像に対して、付加価値的な余剰コンテンツとして与えられるのが一般的である。従って、メイン映像をできるだけ崩さないように保持した状態で再生するのが一般的には望ましい。このため、上記した２種類のメタデータが与えられた場合には、サブ映像自体に対して与えられる表示領域１６Ｂよりも、メイン映像に対して与えられるサブ映像の表示可能領域１６Ａを優先させて表示領域を決定する。 In general, the sub-video is given as extra content with added value to the main video. Therefore, it is generally desirable to reproduce the main video while keeping it as little as possible. For this reason, when the two types of metadata described above are given, the display area 16A of the sub video given to the main video is given priority over the display area 16B given to the sub video itself. Determine the display area.

図１６（ｄ）において、時刻範囲１６０１（「００：００：１３」〜「００：００：１５」）では、表示可能領域１６Ａとの表示領域１６Ｂとがちょうど重なっているため、双方のメタデータに基づきサブ映像の表示領域を決定している。 In FIG. 16D, in the time range 1601 (“00:00:13” to “00:00:15”), the displayable area 16A and the display area 16B are exactly overlapped, so both metadata The sub video display area is determined based on the above.

また、時刻範囲１６０２（「００：００：１５」〜「００：００：２０」及び「００：００：２８」〜「００：００：３０」）では、表示可能領域１６Ａに表示領域１６Ｂが完全に含まれる状態にある。このため、範囲１６０２では、第２の実施形態に示したのと同様のメタデータに基づき、サブ映像自体に対して与えられる表示領域にサブ映像を表示する。 In the time range 1602 (“00:00:15” to “00:00:20” and “00:00:28” to “00:00:30”), the display area 16B is completely included in the displayable area 16A. It is in the state included. For this reason, in the range 1602, the sub video is displayed in the display area given to the sub video itself based on the same metadata as shown in the second embodiment.

また、時刻範囲１６０３（「００：００：３０」〜「００：００：３６」）では、メイン映像に対して与えられるサブ映像の表示可能領域１６Ａと、サブ映像の内容自体に応じて指定されるサブ映像の表示領域１６Ｂとが、異なる領域に別れている。この場合には、メイン映像に対して与えられるサブ映像の表示可能領域１６Ａを優先させる。即ち、時刻範囲１６０３では、第１の実施形態に示したのと同様のメタデータに基づき、メイン映像に対して与えられるサブ映像の表示可能領域にサブ映像を表示する。 In the time range 1603 (“00:00:30” to “00:00:36”), the sub video displayable area 16A given to the main video and the content of the sub video are specified. The sub video display area 16B is divided into different areas. In this case, priority is given to the sub video displayable area 16A given to the main video. That is, in the time range 1603, the sub video is displayed in the sub video displayable area given to the main video, based on the same metadata as shown in the first embodiment.

なお、図には無いが、サブ映像の表示位置を指定するための図１６（ａ）に示す表示可能領域と図１６（ｂ）の表示領域とが異なる領域に別れ、かつ、図１６（ａ）に示した表示可能領域がサブ映像（子画面）の表示サイズよりも大きな領域を示すような場合に、図１６（ａ）の表示可能領域に含まれ、かつ、図１６（ｂ）の表示領域に最も近くなるような領域を決定し、サブ映像の表示領域とする、といった処理を加えてもよい。無論、逆に、サブ映像の演出意図が極めて重要な場合に、図１６（ｂ）の表示領域の優先度を高く設定することにより、サブ映像の表示位置を強制的に図１６（ｂ）の表示領域に基づいて設定するといったことも可能である。 Although not shown in the figure, the displayable area shown in FIG. 16A for designating the display position of the sub video and the display area shown in FIG. 16B are separated into different areas, and FIG. ) Is included in the displayable area of FIG. 16A and the display of FIG. 16B is displayed when the displayable area shown in FIG. Processing such as determining a region closest to the region and setting it as a sub-video display region may be added. Of course, conversely, when the intention of producing the sub video is extremely important, the display position of the sub video is forcibly set by setting the priority of the display area in FIG. 16B high. It is also possible to set based on the display area.

なお、上記した各実施形態では、映像表示装置に入力される映像データ（及び管理データ）並びにメタデータが、放送や通信といった伝送路を経由して入力されるか、あるいは予め記録媒体に記録されていて、記録媒体に記録された映像データ（及び管理データ）並びにメタデータを逐次読み出して再生、表示するのかを問わない。伝送路を経由して一旦記録媒体に記録された後、記録された映像データ（及び管理データ）並びにメタデータを読み出して再生するといった使用についても同様である。即ち、各実施形態の映像表示装置、方法、並びに表示用データは、放送映像の受信装置、映像通信の受信装置、記録媒体を有する録画再生装置の一構成として、並びに、各実施形態で説明したメタデータが記録された記録媒体に対しても、適用可能である。 In each of the above-described embodiments, video data (and management data) and metadata input to the video display device are input via a transmission path such as broadcasting or communication, or recorded in advance on a recording medium. However, it does not matter whether the video data (and management data) and metadata recorded on the recording medium are sequentially read out and reproduced and displayed. The same applies to the use in which the recorded video data (and management data) and metadata are read and reproduced after being once recorded on the recording medium via the transmission path. That is, the video display device, the method, and the display data of each embodiment are described as one configuration of a broadcast video reception device, a video communication reception device, and a recording / playback device having a recording medium. The present invention can also be applied to a recording medium on which metadata is recorded.

また、映像データ（及び管理データ）と本発明の各実施形態で示したメタデータとは、別個に管理することも可能である。このことから、再生側においてメタデータを生成し、別途放送、通信、又は記録媒体を経由して入力された映像データをピクチャ・イン・ピクチャ再生する際に、再生側で生成されたメタデータと組み合わせて用いるといったことも可能である。この場合、例えば、ユーザの好みに応じてメイン映像のうちサブ映像表示時に隠れて構わない領域と隠れて欲しくない領域を設定するなどの処理を通して、メタデータが形成される。再生側でのメタデータの生成は、放送や通信などの伝送路を通して入力された映像データ（及び管理データ）を記録媒体に記録する際や、記録媒体から映像データ（及び管理データ）を読み出して再生する直前などに実行される。また、この生成処理は、ユーザが直接入力しても、あるいはＪａｖａ（登録商標）等のプログラムで動的に生成できるようにしてもよい。即ち、本発明は、メタデータが実際に何処で設定されたかに拘らず、各実施形態で説明したメタデータを利用する映像表示装置、方法に対して適用可能である。 Further, the video data (and management data) and the metadata shown in each embodiment of the present invention can be managed separately. Therefore, when metadata is generated on the playback side and video data input separately via broadcasting, communication, or a recording medium is played back in picture-in-picture, the metadata generated on the playback side It can also be used in combination. In this case, for example, the metadata is formed through processing such as setting an area that may be hidden when displaying the sub-video and an area that is not desired to be hidden in the main video according to the user's preference. Metadata generation on the playback side is performed when video data (and management data) input through a transmission path such as broadcasting or communication is recorded on a recording medium, or when video data (and management data) is read from the recording medium. It is executed just before playback. In addition, the generation process may be directly input by the user or may be dynamically generated by a program such as Java (registered trademark). That is, the present invention can be applied to the video display apparatus and method using the metadata described in each embodiment, regardless of where the metadata is actually set.

なお、ここで開示された実施の形態はすべての点で例示であって制限的なものではない。本発明の範囲は、上記した説明ではなく、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。 It should be noted that the embodiments disclosed herein are illustrative and non-restrictive in every respect. The scope of the present invention is defined by the terms of the claims, rather than the description above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

１、２、３映像表示装置
１０１、１０３復号部
１０２、１０４バッファリング部
１０５合成部
１０６調整部
１０７表示部
１０８入力部
１０９処理制御部
１１０、２１０、３１０位置指定部 1, 2, 3 Video display device 101, 103 Decoding unit 102, 104 Buffering unit 105 Composition unit 106 Adjustment unit 107 Display unit 108 Input unit 109 Processing control unit 110, 210, 310 Position designation unit

Claims

A video composition device for synthesizing a second video with a first video,
The displayable time of the second video is designated within the playback time of the first video,
The second video starts to be played at an arbitrary time within the displayable time, ends when the second video ends or exceeds the displayable time, and is arbitrarily selected during playback. Display / non-display switchable,
The time information indicating the time of the first video and the display area information indicating the display position when the second video is displayed at the time specified over all the times within the displayable time. Designating means for designating a display position when displaying the second video at an arbitrary time based on the display data,
Synthesizing means for superposing and synthesizing the second video image at a display position designated by the designating means of the first video image;
A video synthesizing apparatus comprising:

The display data includes a second video display area in the first video when the second video given corresponding to the time of the second video is switched to display. 2 display area information is included,
The designation means performs a process of designating a display position according to the display area information and / or the second display area information included in the display data when the second video is displayed. The video composition device according to claim 1.

The display data includes a display when displaying the second video in the first video when the second video given corresponding to the time of the second video is switched to display. Displayable area information indicating a possible area is included, and the displayable area refers to an area that may be displayed at an arbitrary position in the area when the second video is switched to display,
The designation means performs a process of designating a display position in accordance with the display area information and / or the displayable area information included in the display data when the second video is displayed. The video composition device according to claim 1.

2. The video composition apparatus according to claim 1, wherein the display area information includes information indicating coordinates and / or size of an area in which the second video is displayed.

The video composition apparatus according to claim 1, wherein the display area information includes an upper left vertex coordinate of a rectangular area for displaying the second video.

A video composition device for synthesizing a second video with a first video,
The displayable time of the second video is designated within the playback time of the first video,
The second video starts to be played at an arbitrary time within the displayable time, ends when the second video ends or exceeds the displayable time, and is arbitrarily selected during playback. Display / non-display switchable,
Time information indicating the time of the first video specified over all the times within the displayable time, and a display indicating a displayable area when the second video is displayed at the time Display data including displayable area information, and the displayable area refers to an area that may be displayed at an arbitrary position in the area when the second video is switched to display. Designation means for designating a display position when displaying the second video at an arbitrary time based on data;
Synthesizing means for superposing and synthesizing the second video image at a display position designated by the designating means of the first video image;
A video synthesizing apparatus comprising:

The display data indicates a display area of the second video in the first video when the second video given corresponding to the time of the second video is switched to display. Contains region information,
The designation unit performs a process of designating a display position according to the displayable area information and / or the display area information included in the display data when the second video is displayed. The video composition device according to claim 6.

The display data includes a display when displaying the second video in the first video when the second video given corresponding to the time of the second video is switched to display. Second displayable area information indicating possible areas is included,
The designation means performs a process of designating a display position in accordance with the displayable area information and / or the second displayable area information included in the display data when displaying the second video. The video synthesizing device according to claim 6, wherein the video synthesizing device is characterized.

The synthesized and output video is a picture-in-picture video, the first video is a video displayed on the main screen, and the second video is a video displayed on the sub-screen. The video composition device according to claim 1, wherein:

A program executed in a video composition device that synthesizes a second video with a first video,
The displayable time of the second video is designated within the playback time of the first video,
The second video starts to be played at an arbitrary time within the displayable time, ends when the second video ends or exceeds the displayable time, and is arbitrarily selected during playback. Display / non-display switchable,
The time information indicating the time of the first video and the display area information indicating the display position when the second video is displayed at the time specified over all the times within the displayable time. A designation function for designating a display position when displaying the second video at an arbitrary time based on the display data;
A composition function for superimposing the second image on the display position designated by the designation function of the first image;
For realizing the above in the video composition device.

The display data includes a second video display area in the first video when the second video given corresponding to the time of the second video is switched to display. 2 display area information is included,
The designation function performs a process of designating a display position in accordance with the display area information and / or the second display area information included in the display data when the second video is displayed. The program according to claim 10.

The display data includes a display when displaying the second video in the first video when the second video given corresponding to the time of the second video is switched to display. Displayable area information indicating a possible area is included, and the displayable area refers to an area that may be displayed at an arbitrary position in the area when the second video is switched to display,
The designation function performs a process of designating a display position according to the display area information and / or the displayable area information included in the display data when the second video is displayed. The program according to claim 10.

The program according to claim 10, wherein the display area information includes information indicating coordinates and / or size of an area in which the second video is displayed.

The program according to claim 10, wherein the display area information includes an upper left vertex coordinate of a rectangular area displaying the second video.

A program executed in a video composition device that synthesizes a second video with a first video,
The displayable time of the second video is designated within the playback time of the first video,
The second video starts to be played at an arbitrary time within the displayable time, ends when the second video ends or exceeds the displayable time, and is arbitrarily selected during playback. Display / non-display switchable,
Time information indicating the time of the first video specified over all the times within the displayable time, and a display indicating a displayable area when the second video is displayed at the time Display data including displayable area information, and the displayable area refers to an area that may be displayed at an arbitrary position in the area when the second video is switched to display. A designation function for designating a display position when displaying the second video at an arbitrary time based on data;
A composition function for superimposing and synthesizing the second image on the display position designated by the designation means of the first image;
For realizing the above in the video composition device.

The display data indicates a display area of the second video in the first video when the second video given corresponding to the time of the second video is switched to display. Contains region information,
The designation function performs a process of designating a display position according to the displayable area information and / or the display area information included in the display data when the second video is displayed. The program according to claim 15.

The display data includes a display when displaying the second video in the first video when the second video given corresponding to the time of the second video is switched to display. Second displayable area information indicating possible areas is included,
The designation function performs a process of designating a display position in accordance with the displayable area information and / or the second displayable area information included in the display data when the second video is displayed. The program according to claim 15, wherein the program is characterized.

The synthesized and output video is a picture-in-picture video, the first video is a video displayed on the main screen, and the second video is a video displayed on the sub-screen. The program according to any one of claims 10 to 17, wherein:

The first video, the second video, and display data for displaying the second video superimposed on the first video are recorded, and the second video is synthesized with the first video. A recording medium that is played back by the video synthesizing device for output,
A recording medium on which the program according to any one of claims 10 to 18 is recorded.