JP4594432B1

JP4594432B1 - Movie playback method, movie playback system, and program

Info

Publication number: JP4594432B1
Application number: JP2009195725A
Authority: JP
Inventors: 政佳岩井; 一彦伊藤; 基央笹木; 操松下
Original assignee: CRI Middleware Co Ltd
Current assignee: CRI Middleware Co Ltd
Priority date: 2009-08-26
Filing date: 2009-08-26
Publication date: 2010-12-08
Anticipated expiration: 2029-08-26
Also published as: JP2011049767A

Abstract

【課題】時間的に同期した複数の画像ストリームを多重化して符号化し、時間的に同期させた画像ストリームとして復号すること。
【解決手段】動画再生システムは、それぞれ異なる複数の画像ストリームの再生タイミングが共通する画像を複数の画像ストリームから横断的に抽出して抽出順に画像を配置し、時間コンボリューションされた符号化シーケンスを読み出すインタフェース部６３２，６３４と、符号化シーケンスをデコードして、時間コンボリューションされた統合ストリームを生成するデコード部６３８と、統合ストリームを取得して統合ストリームを構成する各画像ストリームの画像を分離し、時間同期して書き出すフレーム同期処理部６４０と、グラフィックアクセラレータ６４６からのビデオ信号を受領して動画を表示するディスプレイ装置６５０とを含む。
【選択図】図６A plurality of temporally synchronized image streams are multiplexed and encoded, and decoded as temporally synchronized image streams.
A moving image reproduction system extracts images having a common reproduction timing of a plurality of different image streams from a plurality of image streams, arranges the images in the order of extraction, and sets a time-convolved encoding sequence. The interface units 632 and 634 to read out, the decoding unit 638 that decodes the encoded sequence to generate a time-convolved integrated stream, and acquires the integrated stream and separates the images of each image stream constituting the integrated stream A frame synchronization processing unit 640 that writes out in time synchronization, and a display device 650 that receives a video signal from the graphic accelerator 646 and displays a moving image.
[Selection] Figure 6

Description

本発明は、画像符号化技術に関し、より詳細には、時間的に同期した複数の画像ストリームを多重化して符号化し、時間的に同期させた画像ストリームとして復号する技術に関する。 The present invention relates to an image encoding technique, and more particularly to a technique for multiplexing and encoding a plurality of temporally synchronized image streams and decoding the temporally synchronized image streams.

近年、情報処理装置およびネットワーク技術の性能向上に伴い、情報処理装置が処理するべきデジタルコンテンツも多様化している。デジタルコンテンツには、文書、スチル画像、音声、動画像、動画像と音声とが同期したマルティメディアコンテンツなどがある。 In recent years, with the improvement in performance of information processing devices and network technologies, digital contents to be processed by information processing devices have also been diversified. Examples of digital content include documents, still images, audio, moving images, and multimedia content in which moving images and audio are synchronized.

これらのデジタルコンテンツは、例えば文芸作品、写真集、映画／ビデオ、ゲームなどユーザの特定の嗜好や目的に適合するように編集されてユーザに提供される。上述したデジタルコンテンツがユーザに提供される場合、MPEG、MPEG−２、MPEG-4（以下、MPEG、MPEG-2、MPEG-4などのMPEGを先頭に付して参照される圧縮フォーマットを、MPEGシリーズのフォーマットとして参照する。）MP3、H.264などのフォーマットに圧縮されて、CD-ROM、DVDなどの光学的記録媒体に記録される。また、上述したコンテンツが情報処理装置やデジタル放送などによりデジタルデータとして伝送される場合にも、MPEG-2やH.264といったフォーマットに圧縮され、ストリーミング配信や地上波デジタル放送として配布される。 These digital contents are edited and provided to the user so as to suit the user's specific preference and purpose, such as a literary work, a photo book, a movie / video, and a game. When the digital content described above is provided to the user, MPEG, MPEG-2, MPEG-4 (hereinafter referred to as MPEG, MPEG-2, MPEG-4, etc.) is referred to with a compression format referred to by MPEG. Refer to it as a series format.) Compressed to a format such as MP3, H.264, etc. and recorded on an optical recording medium such as a CD-ROM or DVD. In addition, when the above-described content is transmitted as digital data by an information processing device or digital broadcast, the content is compressed into a format such as MPEG-2 or H.264 and distributed as streaming distribution or terrestrial digital broadcast.

従来、上述したデジタルコンテンツは、多くの場合、２次元（以下、2Dとして参照する。）イメージを提供しており、情報処理装置や伝送基盤の高速化により2Dイメージでも充分に臨場感を味わうことが可能なデジタルコンテンツが提供されている。 Conventionally, the above-mentioned digital contents often provide a two-dimensional (hereinafter referred to as 2D) image, and a 2D image can be fully enjoyed by speeding up information processing equipment and transmission infrastructure. Digital content that can be used is provided.

しかしながら、情報処理装置および伝送技術の進歩により、2Dイメージではなく、デジタルコンテンツを3Dとしてユーザに提供しようとする試みもなされている。デジタルコンテンツを3Dイメージとして視覚的に認識させるためには、IP(Integral Photography)方式などレンチキュラーレンズを使用してユーザの左右の目に対して異なるイメージを与えたり、パララックスバリヤ方式など、左右両眼が認識するべき画像を透過するように交差させた２枚の液晶シャッタを配置することで、視覚的に３次元（以下、3Dとして参照する。）認識を提供する技術が知られている。 However, due to advances in information processing devices and transmission technologies, attempts have been made to provide users with 3D digital content instead of 2D images. In order to visually recognize digital content as a 3D image, a lenticular lens such as the IP (Integral Photography) method is used to give different images to the left and right eyes of the user, and both left and right such as a parallax barrier method are used. There is known a technique for visually providing three-dimensional (hereinafter, referred to as 3D) recognition by arranging two liquid crystal shutters crossed so as to transmit an image to be recognized by an eye.

3D映像を提供するためには、これまで多視点法として参照される3D認識を生成する画像再生システムが知られている。多視点法では、視点周期毎に異なる撮影角度で取得されたイメージを同期して液晶ディスプレイや液晶プロジェクタなどの再生装置を使用して再生する。ユーザまたは観客といった視聴者は、レンチキュラーレンズが、再生装置からの画像を合焦する位置で視差角の異なる複数の2Dイメージを認識する。視聴者が映像を見ながら、視点を変えると、それぞれの視点に近い画像が認識されることになるので、視聴者は、複数の撮影位置からの2Dイメージの空間的コンボリューションに基づいて3Dイメージを認識することが可能となる。 In order to provide 3D video, an image reproduction system that generates 3D recognition referred to as a multi-view method has been known. In the multi-viewpoint method, images acquired at different shooting angles for each viewpoint cycle are synchronized and played back using a playback device such as a liquid crystal display or a liquid crystal projector. A viewer such as a user or a spectator recognizes a plurality of 2D images having different parallax angles at a position where the lenticular lens focuses an image from the playback device. If the viewer changes the viewpoint while watching the video, images close to each viewpoint will be recognized, so the viewer will be able to 3D images based on the spatial convolution of 2D images from multiple shooting positions. Can be recognized.

すなわち、例えば3D映像を提供しようとする場合、IP方式およびパララックスバリヤ方式のいずれの方式でも、複数の動画ストリーム時間的に同期して再生することが必要とされる。これまで、複数の再生装置を配置し、異なる撮影角度から取得したストリーム画像を生成する技術が知られている。例えば、特開平１１−３８９５４号公報（特許文献１）では、映像データをそれぞれ異なる方式で再生して表示画面上に表示するための複数の表示プログラムと、ユーザからの指示により映像データの表示の条件を抽出し、複数の表示プログラムの中のいずれか１つに選択的に送出するための画像データ統合プログラムとを含む画像表示装置が記載されている。 That is, for example, when 3D video is to be provided, it is necessary to reproduce a plurality of moving picture streams in synchronization with each other in both the IP system and the parallax barrier system. Up to now, a technique for arranging a plurality of playback devices and generating stream images acquired from different shooting angles is known. For example, in Japanese Patent Application Laid-Open No. 11-38954 (Patent Document 1), a plurality of display programs for reproducing video data by different methods and displaying them on a display screen, and display of video data according to instructions from a user. An image display device is described that includes an image data integration program for extracting conditions and selectively sending them to any one of a plurality of display programs.

特許文献１では、MPEGストリームなどを複数の再生単位に分割し、再生単位毎に選択して再生表示するものである、また、特許文献１では、3D画像に切り換えて表示する点は記載するものの、3D画像を提供する圧縮データを如何にして生成するかについては、何ら記載するものではない。 In Patent Document 1, an MPEG stream or the like is divided into a plurality of reproduction units, and each reproduction unit is selected and reproduced and displayed. In Patent Document 1, the point of switching to a 3D image is described. It does not describe at all how to generate compressed data that provides a 3D image.

また、特開２００６−１４０６１８号公報（特許文献２）では、DVDビデオ規格に準拠した形式で、ディファレンシャルパック（D_PACK）という奥行き情報データをパック化してMPEG多重化することにより、ディファレンシャルパックを用いれば３次元映像になり、用いなければDVDビデオ規格として標準的な２次元映像が出力できるフォーマットとする３次元映像情報記録装置およびプログラムを開示しており、MPEGデータに、符号化段階で得た奥行き情報を追加することで、3D映像情報を記録することで2D、3Dの切り換え表示に対応でき、3D画像の再生方式に依存しない3D圧縮画像を提供することを記載している。 In Japanese Patent Laid-Open No. 2006-140618 (Patent Document 2), if the differential pack is used by packing and multiplexing the depth information data called differential pack (D_PACK) in a format compliant with the DVD video standard. Disclosed is a 3D video information recording device and program that can output 3D video, which is a format that can output standard 2D video as a DVD video standard if it is not used. It describes that by adding information, 3D video information can be recorded to support switching between 2D and 3D, and a 3D compressed image independent of the 3D image playback method is provided.

さらに、特開２００９−５１３０７４号公報（特許文献３）では、多視点映像コンテンツに対応する少なくとも２つの視点画像のうちの特定の視点画像を基本レイヤとして符号化し、特定の視点画像および少なくとも１つのその他の視点画像のうちの少なくとも一方に対応する下位レイヤからの予測を用いて、少なくとも２つの視点画像のうちの少なくとも１つのその他の視点画像の各々を拡張レイヤとして符号化することによって、少なくとも２つの視点画像を符号化する符号化器を含む装置を開示する。 Furthermore, in JP 2009-513074 A (Patent Document 3), a specific viewpoint image of at least two viewpoint images corresponding to multi-view video content is encoded as a base layer, and the specific viewpoint image and at least one of the viewpoint images are encoded. By encoding each of at least one other viewpoint image of at least two viewpoint images as an enhancement layer using prediction from a lower layer corresponding to at least one of the other viewpoint images, at least 2 An apparatus including an encoder for encoding one viewpoint image is disclosed.

また、特開２００６−５４５００号公報（特許文献４）では、MPEG-2などのインタレース技術を利用して２つのフレームに左視点画像と右視点画像とを符号化し、左右視線画像を時間的に同期させて表示する動画符号化技術が開示されている。特許文献４の技術でも複数の動画ストリームを時間的に多重化して表示することが可能であるが、左右視点画像に限定されてしまうという問題点がある。 Japanese Patent Laid-Open No. 2006-54500 (Patent Document 4) encodes a left viewpoint image and a right viewpoint image into two frames using an interlace technique such as MPEG-2, and temporally converts the left and right line-of-sight images. A moving image encoding technique for displaying in synchronization with the video is disclosed. Even with the technique of Patent Document 4, it is possible to temporally multiplex and display a plurality of video streams, but there is a problem in that it is limited to left and right viewpoint images.

特開平１１−３８９５４号公報JP 11-38954 A 特開２００６−１４０６１８号公報JP 2006-140618 A 特開２００９−５１３０７４号公報JP 2009-513074 A 特開２００６−５４５００号公報JP 2006-54500 A

特許文献１では、再生単位毎に再生する点を記載し、切り換え表示の際に3D表示を選択することを可能とする点を記載する。また、特許文献２では、3Dを生成するための深さ情報を符号化情報から生成してMPEGデータに登録する点を記載する。また特許文献３では、多視点画像を一方の画像を符号化し他方の画像を予測して符号化する装置を開示している。そして特許文献４では、２つの動画ストリームをインタレース方式を利用して符号化し、時間的に同期させて動画ストリームとすることが可能とされている。 Japanese Patent Application Laid-Open No. 2004-228561 describes a point of reproduction for each reproduction unit, and describes that it is possible to select 3D display at the time of switching display. Patent Document 2 describes that depth information for generating 3D is generated from encoded information and registered in MPEG data. Patent Document 3 discloses a device that encodes a multi-viewpoint image by encoding one image and predicting the other image. In Patent Document 4, it is possible to encode two video streams using an interlace method and synchronize them in time to form a video stream.

特許文献１〜特許文献４の技術を使用すれば、例えば3D映像を表示させるための動画ストリームを提供することは可能であるものの、近年ディスプレイ装置の高精細化や情報処理装置の高性能化、およびコンテンツの複雑化・高精細化などに伴いより効率的に複数の時間同期した動画ストリームを符号化し、さらには時間同期して再生することが必要とされていた。 Although it is possible to provide a video stream for displaying, for example, 3D video by using the techniques of Patent Literature 1 to Patent Literature 4, in recent years, high-definition display devices and high-performance information processing devices, In addition, it has been necessary to more efficiently encode a plurality of time-synchronized video streams and to reproduce the time-synchronized video streams in accordance with the complexity and high definition of contents.

また、コンテンツの複雑化・高精細化に伴い、より効率的な画像圧縮技術が必要とされ、当該画像圧縮技術により符号化された符号化ストリームから時間同期を保証しながら複数の動画ストリームを再生することが必要とされていた。 In addition, as content becomes more complex and more detailed, more efficient image compression technology is required, and multiple video streams can be played back while guaranteeing time synchronization from the encoded stream encoded by the image compression technology. It was necessary to do.

すなわち、これまで、共通する画像特徴を有する複数の画像ストリームを時間同期を保証しながら効率的に圧縮し復号する技術が必要とされていた。 That is, until now, there has been a need for a technique for efficiently compressing and decoding a plurality of image streams having common image characteristics while guaranteeing time synchronization.

本発明は上述した従来技術の問題点に鑑みてなされたものである。本発明では、複数の画像ストリームを統合して、統合ストリームを生成し、従来の符号化方法によって符号化する。符号化ストリームは、符号化方式に対応した復号方式を採用するデコーダにより復号され、統合ストリームになる。統合ストリームは、時間同期されて、複数の動画として再生される。 The present invention has been made in view of the above-described problems of the prior art. In the present invention, a plurality of image streams are integrated to generate an integrated stream, which is encoded by a conventional encoding method. The encoded stream is decoded by a decoder that employs a decoding method corresponding to the encoding method, and becomes an integrated stream. The integrated stream is time synchronized and reproduced as a plurality of moving images.

統合ストリームの生成は、指定されたタイムスライスごとに、画像ストリームを構成する画像を指定された順に抽出し、抽出順に統合ストリームの先頭から画像を配置することにより画像スタックを生成する。さらに処理するべきタイムスライスがある場合、統合ストリームは、当該タイムスライスに帰属された画像を、同一の抽出順で抽出し、直前の画像スタックの最後の画像の直後に配置し、以下抽出順に抽出した画像を配置することにより、T_n-1スタックおよびT_nスタックを生成することにより生成される。 In the generation of the integrated stream, images constituting the image stream are extracted in the specified order for each specified time slice, and an image stack is generated by arranging the images from the beginning of the integrated stream in the extraction order. When there are time slices to be further processed, the integrated stream extracts the images belonging to the time slices in the same extraction order, places them immediately after the last image in the previous image stack, and extracts them in the following order of extraction. By arranging the processed images, the T _n-1 stack and the T _n stack are generated.

本発明では、上述した複数の画像ストリームの統合を、画像の時間コンボリューションとして行う。時間コンボリューションにより生成された統合ストリームは、MPEG、MPEG-2、MPEG-4、H．264といった符号化方法を使用して符号化される。符号化は、連続する画像のフレーム間相関を使用して実行されて、符号化ストリームとされる。符号化ストリームは、符号化方式に対応した復号方式を採用するデコーダにより復号され、統合ストリームが再生される。再生された統合ストリームは、個別の画像ストリームを与える画像に分離され、時間同期された後、アナログ変換されて、例えばパーソナルコンピュータのディスプレイ装置、液晶プロジェクタにより動画再生される。 In the present invention, the above-described integration of a plurality of image streams is performed as a temporal convolution of images. The integrated stream generated by the time convolution includes MPEG, MPEG-2, MPEG-4, H.264, and so on. It is encoded using an encoding method such as H.264. Encoding is performed using inter-frame correlation of successive images into an encoded stream. The encoded stream is decoded by a decoder that employs a decoding method corresponding to the encoding method, and an integrated stream is reproduced. The reproduced integrated stream is separated into images that provide individual image streams, time-synchronized, and then converted into an analog signal. For example, a moving image is reproduced by a display device or a liquid crystal projector of a personal computer.

本発明の特定の実施形態は、共通する画像特徴を有し、画像ストリームの所定のタイムスライスに帰属される複数の画像がフレーム間相関予測による圧縮が期待できる複数の画像ストリームの時間的コンボリューションを行うことが好ましく、より具体的には、3D映像を提供するための視点が異なる画像ストリームや、ストーリー展開が共通した複数のゲームシーンを与える画像シーケンスに適用することができる。 Certain embodiments of the present invention provide temporal convolution of multiple image streams that have common image characteristics and that can be expected to be compressed by inter-frame correlation prediction for multiple images belonging to a given time slice of the image stream. More specifically, it can be applied to an image stream providing a plurality of game scenes having a common story development or an image stream having different viewpoints for providing 3D video.

本発明によれば、複数の画像シーケンスから時間コンボリューションされた同一のタイムスライスの画像間でフレーム間相関予測を使い符号化を行うため、効率的な符号化を可能とし、高圧縮率が達成できる。また、復号された統合ストリームから、各画像シーケンスに対応する画像を分離し、時間同期して再生処理部に再生画像を渡すことができるので、ピクチャ間の時間同期性を保証でき、良好で高品質の動画ストリーム再生を可能とする。 According to the present invention, encoding is performed using inter-frame correlation prediction between images of the same time slice that has been time-convolved from a plurality of image sequences, thereby enabling efficient encoding and achieving a high compression rate. it can. In addition, since the image corresponding to each image sequence can be separated from the decoded integrated stream and the reproduced image can be passed to the reproduction processing unit in time synchronization, the time synchronism between pictures can be ensured, and the Enables quality video stream playback.

さらに、本実施形態では、ユーザの希望や操作に応じて、画像再生システムが再生ストリームの効率的な切り替えを可能とし、デジタルコンテンツによる情報提供を多様化させることができる。 Furthermore, according to the present embodiment, the image reproduction system can efficiently switch the reproduction stream according to the user's desire and operation, and information provision by digital contents can be diversified.

本実施形態のエンコーダ１００の機能ブロック図。The functional block diagram of the encoder 100 of this embodiment. 図１に示したエンコーダ１６０の詳細な機能ブロックを示した図。The figure which showed the detailed functional block of the encoder 160 shown in FIG. 本実施形態のエンコーダ１６０を、H.264フォーマットで符号化する場合の機能ブロックを示した図。The figure which showed the functional block in the case of encoding the encoder 160 of this embodiment in a H.264 format. 複数の画像ストリーム、画像ストリームを形成する画像、タイムスライスおよび本実施形態で生成される統合ストリームを構成するデータ構造４００を説明した図。The figure explaining the data structure 400 which comprises the some image stream, the image which forms an image stream, a time slice, and the integrated stream produced | generated by this embodiment. 本実施形態のデコーダ５００の機能ブロックを示した図。The figure which showed the functional block of the decoder 500 of this embodiment. 本実施形態の動作再生システム６００の機能ブロック図。The functional block diagram of the operation | movement reproduction | regeneration system 600 of this embodiment. 図６で説明したフレーム同期処理部６４０の詳細な機能ブロック図。FIG. 7 is a detailed functional block diagram of the frame synchronization processing unit 640 described in FIG. 6. 本実施形態のエンコーダ１６０が実行するエンコード処理のフローチャート。The flowchart of the encoding process which the encoder 160 of this embodiment performs. 本実施形態の画像復号方法のフローチャート。The flowchart of the image decoding method of this embodiment. 3D映像を表示するため、９視点に対応する９画像ストリームの先頭のタイムスライスT₁の画像スタック１０００の画像を示した図。To display 3D images, showing the head of the image in the image stack 1000 time slice T ₁ of the 9 image stream corresponding to the nine viewpoints FIG. 図１０に示したタイムスライスT₁の直後のタイムスライスT=T₂におけるVIEW1〜VIEW9の画像スタック１１００の実施形態を示した図。It illustrates an embodiment of the image stack 1100 VIEW1~VIEW9 in time slice T = T ₂ immediately after the time slice T ₁ shown in FIG. 10. 本実施形態で採用するフレーム間相関予測１２００を説明した図。The figure explaining the inter-frame correlation prediction 1200 employ | adopted by this embodiment. 本実施形態のフレーム間相関予測方法について、最も移動量の大きなオブジェクトである飛行物体を、ワイヤフレームとして、背景を除去して示した図。The figure which showed the background which removed the flying object which is an object with the largest moving amount as a wire frame about the correlation prediction method of this embodiment. 本実施形態のエンコード／デコード方式および動画再生システムにより再現される動画像の切り換え処理の実施形態を示した図。The figure which showed embodiment of the switching process of the moving image reproduced by the encoding / decoding system and moving image reproduction system of this embodiment.

以下、本発明につき実施形態をもって説明するが、本発明は後述する実施形態に限定されるものではない。図１は、本実施形態のエンコーダ１００の機能ブロック図である。図１に示すようにエンコーダは、Ａ画像ストリーム供給部１１０〜Ｎ画像ストリーム供給部１４０（Ｎは、１以上の正の整数である。）を含んでいる。Ａ画像ストリーム供給部１１０は、例えば、デジタル・カメラが取得したスチル画像の時系列的なストリームとすることもできるし、CG(Computer Graphics)により生成された画像を、特定のシーケンスで配置し、画像送り速度などを指定して再生するように構成したアニメーションとすることもできる。 Hereinafter, the present invention will be described with embodiments, but the present invention is not limited to the embodiments described below. FIG. 1 is a functional block diagram of the encoder 100 of the present embodiment. As shown in FIG. 1, the encoder includes an A image stream supply unit 110 to an N image stream supply unit 140 (N is a positive integer of 1 or more). The A image stream supply unit 110 may be a time-series stream of still images acquired by a digital camera, for example, and images generated by CG (Computer Graphics) are arranged in a specific sequence, It is also possible to use an animation that is configured to be played by designating an image feed speed or the like.

各画像ストリーム供給部１１０〜１４０は、デジタル・カメラなどによりオンザフライで生成する装置を使用して構成することもできるし、適切なインタフェースを介して、HDD、DVDやCD-ROMなどの記録媒体に格納されたスチル画像を画像ストリームとして読み出すことが可能な情報処理装置として構成することができる。 Each of the image stream supply units 110 to 140 can be configured using a device that generates on-the-fly using a digital camera or the like, or can be connected to a recording medium such as an HDD, DVD, or CD-ROM via an appropriate interface. The stored still image can be configured as an information processing apparatus capable of reading out as an image stream.

各画像ストリーム供給部１１０〜１４０から供給された画像ストリームは、ミキサ１５０に入力される。ミキサ１５０は、設定されたタイムスライス毎に、各画像ストリームから画像を設定された順に抽出し、早くに抽出された画像をストリームの先頭から順に配置することにより、複数の画像ストリーム１１０ａ〜１４０ｎを時間コンボリューションする。 The image streams supplied from the image stream supply units 110 to 140 are input to the mixer 150. The mixer 150 extracts images from each image stream in the set order for each set time slice, and arranges the images extracted earlier in order from the head of the stream, thereby arranging the plurality of image streams 110a to 140n. Time convolution.

最初のタイムスライスT₁について画像スタックの生成を終了すると、ミキサ１５０は、次のタイムスライスに対応する画像を、画像ストリームについて設定された順に抽出し、既に生成されたタイムスライスT₁の画像スタックの最後尾画像の直後に抽出順に挿入して行き、タイムスライスT₂に対応する画像スタックを生成する。 When the generation of the image stack for the first time slice T ₁ ends, the mixer 150 extracts the images corresponding to the next time slice in the order set for the image stream, and the image stack of the time slice T ₁ that has already been generated. of go inserted into the extraction order immediately after the last image, to generate an image stack corresponding to the time slice T _2.

タイムスライスTの時間間隔は、画像ストリームをスムースに再生するために必要な時間間隔とすることができ、例えば、T=１・P^-１（Pは、フレーム送り速度であり、フレーム・ｓ^−１の次元を有し、典型的には、１６〜３０程度とされる。）に設定することができる。なお、ミキサ１５０は、適切な画像処理ソフトウェアを情報処理装置にインストールして画像処理を実行させることによって構成することができるし、また時間コンボリューションを実行するための例えばASIC(Application
Specified Integrated Circuit)として実装することもでき、時に限定されるものではない。 The time interval of the time slice T can be set to a time interval necessary for smoothly reproducing the image stream. For example, T = 1 · P ⁻¹ (P is a frame feed rate, and frame · s ^{− 1} dimension, typically about 16 to 30). The mixer 150 can be configured by installing appropriate image processing software in the information processing apparatus to execute image processing, and for example, ASIC (Application for executing time convolution).
It can also be implemented as a Specified Integrated Circuit) and is not limited at times.

また、本実施形態の特定の実施形態では、アナログカメラが取得した画像でも利用することができるが、この場合、アナログ画像をA／D変換し、BMPなどにフォーマット変換するなどによってエンコーダ１６０に入力するべき画像ストリームを生成することができる。さらに他の実施形態では、デジタル・カメラが標準的に実装するいわゆるMoving JPEGとして参照されるフォーマットの画像ストリームを利用することができる。各画像ストリーム１１０ａ〜１４０ｎは、ミキサ１５０により時間コンボリューションされて統合ストリームとされる。生成された統合ストリームは、エンコーダ１６０に送付されて、MPEGシリーズまたはH.264の符号化処理が施されて符号化ストリームとされて、エンコーダ１６０から出力される。符号化ストリームの出力は、パケットとしてインターネットやデジタル放送に提供することができるし、またDVDなどの記録媒体に記録されたデジタル・コンテンツとして格納することもできる。 In a specific embodiment of the present embodiment, an image acquired by an analog camera can also be used. In this case, the analog image is input to the encoder 160 by A / D conversion, format conversion to BMP, or the like. A power image stream can be generated. In still another embodiment, an image stream in a format referred to as a so-called Moving JPEG that is normally implemented by a digital camera can be used. Each of the image streams 110a to 140n is time-convolved by the mixer 150 to be an integrated stream. The generated integrated stream is sent to the encoder 160, subjected to MPEG series or H.264 encoding processing to be an encoded stream, and is output from the encoder 160. The output of the encoded stream can be provided as a packet to the Internet or digital broadcasting, or can be stored as digital content recorded on a recording medium such as a DVD.

図２は、図１に示したエンコーダ１６０の詳細な機能ブロック２００を示す。図２に示すようにエンコーダ１６０は、MPEG、MPEG−２、MPEG−4といったMPEGシリーズのフォーマットの符号化ストリームを生成する実施形態である。エンコーダ１６０は、入力バッファ２１０と、加減算器２１２と、DCT器２１４とを含んで構成されている。入力バッファ２１０は、FIFO(First in First out)バッファとして構成されており、処理対象の画像ストリーム１１０ａ〜１４０ｎから生成された統合ストリームを先入れ先出し方式で格納する。なお、入力バッファ２１０に入力される画像は、エンコーダ１６０が利用するフレーム間相関予測処理の形式、例えば前方予測、後方予測、双方向予測、予測画像不使用などに対応して、画像スタック内の画像シーケンスの並べ替え処理が施されてもよい。 FIG. 2 shows a detailed functional block 200 of the encoder 160 shown in FIG. As shown in FIG. 2, the encoder 160 is an embodiment that generates an encoded stream of an MPEG series format such as MPEG, MPEG-2, or MPEG-4. The encoder 160 includes an input buffer 210, an adder / subtractor 212, and a DCT unit 214. The input buffer 210 is configured as a FIFO (First in First Out) buffer, and stores an integrated stream generated from the image streams 110a to 140n to be processed in a first-in first-out manner. Note that the image input to the input buffer 210 corresponds to the format of inter-frame correlation prediction processing used by the encoder 160, for example, forward prediction, backward prediction, bidirectional prediction, prediction image non-use, etc. Image sequence rearrangement processing may be performed.

入力バッファ２１０に格納された統合ストリームの画像は、加減算器２１２に送られ、典型的には、フレーム間相関予測に基づく情報が計算された後、DCT(Discrete Cosine Transformation)器２１４に送付され、DCT計算が実行される。DCT計算の結果は、量子化器２１６に送付されて量子化され、可変長符号化器２１８によるハフマン符号化などの符号化処理の後、出力バッファ２３２に送られて、符号量制御器２３０による符号量のフィードバック制御の下で符号化ストリーム(MPEG系列)が生成され、エンコーダ１６０の出力ストリームとされる。 The integrated stream image stored in the input buffer 210 is sent to an adder / subtractor 212. Typically, after information based on inter-frame correlation prediction is calculated, it is sent to a DCT (Discrete Cosine Transformation) unit 214, DCT calculation is performed. The result of the DCT calculation is sent to the quantizer 216 and quantized. After the encoding process such as Huffman encoding by the variable length encoder 218, the result is sent to the output buffer 232, and the code amount controller 230 An encoded stream (MPEG sequence) is generated under the feedback control of the code amount, and used as an output stream of the encoder 160.

量子化器２１６の出力は、逆量子化器２２０、逆DCT器２２２、加算器２２４、イメージバッファ２２６、フレーム間相関予測２２８に送付され、差分画像の生成、および説明する実施形態では、予測情報の計算に使用される。本実施形態では、エンコーダ１６０は、時系列的に連続した画像だけでなく、特定のタイムスライスT_jに帰属される異なる画像ストリーム１１０ａ〜１１０ｎの画像間でもフレーム間相関予測による圧縮を実行する。 The output of the quantizer 216 is sent to an inverse quantizer 220, an inverse DCT device 222, an adder 224, an image buffer 226, and an inter-frame correlation prediction 228. Used in the calculation of In the present embodiment, the encoder 160, time series not only continuous image also executes the compression by inter-frame correlation prediction between the images of the different image streams 110a~110n attributed to a particular time slice T _j.

例えば、3D映像を再生するための符号化ストリームを生成する場合、フレーム間相関予測の計算は、視点の角度変化に対応する差分画像を与えることになり、またゲームストーリーなどを再生する場合には、同一のタイムスライスT_jにおけるストーリーの差に対応する差分画像を与える。このため、本実施形態では、エンコーダ１６０は、従来と同様のエンコード方式を使用する。 For example, when generating an encoded stream for playing back 3D video, the calculation of inter-frame correlation prediction will give a difference image corresponding to the angle change of the viewpoint, and when playing a game story etc. The difference image corresponding to the story difference in the same time slice T _j is given. For this reason, in this embodiment, the encoder 160 uses the same encoding method as the conventional one.

図３は、本実施形態のエンコーダ１６０を、H.264フォーマットで符号化する場合の機能ブロック３００を示す。入力バッファ３１０は、図２で説明したと同様に、FIFO(First in First out)バッファとして構成されており、処理対象の画像ストリーム１１０ａ〜１４０ｎから生成された統合ストリームを先入れ先出し方式で格納する。その後、入力バッファ３１０の統合ストリームは、加算減算器３１２に供給される。図３の実施形態では、処理対象の画像ストリームはインター符号化され、加減算器３１２は、統合ストリームのピクチャから、フレーム間相関予測を行うため、フレーム間相関予測器３２４から供給される予測画像を減算して差分画像データを生成して直交変換装置３１４に供給する。 FIG. 3 shows a functional block 300 when the encoder 160 of the present embodiment is encoded in the H.264 format. As described with reference to FIG. 2, the input buffer 310 is configured as a FIFO (First in First Out) buffer, and stores an integrated stream generated from the processing target image streams 110 a to 140 n in a first-in first-out manner. Thereafter, the integrated stream of the input buffer 310 is supplied to the adder / subtractor 312. In the embodiment of FIG. 3, the image stream to be processed is inter-coded, and the adder / subtractor 312 performs the inter-frame correlation prediction from the picture of the integrated stream, and thus the prediction image supplied from the inter-frame correlation predictor 324 is used. The difference image data is generated by subtraction and supplied to the orthogonal transformation device 314.

フレーム間相関予測器３２４は、図２で説明したと同様に、異なる画像ストリームを構成する画像間の予測ベクトルの計算を実行する。フレーム間相関予測器３２４は、より具体的には現在処理対象とされる画像に対し、参照画像とすべき画像をイメージバッファ３２２から読み出し、その参照画像に対して、予測ベクトルに基づき予測画像を生成し、加減算器３１２に供給する。加減算器３１２は、フレーム間相関予測器３２４から供給される予測画像を、現在処理対象のピクチャから減算して差分画像を生成した後、差分画像を直交変換装置３１４に供給する。 The inter-frame correlation predictor 324 calculates a prediction vector between images constituting different image streams, as described with reference to FIG. More specifically, the inter-frame correlation predictor 324 reads, from the image buffer 322, an image to be a reference image for an image that is currently processed, and outputs a predicted image based on a prediction vector for the reference image. Generated and supplied to the adder / subtractor 312. The adder / subtractor 312 generates a difference image by subtracting the prediction image supplied from the inter-frame correlation predictor 324 from the current processing target picture, and then supplies the difference image to the orthogonal transformation device 314.

直交変換装置３１４は、加減算器３１２から供給されるピクチャまたは差分画像を取得して、例えば、DCT変換などの直交変換を適用し、その変換係数を、量子化器３１６に送付する。量子化器３１６は、後述する符号量制御器３２８によるフィードバック制御の下で、直交変換装置３１４からの変換係数を量子化し、その結果得られる量子化係数を、符号化器３２６に供給する。量子化器３１６からの量子化係数および予測ベクトルなどは、符号化器３２６による可変長符号化や算術符号化といった符号化処理の後、出力バッファ３３０に送付されて、先入れ先出し方式で蓄積される。符号量制御器３２８は、出力バッファの所定の画像ストリームの画像セットのために確保するカラムの記憶量に基づき、出力バッファ３３０がオーバフローまたはアンダフローしないように、量子化器３１６の処理をフィードバック制御する。 The orthogonal transform device 314 acquires the picture or difference image supplied from the adder / subtractor 312, applies orthogonal transform such as DCT transform, and sends the transform coefficient to the quantizer 316. The quantizer 316 quantizes the transform coefficient from the orthogonal transform device 314 under feedback control by a code amount controller 328 described later, and supplies the resulting quantized coefficient to the encoder 326. The quantized coefficients, prediction vectors, and the like from the quantizer 316 are sent to the output buffer 330 after being encoded by the encoder 326 such as variable-length encoding and arithmetic encoding, and are stored in a first-in first-out manner. The code amount controller 328 feedback-controls the processing of the quantizer 316 so that the output buffer 330 does not overflow or underflow based on the storage amount of the column reserved for the image set of the predetermined image stream of the output buffer. To do.

逆量子化器３１８は、量子化器３１６から供給される変換係数を、量子化器３１６の量子化処理と同一の量子化処理を適用して逆量子化し、その結果得られる変換係数を、逆直交変換器３２０に供給する。逆直交変換器３２０は、逆量子化器３１８からの変換係数に逆直交変換処理を施して現在処理中のイントラ符号化ピクチャ、または元のインター符号化ピクチャから予測画像を減算した差分画像を復号して、イメージバッファ３２２に送付する。 The inverse quantizer 318 inversely quantizes the transform coefficient supplied from the quantizer 316 by applying the same quantization process as that of the quantizer 316, and inversely transforms the resulting transform coefficient. This is supplied to the orthogonal transformer 320. The inverse orthogonal transformer 320 performs inverse orthogonal transform processing on the transform coefficient from the inverse quantizer 318 and decodes a difference image obtained by subtracting a predicted image from an intra-coded picture that is currently being processed or an original inter-coded picture. Then, it is sent to the image buffer 322.

出力バッファ３３０には、統合ストリームを符号化した符号化ストリームが、説明する実施形態では、H.264の符号化フォーマットで蓄積され、各種制御データなどがヘッダ情報などとして付された後、符号化ストリームとして出力される。 In the output buffer 330, an encoded stream obtained by encoding the integrated stream is accumulated in the H.264 encoding format in the embodiment to be described, and various control data and the like are added as header information and then encoded. Output as a stream.

図４は、複数の画像ストリーム、画像ストリームを形成する画像、タイムスライスおよび本実施形態で生成される統合ストリームを構成するデータ構造４００を説明した図である。図４中、画像ストリームは、Ａ画像ストリーム４１０、Ｂ画像ストリーム４２０からＮ画像ストリーム４３０までのＮ個の別個の画像ストリームを利用することができる。画像ストリーム、例えばＡ画像ストリーム４１０は、画像１，画像２，…，画像９，…の画像シーケンスから構成されていて、画像シーケンスは、Ａ画像ストリームの終了に対応するまでの画像を含んでいる。 FIG. 4 is a diagram illustrating a data structure 400 constituting a plurality of image streams, images forming the image stream, time slices, and an integrated stream generated in the present embodiment. In FIG. 4, N separate image streams from A image stream 410 and B image stream 420 to N image stream 430 can be used as the image stream. The image stream, for example, the A image stream 410 is composed of image sequences of image 1, image 2,..., Image 9,..., And the image sequence includes images up to the end of the A image stream. .

一方、Ｂ画像ストリーム４２０およびＮ画像ストリーム４３０についても同様に、画像１，画像２，…画像９，…の画像シーケンスから構成されていて、Ａ画像ストリームと同様に、各画像ストリームが終了するまでに対応する画像を含んでいる。さらに、各画像ストリーム４１０〜４３０の画像_ｉで示される画像は、ミキサ１５０により、同一のタイムスライスT_iを構成するために横断的に抽出され、統合ストリーム４４０に配置される。 On the other hand, the B image stream 420 and the N image stream 430 are similarly composed of image sequences of image 1, image 2,..., Image 9,. The image corresponding to is included. Further, the images indicated by the image _i of each of the image streams 410 to 430 are sampled by the mixer 150 in order to form the same time slice T _i and placed in the integrated stream 440.

例えば、Ａ画像ストリーム４１０の画像１、Ｂ画像ストリーム４２０の画像１、Ｎ画像ストリームの画像１は、図４に示されるように、設定された順に抽出されて、タイムスライスT₁として示される画像スタックを、統合ストリーム中で構成する。図４で説明する実施形態では、タイムスライスT_iで指定される画像スタックには、Ｎ個のストリームに対応するＮ個の画像が含まれていて、図４に示すようにi=1〜lastまで(last)個の画像スタックが形成される。 For example, the image 1 of the A image stream 410, the image 1 of the B image stream 420, and the image 1 of the N image stream are extracted in the set order and shown as the time slice T ₁ as shown in FIG. The stack is configured in the integration stream. In the embodiment described in FIG. 4, the image stack specified by the time slice T _i includes N images corresponding to N streams, and i = 1 to last as illustrated in FIG. 4. Up to (last) image stacks are formed.

同様にタイムスライスT₂には、Ａ画像ストリーム４１０〜Ｎ画像ストリーム４３０のそれぞれの画像２が抽出され、タイムスライスT₁で指定される画像スタックの最後の画像の直後に、抽出順に画像２が配置されて行き、Ｎ画像ストリーム４３０の画像２が配置された時点で、タイムスライスT₂で指定される画像スタックが生成される。ミキサ１５０は、同様の処理を、T_lastで指定されるタイムスライスまで継続し、最終的に全画像を時間コンボリューションした統合ストリームを形成する。 Similarly, the time slice T _2, each image 2 of the A image stream 410~N image stream 430 is extracted, immediately after the last image in the image stack specified by the time slice T _1, image 2 is the extracting order When the image 2 of the N image stream 430 is arranged, the image stack specified by the time slice T ₂ is generated. The mixer 150 continues the same processing until the time slice specified by T _last , and finally forms an integrated stream in which all the images are time-convolved.

図４に示した統合ストリーム４４０は、図２または図３で説明した機能ブロックを含むエンコーダ１６０に入力され、エンコーダ１６０の設定にしたがって統合ストリーム４４０をエンコードして行く。この際、エンコーダ１６０に設定されたGOP内にピクチャ数、I、Pピクチャの間隔にしたがって、従来の画像ストリームをエンコードすると同様の処理を適用してエンコードする。 The integrated stream 440 illustrated in FIG. 4 is input to the encoder 160 including the functional blocks described with reference to FIG. 2 or 3, and the integrated stream 440 is encoded according to the setting of the encoder 160. At this time, encoding is performed by applying the same processing as that for encoding a conventional image stream in the GOP set in the encoder 160 according to the number of pictures, the interval of I and P pictures.

図５は、本実施形態のデコーダ５００の機能ブロックを示す。図５に説明する実施形態は、エンコード／デコード方式としてH.264方式を使用するものとして説明する。入力バッファ(図示せず）は、FIFOバッファから構成されており、入力された符号化ストリーム５１０から、図４に示したデータ構造の統合ストリーム４４０をデコード部５２０によりデコードし、統合ストリーム５４０を復号して出力ストリームとしている。符号化ストリーム５１０は、図５に示した実施形態では、タイムスライスT₁〜最後のT_lastまで送付され、入力バッファに入力された順にデコード部５２０に送付され、デコード処理が行われる。 FIG. 5 shows functional blocks of the decoder 500 of this embodiment. The embodiment described in FIG. 5 will be described on the assumption that the H.264 system is used as the encoding / decoding system. The input buffer (not shown) includes a FIFO buffer, and the integrated stream 440 having the data structure shown in FIG. 4 is decoded from the input encoded stream 510 by the decoding unit 520, and the integrated stream 540 is decoded. Output stream. In the embodiment shown in FIG. 5, the encoded stream 510 is sent from the time slice T ₁ to the last T _last and is sent to the decoding unit 520 in the order of input to the input buffer, and the decoding process is performed.

図５に示した実施形態のデコード部５２０は、可変長復号器５２２と、逆量子化器５２４と逆DCT器５２６とを含んでいる。可変長復号器５２２は、ハフマン符号化などのより符号化された符号化データを復号し、逆量子化器５２４は、量子化されたDCT係数を図３に示したエンコーダ１６０の量子化処理の逆変換を実行してDCT係数を生成し、逆DCT器５２６にデータを供給する。逆DCT器５２６は、取得したDCT係数を使用して現在処理対象の差分画像のピクチャを再生し加算器５２８に送付する。 The decoding unit 520 in the embodiment shown in FIG. 5 includes a variable length decoder 522, an inverse quantizer 524, and an inverse DCT device 526. The variable length decoder 522 decodes encoded data encoded by Huffman encoding or the like, and the inverse quantizer 524 converts the quantized DCT coefficient into the quantization process of the encoder 160 shown in FIG. Inverse transformation is performed to generate DCT coefficients, and data is supplied to the inverse DCT unit 526. The inverse DCT unit 526 reproduces a picture of the differential image that is currently processed using the acquired DCT coefficient and sends it to the adder 528.

一方、可変長復号器５２２の出力は、フレーム間相関予測器５３０に送付され、復号された予測ベクトルの値を使用してイメージバッファ５３２内に格納されている参照画像から予測ピクチャが生成される。生成された予測ピクチャは、加算器５２８に送付され、差分画像のピクチャと合成された後、出力ストリームとして出力バッファ(図示せず）に蓄積された後、先入れ先出し方式でデコード部５２０から出力ストリームとして送出される。出力ストリームは、動画再生される前に、統合ストリームが含む画像ストリーム個別のストリームに時間同期されながら分離される。以上フレーム間相関予測について説明の便宜上ピクチャとして説明したが、実際には、画像の一部分を差分画像として取得したり、複数の差分画像が組み合わされたイメージデータを使用してフレーム間相関予測処理を行うことができる。 On the other hand, the output of the variable length decoder 522 is sent to the inter-frame correlation predictor 530, and a predicted picture is generated from the reference image stored in the image buffer 532 using the decoded prediction vector value. . The generated prediction picture is sent to the adder 528, synthesized with the difference image picture, stored in an output buffer (not shown) as an output stream, and then output from the decoding unit 520 as an output stream in a first-in first-out manner. Sent out. The output stream is separated while being time-synchronized with the individual stream of the image stream included in the integrated stream before the moving image is reproduced. Although the inter-frame correlation prediction has been described as a picture for convenience of explanation above, in practice, a part of the image is acquired as a difference image, or the inter-frame correlation prediction process is performed using image data in which a plurality of difference images are combined. It can be carried out.

各画像ストリームが分離された後、各画像ストリームは、フォーマット変換およびD／A変換が施されてアナログデータに変換され、パーソナルコンピュータなどのグラフィックアクセラレータを介してビデオ信号とされ、ディスプレイ画面や液晶プロジェクタなどの再生装置に送付されて動画再生される。 After each image stream is separated, each image stream is subjected to format conversion and D / A conversion to be converted into analog data, and converted into a video signal via a graphic accelerator such as a personal computer. The video is sent to a playback device such as

図６は、本実施形態の動作再生システム６００の機能ブロック図である。図６に示す動画再生システム６００は、パーソナルコンピュータ、ワークステーション、ゲーム装置、または液晶プロジェクタのコントローラとして構成することができ、図５に示したデコーダ５００をその機能モジュールとして含んだ構成とされている。 FIG. 6 is a functional block diagram of the operation reproduction system 600 of this embodiment. A moving image playback system 600 shown in FIG. 6 can be configured as a controller of a personal computer, workstation, game device, or liquid crystal projector, and includes the decoder 500 shown in FIG. 5 as its functional module. .

図６に示すように、動画再生システム６００は、主制御装置６３０と、ディスプレイ装置６５０と、主制御装置６３０に対して各種の指令を行うためのマウス、キーボード、ジョイスティックなどの入出力周辺装置６６０とを含んで構成されている。主制御装置６３０は、特定の用途に応じてパーソナルコンピュータとして実装することもできるし、ゲーム装置、デジタル・テレビ、液晶プロジェクタのためのコントロール・ユニットなどとして実装することができる。 As shown in FIG. 6, the moving image reproduction system 600 includes a main control device 630, a display device 650, and input / output peripheral devices 660 such as a mouse, a keyboard, and a joystick for issuing various commands to the main control device 630. It is comprised including. The main control device 630 can be implemented as a personal computer according to a specific application, or as a control unit for a game device, a digital television, a liquid crystal projector, or the like.

主制御装置６３０は、各種機能部を制御するための中央制御装置（CPU）６３６と、アプリケーションプログラムの実行空間を提供する記憶装置であるRAM６４２と、地上波デジタル基盤、インターネット／ローカルエリア・ネットワーク(LAN)など公衆ネットワーク６１０を介してデータ送受信を行うためのネットワークインタフェース６３２と、ハードディスク装置(HDD)、MO、CD、DVDといった光学的記録装置を介してデータの読み込みおよび書き込みを行うため、IDE、ATA、SERIAL-ATA、ULTRA-ATAなどの規格のストレージインタフェース６３４とを含んで構成されている。 The main controller 630 includes a central controller (CPU) 636 for controlling various functional units, a RAM 642 as a storage device that provides an execution space for application programs, a terrestrial digital infrastructure, an Internet / local area network ( A network interface 632 for transmitting and receiving data via a public network 610 such as a LAN) and an IDE, for reading and writing data via an optical recording device such as a hard disk drive (HDD), MO, CD, DVD, etc. And a storage interface 634 of a standard such as ATA, SERIAL-ATA, or ULTRA-ATA.

主制御装置６３０が実装するCPU６３６としては、PENTIUM（登録商標）、XEON（登録商標）、PENTIUM（登録商標）互換チップなど、CISCアーキテクチャのマイクロプロセッサ、またはPOWER PC（登録商標）などのRISCアーキテクチャのマイクロプロセッサを挙げることができ、CPU６３６は、シングルコアまたはマルチコアの形態で実装することができる。また、主制御装置６３０は、WINDOWS（登録商標）、UNIX（登録商標）、LINUX（登録商標）などのオペレーティングシステムを搭載し、上述したOSの制御下で、各種例外処理、外部機器の管理、通信セッション管理、C、C++、Java（登録商標）、JavaScript(登録商標）で記述されたプログラムの実行および実行管理を行っている。 The CPU 636 implemented by the main controller 630 includes a CISC architecture microprocessor such as PENTIUM (registered trademark), XEON (registered trademark), and PENTIUM (registered trademark) compatible chip, or a RISC architecture such as POWER PC (registered trademark). A microprocessor may be mentioned, and the CPU 636 may be implemented in a single core or multi-core form. The main control device 630 includes an operating system such as WINDOWS (registered trademark), UNIX (registered trademark), LINUX (registered trademark), etc., and under the control of the OS described above, various exception handling, management of external devices, Communication session management, and execution and execution management of programs written in C, C ++, Java (registered trademark), JavaScript (registered trademark).

さらに、主制御装置６３０は、WINDOWS（登録商標）、UNIX（登録商標）、LINUX（登録商標）、MACOSなど、いかなるオペレーティングシステムにより制御されてもよい。また、主制御装置６３０は、Internet Explorer（商標）、Mozilla（商標）、Opera（商標）、Firefox（登録商標）などのブラウザ・ソフトウェアを実装することができる。なお、主制御装置６３０がゲーム装置や液晶プロジェクタのコントロール・ユニットとして実装される場合、Windows（登録商標）やLINUX以外の組み込み機器専用のOSを実装していてもよい。 Further, the main controller 630 may be controlled by any operating system such as WINDOWS (registered trademark), UNIX (registered trademark), LINUX (registered trademark), or MACOS. In addition, the main control device 630 can be implemented with browser software such as Internet Explorer (trademark), Mozilla (trademark), Opera (trademark), and Firefox (registered trademark). When the main control device 630 is mounted as a control unit for a game device or a liquid crystal projector, an OS dedicated to an embedded device other than Windows (registered trademark) or LINUX may be mounted.

さらに主制御装置６３０は、デコード部６３８と、フレーム同期処理部６４０と、グラフィックスアクセラレータ(以下、単にGAとして参照する。)６４６とを含んで構成されている。デコード部６３８と、GA６４６は、アプリケーションプログラムで構成することができる。また、高速の画像処理を可能とするため、専用のASICとして構成され、拡張ボード、拡張カード、またはオンボードチップとして構成されることもできる。 Further, main controller 630 includes a decoding unit 638, a frame synchronization processing unit 640, and a graphics accelerator (hereinafter simply referred to as GA) 646. The decoding unit 638 and the GA 646 can be configured by application programs. Further, in order to enable high-speed image processing, it is configured as a dedicated ASIC and can be configured as an expansion board, an expansion card, or an on-board chip.

一方、フレーム同期処理部６４０は、デコード部６３８が出力する統合ストリームから、統合ストリームを構成する各画像ストリームを時間同期して分離する処理を実行する処理モジュールであり、DLLやその他のランタイムライブラリ、またはPlug-inプログラムとして実装することが好ましい。また、主制御装置６３０がゲーム装置など、パーソナルコンピュータよりも拡張性が低い場合には、専用の処理を行うチップとして実装することもできる。フレーム同期処理部６４０は、統合ストリームを受領すると、統合ストリームに時間コンボリューションされた画像を、タイムスライスT_i単位で分離し、Ｎ個の画像ストリームに時間同期させながらGA６４６に出力する。 On the other hand, the frame synchronization processing unit 640 is a processing module that executes processing for separating each image stream constituting the integrated stream from the integrated stream output from the decoding unit 638 in time synchronization, and includes a DLL and other runtime libraries, Or it is preferable to implement as a Plug-in program. In addition, when the main control device 630 is less expandable than a personal computer such as a game device, it can be mounted as a chip for performing dedicated processing. Frame synchronization processing unit 640, upon receiving the integration stream, the time convolved image integration stream, separated in time slice T _i units, and outputs the GA646 while time synchronization with the N image stream.

GA６４６は、画像ストリームを受領して、フォーマット変換、D/A変換、およびVGA、XGAなどのビデオ変調を行って、グラフィックスインタフェース６４４を介してディスプレイ装置６５０にビデオ信号を送付し、ディスプレイ装置６５０上に動画像を再生する。送付されるビデオ信号は、特定の用途に応じて全画像ストリームでも良いし、画像再生システム６００またはユーザ指令により設定された特定の画像ストリームのみとすることができる。なお、オーディオデータは、フレーム同期処理部６４０からの画像のフラッシュと同期して単一のストリームとしてオーディオ制御部（図示せず）に送付され、動画およびオーディオの同期再生が可能とされている。 The GA 646 receives the image stream, performs format conversion, D / A conversion, and video modulation such as VGA and XGA, and sends a video signal to the display device 650 via the graphics interface 644. Play a video up. The video signal to be sent may be an entire image stream according to a specific application, or only a specific image stream set by the image reproduction system 600 or a user command. Note that the audio data is sent to an audio control unit (not shown) as a single stream in synchronization with the image flash from the frame synchronization processing unit 640 so that the moving image and audio can be synchronized and reproduced.

図７は、図６で説明したフレーム同期処理部６４０の詳細な機能ブロック図である。なお、図７には、画像ストリームの出力態様に関連して２つのフレーム同期処理部の実施形態を示す。図７に示す第１の実施形態のフレーム同期処理部６４０−Ａは、フレームバッファとして機能するFIFOバッファ７２０と、異なる画像ストリームの同一タイムチャンクT_iの画像を同期して出力するためのラインバッファまたはリングバッファから構成される同期化バッファ７３０とを含んで構成されている。デコード部６３８からの統合ストリーム７１０は、フレーム同期処理部６４０−Ａに入力され、先入れ先出し方式で、同期化バッファ７３０に統合ストリームを送付する。 FIG. 7 is a detailed functional block diagram of the frame synchronization processing unit 640 described with reference to FIG. FIG. 7 shows an embodiment of two frame synchronization processing units in relation to the output mode of the image stream. The first embodiment of the frame synchronization processing unit 640-A, a FIFO buffer 720 which functions as a frame buffer, a line buffer for synchronizing and outputting the image of the same time chunks T _i of different image stream shown in FIG. 7 Alternatively, it includes a synchronization buffer 730 formed of a ring buffer. The integrated stream 710 from the decoding unit 638 is input to the frame synchronization processing unit 640-A, and the integrated stream is sent to the synchronization buffer 730 by a first-in first-out method.

同期化バッファ７３０は、統合された画像ストリームのストリーム数と同一の記憶領域がポインタで指定されており、説明する実施形態では、同期化バッファ７３０は、９つの画像ストリームの再生タイミングが共通するタイムスライスT_iについての９画像を格納した段階で、満杯となるように制御されている。FIFOバッファ７２０からの統合ストリームを、同期化バッファ７２４が、画像数をカウントしながら格納して行く。 In the synchronization buffer 730, the same storage area as the number of streams of the integrated image stream is designated by a pointer. In the embodiment to be described, the synchronization buffer 730 is a time at which the reproduction timings of nine image streams are common. Control is performed so that the image becomes full when nine images of the slice T _i are stored. The synchronization stream 724 stores the integrated stream from the FIFO buffer 720 while counting the number of images.

この時点で同期化バッファ７３０のポインタ１〜ポインタＮで指定されるアドレス領域には、同一のタイムスライスT_iに帰属される画像が蓄積されている。第１の実施形態では、同期化バッファ７３０は、FIFOバッファ７２０に対して書き出し停止を指令すると、同時にGA６４６に対してポインタ１〜ポインタＮで指定されるアドレス領域からデータ読み込みを指令し、GA６４６は、各アドレス領域に対応して記憶された画像を取得して各画像ストリームに対応するビデオ信号を生成し、ディスプレイ装置６５０へと送付して、例えば3D映像を表示させる。 At this time, images belonging to the same time slice T _i are accumulated in the address areas designated by the pointers 1 to N of the synchronization buffer 730. In the first embodiment, when the synchronization buffer 730 instructs the FIFO buffer 720 to stop writing, it simultaneously instructs the GA 646 to read data from the address area specified by the pointers 1 to N. The GA 646 Then, an image stored corresponding to each address area is acquired, a video signal corresponding to each image stream is generated, and sent to the display device 650 to display, for example, 3D video.

同期化バッファ７３０の内容がフラッシュされた後、フレーム同期化処理部６４０は、FIFOバッファ７２０に対してデータ書き出しを指令し、再度、同期化バッファ７３０が満杯になるまでデータを送付し、以後、統合ストリームの画像が無くなるまで同様の処理を繰り返し、動画再生を実行する。なお、この際のGA６４６は、3D映像を表示するために複数のグラフィックチップを実装することができる。また、他の実施形態では、フレーム同期処理部６４０からの画像ストリーム出力は、各視点画像を投影するための独立した液晶プロジェクタのGAに送付することもでき、各液晶プロジェクタが実装するGAにより、投影するビデオ信号が生成され、各液晶プロジェクタからそれぞれ投影されてもよい。なお、GA６４６側では、フレーム同期化処理部６４０による時間同期処理のディレイが動画再生に対して影響を与えないように、適切な数のフレームバッファを保有している。 After the contents of the synchronization buffer 730 are flushed, the frame synchronization processing unit 640 instructs the FIFO buffer 720 to write data, and sends the data again until the synchronization buffer 730 is full. The same processing is repeated until there is no image in the integrated stream, and the moving image reproduction is executed. Note that the GA 646 at this time can be mounted with a plurality of graphic chips in order to display 3D video. In another embodiment, the image stream output from the frame synchronization processing unit 640 can be sent to an independent liquid crystal projector GA for projecting each viewpoint image. A video signal to be projected may be generated and projected from each liquid crystal projector. Note that the GA 646 has an appropriate number of frame buffers so that the delay of the time synchronization processing by the frame synchronization processing unit 640 does not affect the moving image reproduction.

また、フレーム同期処理部６４０の第２の実施形態６４０−Ｂは、複数の画像ストリームのうち選択された画像ストリームのみを再生するための実施形態であり、FIFOバッファ７４０および同期化バッファ７５０の構成は、第１の実施形態と同様である。一方、同期化バッファ７５０の出力は、直接GA６４６ではなく、一旦セレクタ７６０に入力される。説明する実施形態では、同期化バッファ７５０からの出力がセレクタ７６０により選択され、Ａ画像ストリームに対応する画像がGA６４６に対して出力され、動画再生に利用される。セレクタ７６０には、入出力周辺装置６６０からの画像切り換え指令を受領して、主制御装置６３０が生成したセレクト信号が入力されていてＢ画像ストリーム〜Ｎ画像ストリームまでの画像が破棄されている。 In addition, the second embodiment 640-B of the frame synchronization processing unit 640 is an embodiment for reproducing only a selected image stream from among a plurality of image streams, and the configuration of the FIFO buffer 740 and the synchronization buffer 750. Is the same as in the first embodiment. On the other hand, the output of the synchronization buffer 750 is input directly to the selector 760 instead of directly to the GA 646. In the embodiment to be described, an output from the synchronization buffer 750 is selected by the selector 760, and an image corresponding to the A image stream is output to the GA 646 and used for moving image reproduction. The selector 760 receives an image switching command from the input / output peripheral device 660, and receives a select signal generated by the main control device 630, and images from the B image stream to the N image stream are discarded.

また、ユーザから例えばＢ画像ストリームへの切り換え指令を受領すると、主制御装置６３０は、Ｂ画像ストリームを出力するためのセレクト信号を生成し、セレクタ７６０に送付する。セレクタ７６０は、当該セレクト信号を受領して、出力画像ストリームを、Ａ画像ストリームから、Ｂ画像ストリームへと切り換えを行う。本実施形態では、同期化バッファ７５０に格納された画像は、同一のタイムスライスT_iに属するため、セレクト信号が入力された後に他の画像ストリームに切り換えられた場合にでも、再生動画のスキップなどが発生せず、高品質の画像切り換えが可能となる。 Further, when receiving a command for switching to the B image stream from the user, for example, the main control device 630 generates a select signal for outputting the B image stream and sends it to the selector 760. The selector 760 receives the select signal and switches the output image stream from the A image stream to the B image stream. In this embodiment, stored in the synchronizing buffer 750 images, since belonging to the same time slice T _i, after which the select signal is input, even when switched to another image stream, such as skipping playback movies Therefore, high-quality image switching is possible.

図８は、本実施形態のエンコーダ１６０が実行するエンコード処理のフローチャートを示す。図８の処理は、ステップＳ８００から開始し、ステップＳ８０１で、画像ストリーム供給部１１０〜１４０からの画像ストリームからミキサ１５０が生成した統合ストリームをエンコーダ１６０が取得し、入力バッファに蓄積する。ステップＳ８０２では、入力バッファに格納された統合ストリームについて、フレーム間相関予測を使用しながらエンコードし、ステップＳ８０３でエンコードされた符号化統合ストリームを生成し、出力ストリームとする。 FIG. 8 shows a flowchart of the encoding process executed by the encoder 160 of the present embodiment. The processing of FIG. 8 starts from step S800, and in step S801, the encoder 160 acquires the integrated stream generated by the mixer 150 from the image streams from the image stream supply units 110 to 140, and accumulates them in the input buffer. In step S802, the integrated stream stored in the input buffer is encoded using inter-frame correlation prediction, and the encoded integrated stream encoded in step S803 is generated as an output stream.

Ｓ８０１までの処理は、記憶装置に格納する処理を経由して、Ｓ８０２からの処理と切り離して行うことができ、それは典型的に行われる。 The processing up to S801 can be performed separately from the processing from S802 via processing stored in the storage device, which is typically performed.

なお、タイムスライスT_iに帰属される画像間でしきい値以上に画像が相違する場合は、フレーム間相関予測を使用しないようにすることができる。なお、本実施形態でタイムスライスとは、複数の画像ストリームを構成する画像のうち、時間的に同期して再生するべき画像セットを与える、画像ストリームの時間的断面を意味する。ステップＳ８０４では、出力ストリームを適切なストレージインタフェースを介してHDD装置またはDVDなどの記録媒体に格納して頒布可能とし、ステップＳ８０５でエンコード処理を終了する。 Incidentally, if the image more than the threshold is different than between the image attributable to the time slice T _i, it can be prevented by using the inter-frame correlation prediction. In the present embodiment, the time slice means a temporal section of an image stream that gives an image set to be reproduced in time synchronization among images constituting a plurality of image streams. In step S804, the output stream is stored in a recording medium such as an HDD device or a DVD via an appropriate storage interface, and can be distributed. In step S805, the encoding process ends.

図９は、本実施形態の画像復号方法のフローチャートを示す。本実施形態の画像復号方法は、ステップＳ９００から開始し、符号化統合ストリームを、デコード部６３８の入力バッファに読み込む。ステップＳ９０２で、フレーム間相関予測を使用してデコードし、統合ストリーム（復号）を生成する。 FIG. 9 shows a flowchart of the image decoding method of the present embodiment. The image decoding method of this embodiment starts from step S900, and reads the encoded integrated stream into the input buffer of the decoding unit 638. In step S902, decoding is performed using inter-frame correlation prediction to generate an integrated stream (decoded).

ステップＳ９０３で、デコード後、統合ストリーム（復号）をフレーム同期処理部６４０に渡し、同期化バッファ７３０または同期化バッファ７５０により複数の画像ストリーム画の画像を時間同期させながら分離する。ステップＳ９０４では、画像ストリーム選択が指令されたか否かを判断し、画像ストリームを選択するユーザ指令がある場合（ｙｅｓ）、ステップＳ９０５で、選択された画像ストリームをGA６４６に送付して動画再生を行う。 In step S903, after decoding, the integrated stream (decoding) is transferred to the frame synchronization processing unit 640, and the synchronization buffer 730 or the synchronization buffer 750 separates the images of the plurality of image stream images while synchronizing them in time. In step S904, it is determined whether image stream selection has been commanded. If there is a user command to select an image stream (yes), in step S905, the selected image stream is sent to the GA 646 to play a moving image. .

一方、画像ストリーム選択が指令されない場合(ｎｏ)、ステップＳ９０７で各画像ストリームを並列的にGAに対して送付し、動画再生を行う。なお、画像ストリーム選択の指令は、特定の画像ストリームを切り換えて表示する場合に、入出力周辺装置６６０からユーザが指令する。また、他の実施形態では、画像再生システム６００は、ユーザからの明示的な指令を受領しない限り、例えばＡ画像ストリームを動画像再生のデフォルト設定とし、ユーザ指令またはゲームストーリーの進行に応じて他の画像ストリームに切り換えて動画再生してもよい。 On the other hand, if image stream selection is not instructed (no), each image stream is sent to the GA in parallel in step S907, and moving image reproduction is performed. The image stream selection command is issued by the user from the input / output peripheral device 660 when a specific image stream is switched and displayed. In another embodiment, the image playback system 600 uses, for example, the A image stream as a default setting for moving image playback unless an explicit command is received from the user, and the other is set according to the user command or the progress of the game story. It is also possible to switch to the image stream and play the video.

ステップＳ９０６では、統合ストリーム（復号）の最後までデコードしたか否かを判断し、最後までデコードしていない場合（ｎｏ）、処理をステップＳ９０４に戻し、処理を反復する。一方、統合ストリーム（復号）のデコードが完了した場合（ｙｅｓ）、統合ストリーム（復号）の最後まで動画再生を行い、ステップＳ９０８で処理を終了する。 In step S906, it is determined whether or not decoding has been completed to the end of the integrated stream (decoding). If the decoding has not been completed to the end (no), the process returns to step S904 and the process is repeated. On the other hand, when the decoding of the integrated stream (decoding) is completed (yes), the moving image is reproduced to the end of the integrated stream (decoding), and the process ends in step S908.

図１０〜図１４を使用して本実施形態の時間コンボリューションされた統合ストリームのエンコード処理および動画再生処理について具体的に説明する。図１０は、3D映像を表示するため、９視点に対応する９画像ストリームの先頭のタイムスライスT₁の画像スタック１０００の画像を示す。タイムスライスT=T₁で示される複数のVIEWは、画像スタック１０１０を形成しており、VIEW1〜VIEW9に対してフレーム間相関予測が行われる。より具体的には、3D動画像を再生するための画像スタックは、図１０に示すように視点角度が異なるだけの画像を含んでいる。統合ストリームは、図１０のVIEW1〜VIEW9として平面的に示した画像が、それぞれ画像スタック１０１０のVIEW1〜VIEW9までの画像スタックに帰属されている。 The encoding process and moving image reproduction process of the time-convolved integrated stream according to the present embodiment will be specifically described with reference to FIGS. FIG. 10 shows images of the image stack 1000 of the _first time slice T1 of the nine image streams corresponding to nine viewpoints for displaying 3D video. A plurality of VIEWs indicated by time slices T = T ₁ form an image stack 1010, and inter-frame correlation prediction is performed on VIEW ₁ to VIEW 9. More specifically, an image stack for reproducing a 3D moving image includes images having different viewpoint angles as shown in FIG. In the integrated stream, the images shown in plan as VIEW1 to VIEW9 in FIG. 10 belong to the image stacks of VIEW1 to VIEW9 in the image stack 1010, respectively.

さらに、図１１には、図１０に示したタイムスライスT₁の直後のタイムスライスT=T₂におけるVIEW1〜VIEW9の画像スタック１１００の実施形態を示す。VIEW1〜VIEW9で示された画像は、同一のVIEW番号を有している画像が、同一のオリジナルの画像ストリームを形成する。図１１においても、VIEW1〜VIEW9は、画像スタック１１１０を形成しており、図１０と同様に、各VIEWの画像は、視点角度が異なるだけの画像から構成されており、図１０および図１１で示した実施形態では、効率的な圧縮処理が可能である。より具体的には。図１０おおび図１１を参照し、例えば、図１０のVIEW4〜VIEW5を比較すれば理解されるように、オブジェクトの移動は小さく、視点角度が異なるだけの僅かな差しかなく、このため同一のタイムスライスに帰属する画像スタックについて効率的なフレーム間相関予測に基づいて動画圧縮を行うことが可能となる。 Further, in FIG. 11 shows an embodiment of the image stack 1100 VIEW1~VIEW9 in time slice T = T ₂ immediately after the time slice T ₁ shown in FIG. 10. In the images indicated by VIEW1 to VIEW9, images having the same VIEW number form the same original image stream. Also in FIG. 11, VIEW1 to VIEW9 form an image stack 1110. Like FIG. 10, each VIEW image is composed of images with different viewpoint angles. In the illustrated embodiment, efficient compression processing is possible. More specifically: With reference to FIGS. 10 and 11, for example, as will be understood by comparing VIEW4 to VIEW5 in FIG. 10, the movement of the object is small and the viewpoint angle is slightly different. It is possible to perform moving image compression based on efficient inter-frame correlation prediction for an image stack belonging to a time slice.

すなわち、本実施形態にしたがって画像ストリームを横断するようにして画像を時間コンボリューションすることで生成した統合ストリームを、フレーム間相関予測しながらエンコード処理を実行することで、高い圧縮効果を提供することができるので、個々の画像ストリームを従来方法にしたがってエンコードするよりも効率的な圧縮が可能となる。 That is, a high compression effect is provided by performing an encoding process while predicting the inter-frame correlation of an integrated stream generated by temporally convolving an image so as to cross the image stream according to the present embodiment. Can be compressed more efficiently than encoding individual image streams according to conventional methods.

図１２は、本実施形態で採用するフレーム間相関予測１２００を説明する図である。特定のタイムスライスTiに帰属する画像スタック１２１０は、説明する実施形態では９画像あり、これらの９画像は、視点角度が異なるのみで、ほぼ同一の画像とされる。この結果、VIEW1〜VIEW9までの差分画像は、視点角度の相違するだけのものとなり、極めて効率的な圧縮が可能となる。一方、本実施形態のフレーム間相関予測の予測についても、双方向予測、前方予測、後方予測など既存のエンコーダによる計算方法を適用して充分精度よく符号化ストリームを生成するために利用することができることが分かる。 FIG. 12 is a diagram for explaining the inter-frame correlation prediction 1200 employed in the present embodiment. The image stack 1210 belonging to a specific time slice Ti has nine images in the embodiment to be described, and these nine images are substantially the same images only with different viewpoint angles. As a result, the difference images from VIEW1 to VIEW9 have only different viewpoint angles, and extremely efficient compression is possible. On the other hand, the prediction of inter-frame correlation prediction according to this embodiment can also be used to generate an encoded stream with sufficient accuracy by applying a calculation method using an existing encoder such as bidirectional prediction, forward prediction, and backward prediction. I understand that I can do it.

さらに、本実施形態で3D映像を与えるための画像ストリームを時間コンボリューションして統合画像を生成する実施形態では、予めスケーラビリティがある始点角度の差に対応する画像間でフレーム間相関予測による圧縮を行うことが可能となるので、圧縮処理のスケーラビリティも保証しやすい。 Furthermore, in the embodiment in which the image stream for providing 3D video is temporally convolved to generate an integrated image in the present embodiment, compression by inter-frame correlation prediction is performed between images corresponding to the difference in start point angle with scalability in advance. Therefore, it is easy to guarantee the scalability of the compression process.

図１３は、本実施形態のフレーム間相関予測方法について、最も移動量の大きなオブジェクトである飛行物体を、ワイヤフレームとして、背景を除去して示した図である。図１３中、画像１３００と画像１３１０とは、視点角度が異なり再生タイミングが共通する同一のタイムスライスに含まれる画像であり、画像１３００と、画像１３２０は、同一の画像ストリームの時間的に離れた再生タイミングに対応する画像である。図１３に示されるように、本実施形態では、同一のタイムスライスTiやTjを構成する画像セットは、視点角度の相違による画像の違いのみで、極めて近似した画像から構成される。従来の画像符号化では、紙面左手側から右手側（後方予測）またはこの逆（前方予測）またはいずれか（双方向予測）を使用して画像圧縮を行うものである。一方、本実施形態では、紙面左右に加え上下に向かう方向でも、フレーム間相関予測による圧縮を行う。図１３に示すように、複数の画像ストリームを時間的に多重化する場合には時系列的に沿ってフレーム間相関予測を行うことよりも異なる画像ストリームの再生タイミングが同一の画像間でフレーム間相関予測による圧縮を行う方が高圧縮を達成可能であることが示される。 FIG. 13 is a diagram showing the inter-frame correlation prediction method of the present embodiment with the flying object being the object with the largest movement amount as a wire frame, with the background removed. In FIG. 13, an image 1300 and an image 1310 are images included in the same time slice with different viewpoint angles and the same reproduction timing, and the image 1300 and the image 1320 are separated in time from the same image stream. It is an image corresponding to the reproduction timing. As shown in FIG. 13, in the present embodiment, the image sets that make up the same time slice Ti or Tj are made up of images that are very close to each other only by the difference in image due to the difference in viewpoint angle. In conventional image coding, image compression is performed using the left-hand side to the right-hand side (backward prediction), the reverse (forward prediction), or one (bidirectional prediction). On the other hand, in the present embodiment, compression is performed by inter-frame correlation prediction in the direction toward the top and bottom in addition to the right and left of the page. As shown in FIG. 13, when a plurality of image streams are multiplexed in time, the reproduction timing of different image streams is different between frames with the same reproduction timing than when performing inter-frame correlation prediction in time series. It is shown that high compression can be achieved by performing compression by correlation prediction.

図１２にも示したように、特定のタイムスライスT_iに帰属する画像スタック１２１０は、説明する実施形態では９画像あり、これらの９画像は、視点角度が異なるのみでほぼ同一の画像となる。この結果、VIEW1〜VIEW9までの差分画像は、視点角度の相違するだけのものとなり、極めて効率的な圧縮が可能となる。一方、本実施形態のフレーム間相関予測は、双方向予測、前方予測、後方予測など既存のエンコーダによる計算方法を適用して充分精度よく符号化ストリームを生成するために利用することができる。 As shown in FIG. 12, the image stack 1210 belonging to a specific time slice T _i has nine images in the embodiment to be described, and these nine images are substantially the same images only with different viewpoint angles. . As a result, the difference images from VIEW1 to VIEW9 have only different viewpoint angles, and extremely efficient compression is possible. On the other hand, the inter-frame correlation prediction of the present embodiment can be used to generate an encoded stream with sufficient accuracy by applying a calculation method using an existing encoder such as bidirectional prediction, forward prediction, and backward prediction.

図１４は、本実施形態のエンコード／デコード方式および動画再生システムにより再現される動画像の切り換え処理の実施形態を示す。図１４に示した実施形態では、画像ストリーム１４１０と、画像ストリーム１４２０とが統合されているものとして説明する。 FIG. 14 shows an embodiment of switching processing of moving images reproduced by the encoding / decoding method and moving image reproduction system of the present embodiment. In the embodiment illustrated in FIG. 14, the image stream 1410 and the image stream 1420 are described as being integrated.

複数の画像ストリームの同一のタイムスライスにおける画像を横断的に抽出して統合し、また時間的に同期してデコードすることにより、複数の画像シーケンスのタイムスライスを時間的に同期させることが可能となる。このため、ユーザが、入出力周辺装置６６０から画像切り換え指令を発行するか、または画像再生システム６００がゲーム装置などとして実装される場合、ストーリーの展開に応じてボーナスアイテムに切り換える処理を実行する場合など、再生するべき画像ストリームを選択するだけで時間同期した異なる画像ストリームの動画像を表示できる。 It is possible to temporally synchronize time slices of multiple image sequences by extracting and consolidating images in the same time slice of multiple image streams and integrating them in time synchronization and decoding. Become. For this reason, when the user issues an image switching command from the input / output peripheral device 660 or when the image reproduction system 600 is implemented as a game device or the like, a process of switching to a bonus item according to the story development is executed. For example, moving images of different image streams synchronized in time can be displayed simply by selecting an image stream to be reproduced.

例えば、ユーザは、現在画像ストリーム１４１０を動画再生しているものとする。ユーザが、タイムスライス１４２０で、画像ストリームを１４３０に切り換えて動画再生する指令を発行したものとすると、セレクタ７６０に画像ストリーム切り換え指令が発行され、最小のタイムラグで、画像ストリーム１４３０に対応する再生動画に切換えられている。 For example, it is assumed that the user is currently playing a moving image of the image stream 1410. Assuming that the user issues a command to switch the image stream to 1430 and play a video at time slice 1420, the video stream switching command is issued to selector 760, and the playback video corresponding to image stream 1430 with the minimum time lag. It has been switched to.

また、図１４の実施形態は、再生動画切り換えの他にも、例えばパララックスバリヤ方式など２視点を利用する3D動画再生において、左目画像および右目画像を高い同期性をもって動画再生するためにも適用することができる。 Further, the embodiment of FIG. 14 is applied not only to playback video switching but also to playback of left-eye images and right-eye images with high synchronization in 3D video playback using two viewpoints such as a parallax barrier method. can do.

本実施形態の上記機能は、アセンブラ、C、C++、Java（登録商標）、といったレガシープログラミング言語やオブジェクト指向プログラミング言語などで記述された装置実行可能なプログラムまたは等価的な集積回路により実現でき、本実施形態のプログラムは、ハードディスク装置、CD-ROM、MO、フレキシブルディスク、EEPROM、EPROMなどの装置可読な記録媒体に格納して頒布することができ、また他装置が可能な形式でネットワークを介して伝送することができる。 The above functions of the present embodiment can be realized by a device executable program or an equivalent integrated circuit described in a legacy programming language such as assembler, C, C ++, or Java (registered trademark) or an object-oriented programming language. The program of the embodiment can be stored and distributed in a device-readable recording medium such as a hard disk device, CD-ROM, MO, flexible disk, EEPROM, EPROM, etc., and in a format that other devices can use via a network. Can be transmitted.

これまで本実施形態につき説明してきたが、本発明は、上述した実施形態に限定されるものではなく、ピクチャ、すなわち１フレーム単位で処理を行うものとして説明したが、符号化および復号化の処理は、特定の目的に応じて画像の１部分を対象として行うこともできるし、複数の画像を重畳させて符号化および復号化の処理を施すなど、他の実施形態、追加、変更、削除など、当業者が想到することができる範囲内で変更することができ、いずれの態様においても本発明の作用・効果を奏する限り、本発明の範囲に含まれるものである。 Although the present embodiment has been described so far, the present invention is not limited to the above-described embodiment, and has been described as performing processing in units of pictures, that is, one frame. Can be performed on a part of an image according to a specific purpose, and other embodiments such as addition, change, deletion, etc., such as encoding and decoding processing by superimposing a plurality of images, etc. These modifications can be made within the range that can be conceived by those skilled in the art, and any embodiment is included in the scope of the present invention as long as the effects and advantages of the present invention are exhibited.

１００…エンコーダ、１１０〜１４０…画像ストリーム供給部、１１０ａ〜１４０ｎ…画像ストリーム、１５０…ミキサ、１６０…エンコーダ、２１０…入力バッファ、２１２…加減算器、２１４…DCT器、２１６…量子化器、２１８…可変長符号化器、２２０…逆量子化器、２２２…逆DCT器、２２４…加算器、２２６…イメージバッファ、２２８…フレーム間相関予測器、２３０…符号量制御器、２３２…出力バッファ、３１０…入力バッファ、３１２…加算減算器、３１４…直交変換装置、３１６…量子化器、３１８…逆量子化器、３２０…逆直交変換器、３２２…イメージバッファ、３２４…フレーム間相関予測器、３２６…符号化器、３２８…符号量制御器、３３０…出力バッファ、３３２…加算器、 DESCRIPTION OF SYMBOLS 100 ... Encoder 110-140 Image stream supply part, 110a-140n ... Image stream, 150 ... Mixer, 160 ... Encoder, 210 ... Input buffer, 212 ... Adder / subtractor, 214 ... DCT device, 216 ... Quantizer, 218 DESCRIPTION OF SYMBOLS ... Variable length encoder, 220 ... Inverse quantizer, 222 ... Inverse DCT device, 224 ... Adder, 226 ... Image buffer, 228 ... Inter-frame correlation predictor, 230 ... Code amount controller, 232 ... Output buffer, DESCRIPTION OF SYMBOLS 310 ... Input buffer, 312 ... Addition subtractor, 314 ... Orthogonal transformation device, 316 ... Quantizer, 318 ... Inverse quantizer, 320 ... Inverse orthogonal transformer, 322 ... Image buffer, 324 ... Inter-frame correlation predictor, 326 ... Encoder, 328 ... Code amount controller, 330 ... Output buffer, 332 ... Adder,

Claims

A moving image reproduction method executed by a main control device that reproduces an encoded stream obtained by multiplexing a plurality of image streams in time synchronization, wherein the main control device includes a decoding unit that decodes the encoded stream, and the decoding A frame synchronization processing unit that outputs the plurality of image streams in time synchronization with the output of the unit, and the main control unit includes:
An encoded stream obtained by encoding an integrated stream generated by transversely extracting images having a common reproduction timing of a plurality of different image streams from the plurality of image streams and temporally convolving the images in the extraction order, A decoder unit obtaining from a persistent recording medium or a public network;
Decoding the encoded stream in the decoder unit to generate the time-convolved integrated stream;
The frame synchronization processing unit receives the integrated stream from the decoder, and the frame synchronization processing unit stores the images constituting the integrated stream in a first-in first-out manner, corresponding to the number of image streams. Stopping the writing of the images constituting the integrated stream at the time of storing the number of images, and writing all the stored images to the graphic accelerator in time synchronization;
Creating the respective image streams by the graphic accelerator and reproducing the moving image, wherein the frame synchronization processing unit buffers a FIFO buffer for buffering the integrated stream decoded in the extraction order, and the FIFO buffer. A synchronization buffer for designating an image having a common reproduction timing with the pointer in the extraction order in the same address area as the number of the plurality of image streams, and the frame synchronization processing unit further includes:
When the synchronization buffer designates an image corresponding to the number of image streams, a step of instructing the graphic accelerator to read image data from an address area designated by the pointer is executed, and the frame synchronization is performed. The video playback method, wherein the processing unit is configured as a plug-in, DLL, or runtime library of the main control device.

The moving image reproducing method according to claim 1, wherein the step of generating the integrated stream includes a step of decoding the encoded sequence according to an MPEG series format or an H.264 format.

The moving image reproduction method according to claim 1, further comprising a step of switching during reproduction of an image stream to be sent to the graphic accelerator.

The moving image reproduction method according to claim 1, further comprising a step of starting to write an image constituting the integrated stream when writing to the graphic accelerator is completed.

The moving image reproduction method according to claim 1, wherein the image stream includes different viewpoint images for reproducing 3D video.

A video playback system that plays back an encoded stream in which a plurality of image streams are multiplexed in time synchronization, wherein the video playback system includes:
An image having a common playback timing of a plurality of different image streams is extracted from the plurality of image streams, and an encoded stream obtained by encoding an integrated stream generated by temporal convolution of the images in the extraction order is read. An interface part;
A decoding unit that decodes the encoded stream to generate the time-convolved integrated stream;
A frame synchronization processing unit that acquires the integrated stream from the decoding unit and separates the images of the respective image streams constituting the integrated stream, and writes the image in time synchronization;
A graphics accelerator that receives the output of the frame synchronization processor and generates at least one video signal of the plurality of image streams;
A display device that receives the video signal from the graphic accelerator and displays a moving image, wherein the frame synchronization processing unit buffers a FIFO buffer that decodes the integrated stream decoded in the extraction order; and the FIFO buffer A synchronization buffer for designating images having the same reproduction timing from the pointer in the order of extraction with the same address area as the number of the plurality of image streams, and the frame synchronization processing unit further includes the synchronization buffer As a plug-in, DLL, or runtime library that executes the step of instructing the graphic accelerator to read the image data from the address area specified by the pointer when the image corresponding to the number of image streams is specified Implemented,
Video playback system.

The moving image reproduction system according to claim 6, wherein the interface unit is a network interface that acquires the encoded stream via a public network or a storage interface that performs writing to or reading from a persistent storage device.

The moving image reproduction system according to claim 6 or 7, wherein the decoding unit decodes the encoded stream in an MPEG series format or an H.264 format.

The frame synchronization processing unit according to any one of claims 6 to 8, wherein when the synchronization buffer is full of the image, the frame synchronization processing unit stops writing the FIFO buffer and writes the image to the graphic accelerator. Video playback system.

The moving image reproducing system according to any one of claims 6 to 9, wherein the moving image reproducing system reproduces a 3D image or reproduces a game story.

An apparatus-executable program for executing a moving image reproduction method according to any one of claims 1 to 5, wherein a main control apparatus that reproduces an encoded stream obtained by multiplexing a plurality of image streams in time synchronization .