JP2016213698A

JP2016213698A - Image coding device and program

Info

Publication number: JP2016213698A
Application number: JP2015096384A
Authority: JP
Inventors: 境田　慎一; Shinichi Sakaida; 慎一境田; 俊枝三須; Toshie Misu; 市ヶ谷　敦郎; Atsuro Ichigaya; 敦郎市ヶ谷
Original assignee: Nippon Hoso Kyokai NHK
Current assignee: Japan Broadcasting Corp
Priority date: 2015-05-11
Filing date: 2015-05-11
Publication date: 2016-12-15
Anticipated expiration: 2035-05-11
Also published as: JP6527384B2

Abstract

PROBLEM TO BE SOLVED: To provide an image coding device and a program which, in the case of compression coding and multiplexing images of plural resolutions, is capable of transmitting the images within a band of a transmission path without deterioration of the images.SOLUTION: A low resolution image and a high resolution image are respectively coded and when a generated coded information quantity exceeds a predetermined value, the phase of processing units in a time (frame) direction of GOP (Group Of Picture) is deviated between the low resolution image and the resolution image, thereby smoothing a generated information quantity generated at the same time and suppressing the total generated information quantity of the images within a bandwidth.SELECTED DRAWING: Figure 1

Description

本発明は映像符号化装置に関し、特に、２種類以上の空間解像度の映像を圧縮符号化して多重化する装置及びプログラムに関する。 The present invention relates to a video encoding device, and more particularly, to a device and a program for compressing and multiplexing two or more types of spatial resolution video.

近年、放送分野において、８Ｋ（空間解像度7680×4320）、４Ｋ（空間解像度3840×2160）といった超高精細映像の開発・応用が進んでいる。しかしながら、放送（配信）する映像は、超高精細解像度の映像のみでは十分ではない場合があり、受信側の機器の状態（放送用受信機やテレビの受信機能、ディスプレイの解像度等）に応じて、８Ｋ単独の放送だけではなく、８Ｋと４Ｋの同時放送や、さらには２Ｋを含む放送等、同一コンテンツを複数の解像度で伝送することが求められる場合が想定される。 In recent years, in the broadcasting field, development and application of ultra-high-definition video such as 8K (spatial resolution 7680 × 4320) and 4K (spatial resolution 3840 × 2160) are progressing. However, for broadcast (distributed) video, it may not be sufficient to use ultra high-definition video alone, depending on the status of the receiving device (broadcast receiver, TV reception function, display resolution, etc.) It is assumed that the same content is required to be transmitted with a plurality of resolutions, such as simultaneous broadcasting of 8K and 4K as well as broadcasting of 8K alone, and broadcasting including 2K.

このような超高精細映像の伝送の際には、高効率な圧縮符号化技術が不可欠になる。映像の圧縮符号化技術として、MPEG（Moving Picture Experts Group）-2やAVC（Advanced Video Coding）/H.264、HEVC（High Efficiency Video Coding）/H.265などの方式が開発・標準化され利用されている。伝送の際には、このような圧縮符号化方式を適切に選択して使用する。 When transmitting such ultra-high-definition video, highly efficient compression coding technology is indispensable. Video compression and coding technologies such as MPEG (Moving Picture Experts Group) -2, AVC (Advanced Video Coding) /H.264, and HEVC (High Efficiency Video Coding) /H.265 have been developed, standardized, and used. ing. At the time of transmission, such a compression encoding method is appropriately selected and used.

また、複数の映像を同時に伝送するには、各映像信号を圧縮符号化し、これらを多重して多重化ストリームとして伝送路に送信することが行われる。しかしながら、使用できる伝送路には帯域（ビットレート）等の制約があることから、帯域を有効に利用できる効率的な圧縮符号化と多重化が求められている。 In order to simultaneously transmit a plurality of videos, each video signal is compressed and encoded, multiplexed, and transmitted as a multiplexed stream to a transmission path. However, since there are restrictions on the bandwidth (bit rate) and the like in the usable transmission path, efficient compression coding and multiplexing that can effectively use the bandwidth are required.

圧縮符号化で、画質を一定に保つ場合、圧縮符号化する映像内容・絵柄に依って発生する情報量は変動する。この方式はＶＢＲ（Variable Bit Rate）と呼ばれる。一方、放送など伝送路の帯域が定まっている場合には、発生情報量を一定に保つように発生情報量を制御する。この方式はＣＢＲ（Constant Bit Rate）と呼ばれる。さらに、伝送路の帯域に制約がある中で複数コンテンツを伝送する場合には、各コンテンツの発生情報量の総和を一定に保つような制御を施す技術がある。これは統計多重符号化と呼ばれ、狭い帯域でできるだけ高い画質を保つ場合に、できるだけ無駄なく帯域を使用するために利用される。 When the image quality is kept constant by compression encoding, the amount of information generated varies depending on the video content / picture to be compression encoded. This method is called VBR (Variable Bit Rate). On the other hand, when the bandwidth of the transmission path is fixed such as broadcasting, the generated information amount is controlled so as to keep the generated information amount constant. This method is called CBR (Constant Bit Rate). Furthermore, there is a technique for performing control to keep the total amount of information generated for each content constant when transmitting a plurality of contents while the bandwidth of the transmission path is limited. This is called statistical multiplex coding, and is used to use a band as efficiently as possible when maintaining the highest possible image quality in a narrow band.

また、同一コンテンツの複数解像度での伝送には、通常の圧縮符号化方式の拡張として、階層符号化技術が開発されている。これは、複数（例えば２種類）の空間解像度の同一の映像を一つのストリームとして圧縮符号化し、復号側で一部のみを復号すると解像度の低い映像のみが得られ、ストリームすべてを復号すると解像度の高い映像も得られる機能を有する圧縮符号化方式である。 In addition, for the transmission of the same content at a plurality of resolutions, a hierarchical encoding technique has been developed as an extension of a normal compression encoding system. This is because a plurality of (for example, two types) spatial resolution images are compressed and encoded as a single stream, and only a part of the video is decoded when only a part is decoded on the decoding side. This is a compression encoding method having a function of obtaining a high video.

一方、既存の放送方式である地上波や衛星波の伝送帯域には、世界で定められたチャンネルプランによる制限や技術的な限界にともなう制限があり、複数の空間解像度の超高精細映像を伝送可能な帯域が十分に確保できない場合がある。この場合、伝送可能な情報量（ビットレート）に対して、各映像の圧縮符号化による発生情報量の総和が瞬時的に超過し、結果として映像に劣化や破たんが生じるおそれがある。 On the other hand, the transmission band of terrestrial and satellite waves, which are existing broadcasting systems, is limited by channel plans established in the world and due to technical limitations, and transmits ultra-high definition video with multiple spatial resolutions. There may be cases where sufficient bandwidth cannot be secured. In this case, the total amount of information generated by compression encoding of each video instantaneously exceeds the amount of information (bit rate) that can be transmitted, and as a result, there is a risk that the video will be deteriorated or corrupted.

このような課題に対して、映像信号のＧＯＰ（Group Of Picture）の位相を各チャネル間でずらすことにより、多重化後の目標発生情報量を一定とした、多チャンネル符号化装置が提案されている（特許文献１）。 In response to such a problem, a multi-channel encoding device has been proposed in which the amount of target generated information after multiplexing is made constant by shifting the phase of a GOP (Group Of Picture) of a video signal between channels. (Patent Document 1).

特許文献１に記載の多チャンネル符号化装置においては、各チャンネルのＩピクチャ（イントラピクチャ）の符号化を時間軸上に分散させるとともに、多重化ストリームが所定の符号レートを上回る可能性があることを検出すると、Ｉピクチャ禁止信号が出力され、Ｉピクチャ禁止信号を受けた構造決定部がすべての画像符号化部によるフレーム内符号化を禁止し、フレーム内符号化以外の予測符号化を行わせる構成としている。 In the multi-channel encoding device described in Patent Document 1, the encoding of the I picture (intra picture) of each channel is dispersed on the time axis, and the multiplexed stream may exceed a predetermined code rate. Is detected, the I picture prohibition signal is output, and the structure determination unit that has received the I picture prohibition signal prohibits intra-frame encoding by all the image encoding units and performs predictive encoding other than intra-frame encoding. It is configured.

特許第３１４７８５９号公報Japanese Patent No. 3147859

以上述べたように、超高精細映像を伝送するのに十分でない帯域で複数の超高精細映像を圧縮符号化して伝送する場合、復号した映像に大きな劣化や破たんが生じる場合がある。 As described above, when a plurality of ultra-high-definition videos are compressed and transmitted in a band that is not sufficient for transmitting the ultra-high-definition video, the decoded video may be greatly deteriorated or broken.

特に、放送システムにおいては、同一コンテンツを異なる解像度で伝送することが想定され、同一コンテンツであるから映像の各フレームにおける情報量の多寡が一致し、圧縮符号化による瞬時の発生情報量の総和が、伝送路の帯域を越える確率がより高くなる。 In particular, in a broadcast system, it is assumed that the same content is transmitted at different resolutions, and since the same content, the amount of information in each frame of the video matches, and the total amount of information generated instantaneously by compression coding is The probability of exceeding the bandwidth of the transmission path is higher.

特許文献１の符号化装置は、多重化ストリームが所定の符号レートを上回る可能性があることを検出すると、Ｉピクチャ符号化を禁止するものであるが、Ｉピクチャ禁止信号が出力されている間は、全てのチャネルでＩピクチャを消滅させるため、全体としてＩピクチャが減少し、画像が劣化するおそれがある。また、特許文献１では、通常は単に順次巡回的にＩピクチャの配置をずらしているだけであり、最適なＩピクチャの配置方法について開示されていない。さらに、同一コンテンツを異なる解像度で配信する場合について検討されておらず、階層符号化への応用についても考慮されていない。 When the encoding device of Patent Document 1 detects that a multiplexed stream may exceed a predetermined code rate, the encoding device prohibits I picture encoding, while the I picture prohibition signal is being output. Since the I picture disappears in all the channels, the I picture decreases as a whole, and there is a possibility that the image deteriorates. Further, in Patent Document 1, normally, the arrangement of I pictures is simply shifted in a sequential manner, and an optimal I picture arrangement method is not disclosed. Furthermore, the case where the same content is distributed at different resolutions is not studied, and the application to hierarchical coding is not considered.

また、従来の映像符号化装置として、複数のフレームの情報を纏めたバッファを用意して、伝送路の帯域内での符号化をおこなうものもあるが、バッファ容量に応じて量子化を制御して総計の発生情報量を減らすものであり、ある時間区間で情報量を平坦化させるため、画質の低下や処理時間の遅れが生じる可能性がある。 In addition, as a conventional video encoding device, there is a device that prepares a buffer that summarizes information of a plurality of frames and performs encoding within the band of the transmission path. However, the quantization is controlled according to the buffer capacity. Therefore, the amount of generated information is reduced, and the amount of information is flattened in a certain time section. Therefore, there is a possibility that image quality is deteriorated and processing time is delayed.

従って、上記のような問題点に鑑みてなされた本発明の目的は、複数の映像を圧縮符号化して多重化する際に、映像を劣化させることなく、伝送路の帯域内での映像伝送を可能とした映像符号化装置及びプログラムを提供することにある。 Accordingly, an object of the present invention, which has been made in view of the above problems, is to perform video transmission within the bandwidth of a transmission line without degrading video when a plurality of videos are compressed and encoded and multiplexed. An object is to provide a video encoding device and a program which can be made possible.

上記課題を解決するために本発明に係る映像符号化装置は、複数の異なる映像を、同時に圧縮符号化し、多重化する映像符号化装置において、発生する符号化情報量が所定値を越えたとき、前記複数の異なる映像のうちの少なくとも一つの映像の符号化時の動き推定・補償の参照構造の位相を変更し、同一時刻で発生する符号化情報量を前記所定値以下に平滑化することを特徴とする。 In order to solve the above problems, a video encoding device according to the present invention is a video encoding device that simultaneously compresses and multiplexes a plurality of different videos, and when the amount of encoded information generated exceeds a predetermined value. Changing the phase of the reference structure for motion estimation / compensation when encoding at least one of the plurality of different videos, and smoothing the amount of encoded information generated at the same time to the predetermined value or less. It is characterized by.

また、前記映像符号化装置は、複数の映像符号化部と、前記複数の映像符号化部の出力を多重化するとともに、前記複数の映像符号化部で発生する符号化情報量が前記所定値を超えたときに指令を出力する多重化部と、前記指令に基づいて、前記複数の映像符号化部の少なくとも一つの映像符号化部の符号化時の動き推定・補償の参照構造の位相を変更するＧＯＰ制御部と、を備えることが望ましい。 The video encoding device multiplexes a plurality of video encoding units and outputs of the plurality of video encoding units, and the amount of encoded information generated by the plurality of video encoding units is the predetermined value. And a phase of a reference structure for motion estimation / compensation at the time of encoding of at least one video encoding unit of the plurality of video encoding units based on the command It is desirable to include a GOP control unit to be changed.

また、前記映像符号化装置は、前記複数の映像符号化部は、高解像度映像を符号化する第１の映像符号化部と、低解像度映像を符号化する第２の映像符号化部とを有し、前記第２の映像符号化部への入力映像信号は、前記第１の映像符号化部への前記入力映像信号をダウンコンバートした映像信号であることが望ましい。 In the video encoding device, the plurality of video encoding units include a first video encoding unit that encodes a high-resolution video and a second video encoding unit that encodes a low-resolution video. Preferably, the input video signal to the second video encoding unit is a video signal obtained by down-converting the input video signal to the first video encoding unit.

上記課題を解決するために本発明に係る映像符号化装置は、複数の異なる解像度の映像を、同時に圧縮符号化し、多重化する映像符号化装置であって、前記圧縮符号化は、基本とする映像から解像度の高い映像を予測する構造を有する圧縮符号化である映像符号化装置において、発生する符号化情報量が所定値を越えたとき、前記複数の異なる解像度の映像のうちの少なくとも一つの映像の符号化時の動き推定・補償の参照構造の位相を変更し、同一時刻で発生する符号化情報量を前記所定値以下に平滑化することを特徴とする。 In order to solve the above problems, a video encoding apparatus according to the present invention is a video encoding apparatus that simultaneously compresses and multiplexes a plurality of videos having different resolutions, and the compression encoding is basically performed. In a video encoding device that is a compression encoding having a structure for predicting a high-resolution video from a video, when the amount of encoded information generated exceeds a predetermined value, at least one of the plurality of video with different resolutions The phase of the motion estimation / compensation reference structure at the time of video encoding is changed, and the amount of encoded information generated at the same time is smoothed below the predetermined value.

また、前記映像符号化装置は、基本とする映像を符号化する映像符号化部と解像度の高い映像を符号化する映像符号化部を含む複数の映像符号化部と、前記基本とする映像を符号化する映像符号化部の出力を復号して、解像度の低い映像を生成する局所復号部と、前記解像度の低い映像を拡大して前記解像度の高い映像を符号化する映像符号化部に出力するアップコンバート部と、前記複数の映像符号化部の出力を多重化するとともに、前記複数の映像符号化部で発生する符号化情報量が前記所定値を超えたときに指令を出力する多重化部と、前記指令に基づいて、前記複数の映像符号化部の少なくとも一つの映像符号化部の符号化時の動き推定・補償の参照構造の位相を変更するＧＯＰ制御部と、を備えたことが望ましい。 The video encoding device includes a video encoding unit that encodes a basic video, a plurality of video encoding units including a video encoding unit that encodes a high-resolution video, and the basic video. Decodes the output of the video encoding unit to be encoded to generate a local decoding unit that generates a low-resolution video, and outputs the low-resolution video to a video encoding unit that encodes the high-resolution video Multiplexing the output of the plurality of video encoding units and outputting a command when the amount of encoded information generated by the plurality of video encoding units exceeds the predetermined value And a GOP control unit that changes a phase of a motion estimation / compensation reference structure at the time of encoding of at least one video encoding unit of the plurality of video encoding units based on the command. Is desirable.

また、前記映像符号化装置は、発生する符号化情報量が所定値を越えたとき、前記複数の映像符号化部のうちの少なくとも一つの映像の符号化時の動き推定・補償の参照構造の位相を所定フレーム数ずらして発生する符号化情報量が前記所定値以内かを判断し、所定値以内であれば当該ずらした位相に前記参照構造を設定し、所定値を超えていればさらに所定フレーム数ずらすことを繰り返すことが望ましい。 Further, the video encoding device has a reference structure for motion estimation / compensation at the time of encoding at least one video of the plurality of video encoding units when the amount of generated encoded information exceeds a predetermined value. It is determined whether the amount of encoded information generated by shifting the phase by a predetermined number of frames is within the predetermined value. If it is within the predetermined value, the reference structure is set to the shifted phase, and if it exceeds the predetermined value, it is further predetermined. It is desirable to repeat shifting the number of frames.

また、上記課題を解決するため、本発明に係るプログラムは、コンピュータを、上記映像符号化装置として機能させることを特徴とする。 In order to solve the above problems, a program according to the present invention causes a computer to function as the video encoding device.

本発明における映像符号化装置によれば、超高精細映像を伝送するのに十分でない帯域で、複数の超高精細映像を圧縮符号化し伝送する際に、映像を劣化させることなく（イントラピクチャを減少させることなく）、瞬時的な総発生符号量を平滑化して、帯域内での伝送を可能とすることができ、画質の劣化や破たんを防ぐことができる。また、映像自体の処理時刻はずれないので、複数の映像間での遅延差は発生しない。 According to the video encoding device of the present invention, when a plurality of ultra-high-definition videos are compressed and transmitted in a band that is not sufficient for transmitting the ultra-high-definition video, the video is not deteriorated (intra pictures are transmitted). It is possible to smooth the instantaneous total generated code amount and enable transmission within the band without preventing the image quality from being deteriorated or broken. In addition, since the processing time of the video itself does not change, there is no difference in delay between a plurality of videos.

さらに、本発明により、階層符号化を行う場合においては、基本階層のＩピクチャを拡張階層の符号化で有効活用することにより、拡張階層の画質を向上させることができる。 Furthermore, according to the present invention, when hierarchical coding is performed, the picture quality of the enhancement layer can be improved by effectively utilizing the I picture of the base layer in the enhancement layer coding.

実施の形態１の映像符号化装置のブロック図である。1 is a block diagram of a video encoding device according to Embodiment 1. FIG. 従来の２つの画像のＧＯＰ構造と情報発生量を示す図である。It is a figure which shows the GOP structure and information generation amount of the conventional two images. 実施の形態１の２つの画像のＧＯＰ構造と情報発生量を示す図である。FIG. 6 is a diagram illustrating a GOP structure and information generation amount of two images according to the first embodiment. ＧＯＰ構造の位相を制御するフローチャートである。It is a flowchart which controls the phase of GOP structure. 実施の形態２の映像符号化装置のブロック図である。6 is a block diagram of a video encoding device according to Embodiment 2. FIG. 従来の階層符号化のＧＯＰ構造と参照関係を示す図である。It is a figure which shows the GOP structure and reference relationship of the conventional hierarchy coding. 実施の形態２の階層符号化のＧＯＰ構造と参照関係を示す図である。FIG. 10 is a diagram illustrating a hierarchical encoding GOP structure and a reference relationship according to the second embodiment.

以下、本発明の実施の形態について説明する。以降の説明では、高解像度映像としての入力信号は、８Ｋや４Ｋのような超高精細度映像を想定して説明するが、高解像度映像は８Ｋ，４Ｋに限定されるものではない。また、説明を簡略にするため、ここでは、２種類の異解像度の同一映像（コンテンツ）を伝送すると仮定するが、本発明は２種類の異解像度映像に限定されない。 Embodiments of the present invention will be described below. In the following description, an input signal as a high resolution video will be described assuming an ultra high definition video such as 8K or 4K, but the high resolution video is not limited to 8K or 4K. In order to simplify the description, it is assumed here that the same video (content) of two different resolutions is transmitted, but the present invention is not limited to two types of different resolution video.

伝送帯域に制限がある中で、２種類の映像の全体の符号化情報量を一定に保って伝送する方式として、次の３通りが考えられる。
（１）それぞれの映像を独自にＣＢＲ符号化し、一定のビットレートに保つ方法
（２）２種類の映像の統計多重により、一定のビットレートに保つ方法
（３）階層符号化を利用したうえで、一定のビットレートに保つ方法 There are the following three methods for transmitting while keeping the total encoded information amount of the two types of images constant while the transmission band is limited.
(1) A method of independently CBR encoding each video and maintaining a constant bit rate (2) A method of maintaining a constant bit rate by statistical multiplexing of two types of video (3) After using hierarchical encoding , How to keep a constant bit rate

（１）の方法は、通常のＣＢＲ符号化をそれぞれの映像に適用するものであり、２映像間での情報の相互関係は無く、合計の発生符号量を積極的に減らしたり、調整することにはならないので、本発明では採用しない。 The method (1) applies normal CBR encoding to each video, and there is no correlation between information between the two videos, and the total generated code amount is actively reduced or adjusted. Therefore, it is not adopted in the present invention.

（２）の方法は、統計多重により符号量を２つの映像に適切に配分するものであり、本発明の実施の形態１として説明する。 The method (2) appropriately distributes the code amount to two videos by statistical multiplexing, and will be described as Embodiment 1 of the present invention.

（３）の方法は、２つの映像を１つのストリームとして符号量を配分するものであり、本発明の実施の形態２として説明する。 The method (3) allocates a code amount by using two videos as one stream, and will be described as a second embodiment of the present invention.

（実施の形態１）
図１に、統計多重を利用した映像符号化装置１００のブロック図を示す。映像符号化装置１００は、第１映像符号化部１１０と、第２映像符号化部１２０と、多重化部１３０と、ＧＯＰ制御部１４０とを備える。 (Embodiment 1)
FIG. 1 shows a block diagram of a video encoding apparatus 100 using statistical multiplexing. The video encoding device 100 includes a first video encoding unit 110, a second video encoding unit 120, a multiplexing unit 130, and a GOP control unit 140.

第１入力信号（第１の入力信号）である高解像度映像は、第１映像符号化部１１０に入力される。第１映像符号化部１１０では、第１入力信号の高解像度映像を圧縮符号化する。圧縮符号化技術としては、例えばISO/IECで標準化されているMPEG-2やMPEG-4 AVCやMPEG-H HEVCなどのフレーム間予測を用いる任意の符号化技術が利用できる。第１映像符号化部１１０は、符号化された映像信号を多重化部１３０に出力するとともに、符号化された映像信号のＧＯＰの構造（符号化時の動き推定・補償の参照構造）に関する情報を、ＧＯＰ制御部１４０に出力する。 The high-resolution video that is the first input signal (first input signal) is input to the first video encoding unit 110. The first video encoding unit 110 compresses and encodes the high resolution video of the first input signal. As the compression encoding technique, for example, any encoding technique using inter-frame prediction such as MPEG-2, MPEG-4 AVC, and MPEG-H HEVC standardized by ISO / IEC can be used. The first video encoding unit 110 outputs the encoded video signal to the multiplexing unit 130 and information on the GOP structure of the encoded video signal (reference structure for motion estimation / compensation at the time of encoding). Is output to the GOP control unit 140.

第２入力信号（第２の入力信号）である低解像度映像は、第２映像符号化部１２０に入力される。第１入力信号と第２入力信号が同一の映像コンテンツである場合は、高解像度映像を低解像度映像に変換するダウンコンバート部１５０を設け、第１入力信号をダウンコンバートして、第２入力信号を作成することができる。 The low-resolution video that is the second input signal (second input signal) is input to the second video encoding unit 120. When the first input signal and the second input signal are the same video content, a down-conversion unit 150 that converts a high-resolution video into a low-resolution video is provided, the first input signal is down-converted, and the second input signal Can be created.

ダウンコンバート部１５０は、第１入力信号としての高解像度映像を、例えば、水平、垂直共に１／２の解像度に縮小して、低解像度映像を作成する。縮小時に適用するフィルタについては、Lanczos3など既存のものを使用可能であり、特にダウンコンバートの方式を問わない。 The down-conversion unit 150 creates a low-resolution video by reducing the high-resolution video as the first input signal to, for example, a horizontal and vertical resolution of ½. An existing filter such as Lanczos 3 can be used as a filter to be applied at the time of reduction, and the down-conversion method is not particularly limited.

第２映像符号化部１２０では、第２入力信号の低解像度映像を圧縮符号化する。圧縮符号化技術としては、例えばISO/IECで標準化されているMPEG-2やMPEG-4 AVCやMPEG-H HEVCなどのフレーム間予測を用いる任意の符号化技術が利用できるが、第１映像符号化部１１０と同じＧＯＰの処理単位を用いる方式の符号化技術であることが望ましい。第２映像符号化部１２０は、符号化された映像信号を多重化部１３０に出力するとともに、符号化された映像信号のＧＯＰの構造（符号化時の動き推定・補償の参照構造）に関する情報を、ＧＯＰ制御部１４０に出力する。 The second video encoding unit 120 compresses and encodes the low resolution video of the second input signal. As the compression encoding technique, for example, any encoding technique using inter-frame prediction such as MPEG-2, MPEG-4 AVC, and MPEG-H HEVC standardized by ISO / IEC can be used. It is desirable that the encoding technique be a scheme that uses the same GOP processing unit as the encoding unit 110. The second video encoding unit 120 outputs the encoded video signal to the multiplexing unit 130 and information on the GOP structure of the encoded video signal (reference structure for motion estimation / compensation during encoding). Is output to the GOP control unit 140.

多重化部１３０は、第１映像符号化部１１０と第２映像符号化部１２０の出力である２つの符号化された映像信号を、多重化して送出する。また、多重化部１３０は、第１映像符号化部１１０及び第２映像符号化部１２０で発生する符号化情報量を観測し、合計の瞬時的な発生情報量が閾値を超えたときに、ＧＯＰ制御部１４０へ指令を出力する。なお、閾値は、伝送路の帯域（ビットレート）以下に瞬時的な総発生情報量を抑えることができるよう、上限値となる所定の値として設定される。 The multiplexing unit 130 multiplexes and transmits two encoded video signals that are the outputs of the first video encoding unit 110 and the second video encoding unit 120. Further, the multiplexing unit 130 observes the encoded information amount generated in the first video encoding unit 110 and the second video encoding unit 120, and when the total instantaneous generated information amount exceeds the threshold, A command is output to the GOP control unit 140. The threshold value is set as a predetermined value that is an upper limit value so that the instantaneous total generated information amount can be suppressed below the bandwidth (bit rate) of the transmission path.

ＧＯＰ制御部１４０は、第１映像符号化部１１０と第２映像符号化部１２０から映像符号化のＧＯＰの構造情報を受けて、第１映像符号化部１１０と第２映像符号化部１２０のＧＯＰ構造の相互関係を管理している。また、多重化部１３０からの指令に基づいて、第１映像符号化部１１０と第２映像符号化部１２０の少なくとも一方に制御信号を出力し、第１映像符号化部１１０及び／又は第２映像符号化部１２０の符号化処理の際のＧＯＰ構造（符号化時の動き推定・補償の参照構造）の位相を変化させる制御を行う。 The GOP control unit 140 receives the GOP structure information of the video encoding from the first video encoding unit 110 and the second video encoding unit 120, and the GOP control unit 140 receives the structure information of the first video encoding unit 110 and the second video encoding unit 120. It manages the mutual relationship of GOP structure. Further, based on a command from the multiplexing unit 130, a control signal is output to at least one of the first video encoding unit 110 and the second video encoding unit 120, and the first video encoding unit 110 and / or the second video encoding unit 120 is output. Control is performed to change the phase of the GOP structure (the reference structure for motion estimation / compensation during encoding) during the encoding process of the video encoding unit 120.

ＧＯＰ制御部１４０の動作について、より詳細に説明する。 The operation of the GOP control unit 140 will be described in more detail.

ＧＯＰ（Group Of Picture）は、映像の圧縮・伸長の処理単位であり、一般に、フレーム間予測を行わずにイントラ予測により符号化されるイントラピクチャ（Ｉ）と、時間方向に過去のフレームからのみ予測する順方向予測のインターピクチャ（Ｐ）と、時間方向に複数のフレームから予測する双方向予測のインターピクチャ（Ｂ）の３種類から構成される。ＧＯＰの構造としては例えば、ＩＢＢＰＢＢＰＢＢＩＢＢＰ・・・のように、ある周期でＩおよびＰを挿入し、その間はＢとする構造や、ＩＢＢＢＢＢＢＢＢＩＢＢ・・・のようにＩとＢのみからなる構造があげられ、イントラピクチャから次のイントラピクチャまでの間を１つのＧＯＰと定義する。また、ＧＯＰ内の動き推定・補償の参照関係（参照構造）をＧＯＰ構造と呼ぶこととする。イントラピクチャの挿入周期は０．５秒や１秒など定期的にするのが一般的であるが、映像のカットチェンジ時に強制的にイントラピクチャにするなどの制御が行われることもある。イントラピクチャはフレーム間予測を行わない分、符号化による発生情報量が多くなる傾向にある。 GOP (Group Of Picture) is a processing unit of video compression / decompression. Generally, only from an intra picture (I) encoded by intra prediction without performing inter-frame prediction and a past frame in the time direction. It consists of three types of inter-prediction (P) for prediction in the forward direction and bi-prediction inter-picture (B) for prediction from a plurality of frames in the time direction. The GOP structure includes, for example, a structure in which I and P are inserted at a certain period, such as IBBPBBPBBIBBP..., And B between them. , One GOP is defined from an intra picture to the next intra picture. A reference relationship (reference structure) for motion estimation / compensation in the GOP is referred to as a GOP structure. The insertion period of an intra picture is generally set periodically, such as 0.5 seconds or 1 second, but control such as forcibly making an intra picture at the time of video cut change may be performed. Intra pictures tend to increase the amount of information generated by encoding as much as inter-frame prediction is not performed.

同一コンテンツの高解像度映像と低解像度映像は、元は同じ映像であるから、ＧＯＰ構造及びイントラピクチャの位置（位相）が同じであれば、各ピクチャでの発生情報量の傾向はほぼ同じになる。図２は、従来技術により、例えば同じＩＢＢＰ・・・のＧＯＰ構造で高解像度映像と低解像度映像を符号化した場合の発生情報量を示したものである。この場合、図２のようにＩで情報量が多くなり、次いでＰ、Ｂの順に符号化による情報が発生する。高解像度映像と低解像度映像を同時に符号化伝送する場合、図２のような傾向の発生情報量であると、同じ時刻（フレーム）に同じように情報量の増減が生じる。 Since the high-resolution video and the low-resolution video of the same content are originally the same video, if the GOP structure and the position (phase) of the intra picture are the same, the tendency of the amount of generated information in each picture is almost the same. . FIG. 2 shows the amount of information generated when a high-resolution video and a low-resolution video are encoded by the GOP structure of the same IBBP. In this case, as shown in FIG. 2, the amount of information increases with I, and then information by encoding is generated in the order of P and B. When high-resolution video and low-resolution video are encoded and transmitted at the same time, if the amount of information generated tends to be as shown in FIG. 2, the amount of information increases and decreases at the same time (frame).

そこで、本発明は、統計多重を行う多重化部１３０において同時刻（フレーム）での高解像度映像と低解像度映像の符号化発生情報量を観測し、両者の総和が閾値を超えた場合に、高解像度映像と低解像度映像の少なくとも一方のＧＯＰの参照関係を変化させる指令をＧＯＰ制御部１４０に出力する。ＧＯＰ制御部１４０は、第１映像符号化部１１０及び／又は第２映像符号化部１２０に制御信号を出力して、映像に対する符号化時の動き推定・補償の参照構造を変化させる。具体的には、動き推定・補償の参照関係の周期構造（例えば、ＩＢＢＰ等の組合せ）を変化させることなく、基準となるイントラピクチャの時刻（フレーム）をずらす。このことをＧＯＰ構造の位相をずらす、又は位相を変化させるということとする。 Therefore, the present invention observes the amount of encoding occurrence information of high-resolution video and low-resolution video at the same time (frame) in the multiplexing unit 130 that performs statistical multiplexing, and when the sum of both exceeds a threshold value, A command to change the reference relationship of at least one GOP of the high resolution video and the low resolution video is output to the GOP control unit 140. The GOP control unit 140 outputs a control signal to the first video encoding unit 110 and / or the second video encoding unit 120 to change a reference structure for motion estimation / compensation when encoding video. Specifically, the time (frame) of the reference intra picture is shifted without changing the periodic structure of the reference relationship of motion estimation / compensation (for example, a combination such as IBBP). This means that the phase of the GOP structure is shifted or the phase is changed.

図３は、低解像度映像のイントラピクチャの時刻を、高解像度映像よりも２フレーム分遅らせた例である。ＧＯＰの周期構造を変えることなく、２つの映像のイントラピクチャの時刻をずらすことで、同時刻の発生情報量の総和を減らし、発生情報量を平滑化することが可能になる。 FIG. 3 shows an example in which the time of an intra picture of a low-resolution video is delayed by two frames from that of a high-resolution video. By shifting the time of intra pictures of two videos without changing the GOP periodic structure, the total amount of generated information at the same time can be reduced and the amount of generated information can be smoothed.

なお、符号化装置や多重化装置では、発生情報量の総計を複数フレームやＧＯＰ単位で実施するためにバッファを設けることがある。そのため、ＧＯＰ構造を変化させるかどうかの判断は、通常バッファが設けられる多重化部１３０で行うことが望ましく、バッファが複数の符号化部に設けられた場合は、複数のバッファ内の情報量を参照して判断しても良い。 Note that in an encoding device or a multiplexing device, a buffer may be provided in order to implement the total amount of generated information in a plurality of frames or GOP units. Therefore, it is desirable to determine whether or not to change the GOP structure in the multiplexing unit 130 in which a normal buffer is provided. When the buffer is provided in a plurality of encoding units, the amount of information in the plurality of buffers is determined. You may judge with reference.

次に、ＧＯＰ構造の位相を変化させるか否か、および何フレームずらすかを決める手順について説明する。 Next, a procedure for determining whether or not to change the phase of the GOP structure and how many frames to shift will be described.

図４に、符号化部のＧＯＰ構造の位相を制御するためのフローチャートを示す。 FIG. 4 shows a flowchart for controlling the phase of the GOP structure of the encoding unit.

まず、ステップ１（Ｓ１）では、第１映像符号化部１１０及び第２映像符号化部１２０で、高解像度映像及び低解像度映像をそれぞれ圧縮符号化する。最初の段階では、それぞれの映像符号化部でそれまで使用されていたＧＯＰ構造と連続性を持って映像符号化を行う。 First, in step 1 (S1), the first video encoding unit 110 and the second video encoding unit 120 compress and encode the high resolution video and the low resolution video, respectively. In the first stage, video encoding is performed with continuity with the GOP structure used so far in each video encoding unit.

ステップ２（Ｓ２）において、第１映像符号化部１１０及び第２映像符号化部１２０の同時刻（フレーム）での符号化発生情報量を合計し、両者の総和と閾値とを比較する。ここで発生情報量の総和が閾値を超えた場合は、ステップ３（Ｓ３）に移る。なお、発生情報量の総和が閾値を超えない場合は、いったん終了し、Ｓ１とＳ２を繰り返して、発生情報量の総和について、引き続き監視を行う。 In step 2 (S2), the amounts of information generated at the same time (frame) of the first video encoding unit 110 and the second video encoding unit 120 are summed, and the sum of both is compared with a threshold value. If the total amount of generated information exceeds the threshold value, the process proceeds to step 3 (S3). If the total amount of generated information does not exceed the threshold value, the process is temporarily terminated, and S1 and S2 are repeated to continuously monitor the total amount of generated information.

ステップ３（Ｓ３）では、ＧＯＰ構造の位相の変化量を設定する。すなわち、予め決められた所定のフレーム数（１回あたりのシフト量）だけ、イントラピクチャ（Ｉ）をずらすＧＯＰ位相の制御信号を出力する。映像符号化装置としては、発生情報量の総和が閾値を超えたとき、多重化部１３０が、映像符号化部のＧＯＰ構造の位相を変更することの指令を出力し、ＧＯＰ制御部１４０が、この指令に基づいて第１映像符号化部１１０及び／又は第２映像符号化部１２０に対してＧＯＰ構造（動き推定・補償の参照構造）の位相を所定フレーム数変更する制御信号を出力する。例えば、ＧＯＰ制御部１４０が、第２映像符号化部１２０に対して、ＧＯＰのイントラピクチャの位置を所定のフレーム数（例えば、１フレーム）ずらす制御信号を出力する。この位置（位相）のずらし方は、一方を他方より遅らせても、進めても良く、また、１回あたり何フレームずらしても構わない。 In step 3 (S3), the amount of change in the phase of the GOP structure is set. That is, a GOP phase control signal for shifting the intra picture (I) by a predetermined number of frames (shift amount per time) is output. As the video encoding device, when the total amount of generated information exceeds a threshold, the multiplexing unit 130 outputs a command to change the phase of the GOP structure of the video encoding unit, and the GOP control unit 140 Based on this command, a control signal for changing the phase of the GOP structure (reference structure for motion estimation / compensation) to a predetermined number of frames is output to the first video encoding unit 110 and / or the second video encoding unit 120. For example, the GOP control unit 140 outputs a control signal for shifting the position of the GOP intra-picture to the second video coding unit 120 by a predetermined number of frames (for example, one frame). This method of shifting the position (phase) may be delayed or advanced with respect to one, and may be shifted by any number of frames per time.

次に、ステップ４（Ｓ４）では、映像符号化部（例えば、第２映像符号化部１２０）において、符号化処理のＧＯＰ位相、すなわち、イントラピクチャ（Ｉ）の位置（時刻）を、Ｓ３で設定されたフレームに変更する制御を行う。そして、再度、Ｓ１に戻る。 Next, in step 4 (S4), in the video encoding unit (for example, the second video encoding unit 120), the GOP phase of the encoding process, that is, the position (time) of the intra picture (I) is set in S3. Control to change to the set frame. And it returns to S1 again.

Ｓ１では、映像フレームに対して新しく設定されたＧＯＰ構造（動き推定・補償の参照構造）に従って映像符号化を行い、Ｓ２において、新しい設定に基づく第１映像符号化部１１０及び第２映像符号化部１２０の符号化発生情報量の総和と閾値とを比較する。 In S1, video encoding is performed according to a newly set GOP structure (motion estimation / compensation reference structure) for the video frame. In S2, the first video encoding unit 110 and the second video encoding based on the new settings are performed. The sum total of the amount of information generated by the encoding unit 120 is compared with a threshold value.

この比較結果について、依然として発生情報量の総和が閾値を超えていた場合は、引き続き、Ｓ３→Ｓ４→Ｓ１→Ｓ２の処理を繰り返す。なお、処理を繰り返すごとに、映像符号化の際のイントラピクチャの位置は、所定のフレーム数ずつずれていく。 If the total amount of generated information still exceeds the threshold for this comparison result, the process of S3 → S4 → S1 → S2 is repeated. Each time the process is repeated, the position of the intra picture at the time of video coding is shifted by a predetermined number of frames.

Ｓ２の比較結果について、その際のフレームごとの発生情報量和の最大値が予め定めた閾値以下であれば、ＧＯＰ構造の位相をそのフレーム分だけずらすことに決定して、終了する。 If the maximum value of the total amount of generated information for each frame at that time is equal to or less than a predetermined threshold for the comparison result of S2, it is decided to shift the phase of the GOP structure by that frame, and the process ends.

なお、最終的に、ＧＯＰ構造の位相を変更した後に符号化された信号は、多重化部１３０で多重化されて出力される。また、Ｓ２の発生情報量の総和の監視は、その後も引き続き行われる。 Finally, the signal encoded after changing the phase of the GOP structure is multiplexed by the multiplexing unit 130 and output. Further, the monitoring of the total amount of generated information in S2 is continued thereafter.

図４のフローチャートにおいては、Ｓ３の判断条件を発生情報量が閾値を超えたときとしたが、発生情報量が閾値以上となったときとしても良く、発生情報量が伝送路の帯域を超過することを検出する条件を適宜設定してよい。 In the flowchart of FIG. 4, the determination condition of S3 is when the generated information amount exceeds the threshold, but it may be when the generated information amount exceeds the threshold, and the generated information amount exceeds the bandwidth of the transmission path. The condition for detecting this may be set as appropriate.

この実施の形態において、入力信号は、２種類の異解像度の同一映像（コンテンツ）としたが、３種類以上の異解像度の映像であっても良い。このとき、映像符号化装置には、入力数に応じた映像符号化部を設ける。また、符号化発生情報量の総和が閾値を超えたときは、複数の映像符号化部のうち任意のＧＯＰ構造の位相を調整することができる。このとき、映像符号化部ごとに位相のずらし方を設定することができ、例えば、第１映像符号化部のイントラピクチャ（Ｉ）の位置を基準として、第２映像符号化部はイントラピクチャの位置を、第１の所定のフレーム数（例えば、１フレーム）ずつずらし、第３映像符号化部はイントラピクチャの位置を、第２の所定のフレーム数（例えば、２フレーム）ずつずらして、符号化発生情報量の総和を調整することができる。 In this embodiment, the input signal is the same video (content) of two different resolutions, but may be three or more different resolution videos. At this time, the video encoding device is provided with a video encoding unit corresponding to the number of inputs. In addition, when the total amount of encoding generation information exceeds a threshold, the phase of an arbitrary GOP structure among a plurality of video encoding units can be adjusted. At this time, it is possible to set a phase shift method for each video encoding unit. For example, the second video encoding unit uses the position of the intra picture (I) of the first video encoding unit as a reference. The position is shifted by a first predetermined number of frames (for example, one frame), and the third video encoding unit shifts the position of the intra picture by a second predetermined number of frames (for example, two frames) to code It is possible to adjust the sum of the amount of generated information.

また、この実施の形態においては、２種類の異解像度の映像が、異なるコンテンツの映像であっても同様に、帯域内での伝送を可能とするとの作用・効果が得られる。さらに、第１入力信号と第２入力信号の解像度が同じ場合であっても、映像符号化装置として同様に効果がある。 Further, in this embodiment, even when two types of images with different resolutions are images of different contents, the operation and effect of enabling transmission within the band can be obtained. Furthermore, even when the resolutions of the first input signal and the second input signal are the same, the video encoding device is similarly effective.

（実施の形態２）
以下に、本発明の実施の形態２について説明をする。本発明の実施の形態２の映像符号化装置は、階層符号化を利用するものである。 (Embodiment 2)
The second embodiment of the present invention will be described below. The video encoding apparatus according to the second embodiment of the present invention uses hierarchical encoding.

図５に、階層符号化を利用した映像符号化装置２００のブロック図を示す。映像符号化装置２００は、第１映像符号化部１１０と、第２映像符号化部１２０と、多重化部１３０と、ＧＯＰ制御部１４０と、基本階層映像局所復号部１６０と、アップコンバート部１７０とを備える。 FIG. 5 shows a block diagram of a video encoding apparatus 200 using hierarchical encoding. The video encoding device 200 includes a first video encoding unit 110, a second video encoding unit 120, a multiplexing unit 130, a GOP control unit 140, a base layer video local decoding unit 160, and an up-conversion unit 170. With.

第１入力信号（第１の入力信号）である高解像度映像は、第１映像符号化部１１０に入力される。第１映像符号化部１１０では、第１入力信号の高解像度映像を圧縮符号化するが、ここでは、高解像度映像は階層符号化の拡張階層（ＥＬ：Enhancement Layer）として第１映像符号化部１１０へ送られ、第１映像符号化部１１０は拡張階層符号化処理を行う。その際、基本階層の映像（基本階層の符号化信号を復号し、アップコンバート部１７０で拡大した映像）は、拡張階層符号化の際の予測候補の一つとして使用することができる。圧縮符号化技術としては、例えばISO/IECで標準化されているMPEG-2やMPEG-4 AVCやMPEG-H HEVCなどのフレーム間予測を用いる任意の符号化技術が利用できる。第１映像符号化部１１０は、符号化された映像信号（拡張階層符号化信号）を多重化部１３０に出力するとともに、符号化された映像信号のＧＯＰの構造（符号化時の動き推定・補償の参照構造）に関する情報を、ＧＯＰ制御部１４０に出力する。 The high-resolution video that is the first input signal (first input signal) is input to the first video encoding unit 110. The first video encoding unit 110 compresses and encodes the high-resolution video of the first input signal. Here, the high-resolution video is the first video encoding unit as an enhancement layer (EL) of hierarchical encoding. 110, the first video encoding unit 110 performs an enhancement layer encoding process. At this time, the base layer video (the video obtained by decoding the base layer encoded signal and expanding it by the up-conversion unit 170) can be used as one of the prediction candidates for the extended layer encoding. As the compression encoding technique, for example, any encoding technique using inter-frame prediction such as MPEG-2, MPEG-4 AVC, and MPEG-H HEVC standardized by ISO / IEC can be used. The first video encoding unit 110 outputs the encoded video signal (enhancement layer encoded signal) to the multiplexing unit 130 and also the GOP structure of the encoded video signal (motion estimation / encoding at the time of encoding). The information regarding the compensation reference structure) is output to the GOP control unit 140.

第２映像符号化部１２０では、第２入力信号の低解像度映像を圧縮符号化するが、ここでは、低解像度映像は階層符号化の基本階層（ＢＬ：Base Layer）として第２映像符号化部１２０に送られ、第２映像符号化部１２０は基本階層符号化処理を行う。圧縮符号化技術としては、例えばISO/IECで標準化されているMPEG-2やMPEG-4 AVCやMPEG-H HEVCなどのフレーム間予測を用いる任意の符号化技術が利用できるが、第１映像符号化部１１０と同じＧＯＰの構成を用いる方式の符号化技術であることが望ましい。第２映像符号化部１２０は、符号化された映像信号（基本階層符号化信号）を多重化部１３０と基本階層映像局所復号部１６０に出力するとともに、符号化された映像信号のＧＯＰの構造（符号化時の動き推定・補償の参照構造）に関する情報を、ＧＯＰ制御部１４０に出力する。 The second video encoding unit 120 compresses and encodes the low resolution video of the second input signal. Here, the low resolution video is used as a base layer (BL: Base Layer) for hierarchical encoding. The second video encoding unit 120 performs base layer encoding processing. As the compression encoding technique, for example, any encoding technique using inter-frame prediction such as MPEG-2, MPEG-4 AVC, and MPEG-H HEVC standardized by ISO / IEC can be used. It is desirable that the encoding technique is a scheme using the same GOP configuration as that of the encoding unit 110. The second video encoding unit 120 outputs the encoded video signal (base layer encoded signal) to the multiplexing unit 130 and the base layer video local decoding unit 160, and the structure of the GOP of the encoded video signal. Information related to (reference structure for motion estimation / compensation during encoding) is output to the GOP control unit 140.

基本階層映像局所復号部１６０は、第２映像符号化部１２０の出力である基本階層の符号化信号を復号し、第２入力信号の低解像度映像（基本階層）を生成する。ただし、生成された映像は、符号化により情報圧縮された後に復号されるため、符号化時のＧＯＰ構造（Ｉ，Ｐ，Ｂのいずれで圧縮されたか等）の影響をうけており、第２入力信号の低解像度映像を完全に復元するものではない。 The base layer video local decoding unit 160 decodes the base layer encoded signal which is the output of the second video encoding unit 120, and generates a low-resolution video (base layer) of the second input signal. However, since the generated video is decoded after the information is compressed by the encoding, it is affected by the GOP structure at the time of encoding (whether it is compressed by I, P, or B). It does not completely restore the low resolution video of the input signal.

アップコンバート部１７０は、基本階層映像局所復号部１６０で復号して得られた低解像度映像（基本階層）をアップコンバートし、元の解像度に拡大して、第２符号化部（拡張階層符号化部）１１０に出力する。拡大時に使用するフィルタはLanczos3など既存の方式を使用可能で、特に方式を問わない。なお、拡大された映像は、元の映像を縮小し、符号化及び復号処理を行った後、拡大処理されたものであるので、元の映像とは同一にはならない。 The up-conversion unit 170 up-converts the low-resolution video (base layer) obtained by decoding by the base layer video local decoding unit 160, expands it to the original resolution, and outputs the second encoding unit (enhancement layer coding). Part) 110. Existing filters such as Lanczos3 can be used for the filter used for enlargement, and any filter can be used. Note that the enlarged video is not the same as the original video because it has been enlarged after the original video has been reduced and encoded and decoded.

多重化部１３０とＧＯＰ制御部１４０の動作は、基本的には実施の態様１と同じである。すなわち、多重化部１３０は、第１映像符号化部１１０と第２映像符号化部１２０の出力である２つの符号化された映像信号を、多重化して送出する。また、多重化部１３０は、第１映像符号化部１１０及び第２映像符号化部１２０で発生する符号化情報量（同時刻（フレーム）での基本階層映像と拡張階層映像の符号化発生情報量）を観測し、合計の瞬時的な発生情報量が閾値を超えたときに、ＧＯＰ制御部１４０へ指令を出力する。 The operations of multiplexing section 130 and GOP control section 140 are basically the same as in the first embodiment. That is, the multiplexing unit 130 multiplexes and transmits two encoded video signals that are the outputs of the first video encoding unit 110 and the second video encoding unit 120. In addition, the multiplexing unit 130 encodes information generated by the first video encoding unit 110 and the second video encoding unit 120 (encoding generation information of the base layer video and the extended layer video at the same time (frame)). When the total instantaneous generated information amount exceeds the threshold value, a command is output to the GOP control unit 140.

ＧＯＰ制御部１４０は、第１映像符号化部１１０と第２映像符号化部１２０から映像符号化のＧＯＰの構造情報を受けて、第１映像符号化部１１０と第２映像符号化部１２０のＧＯＰ構造の相互関係を管理している。また、多重化部１３０からの指令に基づいて、第１映像符号化部１１０と第２映像符号化部１２０の少なくとも一方に制御信号を出力し、第１映像符号化部１１０及び／又は第２映像符号化部１２０の符号化処理の際のＧＯＰ構造（符号化時の動き推定・補償の参照構造）の位相を所定のフレーム数だけ変化させる制御を行う。 The GOP control unit 140 receives the GOP structure information of the video encoding from the first video encoding unit 110 and the second video encoding unit 120, and the GOP control unit 140 receives the structure information of the first video encoding unit 110 and the second video encoding unit 120. It manages the mutual relationship of GOP structure. Further, based on a command from the multiplexing unit 130, a control signal is output to at least one of the first video encoding unit 110 and the second video encoding unit 120, and the first video encoding unit 110 and / or the second video encoding unit 120 is output. Control is performed to change the phase of the GOP structure (the reference structure for motion estimation / compensation during encoding) during the encoding process of the video encoding unit 120 by a predetermined number of frames.

階層符号化について、図６に基づいて、簡単に説明する。階層符号化は上記のMPEG-2やMPEG-4 AVCやMPEG-H HEVCなどの符号化技術の拡張として規格化されている方式を用いることができる。例えば、MPEG-H HEVCでは、基本階層の符号化信号を復号しアップコンバート部１７０で拡大された映像は、拡張階層符号化の際の予測候補の一つとして使用される。図６の矢印は、動き推定・補償の参照関係を示している。階層符号化の場合、基本階層（ＢＬ）は通常の符号化と同様の処理を施されるが、拡張階層（ＥＬ）に関しては、ＢＬからの予測とＥＬの他フレームからの予測で構成される。 Hierarchical encoding will be briefly described with reference to FIG. Hierarchical coding can use a standardized method as an extension of the coding technology such as MPEG-2, MPEG-4 AVC, or MPEG-H HEVC. For example, in MPEG-H HEVC, a video decoded from a base layer encoded signal and expanded by the up-conversion unit 170 is used as one of prediction candidates in the case of enhancement layer coding. The arrows in FIG. 6 indicate the reference relationship for motion estimation / compensation. In the case of hierarchical coding, the base layer (BL) is processed in the same way as normal coding, but the enhancement layer (EL) is composed of prediction from BL and prediction from other EL frames. .

図７は、拡張階層映像のイントラピクチャの時刻を基本階層映像よりも１フレーム分遅らせた例である。２つの映像のイントラピクチャの時刻をずらすことで同じ時刻の発生情報量の総和を減らすことが可能になる。なお、ＧＯＰ構造の位相を変化させるかどうかの判断、及び、何フレーム分ずらすかの決定は、図４のフローチャートに従って同様に行うことができる。発生情報量と閾値との比較は、通常バッファが設けられる多重化部１３０で行うことが望ましいが、両方の符号化出力のバッファが設けられる第１映像符号化部（拡張階層符号化部）で実施することも可能である。 FIG. 7 shows an example in which the time of the intra-picture of the extended layer video is delayed by one frame from the basic layer video. By shifting the time of the intra pictures of the two videos, it is possible to reduce the total amount of generated information at the same time. Note that the determination of whether to change the phase of the GOP structure and the determination of how many frames to shift can be similarly performed according to the flowchart of FIG. The comparison between the amount of generated information and the threshold value is desirably performed by the multiplexing unit 130 in which a normal buffer is provided, but in the first video encoding unit (enhancement layer encoding unit) in which buffers for both encoded outputs are provided. It is also possible to implement.

また、図７の太線の囲いで示すように、ＥＬのＢピクチャの予測にＢＬのＩピクチャやＰピクチャを使用するフレームがある。図６の従来の階層符号化処理のように、ＧＯＰ構造がＢＬとＥＬで同一の場合には、ＥＬのＢはＢＬのＢからの予測となるが、本発明（図７）では、ＩやＰからの予測を行うことで、Ｂからの予測よりも高い画質の参照フレームを使用することができ、結果的にＥＬのＢの画質が向上する効果がある。 Further, as indicated by the thick line in FIG. 7, there is a frame that uses a BL I picture or P picture for prediction of an EL B picture. When the GOP structure is the same in BL and EL as in the conventional hierarchical encoding process of FIG. 6, EL B is predicted from BL B. In the present invention (FIG. 7), I and I By performing the prediction from P, it is possible to use a reference frame having a higher image quality than the prediction from B, and as a result, there is an effect of improving the image quality of EL B.

ＧＯＰ構造の位相を変更した後に符号化された信号は、多重化部１３０で多重されて出力される。 A signal encoded after changing the phase of the GOP structure is multiplexed by the multiplexing unit 130 and output.

この実施の形態において、入力信号は、基本階層映像と拡張階層映像の２種類の異解像度の同一映像（コンテンツ）としたが、さらに、中間の階層映像を加えて、３種類以上の異解像度の映像を対象としても良い。このとき、映像符号化装置には、入力数に応じた映像符号化部を設けるとともに、下位の映像符号化部には、映像符号化した信号を復号する局所復号部と、復号された映像をアップコンバートして上位の階層の映像符号化部へ出力するアップコンバート部をそれぞれ設ける。また、符号化発生情報量の総和が閾値を超えたときは、複数の映像符号化部のうち任意のＧＯＰ構造の位相を調整することができる。このとき、映像符号化部ごとに位相のずらし方を設定することができ、符号化発生情報量の総和を調整することができる。 In this embodiment, the input signal is the same video (contents) of two types of different resolutions of the basic layer video and the extended layer video, but further, the intermediate layer video is added and three or more types of different resolutions are added. Video may be targeted. At this time, the video encoding device is provided with a video encoding unit corresponding to the number of inputs, and the lower video encoding unit includes a local decoding unit that decodes the video encoded signal, and the decoded video. An up-conversion unit that performs up-conversion and outputs the result to a higher-level video encoding unit is provided. In addition, when the total amount of encoding generation information exceeds a threshold, the phase of an arbitrary GOP structure among a plurality of video encoding units can be adjusted. At this time, it is possible to set a phase shift method for each video encoding unit, and to adjust the total amount of encoding generated information.

なお、この実施の形態２において、HEVCの階層符号化処理を利用する場合は、２種類の異解像度の映像が、異なるコンテンツの映像であっても、映像符号化装置として同様に機能することができる。 In the second embodiment, when the HEVC hierarchical encoding process is used, even if two types of videos with different resolutions are videos of different contents, they can function similarly as a video encoding device. it can.

なお、上述した映像符号化装置として機能させるためにコンピュータを好適に用いることができ、そのようなコンピュータは、映像符号化装置の各機能を実現する処理内容を記述したプログラムを該コンピュータの記憶部に格納しておき、該コンピュータの中央演算処理装置（ＣＰＵ）によってこのプログラムを読み出して実行させることで実現することができる。なお、このプログラムは、コンピュータ読取り可能な記録媒体に記録可能である。 Note that a computer can be suitably used to function as the above-described video encoding apparatus, and such a computer stores a program describing processing contents for realizing each function of the video encoding apparatus. This program can be realized by reading out and executing this program by a central processing unit (CPU) of the computer. This program can be recorded on a computer-readable recording medium.

上述の実施形態は代表的な例として説明したが、本発明の趣旨及び範囲内で、多くの変更及び置換ができることは当業者に明らかである。したがって、本発明は、上述の実施形態によって制限するものと解するべきではなく、特許請求の範囲から逸脱することなく、種々の変形や変更が可能である。例えば、実施形態に記載の複数の構成ブロックやステップ等を１つに組み合わせたり、或いは分割したりすることが可能である。 Although the above embodiment has been described as a representative example, it will be apparent to those skilled in the art that many changes and substitutions can be made within the spirit and scope of the invention. Therefore, the present invention should not be construed as being limited by the above-described embodiments, and various modifications and changes can be made without departing from the scope of the claims. For example, a plurality of constituent blocks and steps described in the embodiments can be combined into one or divided.

１００映像符号化装置
１１０第１映像符号化部
１２０第２映像符号化部
１３０多重化部
１４０ＧＯＰ制御部
１５０ダウンコンバート部
１６０基本階層映像局所復号部
１７０アップコンバート部
２００映像符号化装置 100 video encoding device 110 first video encoding unit 120 second video encoding unit 130 multiplexing unit 140 GOP control unit 150 down-conversion unit 160 base layer video local decoding unit 170 up-conversion unit 200 video encoding device

Claims

In a video encoding device for simultaneously compressing and multiplexing a plurality of different videos,
A code that is generated at the same time by changing the phase of the motion estimation / compensation reference structure at the time of encoding at least one of the plurality of different videos when the amount of coded information to be generated exceeds a predetermined value A video encoding apparatus characterized by smoothing the amount of encoded information below the predetermined value.

The video encoding device according to claim 1, wherein
A plurality of video encoding units;
A multiplexer that multiplexes outputs of the plurality of video encoding units and outputs a command when the amount of encoded information generated in the plurality of video encoding units exceeds the predetermined value;
A GOP control unit that changes a phase of a reference structure for motion estimation / compensation at the time of encoding of at least one video encoding unit of the plurality of video encoding units based on the command;
A video encoding device comprising:

The video encoding device according to claim 1 or 2,
The plurality of video encoding units includes a first video encoding unit that encodes a high-resolution video, and a second video encoding unit that encodes a low-resolution video,
The video encoding device, wherein the input video signal to the second video encoding unit is a video signal obtained by down-converting the input video signal to the first video encoding unit.

A video encoding device that simultaneously compresses and multiplexes a plurality of videos having different resolutions, and the compression encoding is a compression encoding having a structure for predicting a high-resolution video from a basic video. In a video encoding device,
When the amount of generated encoded information exceeds a predetermined value, the phase of the motion estimation / compensation reference structure is changed at the time of encoding at least one of the plurality of different resolution images, and is generated at the same time. A video encoding apparatus characterized by smoothing an encoded information amount to be equal to or less than the predetermined value.

The video encoding device according to claim 4, wherein
A plurality of video encoding units including a video encoding unit for encoding a basic video and a video encoding unit for encoding a high-resolution video;
Decoding the output of the video encoding unit that encodes the basic video, and generating a low-resolution video; and
An up-conversion unit for enlarging the low-resolution video and outputting the high-resolution video to a video encoding unit;
A multiplexer that multiplexes outputs of the plurality of video encoding units and outputs a command when the amount of encoded information generated in the plurality of video encoding units exceeds the predetermined value;
A GOP control unit that changes a phase of a reference structure for motion estimation / compensation at the time of encoding of at least one video encoding unit of the plurality of video encoding units based on the command;
A video encoding device comprising:

In the video coding device according to any one of claims 1 to 5,
When the generated encoded information amount exceeds a predetermined value, the phase of the motion estimation / compensation reference structure at the time of encoding at least one of the plurality of video encoding units is shifted by a predetermined number of frames. It is determined whether the amount of encoded information is within the predetermined value, and if it is within the predetermined value, the reference structure is set to the shifted phase, and if it exceeds the predetermined value, the predetermined number of frames is further shifted. A video encoding device.

A program for causing a computer to function as the video encoding device according to any one of claims 1 to 6.